Esse::Document is the in-memory representation of a single indexable record. Repositories produce documents; the import pipeline sends them to Elasticsearch.
In most cases you don’t interact with Esse::Document directly — returning a Hash from your repository document block is enough. But understanding the contract is useful for custom document classes, partial updates, and lazy attributes.
The contract
Every document has these (optionally overridable) methods:
| Method | Description |
|---|---|
id | String/Integer/nil. Documents with nil id are ignored on index/delete. |
type | String or nil. Legacy ES type (modern clusters use _doc by default). |
routing | String or nil. Used for custom routing. |
source | Hash. The actual document body. |
meta | Hash. Extra metadata merged into bulk operation headers. |
Instance methods:
| Method | Returns |
|---|---|
object | The raw record the document was built from |
options | Frozen hash of construction options |
to_h | { _id, _type, _routing, ...meta, ...source } |
to_bulk(operation:, data:) | Bulk-formatted hash |
doc_header | { _id, _type, routing } |
ignore_on_index? | true when id.nil? |
ignore_on_delete? | true when id.nil? |
Writing a custom Document class
class UserDocument < Esse::Document def id object.id end
def routing object.tenant_id end
def source { name: object.name, email: object.email, roles: object.role_names } endend
# Use it from a repositoryrepository :user do collection { |**c, &b| User.find_in_batches { |u| b.call(u, c) } } document UserDocumentendVariants
HashDocument
Auto-extracts header fields from a hash. Most repository document blocks implicitly use HashDocument:
doc = Esse::HashDocument.new( _id: 'abc', _routing: 'shard-1', name: 'John', email: 'john@example.com')
doc.id # => 'abc'doc.routing # => 'shard-1'doc.source # => { name: 'John', email: 'john@example.com' }ID is looked up in this order: _id, id, :_id, :id. Reserved keys (_id, _type, _routing, routing) are stripped from source.
NullDocument
A no-op document, used to skip indexing from inside a document block:
document do |record, **| next Esse::NullDocument.new if record.deleted? { _id: record.id, name: record.name }endNullDocument#id is nil, so it’s automatically ignored by bulk operations.
DocumentForPartialUpdate
Used when updating a subset of fields on an already-indexed document. Usually created via:
doc = original_doc.document_for_partial_update(comment_count: 42)
doc.source # => { comment_count: 42 }# id, type, routing delegate to original_docLazyDocumentHeader
A lightweight placeholder that carries only the information needed to target a document (id, type, routing) and arbitrary extra options. It’s the object you receive in lazy_document_attribute blocks:
lazy_document_attribute :name do |headers| # headers is Array[Esse::LazyDocumentHeader] ids = headers.map(&:id) data = User.where(id: ids).pluck(:id, :name).to_h headers.each_with_object({}) { |h, acc| acc[h] = data[h.id] }endCoerce any input into a header:
Esse::LazyDocumentHeader.coerce(1) # from IDEsse::LazyDocumentHeader.coerce('id-123') # from stringEsse::LazyDocumentHeader.coerce(_id: 1, _routing: 'x')Esse::LazyDocumentHeader.coerce_each([1, 2, 3]) # batchMutations
You can mutate a document before indexing:
doc.mutate(:display_name) { "#{object.first_name} #{object.last_name}" }doc.mutations # => { display_name: "John Doe" }doc.mutated_source # => source.merge(display_name: "John Doe")Mutations are applied when building bulk payloads.
Equality
doc1 = UserDocument.new(user)doc2 = UserDocument.new(user)doc1.eql?(doc2) # => true if all headers and source match
# When comparing to a LazyDocumentHeader:doc1.eql?(header, match_lazy_doc_header: true)Bulk format
doc.to_bulk(operation: :index)# => { _id: 1, _type: '_doc', data: { name: 'John' } }
doc.to_bulk(operation: :update, data: { doc: { name: 'Jane' } })# => { _id: 1, _type: '_doc', data: { doc: { name: 'Jane' } } }When do you write a custom class?
Prefer returning a Hash from the repository document block for simple cases. Reach for a custom Esse::Document subclass when:
- You want shared logic across many indices.
- You need access to construction options (
doc.options[:foo]). - You want mutations / partial updates to be derived cleanly.
- You’re building a reusable serializer layer (e.g., across Active Record models).