Introduction - BlobHub

The Specification group documents how the worker operates internally: the BlobHub data model it builds on, the state it keeps on disk versus on the wire, how it synchronizes the two, and the per-job-type object notation. You do not need any of this to connect the worker to your sessions — read the Overview for that. This group is for understanding the mechanics.

The BlobHub data model the worker builds on

The worker is a client of the existing BlobHub session API. It introduces no new backend endpoints: it reads and writes the same sessions, session objects, and thread items any client speaks, as its own user. Everything below is part of the generic data model — the worker only attaches meaning to specific metadata blocks within it.

Sessions — a session is the shared context the worker attaches to. One session_agent_harness section attaches to one session.
Session objects — a thread-typed session object has an envelope of attributes (display metadata) and metadata (the structured state). The worker uses two kinds: a single worker object that marks its attachment, and one object per unit of work. See thread session object and upload_session_object.
Thread items — an append-only stream of items on a thread. The agent’s output is posted as items; a human’s posts arrive as items. See post_session_thread_item, list_session_thread_items, and get_session_thread_item.
Session events — the change feed the worker polls to learn what happened. The relevant event is session_thread_item_posted. See Session events.

The worker reuses these generic operations directly; this specification does not re-document their request and response shapes.

What the worker reads and writes

The worker never invents new object types or commands. On each object it touches, it reads and writes only specific metadata blocks — for example the control channel and agent settings on a unit-of-work object, and the attachment marker on the worker object. The exact blocks per object are documented on the per-job-type object pages, starting with the Job Session Object and the Worker Session Object.

Sync model

The worker stays current by polling the session’s change feed with a cursor — it calls list_session_events and advances past the events it has processed, so it never re-handles the same event. When it needs to change an envelope, it does an optimistic read-modify-write: read the current object, mutate only the fields it owns, write back, and on a write conflict re-read and retry, resyncing from the next event.

Local vs wire state

The worker keeps two views of each unit of work, and they are deliberately not the same:

The wire envelope (the session object’s metadata) is authoritative for handoff — it is how the user and the worker communicate ownership and the control-channel state.
The local thread.yaml on the worker host is authoritative for recovery — it records the resume pointer, the last consumed/posted item cursors, and the local failure detail, none of which appear on the wire.

The worker persists local state before any observable side effect (before posting an item, before spawning an agent), so that a crash at any point leaves the on-disk record consistent with what the rest of the system can see. The two layouts differ field-by-field; see the Job Session Object for the wire shape and the wire-to-local mapping.

Job types

A section’s job_type is the discriminator that selects which validator runs at preflight, which objects the section attaches to, and how it behaves at runtime. session_agent_harness is the one job type in v1. See Job Types for the dispatch mechanics and extensibility.

​The BlobHub data model the worker builds on

​What the worker reads and writes

​Sync model

​Local vs wire state

​Job types

​See also

The BlobHub data model the worker builds on

What the worker reads and writes

Sync model

Local vs wire state

Job types

See also