TCP Protocol
The TCP binary ingest protocol provides high-throughput document streaming. It is significantly faster than HTTP JSONL import and supports both batch and realtime modes.
Overview
The TCP protocol runs on port 8109 by default (configurable via --ingest-addr). Documents are encoded in a compact binary format and streamed in batches over a persistent TCP connection.
| Mode | Description |
|---|---|
| Batch | Stream a large number of documents, then commit to rebuild the full index. Best for bulk loads and initial data ingestion. |
| Realtime | Instant upserts and deletes without a commit step. Documents are immediately searchable via an overlay that merges with the base index. |
Protocol Flow
Batch Mode
- Client opens TCP connection
- Client sends MSG_START_INGEST with collection name and mode (batch)
- Client sends one or more MSG_DOC_BATCH messages, each containing a batch of encoded documents
- Server responds with MSG_ACK for each batch (sequence number, total docs received)
- Client sends MSG_COMMIT to finalize
- Server rebuilds the index and responds with MSG_COMMIT_RESULT (success, docs indexed, build time)
Realtime Mode
- Client opens TCP connection
- Client sends MSG_START_INGEST with collection name and mode (realtime)
- Client sends MSG_UPSERT messages for individual document upserts or deletes
- Documents are immediately visible in search results
Message Types
Client → Server
| ID | Name | Description |
|---|---|---|
0x01 | MSG_START_INGEST | Start an ingest session. Payload: collection name (string), mode (u8: 0=Batch, 1=Realtime). |
0x02 | MSG_DOC_BATCH | A batch of encoded documents. Payload: document count (u32), binary document data. |
0x03 | MSG_COMMIT | Finalize the batch and rebuild the index. |
0x04 | MSG_ABORT | Cancel the current ingest session. |
0x05 | MSG_UPSERT | Stateless upsert or delete (realtime mode only). |
Server → Client
| ID | Name | Description |
|---|---|---|
0x10 | MSG_ACK | Batch acknowledged. Payload: batch sequence (u64), total docs received (u64). |
0x11 | MSG_COMMIT_RESULT | Index rebuild complete. Payload: success (bool), docs indexed (u64), build time ms (u64). |
0x12 | MSG_ERROR | Error response. Payload: error code (u16), message (string). |
0x13 | MSG_BACKPRESSURE | Server is busy. Payload: retry after ms (u32). |
Error Codes
| Code | Description |
|---|---|
1 | Collection not found |
2 | Schema mismatch |
3 | Session already active |
4 | No active session |
5 | Batch too large |
6 | Internal error |
7 | Limit exceeded (document count or memory) |
8 | Backpressure (overlay too large) |
Wire Format
All multi-byte integers use big-endian byte order. Strings are encoded as a u32 length prefix followed by UTF-8 bytes.
Document Encoding
Each document is encoded as a sequence of field values, identified by field ID (u16, matching the schema field order):
| Field Type | Encoding |
|---|---|
string | u32 length + UTF-8 bytes |
int32 | 4 bytes big-endian |
int64 | 8 bytes big-endian |
float | 8 bytes big-endian (f64) |
bool | 1 byte (0 or 1) |
string[] | u32 count + per-string (u32 length + UTF-8 bytes) |
int32[] | u32 count + per-element (4 bytes big-endian) |
int64[] | u32 count + per-element (8 bytes big-endian) |
geopoint | f64 latitude + f64 longitude (16 bytes big-endian) |
float[] (vector) | u16 dimensions + per-element f32 (4 bytes big-endian) |
The document’s external ID is encoded at the end as a string (u32 length + UTF-8 bytes).
Client SDKs
Official SDKs with full TCP protocol support are available for .NET, Python, Go, Java, and Node.js. Each includes typed document builders, batch streaming, and error handling.
Performance
In benchmarks on 309,236 production records, the TCP binary protocol ingests at 70K docs/s in 4.4 seconds — 9.5x faster than Typesense v30.1 on the same hardware. This is due to compact binary encoding (no JSON parsing overhead), persistent connections, and batch streaming. For even faster bulk loads, the TQBF binary file format pre-encodes documents ahead of time so the server performs zero parsing at import time.
Configuration
| Setting | Default | Description |
|---|---|---|
--ingest-addr | 0.0.0.0:8109 | TCP listen address |
--max-ingest-connections | 64 | Maximum concurrent connections |
TOML: ingest_allow | all | IP allowlist for TCP connections (CIDR notation) |