Today we're releasing Torque 0.5.0 — a high-performance, in-memory search engine built from scratch in Rust. It's a drop-in replacement for Typesense with 6x faster search, 7.5x faster ingestion, and sub-2-second restarts at multi-million document scale.
Why we built Torque
We needed a search engine for Traverse that could handle millions of documents with sub-millisecond latency. Existing options forced a trade-off: Elasticsearch is powerful but operationally heavy (JVM tuning, shard management, segment merges). Typesense is simple but limited at scale — no sharding, no BM25 scoring, and restart times measured in minutes to hours for large datasets. Algolia is fast but closed-source and SaaS-only.
We wanted something different: an engine that serves queries from memory with zero-copy data structures, persists everything to disk for durability, and restores instantly. So we built one.
Architecture
Torque holds all data in memory for serving. Inverted index posting lists and term frequencies are memory-mapped from disk using CRoaring's Frozen bitmap format — no deserialization, no heap allocation. The file on disk IS the serving data structure. Other components (filters, sort indexes, column store) are deserialized from compressed binary on startup.
Every collection's index is held behind an ArcSwap — a lock-free atomic pointer. Search queries load an Arc<IndexBundle> with a single atomic read. Index rebuilds happen in the background and swap the pointer atomically when complete. Queries never block on writes. Writes never block on queries.
| Interface | Port | Purpose |
|---|---|---|
| HTTP API | 8108 | Search, CRUD, Studio, admin, TLS |
| TCP Binary | 8109 | High-throughput streaming ingest |
| Studio | 8108 | Embedded web UI |
| MCP | 8108 | Model Context Protocol for AI agents |
Scoring: BM25F with SIMD
Torque uses BM25F — the standard information retrieval scoring function with inverse document frequency and document-length normalization. Rare terms are weighted higher. Long documents aren't penalized for containing more words. This matters for long-form content (articles, documentation, legal text) where term rarity is a strong relevance signal.
Typesense and Meilisearch use custom tie-breaking algorithms without IDF. Algolia uses a similar tie-breaking approach. Elasticsearch uses BM25 via Lucene.
Torque's BM25F has three hot-path implementations depending on query shape: SIMD AVX2 (single-field, 8 candidates scored per instruction), 4x unrolled (multi-field), and scalar fallback. On AMD EPYC with AVX-512, bitmap intersections process 512 document IDs per instruction via CRoaring's SIMD-optimized C library.
Persistence: 1.8-second restore
Torque persists everything to disk automatically. On restart, it memory-maps the inverted index files and deserializes the rest in parallel. The 4.64-million-document English Wikipedia index restores from disk in 1.8 seconds.
| Engine | Restart strategy | Cost at scale |
|---|---|---|
| Torque | mmap frozen bitmaps (zero-copy) | 1.8s for 4.64M docs |
| Typesense | Rebuild ART from RocksDB | 78 minutes reported for 28M docs |
| Elasticsearch | mmap Lucene segments | Seconds (but cold queries hit disk) |
| Meilisearch | LMDB mmap | Fast |
The design principle: the build pipeline produces data structures that are simultaneously the in-memory serving format AND the on-disk persistence format. There is no separate “load” step. Restart means mmap the files and start serving.
Three ingestion paths
Torque is push-based — it never connects to your database. Clients handle data fusion and stream documents to Torque.
- TCP Binary Protocol. Custom binary wire format with schema-aware encoding. No JSON parsing overhead. Measured at 69K docs/s on a 309K-document benchmark — 7.5x faster than Typesense's HTTP JSONL ingest.
- TQBF Binary File Upload. Pre-encode documents to a compressed binary file, upload in one HTTP POST. A single request imports the entire dataset. Measured at 244K docs/s on 4.64M Wikipedia documents.
- HTTP JSONL. Typesense-compatible REST API. Best for small ad-hoc updates and existing Typesense client compatibility.
All five client SDKs implement all three ingestion methods.
GPU acceleration
Torque is the only search engine we're aware of that uses GPU acceleration for the core search operations themselves — not just for embedding generation or vector index building.
Ten CUDA kernels accelerate bitmap intersection, numeric range filtering, BM25F scoring, facet counting, vector distance computation, and geo filtering. When no GPU is available, Torque falls back to CPU transparently with no configuration change.
Elasticsearch and Typesense use GPUs only for ML model inference. Vespa uses GPUs for ONNX model evaluation during ranking. None of them accelerate the inverted index search path itself.
Benchmark: Torque vs Typesense vs Quickwit
Test setup: 309K Swedish bankruptcy records, 67 fields with nested objects, geo coordinates, and financial data. All three engines on the same machine, same dataset, same query patterns. 20 iterations, 5 warmup, CPU-only, HTTP round-trip latency on localhost. Quickwit is backed by Tantivy (the Rust equivalent of Lucene).
| Pattern | Torque p50 | Typesense p50 | Quickwit p50 | vs Typesense | vs Quickwit |
|---|---|---|---|---|---|
| text_search | 2.7 ms | 2.8 ms | 2.9 ms | 1.1x | 1.1x |
| filter_only | 0.2 ms | 1.7 ms | 3.5 ms | 7.5x | 15x |
| faceted | 1.2 ms | 13.1 ms | 2.9 ms | 11.2x | 2.4x |
| sort_only | 0.2 ms | 8.0 ms | 7.9 ms | 53x | 52x |
| group_by | 0.5 ms | 75.5 ms | 1.4 ms | 149x | 2.8x |
| geo_sort | 0.3 ms | 13.0 ms | — | 46x | — |
| prefix_search | 0.4 ms | 1.1 ms | 7.5 ms | 2.5x | 17x |
| multi_filter | 0.6 ms | 18.8 ms | 3.2 ms | 29x | 5x |
| Geometric mean | 6.1x | 4.2x |
The largest advantages are in operations dominated by bitmap and column-store operations: filtering, sorting, grouping, and geo queries. Text search is similar across all engines because they're all fundamentally bounded by the same inverted index lookups.
Ingestion
| Engine | Method | Time (309K docs) | Throughput |
|---|---|---|---|
| Torque | TCP binary | 4.5s | 69K docs/s |
| Quickwit | HTTP NDJSON | 7.6s | 41K docs/s |
| Typesense | HTTP JSONL | 33.8s | 9K docs/s |
Wikipedia scale (4.64M documents)
| Metric | Value |
|---|---|
| Import (TQBF binary) | 19s (244K docs/s) |
| Index build | ~3 minutes (streaming, disk-backed text) |
| Persist to disk | 2.5 GB |
| Restore from disk | 1.8s |
| Serving RSS | 3.1 GB (inverted index mmap'd, not on heap) |
| Search “barack obama” | 8,181 results in 3 ms |
Typesense v30.1 API compatible
Torque implements the Typesense v30.1 REST API for all search and CRUD operations. If you have an existing Typesense deployment, you can point your clients at Torque and they work without code changes.
Fully compatible: Collections, Documents, Search, Multi-search, Aliases, API Keys, Synonyms, Presets, Stopwords.
Torque-only extensions: TCP binary ingest, TQBF binary file import, collection suspend/resume, compaction, GPU acceleration, decay sorting, nested element-level AND filters, MCP server.
Client SDKs
Five official SDKs, each with the full HTTP API, the TCP binary ingest protocol, and the TQBF binary file writer:
| Language | Package | Runtime |
|---|---|---|
| .NET 10+ | Torque.Http | Zero NuGet deps, source-generated JSON |
| Python 3.10+ | torque-http | Zero deps, stdlib only |
| Node.js 20+ | @truespar/torque-http | Zero deps, native fetch, TypeScript |
| Go 1.23+ | torque-sdk-go | Zero deps, stdlib only |
| Java 21+ | com.truespar:torque-http | Jackson 3.1, typed models |
Torque Studio
Torque ships with an embedded web UI — no separate install, no npm, no build step. Open http://localhost:8108 and you have a search editor with syntax highlighting, a schema browser, a data import panel, a benchmark dashboard, API key management, and full documentation.
MCP server
Torque includes a built-in MCP (Model Context Protocol) server at /api/mcp. AI assistants like Claude can search your collections, inspect schemas, and manage documents directly. Nine tools are available: search, multi-search, list collections, get schema, get document, create collection, upsert document, delete document, and delete collection.
// claude_desktop_config.json
{
"mcpServers": {
"torque": {
"url": "http://localhost:8108/api/mcp",
"headers": {
"X-TYPESENSE-API-KEY": "your-api-key"
}
}
}
}
What's in the box
- Search. Full-text with BM25F, multi-field, prefix, typo tolerance (SymSpell), synonyms, presets, stopwords, highlighting.
- Filtering. Exact match, numeric range, boolean, geo radius, nested object conditions, array element-level AND.
- Sorting. Field, text relevance, geo distance, decay functions (Gaussian, linear, exponential).
- Faceting. Per-field counts with min/max/sum/average stats.
- Grouping. Group results by field with configurable per-group limits.
- Vector search. HNSW with RaBitQ quantization (1-bit, no training). Hybrid text+vector via RRF or alpha-weighted interpolation.
- Stemming. 17 languages: English, German, French, Spanish, Portuguese, Italian, Dutch, Swedish, Norwegian, Danish, Finnish, Russian, Arabic, Greek, Hungarian, Romanian, Turkish.
- Auth. Bootstrap, admin, search-only, and HMAC-scoped API keys.
- Persistence. Automatic, atomic, zero-config. Collections, indexes, overlays, schemas, keys, synonyms, presets, stopwords — all restored on restart.
- Realtime mode. Instant upsert/delete with overlay persistence, auto-compaction, backpressure.
- Operations. TLS/HTTPS, Prometheus metrics, IP allowlist, slow query logging, TOML config, env vars, Windows Service, Docker.
Getting started
# Download and run
torque-server --api-key YOUR_KEY --license-key YOUR_LICENSE
# Or Docker
docker run -p 8108:8108 -p 8109:8109 -v torque-data:/data truespar/torque
Open http://localhost:8108 for Studio. Read the docs for the full setup guide, or reach out if you want a demo.
What's next
We're working on Raft-based clustering for multi-node replication, JOINs across collections, conversational search, and continuing to push on raw query performance. Torque already runs as a unikernel on Silicon — the same 84 ms boot time, the same hardware isolation, applied to search.