← Back to Blog

Introducing Torque: A Search Engine That Restores 4.64M Documents in 1.8 Seconds

Today we're releasing Torque 0.5.0 — a high-performance, in-memory search engine built from scratch in Rust. It's a drop-in replacement for Typesense with 6x faster search, 7.5x faster ingestion, and sub-2-second restarts at multi-million document scale.

Why we built Torque

We needed a search engine for Traverse that could handle millions of documents with sub-millisecond latency. Existing options forced a trade-off: Elasticsearch is powerful but operationally heavy (JVM tuning, shard management, segment merges). Typesense is simple but limited at scale — no sharding, no BM25 scoring, and restart times measured in minutes to hours for large datasets. Algolia is fast but closed-source and SaaS-only.

We wanted something different: an engine that serves queries from memory with zero-copy data structures, persists everything to disk for durability, and restores instantly. So we built one.

Architecture

Torque holds all data in memory for serving. Inverted index posting lists and term frequencies are memory-mapped from disk using CRoaring's Frozen bitmap format — no deserialization, no heap allocation. The file on disk IS the serving data structure. Other components (filters, sort indexes, column store) are deserialized from compressed binary on startup.

Every collection's index is held behind an ArcSwap — a lock-free atomic pointer. Search queries load an Arc<IndexBundle> with a single atomic read. Index rebuilds happen in the background and swap the pointer atomically when complete. Queries never block on writes. Writes never block on queries.

InterfacePortPurpose
HTTP API8108Search, CRUD, Studio, admin, TLS
TCP Binary8109High-throughput streaming ingest
Studio8108Embedded web UI
MCP8108Model Context Protocol for AI agents

Scoring: BM25F with SIMD

Torque uses BM25F — the standard information retrieval scoring function with inverse document frequency and document-length normalization. Rare terms are weighted higher. Long documents aren't penalized for containing more words. This matters for long-form content (articles, documentation, legal text) where term rarity is a strong relevance signal.

Typesense and Meilisearch use custom tie-breaking algorithms without IDF. Algolia uses a similar tie-breaking approach. Elasticsearch uses BM25 via Lucene.

Torque's BM25F has three hot-path implementations depending on query shape: SIMD AVX2 (single-field, 8 candidates scored per instruction), 4x unrolled (multi-field), and scalar fallback. On AMD EPYC with AVX-512, bitmap intersections process 512 document IDs per instruction via CRoaring's SIMD-optimized C library.

Persistence: 1.8-second restore

Torque persists everything to disk automatically. On restart, it memory-maps the inverted index files and deserializes the rest in parallel. The 4.64-million-document English Wikipedia index restores from disk in 1.8 seconds.

EngineRestart strategyCost at scale
Torquemmap frozen bitmaps (zero-copy)1.8s for 4.64M docs
TypesenseRebuild ART from RocksDB78 minutes reported for 28M docs
Elasticsearchmmap Lucene segmentsSeconds (but cold queries hit disk)
MeilisearchLMDB mmapFast

The design principle: the build pipeline produces data structures that are simultaneously the in-memory serving format AND the on-disk persistence format. There is no separate “load” step. Restart means mmap the files and start serving.

Three ingestion paths

Torque is push-based — it never connects to your database. Clients handle data fusion and stream documents to Torque.

  • TCP Binary Protocol. Custom binary wire format with schema-aware encoding. No JSON parsing overhead. Measured at 69K docs/s on a 309K-document benchmark — 7.5x faster than Typesense's HTTP JSONL ingest.
  • TQBF Binary File Upload. Pre-encode documents to a compressed binary file, upload in one HTTP POST. A single request imports the entire dataset. Measured at 244K docs/s on 4.64M Wikipedia documents.
  • HTTP JSONL. Typesense-compatible REST API. Best for small ad-hoc updates and existing Typesense client compatibility.

All five client SDKs implement all three ingestion methods.

GPU acceleration

Torque is the only search engine we're aware of that uses GPU acceleration for the core search operations themselves — not just for embedding generation or vector index building.

Ten CUDA kernels accelerate bitmap intersection, numeric range filtering, BM25F scoring, facet counting, vector distance computation, and geo filtering. When no GPU is available, Torque falls back to CPU transparently with no configuration change.

Elasticsearch and Typesense use GPUs only for ML model inference. Vespa uses GPUs for ONNX model evaluation during ranking. None of them accelerate the inverted index search path itself.

Benchmark: Torque vs Typesense vs Quickwit

Test setup: 309K Swedish bankruptcy records, 67 fields with nested objects, geo coordinates, and financial data. All three engines on the same machine, same dataset, same query patterns. 20 iterations, 5 warmup, CPU-only, HTTP round-trip latency on localhost. Quickwit is backed by Tantivy (the Rust equivalent of Lucene).

PatternTorque p50Typesense p50Quickwit p50vs Typesensevs Quickwit
text_search2.7 ms2.8 ms2.9 ms1.1x1.1x
filter_only0.2 ms1.7 ms3.5 ms7.5x15x
faceted1.2 ms13.1 ms2.9 ms11.2x2.4x
sort_only0.2 ms8.0 ms7.9 ms53x52x
group_by0.5 ms75.5 ms1.4 ms149x2.8x
geo_sort0.3 ms13.0 ms46x
prefix_search0.4 ms1.1 ms7.5 ms2.5x17x
multi_filter0.6 ms18.8 ms3.2 ms29x5x
Geometric mean6.1x4.2x

The largest advantages are in operations dominated by bitmap and column-store operations: filtering, sorting, grouping, and geo queries. Text search is similar across all engines because they're all fundamentally bounded by the same inverted index lookups.

Ingestion

EngineMethodTime (309K docs)Throughput
TorqueTCP binary4.5s69K docs/s
QuickwitHTTP NDJSON7.6s41K docs/s
TypesenseHTTP JSONL33.8s9K docs/s

Wikipedia scale (4.64M documents)

MetricValue
Import (TQBF binary)19s (244K docs/s)
Index build~3 minutes (streaming, disk-backed text)
Persist to disk2.5 GB
Restore from disk1.8s
Serving RSS3.1 GB (inverted index mmap'd, not on heap)
Search “barack obama”8,181 results in 3 ms

Typesense v30.1 API compatible

Torque implements the Typesense v30.1 REST API for all search and CRUD operations. If you have an existing Typesense deployment, you can point your clients at Torque and they work without code changes.

Fully compatible: Collections, Documents, Search, Multi-search, Aliases, API Keys, Synonyms, Presets, Stopwords.

Torque-only extensions: TCP binary ingest, TQBF binary file import, collection suspend/resume, compaction, GPU acceleration, decay sorting, nested element-level AND filters, MCP server.

Client SDKs

Five official SDKs, each with the full HTTP API, the TCP binary ingest protocol, and the TQBF binary file writer:

LanguagePackageRuntime
.NET 10+Torque.HttpZero NuGet deps, source-generated JSON
Python 3.10+torque-httpZero deps, stdlib only
Node.js 20+@truespar/torque-httpZero deps, native fetch, TypeScript
Go 1.23+torque-sdk-goZero deps, stdlib only
Java 21+com.truespar:torque-httpJackson 3.1, typed models

Torque Studio

Torque ships with an embedded web UI — no separate install, no npm, no build step. Open http://localhost:8108 and you have a search editor with syntax highlighting, a schema browser, a data import panel, a benchmark dashboard, API key management, and full documentation.

MCP server

Torque includes a built-in MCP (Model Context Protocol) server at /api/mcp. AI assistants like Claude can search your collections, inspect schemas, and manage documents directly. Nine tools are available: search, multi-search, list collections, get schema, get document, create collection, upsert document, delete document, and delete collection.

// claude_desktop_config.json
{
  "mcpServers": {
    "torque": {
      "url": "http://localhost:8108/api/mcp",
      "headers": {
        "X-TYPESENSE-API-KEY": "your-api-key"
      }
    }
  }
}

What's in the box

  • Search. Full-text with BM25F, multi-field, prefix, typo tolerance (SymSpell), synonyms, presets, stopwords, highlighting.
  • Filtering. Exact match, numeric range, boolean, geo radius, nested object conditions, array element-level AND.
  • Sorting. Field, text relevance, geo distance, decay functions (Gaussian, linear, exponential).
  • Faceting. Per-field counts with min/max/sum/average stats.
  • Grouping. Group results by field with configurable per-group limits.
  • Vector search. HNSW with RaBitQ quantization (1-bit, no training). Hybrid text+vector via RRF or alpha-weighted interpolation.
  • Stemming. 17 languages: English, German, French, Spanish, Portuguese, Italian, Dutch, Swedish, Norwegian, Danish, Finnish, Russian, Arabic, Greek, Hungarian, Romanian, Turkish.
  • Auth. Bootstrap, admin, search-only, and HMAC-scoped API keys.
  • Persistence. Automatic, atomic, zero-config. Collections, indexes, overlays, schemas, keys, synonyms, presets, stopwords — all restored on restart.
  • Realtime mode. Instant upsert/delete with overlay persistence, auto-compaction, backpressure.
  • Operations. TLS/HTTPS, Prometheus metrics, IP allowlist, slow query logging, TOML config, env vars, Windows Service, Docker.

Getting started

# Download and run
torque-server --api-key YOUR_KEY --license-key YOUR_LICENSE

# Or Docker
docker run -p 8108:8108 -p 8109:8109 -v torque-data:/data truespar/torque

Open http://localhost:8108 for Studio. Read the docs for the full setup guide, or reach out if you want a demo.

What's next

We're working on Raft-based clustering for multi-node replication, JOINs across collections, conversational search, and continuing to push on raw query performance. Torque already runs as a unikernel on Silicon — the same 84 ms boot time, the same hardware isolation, applied to search.