Vector

ArcadeDB includes a native vector search engine for similarity-based retrieval of embeddings. Vector indexes are fully integrated into the SQL query engine and support ACID transactions, persistent storage, and automatic compaction.

How Vector Search Works

Vector search finds the nearest neighbors to a query vector in high-dimensional space. Instead of exact matching (like SQL WHERE), it finds the most similar items based on a distance or similarity metric.

Typical workflow:

  1. Generate embeddings from your data using an external model (OpenAI, Sentence Transformers, etc.)

  2. Store embeddings as vector properties on vertices or documents

  3. Create a vector index on the property

  4. Query with vectorNeighbors() to find the k most similar items

LSMVectorIndex Architecture

ArcadeDB’s vector index is built on two foundations:

  • LSM Tree storage — ArcadeDB’s proven LSM Tree architecture provides persistent, crash-safe storage with automatic compaction

  • JVector 4.0.0 — A high-performance vector search library that implements both HNSW (Hierarchical Navigable Small World) and Vamana (DiskANN) graph algorithms

The index stores vectors as a navigable graph where each node connects to its approximate nearest neighbors. Searches traverse this graph, narrowing in on the closest matches efficiently — typically in O(log n) time rather than O(n) brute-force scanning.

Flat vs Hierarchical Structure

The index supports two graph structures:

Flat (default) Hierarchical

Algorithm

Single-layer Vamana graph

Multi-layer HNSW with exponential decay

Build speed

Faster

10-20% slower

Disk usage

Baseline

5-15% larger

Best for

< 100K vectors, well-clustered data

100K+ vectors, 1024+ dimensions, diverse queries

Enable hierarchical mode with addHierarchy: true in the index metadata.

Similarity Functions

Three distance metrics are available:

Function When to Use Value Range

COSINE (default)

Text embeddings (BERT, GPT, Sentence Transformers). Direction matters, magnitude does not.

-1 to 1

DOT_PRODUCT

Normalized vectors where speed matters. 10-15% faster than COSINE.

Unbounded

EUCLIDEAN

Spatial data, point clouds, continuous measurements. Absolute distance matters.

0 to infinity

If your embeddings are already L2-normalized (unit vectors), use DOT_PRODUCT for best performance — it produces the same ranking as COSINE but skips the normalization step.

Quantization

Quantization reduces memory usage by compressing vector components at the cost of slight accuracy loss:

Type Memory Reduction Speed Recall Use Case

NONE

Baseline

Baseline

100%

Small datasets (< 10K vectors), maximum accuracy

INT8 (recommended)

4x (75% savings)

10-15% faster

95-98%

Best balance of speed and accuracy for most workloads

BINARY

32x (97% savings)

15-20% faster

85-92%

Massive datasets, approximate search with reranking

PRODUCT

16-64x

Approximate

Varies

Very large datasets (100K+), enables zero-disk-I/O graph construction

Use INT8 quantization for most use cases. It provides 4x memory savings with minimal accuracy loss and significantly faster search. Only use NONE for very small datasets where maximum precision matters.

Why INT8 is faster

Quantization doesn’t just save memory — it fundamentally changes how vectors are read during search.

Without quantization (NONE), each node visited during graph traversal requires a full document lookup: read the record from disk, deserialize it, and extract the vector property. With INT8, vectors are stored in compact contiguous index pages and read directly — no document deserialization needed.

In benchmarks with 500K 384-dimensional vectors (matching the all-MiniLM-L6-v2 embedding model), INT8 reduces search latency by 2.5x compared to NONE:

Quantization Mean latency p95 latency Vector fetch path

NONE

3.50 ms

4.36 ms

Document lookup (random I/O)

INT8

1.59 ms

1.94 ms

Index pages (sequential I/O)

The difference becomes even more significant under memory pressure. With NONE quantization, vector data is 4x larger, evicting more data from memory caches and forcing real disk I/O. INT8 keeps the working set small enough to stay in memory even with constrained resources.

Quantization is transparent — queries work identically regardless of quantization setting. The index automatically quantizes on insert and dequantizes on retrieval.

When using PRODUCT quantization, the graph build uses Product Quantization scores instead of exact vector distances. PQ codes are compact and stay in memory, eliminating disk I/O during graph construction. This is most effective on large datasets (100K+ vectors) where PQ quality is sufficient.

Table 1. Memory and performance example: 100K vectors with 384 dimensions
Quantization Vector memory Search speed

NONE

156 MB

Baseline

INT8

39 MB

~2.5x faster

BINARY

5 MB

~3x faster (lower recall)

Key Parameters

Parameter Default Purpose

dimensions

(required)

Must match your embedding model output size

similarity

COSINE

Distance metric: COSINE, DOT_PRODUCT, or EUCLIDEAN

quantization

NONE

Compression: NONE, INT8, BINARY, or PRODUCT. INT8 is recommended for most use cases.

efSearch

adaptive

Search beam width at query time. Controls recall vs speed trade-off. See efSearch and Adaptive Search.

maxConnections

16

Connections per node. Higher = better recall, more memory

beamWidth

100

Search depth during build. Higher = better index quality, slower builds

addHierarchy

false

Enable multi-layer HNSW for large/complex datasets

storeVectorsInGraph

false

Co-locate vectors in graph file for faster retrieval at large scale

buildGraphNow

true

Build the HNSW graph immediately at index creation time. Set to false for deferred (lazy) building on first search.

efSearch and Adaptive Search

The efSearch parameter controls how many candidate nodes the search explores in the vector graph. Higher values find more accurate results but take longer.

When efSearch is not explicitly set (either on the index or per-query), ArcadeDB uses an adaptive two-pass strategy:

  1. First pass — Uses a moderate beam width (2 × k), which is sufficient for most queries on well-clustered data.

  2. Second pass — If the first pass returns insufficient results, the search automatically widens the beam to 10 × k.

For small indexes (< 10K vectors), the full default efSearch is always used since the cost is negligible.

This adaptive behavior gives you fast queries on easy lookups while still maintaining recall on harder queries — without requiring any tuning.

Setting efSearch

You can set efSearch at three levels:

Per-query (highest priority) — pass as the 4th argument to vectorNeighbors(), either positionally or via the named options map:

-- Higher efSearch for a critical query that needs maximum recall (positional)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, 500))

-- Lower efSearch for a latency-sensitive query (positional)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, 30))

-- Options map form (extensible; also supports `filter`)
SELECT expand(vectorNeighbors('Doc[embedding]', [...], 10, { efSearch: 500 }))

Per-index — set in the index metadata at creation time:

CREATE INDEX ON Doc (embedding) LSM_VECTOR METADATA {
  dimensions: 1024,
  similarity: 'COSINE',
  efSearch: 200
}

Adaptive (default) — when neither per-query nor per-index efSearch is specified, the adaptive strategy described above is used.

For most workloads, the adaptive default works well. Only set efSearch explicitly if you need consistently high recall regardless of query difficulty, or if you have strict latency requirements.

Vector search can be combined with a logical filter on the same type by passing a filter option containing the allowed RIDs. The HNSW traversal restricts itself to that set, so non-matching vectors are skipped without decoding.

-- Find the 10 most similar documents within a specific tenant and category
SELECT vectorNeighbors(
         'Document[embedding]',
         :queryVector,
         10,
         { filter: (SELECT @rid FROM Document WHERE tenantId = 'acme' AND category = 'finance') }
       )

The filter value accepts a list of RIDs, RID strings, or any Identifiable. It can be produced by a subquery, a query parameter, or built programmatically.

Very selective filters (only a tiny fraction of records match) can starve the HNSW beam; combine filter with a higher efSearch to preserve recall.

A single vertex type can have multiple vector indexes on different properties:

CREATE INDEX ON Product (imageEmbedding) LSM_VECTOR METADATA {dimensions: 512, similarity: 'COSINE'}
CREATE INDEX ON Product (textEmbedding)  LSM_VECTOR METADATA {dimensions: 768, similarity: 'COSINE'}

Query each index independently to search by image similarity, text similarity, or combine scores.

Integration with Other Models

Vector search combines naturally with ArcadeDB’s other data models:

  • Graph + Vectors — Find similar items, then traverse relationships to discover connected context (Graph RAG pattern)

  • Full-text + Vectors — Hybrid search combining keyword matching with semantic similarity (Knowledge Graph pattern)

  • Time Series + Vectors — Detect behavioral anomalies by comparing embedding patterns over time (Fraud Detection pattern)

SQL Example

Create a vector index and query it:

-- Create vertex type and property
CREATE VERTEX TYPE Document;
CREATE PROPERTY Document.content STRING;
CREATE PROPERTY Document.embedding ARRAY_OF_FLOATS;

-- Create vector index with 384 dimensions using COSINE similarity
CREATE INDEX ON Document (embedding) LSM_VECTOR METADATA {
  dimensions: 384,
  similarity: 'COSINE'
};

-- Query for the 10 nearest documents
-- Returns rows with .record (full document) and .distance (0 = identical for COSINE)
SELECT expand(vectorNeighbors('Document[embedding]', $queryVector, 10))

Java Example

Create and query a vector index programmatically:

import com.arcadedb.index.lsm.LSMVectorIndex;
import com.arcadedb.index.lsm.LSMVectorIndexBuilder;
import com.arcadedb.index.vector.VectorSimilarityFunction;

// Create index programmatically
final LSMVectorIndexBuilder builder = new LSMVectorIndexBuilder(
    database,
    "Document",
    new String[]{"embedding"})
    .withDimensions(384)
    .withSimilarity(VectorSimilarityFunction.COSINE)
    .withMaxConnections(16)
    .withBeamWidth(100);

final LSMVectorIndex index = builder.create();

// Query the index using SQL
final ResultSet resultSet = database.query("sql",
    "SELECT expand(vectorNeighbors('Document[embedding]', ?, 10))",
    queryVector);

Configuration Parameters

When creating LSMVectorIndex instances, the following parameters can be configured:

  • dimensions: The dimensionality of the vectors (must match your embedding model output)

  • similarity: The distance function for similarity calculation (COSINE, DOT_PRODUCT, EUCLIDEAN, etc.)

  • maxConnections: Maximum number of connections per layer in the HNSW graph (default: 16, increase for better recall)

  • beamWidth: Beam width for approximate nearest neighbor search (default: 100, increase for more accurate results)

Supported Similarity Functions

Measure Name Type

COSINE

Cosine Similarity

L2

DOT_PRODUCT

Inner Product

L2

EUCLIDEAN

Euclidean Distance

L2

For more information on vector embeddings, see the Vector Embeddings section.

Further Reading