Vector Functions
Vector functions provide comprehensive operations for vector embeddings, similarity search, and machine learning workflows. These functions are available in both SQL and Cypher queries.
Vector functions are also available with the legacy vectorXXX() naming convention for backward compatibility. Both vector.dimension() and vectorDimension() work identically.
|
Aggregation Functions
vector.sum()
Aggregate function: element-wise sum of vectors.
Syntax: vector.sum(<field>)
Returns: Vector - Sum of all vectors
SELECT vector.sum(embedding) FROM documents
vector.avg()
Aggregate function: element-wise average of vectors.
Syntax: vector.avg(<field>)
Returns: Vector - Average vector
SELECT vector.avg(embedding) FROM documents
Basic Operations
vector.dimension()
Returns the dimension of a vector (length of the underlying array).
Syntax: vector.dimension(<vector>)
Returns: Integer - Vector dimension
SELECT vector.dimension([1.0, 2.0, 3.0])
-- Returns: 3
RETURN vector.dimension([1.0, 2.0, 3.0]) AS dim
vector.add()
Returns element-wise sum of two vectors.
Syntax: vector.add(<vector1>, <vector2>)
Returns: Vector - Sum vector
SELECT vector.add([1.0, 2.0, 3.0], [2.0, 3.0, 4.0])
-- Returns: [3.0, 5.0, 7.0]
vector.subtract()
Returns element-wise difference of two vectors.
Syntax: vector.subtract(<vector1>, <vector2>)
Returns: Vector - Difference vector
SELECT vector.subtract([3.0, 5.0, 7.0], [1.0, 2.0, 3.0])
-- Returns: [2.0, 3.0, 4.0]
Nearest Neighbor Search
vector.neighbors()
Returns k nearest neighbors from a vector index.
Syntax:
vector.neighbors(<index-spec>, <query-vector>, <k>)
vector.neighbors(<index-spec>, <query-vector>, <k>, <efSearch>)
vector.neighbors(<index-spec>, <query-vector>, <k>, { <option>: <value>, ... })
Parameters:
-
index-spec- Index specification as'TypeName[propertyName]' -
query-vector- Query vector or key to look up -
k- Number of neighbors to return -
efSearch- (optional, positional) Search beam width. Controls the trade-off between recall and speed. Higher values improve recall but increase latency. When omitted, the index uses adaptive efSearch.
Options map (optional, alternative to the positional efSearch argument):
| Key | Type | Description |
|---|---|---|
|
Integer |
Same semantics as the positional form. |
|
List of RIDs or RID strings |
Restricts the search to the provided set of records. Useful to combine a vector search with a logical filter: first select the candidate RIDs, then pass them as |
Returns: List - Nearest neighbors with distances
-- Default (adaptive efSearch)
SELECT vector.neighbors('Document[embedding]', [0.1, 0.2, 0.3], 5)
-- Explicit efSearch for higher recall (positional form, backward compatible)
SELECT vector.neighbors('Document[embedding]', [0.1, 0.2, 0.3], 5, 500)
-- Options map form (named, extensible)
SELECT vector.neighbors('Document[embedding]', [0.1, 0.2, 0.3], 5, { efSearch: 500 })
-- Combine a vector search with a logical filter on the same type
SELECT vector.neighbors(
'Document[embedding]',
[0.1, 0.2, 0.3],
10,
{ filter: (SELECT @rid FROM Document WHERE tenantId = 'acme' AND category = 'finance') }
)
-- Using a parameter binding for the RID list
SELECT vector.neighbors('Document[embedding]', :queryVector, 10, { efSearch: 300, filter: :allowedRids })
|
The options map rejects unknown keys with a descriptive error to catch typos, for example passing |
Cypher Usage:
The recommended way to use vector search from Cypher is via CALL, which uses the HNSW index for fast approximate nearest neighbor search:
// ArcadeDB-native syntax
CALL vector.neighbors('Document[embedding]', $queryVector, 10)
YIELD name, distance
RETURN name, distance
ORDER BY distance
// Neo4j-compatible syntax (returns node + score)
CALL db.index.vector.queryNodes('Document[embedding]', 10, $queryVector)
YIELD node, score
RETURN node.title AS title, score
ORDER BY score DESC
db.index.vector.queryNodes() returns score (cosine similarity, 1.0 = identical), while vector.neighbors() returns distance (0.0 = identical). The relationship is score = 1 - distance.
|
Type-Specific Search with Inheritance:
When a vector index exists on a parent type, you can search specific child types:
-- Search only in EMBEDDING_IMAGE records
SELECT vector.neighbors('EMBEDDING_IMAGE[vector]', $queryVector, 10)
-- Search across all types (parent + children)
SELECT vector.neighbors('EMBEDDING[vector]', $queryVector, 10)
Normalization and Norms
vector.normalize()
Normalizes vector to unit length (L2 norm = 1.0).
Syntax: vector.normalize(<vector>)
Returns: Vector - Normalized vector
SELECT vector.normalize([3.0, 4.0])
-- Returns: [0.6, 0.8]
vector.isnormalized()
Checks if vector is normalized (L2 norm ≈ 1.0).
Syntax: vector.isnormalized(<vector>, [tolerance])
Returns: Boolean - true if normalized
SELECT vector.isnormalized([0.6, 0.8])
-- Returns: true
vector.magnitude()
Computes L2 norm (Euclidean length) of vector.
Syntax: vector.magnitude(<vector>)
Returns: Double - L2 norm
SELECT vector.magnitude([3.0, 4.0])
-- Returns: 5.0
Quantization
vector.quantizeint8()
Quantizes vector to 8-bit integers using min-max scaling.
Syntax: vector.quantizeint8(<vector>)
Returns: ByteArray - Quantized vector
SELECT vector.quantizeint8([0.1, 0.5, 0.9])
Scoring and Fusion
vector.hybridscore()
Computes weighted average of two scores.
Syntax: vector.hybridscore(<score1>, <score2>, <alpha>)
Returns: Double - Weighted average
SELECT vector.hybridscore(0.8, 0.6, 0.7)
-- Returns: 0.7 * 0.8 + 0.3 * 0.6 = 0.74
vector.multiscore()
Combines multiple scores using a fusion method.
Syntax: vector.multiscore(<scores>, <method>, [weights])
Methods:
-
'MAX'- Maximum score (ColBERT style) -
'AVG'- Arithmetic average -
'MIN'- Minimum score -
'WEIGHTED'- Weighted average (requires weights)
Returns: Double - Combined score
SELECT vector.multiscore([0.9, 0.7, 0.8], 'MAX')
-- Returns: 0.9
vector.rrfscore()
Computes Reciprocal Rank Fusion (RRF) for combining multiple rankings.
Syntax: vector.rrfscore(<rank1>, <rank2>, …, [k])
Parameters:
-
k- Center rank constant (default: 60)
Returns: Double - RRF score
SELECT vector.rrfscore(1, 2, 4, 60)
vector.normalizescores()
Normalizes scores to [0, 1] range using min-max normalization.
Syntax: vector.normalizescores(<scores>)
Returns: Vector - Normalized scores
SELECT vector.normalizescores([1.0, 2.0, 3.0])
-- Returns: [0.0, 0.5, 1.0]
vector.scoretransform()
Transforms scores using various functions.
Syntax: vector.scoretransform(<score>, <method>)
Methods:
-
'LINEAR'- No transformation -
'SIGMOID'- Logistic function -
'LOG'- Natural logarithm -
'EXP'- Exponential function
Returns: Double - Transformed score
SELECT vector.scoretransform(0.5, 'SIGMOID')
Similarity and Distance
vector.dotproduct()
Computes dot product (inner product) between two vectors.
Syntax: vector.dotproduct(<vector1>, <vector2>)
Returns: Double - Dot product
SELECT vector.dotproduct([1.0, 2.0, 3.0], [4.0, 5.0, 6.0])
-- Returns: 32.0
vector.cosinesimilarity()
Computes cosine similarity between two vectors. Returns value between -1 and 1.
Syntax: vector.cosinesimilarity(<vector1>, <vector2>)
Returns: Double - Cosine similarity (-1 to 1)
SELECT vector.cosinesimilarity([1.0, 0.0], [1.0, 0.0])
-- Returns: 1.0 (identical direction)
SELECT vector.cosinesimilarity([1.0, 0.0], [0.0, 1.0])
-- Returns: 0.0 (orthogonal)
vector.l2distance()
Computes L2 distance (Euclidean distance) between two vectors.
Syntax: vector.l2distance(<vector1>, <vector2>)
Returns: Double - Euclidean distance
SELECT vector.l2distance([0.0, 0.0], [3.0, 4.0])
-- Returns: 5.0
vector.approxdistance()
Computes approximate distance between quantized vectors without full dequantization.
Syntax: vector.approxdistance(<quantized1>, <quantized2>, <mode>)
Modes:
-
'INT8'- Faster than floats, preserves ranking order -
'BINARY'- Very fast Hamming distance, 8x fewer operations
Returns: Double - Approximate distance
SELECT vector.approxdistance(
vector.quantizeint8([1.0, 2.0, 3.0]),
vector.quantizeint8([1.0, 3.0, 3.0]),
'INT8'
)
Sparse Vectors
vector.densetosparse()
Converts dense vector to sparse representation.
Syntax: vector.densetosparse(<vector>, [threshold])
Parameters:
-
threshold- Values below this are considered zero (default: 0.0)
Returns: SparseVector - Sparse representation
SELECT vector.densetosparse([0.5, 0.0, 0.1], 0.2)
-- Only keeps elements >= 0.2
vector.sparsetodense()
Converts sparse vector back to dense representation.
Syntax: vector.sparsetodense(<sparsevector>)
Returns: Vector - Dense vector
SELECT vector.sparsetodense(vector.sparsecreate([0, 2], [0.5, 0.3]))
vector.sparsecreate()
Creates sparse vector from indices and values.
Syntax: vector.sparsecreate(<indices>, <values>, [dimension])
Returns: SparseVector - Sparse vector
SELECT vector.sparsecreate([0, 2, 5], [0.5, 0.3, 0.8], 7)
Statistics
vector.variance()
Computes variance of vector elements.
Syntax: vector.variance(<vector>)
Returns: Double - Variance
SELECT vector.variance([1.0, 2.0, 3.0])
vector.stddev()
Computes standard deviation of vector elements.
Syntax: vector.stddev(<vector>)
Returns: Double - Standard deviation
SELECT vector.stddev([1.0, 2.0, 3.0])
Utility Functions
vector.tostring()
Converts vector to string representation.
Syntax: vector.tostring(<vector>, [format])
Formats:
-
'COMPACT'- Single line[1.0, 2.0, 3.0](default) -
'PRETTY'- Multi-line with formatting -
'PYTHON'- Python list format -
'MATLAB'- MATLAB format
Returns: String - Formatted vector
SELECT vector.tostring([0.5, 0.25, 0.75], 'PYTHON')