Documentation
ΒΆ
Overview ΒΆ
Package vecgo provides a high-performance embedded vector database for Go.
Vecgo is an embeddable, hybrid vector database designed for production workloads. It combines commit-oriented durability with HNSW + DiskANN indexing for best-in-class performance.
Quick Start ΒΆ
Local mode:
ctx := context.Background()
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))
db, _ := vecgo.Open(ctx, vecgo.Local("./data")) // re-open existing
Cloud mode:
s3Store, _ := s3.New(ctx, "my-bucket", s3.WithPrefix("vectors/"))
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store))
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store), vecgo.WithCacheDir("/fast/nvme"))
Insert Modes ΒΆ
Vecgo provides three insert modes optimized for different workloads:
// 1. SINGLE INSERT β Real-time updates (~625 vec/s, 768 dim) // Vectors are HNSW-indexed and searchable immediately. id, _ := db.Insert(ctx, vector, metadata, payload) // 2. BATCH INSERT β Indexed batch (~3,000 vec/s at batch=100) // All vectors are HNSW-indexed and searchable immediately. ids, _ := db.BatchInsert(ctx, vectors, metadatas, payloads) // 3. DEFERRED INSERT β Bulk loading (~2,000,000 vec/s!) // Vectors are NOT indexed into HNSW, stored only. // Searchable after Commit() triggers flush to DiskANN segment. ids, _ := db.BatchInsertDeferred(ctx, vectors, metadatas, payloads) db.Commit(ctx) // Now searchable via DiskANN
Use Deferred mode for initial data loading, migrations, or nightly reindex. Use Single/Batch mode for real-time RAG where immediate search is required.
Search with Data ΒΆ
By default, search returns IDs, scores, metadata, and payload:
results, _ := db.Search(ctx, query, 10)
for _, r := range results {
fmt.Println(r.ID, r.Score, r.Metadata, r.Payload)
}
For minimal results (IDs + scores only), use WithoutData():
results, _ := db.Search(ctx, query, 10, vecgo.WithoutData())
Durability Model ΒΆ
Vecgo uses commit-oriented durability (append-only versioned commits):
db.Insert(ctx, vec, nil, nil) // buffered in memory db.Commit(ctx) // durable after this
Key Features ΒΆ
- HNSW + DiskANN hybrid indexing
- Commit-oriented durability (no WAL complexity)
- Full quantization suite (PQ, OPQ, SQ, BQ, RaBitQ, INT4)
- Cloud-native storage (S3/GCS/Azure via BlobStore)
- Time-travel queries
- Hybrid search (BM25 + vectors with RRF fusion)
- SIMD optimized (AVX-512/AVX2/NEON/SVE2)
Index ΒΆ
- Constants
- Variables
- type Backend
- type Candidate
- type CompactionConfig
- type CompactionPolicy
- type DB
- type FlushConfig
- type ID
- type Metric
- type MetricsObserver
- type Option
- func Create(dim int, metric Metric) Option
- func ReadOnly() Option
- func WithBlockCacheSize(size int64) Option
- func WithCacheDir(dir string) Option
- func WithCompactionConfig(cfg CompactionConfig) Option
- func WithCompactionPolicy(policy CompactionPolicy) Option
- func WithDimension(dim int) Option
- func WithFlushConfig(cfg FlushConfig) Option
- func WithLexicalIndex(idx lexical.Index, field string) Option
- func WithLogger(l *slog.Logger) Option
- func WithMemoryLimit(bytes int64) Option
- func WithMetric(m Metric) Option
- func WithMetricsObserver(observer MetricsObserver) Option
- func WithRetentionPolicy(p RetentionPolicy) Option
- func WithSchema(schema metadata.Schema) Option
- func WithTimestamp(t time.Time) Option
- func WithVersion(v uint64) Option
- type QuantizationType
- type QueryStats
- type Record
- type RetentionPolicy
- type SearchOption
- func WithFilter(filter *metadata.FilterSet) SearchOption
- func WithMetadata() SearchOption
- func WithNProbes(n int) SearchOption
- func WithPayload() SearchOption
- func WithPreFilter(preFilter bool) SearchOption
- func WithRefineFactor(factor float32) SearchOption
- func WithSelectivityCutoff(cutoff float64) SearchOption
- func WithStats(stats *QueryStats) SearchOption
- func WithVector() SearchOption
- func WithoutData() SearchOption
- type SegmentQueryStats
- type VacuumStats
Constants ΒΆ
const ( MetricL2 = distance.MetricL2 MetricCosine = distance.MetricCosine MetricDot = distance.MetricDot QuantizationTypeNone = quantization.TypeNone QuantizationTypePQ = quantization.TypePQ QuantizationTypeOPQ = quantization.TypeOPQ QuantizationTypeSQ8 = quantization.TypeSQ8 QuantizationTypeBQ = quantization.TypeBQ QuantizationTypeRaBitQ = quantization.TypeRaBitQ QuantizationTypeINT4 = quantization.TypeINT4 )
Variables ΒΆ
var ( ErrClosed = engine.ErrClosed ErrInvalidArgument = engine.ErrInvalidArgument ErrCorrupt = engine.ErrCorrupt ErrIncompatibleFormat = engine.ErrIncompatibleFormat ErrBackpressure = engine.ErrBackpressure ErrReadOnly = engine.ErrReadOnly )
Public error taxonomy (re-exported from engine).
var ( WithDiskANNThreshold = engine.WithDiskANNThreshold WithCompactionThreshold = engine.WithCompactionThreshold WithQuantization = engine.WithQuantization )
Re-export Option constructors
var NewRecord = model.NewRecord
NewRecord creates a new RecordBuilder for fluent record construction.
Functions ΒΆ
This section is empty.
Types ΒΆ
type Backend ΒΆ added in v0.0.13
type Backend interface {
// contains filtered or unexported methods
}
Backend represents a storage backend for the vector database. Use Local() for filesystem storage or Remote() for cloud storage.
type CompactionConfig ΒΆ added in v0.0.13
type CompactionConfig = engine.CompactionConfig
CompactionConfig configures compaction.
type CompactionPolicy ΒΆ added in v0.0.13
type CompactionPolicy = engine.CompactionPolicy
CompactionPolicy determines when to compact segments.
type DB ΒΆ added in v0.0.13
DB is the main entry point for the vector database. It wraps the internal engine and provides a high-level API.
func Open ΒΆ added in v0.0.13
Open opens or creates a vector database with the given backend. The context is used for initialization I/O and can be used for timeouts.
For new indexes, use the Create option:
ctx := context.Background()
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))
For existing indexes, dimension and metric are loaded from the manifest:
db, _ := vecgo.Open(ctx, vecgo.Local("./data"))
For cloud storage:
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store))
func (*DB) BatchInsertRecords ΒΆ added in v0.0.13
BatchInsertRecords inserts multiple records in a single batch.
Example:
records := []vecgo.Record{rec1, rec2, rec3}
ids, err := db.BatchInsertRecords(ctx, records)
func (*DB) InsertRecord ΒΆ added in v0.0.13
InsertRecord inserts a record built with the fluent RecordBuilder API.
Example:
rec := vecgo.NewRecord(vec).
WithMetadata("category", metadata.String("tech")).
WithPayload(jsonData).
Build()
id, err := db.InsertRecord(ctx, rec)
func (*DB) Vacuum ΒΆ added in v0.0.13
Vacuum removes old manifest versions and their orphaned segments based on the RetentionPolicy configured via WithRetentionPolicy().
This reclaims disk space from:
- Old manifest versions (beyond KeepVersions or KeepDuration)
- Segments no longer referenced by any retained manifest
Time-travel queries to vacuumed versions will fail.
Safe to call periodically (e.g., daily cron job). No-op if no retention policy is set.
Example:
// Configure retention when opening
policy := vecgo.RetentionPolicy{KeepVersions: 10}
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithRetentionPolicy(policy))
// Later, reclaim space from expired versions
err := db.Vacuum(ctx)
type FlushConfig ΒΆ added in v0.0.13
type FlushConfig = engine.FlushConfig
FlushConfig configures MemTable flushing.
type MetricsObserver ΒΆ added in v0.0.13
type MetricsObserver = engine.MetricsObserver
MetricsObserver interfaces for observing engine metrics.
type Option ΒΆ added in v0.0.13
Option configures the engine.
func Create ΒΆ added in v0.0.13
Create returns an Option that specifies creating a new index with the given dimension and metric. Use this when creating a new index for the first time. If omitted, vecgo will attempt to open an existing index and read dim/metric from the manifest.
func ReadOnly ΒΆ added in v0.0.13
func ReadOnly() Option
ReadOnly puts the engine in read-only mode. In this mode:
- Insert/Delete operations return ErrReadOnly
- No local state is required (pure memory cache)
- Ideal for stateless serverless search nodes
Example:
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store), vecgo.ReadOnly())
func WithBlockCacheSize ΒΆ added in v0.0.13
WithBlockCacheSize sets the size of the block cache in bytes.
func WithCacheDir ΒΆ added in v0.0.13
WithCacheDir sets the local directory for caching remote data. Only applicable when using Remote() backend. Defaults to os.TempDir()/vecgo-cache-<random>.
func WithCompactionConfig ΒΆ added in v0.0.13
func WithCompactionConfig(cfg CompactionConfig) Option
WithCompactionConfig sets the compaction configuration.
func WithCompactionPolicy ΒΆ added in v0.0.13
func WithCompactionPolicy(policy CompactionPolicy) Option
WithCompactionPolicy sets the compaction policy.
func WithDimension ΒΆ added in v0.0.13
WithDimension sets the vector dimension.
func WithFlushConfig ΒΆ added in v0.0.13
func WithFlushConfig(cfg FlushConfig) Option
WithFlushConfig sets the flush configuration.
func WithLexicalIndex ΒΆ added in v0.0.13
WithLexicalIndex sets the lexical index and the metadata field to index.
func WithLogger ΒΆ added in v0.0.13
WithLogger sets the logger for the engine.
func WithMemoryLimit ΒΆ added in v0.0.13
WithMemoryLimit sets the memory limit for the engine in bytes. If set to 0, memory is unlimited (no backpressure). Default is 1GB.
func WithMetric ΒΆ added in v0.0.13
WithMetric sets the distance metric.
func WithMetricsObserver ΒΆ added in v0.0.13
func WithMetricsObserver(observer MetricsObserver) Option
WithMetricsObserver sets the metrics observer for the engine.
func WithRetentionPolicy ΒΆ added in v0.0.13
func WithRetentionPolicy(p RetentionPolicy) Option
WithRetentionPolicy sets the retention policy for time-travel versioning. Old manifests and their orphaned segments are removed by Vacuum() based on this policy.
Example:
policy := vecgo.RetentionPolicy{
KeepVersions: 10, // Keep last 10 versions
KeepDuration: 7 * 24 * time.Hour, // OR keep 7 days of history
}
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithRetentionPolicy(policy))
func WithSchema ΒΆ added in v0.0.13
WithSchema sets the metadata schema for the engine.
func WithTimestamp ΒΆ added in v0.0.13
WithTimestamp opens the database at the state closest to the given time (Time-Travel). This enables querying historical data states without loading the latest version.
Example:
// Query the database as it was yesterday
yesterday := time.Now().Add(-24 * time.Hour)
db, err := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithTimestamp(yesterday))
Note: The database must have been committed at least once at or before the given timestamp. If no version exists at or before the timestamp, an error is returned.
func WithVersion ΒΆ added in v0.0.13
WithVersion opens the database at a specific manifest version ID (Time-Travel). Version IDs are monotonically increasing and assigned on each Commit().
Example:
// Open at version 42
db, err := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithVersion(42))
Use Stats().ManifestID to get the current version ID after opening.
type QuantizationType ΒΆ added in v0.0.13
type QuantizationType = quantization.Type
QuantizationType defines the vector quantization method.
type QueryStats ΒΆ added in v0.0.13
type QueryStats = model.QueryStats
QueryStats provides detailed execution statistics for a search query. Use WithStats to collect these during a search.
type RetentionPolicy ΒΆ added in v0.0.13
type RetentionPolicy = engine.RetentionPolicy
RetentionPolicy defines rules for retaining old versions. Used with Vacuum() to control time-travel storage overhead.
type SearchOption ΒΆ added in v0.0.13
type SearchOption = func(*model.SearchOptions)
SearchOption configures a search query.
func WithFilter ΒΆ added in v0.0.13
func WithFilter(filter *metadata.FilterSet) SearchOption
WithFilter sets a typed metadata filter for the search.
func WithMetadata ΒΆ added in v0.0.15
func WithMetadata() SearchOption
WithMetadata requests metadata to be returned in the search results. Metadata IS included by default. Use WithoutData() to exclude all, then selectively include what you need with WithMetadata(), WithPayload(), etc.
func WithNProbes ΒΆ added in v0.0.13
func WithNProbes(n int) SearchOption
WithNProbes sets the number of probes for the search.
func WithPayload ΒΆ added in v0.0.15
func WithPayload() SearchOption
WithPayload requests payload to be returned in the search results. Payload IS included by default. Use WithoutData() to exclude all, then selectively include what you need with WithMetadata(), WithPayload(), etc.
func WithPreFilter ΒΆ added in v0.0.13
func WithPreFilter(preFilter bool) SearchOption
WithPreFilter forces pre-filtering (or post-filtering if false).
func WithRefineFactor ΒΆ added in v0.0.13
func WithRefineFactor(factor float32) SearchOption
WithRefineFactor sets the refinement factor for reranking.
func WithSelectivityCutoff ΒΆ added in v0.0.13
func WithSelectivityCutoff(cutoff float64) SearchOption
WithSelectivityCutoff sets the selectivity threshold above which HNSW with post-filtering is preferred over bitmap-based brute-force search.
When a filter matches more than this percentage of documents in a segment, the engine skips expensive bitmap materialization and uses HNSW traversal with post-filtering instead. This is faster because bitmap overhead dominates at high selectivity.
Default: 0.30 (30%) - based on benchmark analysis Range: 0.0 - 1.0 - Lower values (e.g., 0.1): More aggressive HNSW usage - Higher values (e.g., 0.5): More bitmap brute-force usage - 0: Uses adaptive heuristic based on k and absolute match count
Example:
// Use HNSW when filter matches >20% of documents
results, _ := db.Search(ctx, query, 10,
vecgo.WithFilter(filter),
vecgo.WithSelectivityCutoff(0.20))
func WithStats ΒΆ added in v0.0.13
func WithStats(stats *QueryStats) SearchOption
WithStats enables collection of detailed query execution statistics. Pass a pointer to a QueryStats struct that will be populated after the query. Use this for query debugging, performance analysis, and cost estimation.
Example:
stats := &vecgo.QueryStats{}
results, _ := db.Search(ctx, query, 10, vecgo.WithStats(stats))
fmt.Println(stats.Explain()) // Human-readable explanation
fmt.Printf("Distance computations: %d\n", stats.DistanceComputations)
func WithVector ΒΆ added in v0.0.13
func WithVector() SearchOption
WithVector requests the vector to be returned in the search results. Vectors are NOT included by default due to their large size.
func WithoutData ΒΆ added in v0.0.13
func WithoutData() SearchOption
WithoutData disables automatic retrieval of metadata and payload. By default, search returns metadata and payload for each result. Use this option for high-throughput scenarios where only IDs and scores are needed.
Example:
results, _ := db.Search(ctx, query, 10, vecgo.WithoutData())
type SegmentQueryStats ΒΆ added in v0.0.13
type SegmentQueryStats = model.SegmentQueryStats
SegmentQueryStats contains execution statistics for a single segment.
type VacuumStats ΒΆ added in v0.0.13
type VacuumStats = engine.VacuumStats
VacuumStats holds results of the vacuum operation.
Directories
ΒΆ
| Path | Synopsis |
|---|---|
|
Package blobstore provides storage abstraction for Vecgo's immutable segments.
|
Package blobstore provides storage abstraction for Vecgo's immutable segments. |
|
minio
Package minio provides a BlobStore implementation using the MinIO client.
|
Package minio provides a BlobStore implementation using the MinIO client. |
|
s3
Package s3 provides S3 implementations of the blobstore.BlobStore interface.
|
Package s3 provides S3 implementations of the blobstore.BlobStore interface. |
|
Package distance provides vector distance calculations with SIMD acceleration.
|
Package distance provides vector distance calculations with SIMD acceleration. |
|
examples
|
|
|
basic
command
|
|
|
bulk_load
command
Package main demonstrates Vecgo's bulk loading capabilities.
|
Package main demonstrates Vecgo's bulk loading capabilities. |
|
cloud_tiered
command
|
|
|
explain
command
Package main demonstrates Vecgo's query explanation and statistics capabilities.
|
Package main demonstrates Vecgo's query explanation and statistics capabilities. |
|
modern
command
|
|
|
rag
command
|
|
|
time_travel
command
Package main demonstrates Vecgo's time-travel query capabilities.
|
Package main demonstrates Vecgo's time-travel query capabilities. |
|
internal
|
|
|
arena
Package arena provides an off-heap memory allocator for HNSW graphs.
|
Package arena provides an off-heap memory allocator for HNSW graphs. |
|
bitmap
Package bitmap provides a best-in-class query-time bitmap engine for vector search.
|
Package bitmap provides a best-in-class query-time bitmap engine for vector search. |
|
bitset
Package bitset provides a lock-free segmented bitset for concurrent access.
|
Package bitset provides a lock-free segmented bitset for concurrent access. |
|
cache
Package cache provides LRU caching for block data.
|
Package cache provides LRU caching for block data. |
|
conv
Package conv provides safe integer type conversion utilities.
|
Package conv provides safe integer type conversion utilities. |
|
engine
Package engine implements the core vector database engine.
|
Package engine implements the core vector database engine. |
|
fs
Package fs provides filesystem abstractions for testability and fault injection.
|
Package fs provides filesystem abstractions for testability and fault injection. |
|
hash
Package hash provides fast, hardware-accelerated hashing utilities for data integrity.
|
Package hash provides fast, hardware-accelerated hashing utilities for data integrity. |
|
hnsw
Package hnsw implements Hierarchical Navigable Small World graphs.
|
Package hnsw implements Hierarchical Navigable Small World graphs. |
|
kmeans
Package kmeans implements k-means clustering for quantization training.
|
Package kmeans implements k-means clustering for quantization training. |
|
manifest
Package manifest provides Bloom filters for fast categorical negative lookups.
|
Package manifest provides Bloom filters for fast categorical negative lookups. |
|
mem
Package mem provides memory allocation utilities for SIMD operations.
|
Package mem provides memory allocation utilities for SIMD operations. |
|
metadata
Package imetadata provides internal metadata indexing for efficient filtering.
|
Package imetadata provides internal metadata indexing for efficient filtering. |
|
mmap
Package mmap provides memory-mapped file access for zero-copy I/O.
|
Package mmap provides memory-mapped file access for zero-copy I/O. |
|
pk
Package pk provides a lock-free MVCC primary key index.
|
Package pk provides a lock-free MVCC primary key index. |
|
quantization
Package quantization provides vector compression techniques for memory reduction.
|
Package quantization provides vector compression techniques for memory reduction. |
|
resource
Package resource implements the ResourceController for global limits and governance.
|
Package resource implements the ResourceController for global limits and governance. |
|
searcher
Package searcher provides pooled search context for zero-allocation queries.
|
Package searcher provides pooled search context for zero-allocation queries. |
|
segment
Package segment defines interfaces for immutable data segments.
|
Package segment defines interfaces for immutable data segments. |
|
segment/diskann
Package diskann implements a DiskANN/Vamana segment for graph-based ANN search.
|
Package diskann implements a DiskANN/Vamana segment for graph-based ANN search. |
|
segment/flat
Package flat implements an immutable Flat segment with brute-force search.
|
Package flat implements an immutable Flat segment with brute-force search. |
|
segment/memtable
Package memtable implements the in-memory L0 segment with HNSW indexing.
|
Package memtable implements the in-memory L0 segment with HNSW indexing. |
|
simd
Package simd provides SIMD-optimized vector operations.
|
Package simd provides SIMD-optimized vector operations. |
|
simd/cmd/generator
command
Package main is a code generator for vecgo SIMD functions.
|
Package main is a code generator for vecgo SIMD functions. |
|
vectorstore
Package vectorstore provides a canonical vector storage interface and a high-performance columnar implementation.
|
Package vectorstore provides a canonical vector storage interface and a high-performance columnar implementation. |
|
Package lexical defines the interface for lexical (keyword) search indexes.
|
Package lexical defines the interface for lexical (keyword) search indexes. |
|
bm25
Package bm25 provides a BM25-based lexical search index.
|
Package bm25 provides a BM25-based lexical search index. |
|
Package metadata provides efficient metadata storage and filtering for Vecgo.
|
Package metadata provides efficient metadata storage and filtering for Vecgo. |
|
Package model defines core types used throughout Vecgo.
|
Package model defines core types used throughout Vecgo. |
|
Package testutil provides testing utilities for Vecgo.
|
Package testutil provides testing utilities for Vecgo. |