vecgo

package module
v0.0.15 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 19, 2026 License: Apache-2.0 Imports: 10 Imported by: 0

README ΒΆ

πŸ§¬πŸ” Vecgo

CI Go Reference goreportcard License

Vecgo is a pure Go, embeddable, hybrid vector database designed for high-performance production workloads. It combines commit-oriented durability with HNSW + DiskANN indexing for best-in-class performance.

⚠️ This is experimental and subject to breaking changes.

✨ Key Differentiators

  • ⚑ Faster & lighter than external services β€” no network overhead, no sidecar, 15MB binary
  • πŸ”§ More capable than simple libraries β€” durability, MVCC, hybrid search, cloud storage
  • 🎯 Simpler than CGO wrappers β€” pure Go toolchain, static binaries, cross-compilation
  • πŸ—οΈ Modern architecture β€” commit-oriented durability (append-only versioned commits), no WAL complexity

πŸ“Š Performance

Vecgo is optimized for high-throughput, low-latency vector search with:

  • FilterCursor β€” zero-allocation push-based iteration
  • Zero-Copy Vectors β€” direct access to mmap'd memory
  • SIMD Distance β€” AVX-512/AVX2/NEON/SVE2 runtime detection

Run benchmarks locally to see performance on your hardware:

cd benchmark_test && go test -bench=. -benchmem -timeout=15m

See benchmark_test/baseline.txt for reference results.

🎯 Features

πŸ“Š Index Types
Index Description Use Case
HNSW Hierarchical Navigable Small World graph In-memory L0 (16-way sharded, lock-free search, arena allocator)
DiskANN/Vamana Disk-resident graph with quantization Large-scale on-disk segments with PQ/RaBitQ
FreshDiskANN Streaming updates for Vamana Lock-free reads, soft deletion, background consolidation
Flat Exact nearest-neighbor with SIMD Exact search, small segments
πŸ—œοΈ Quantization

Quantization reduces in-memory index size for DiskANN segments. Full vectors remain on disk for reranking.

Method RAM Reduction Recall Best For
Product Quantization (PQ) 8-64Γ— 90-95% Large-scale, high compression
Optimized PQ (OPQ) 8-64Γ— 93-97% Best recall with compression
Scalar Quantization (SQ8) 4Γ— 95-99% General purpose, balanced
Binary Quantization (BQ) 32Γ— 70-85% Pre-filtering, coarse search
RaBitQ ~30Γ— 80-90% Better BQ alternative (SIGMOD '24)
INT4 8Γ— 90-95% Memory-constrained

πŸ“– See Performance Tuning Guide for detailed quantization configuration.

🏒 Enterprise Features
  • ☁️ Cloud-Native Storage β€” S3/GCS/Azure via pluggable BlobStore interface
  • πŸ”’ Commit-Oriented Durability β€” Atomic commits with immutable segments
  • πŸ”€ Hybrid Search β€” BM25 + vector similarity with RRF fusion
  • πŸ“Έ Snapshot Isolation β€” Lock-free reads via MVCC
  • ⏰ Time-Travel Queries β€” WithTimestamp() / WithVersion() to query historical state
  • 🏷️ Typed Metadata β€” Schema-enforced metadata with filtering
  • πŸ“Š Query Statistics β€” WithStats() + Explain() for debugging
  • 🎯 Segment Pruning β€” Triangle inequality, Bloom filters, numeric range stats
  • πŸš€ SIMD Optimized β€” AVX-512/AVX2/NEON/SVE2 runtime detection

πŸš€ Quick Start

πŸ“¦ Installation
go get github.com/hupe1980/vecgo

Platform Requirements: Vecgo requires a 64-bit architecture (amd64 or arm64). SIMD optimizations use AVX-512/AVX2 on x86-64 and NEON/SVE2 on ARM64.

πŸ’» Basic Usage
package main

import (
    "context"
    "fmt"
    "log"

    "github.com/hupe1980/vecgo"
    "github.com/hupe1980/vecgo/metadata"
)

func main() {
    ctx := context.Background()

    // Create a new index (128 dimensions, L2 distance)
    db, err := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))
    if err != nil {
        log.Fatal(err)
    }
    defer db.Close()

    // Insert with fluent builder API
    vector := make([]float32, 128)
    rec := vecgo.NewRecord(vector).
        WithMetadata("category", metadata.String("electronics")).
        WithMetadata("price", metadata.Float(99.99)).
        WithPayload([]byte(`{"desc": "Product description"}`)).
        Build()
    
    id, err := db.InsertRecord(ctx, rec)
    if err != nil {
        log.Fatal(err)
    }
    fmt.Printf("Inserted ID: %d\n", id)

    // Or use the simple API
    id, err = db.Insert(ctx, vector, nil, nil)

    // Commit to disk (data is durable after this)
    if err := db.Commit(ctx); err != nil {
        log.Fatal(err)
    }

    // Search β€” returns IDs, scores, metadata, and payload by default
    query := make([]float32, 128)
    results, err := db.Search(ctx, query, 10)
    if err != nil {
        log.Fatal(err)
    }

    for _, r := range results {
        fmt.Printf("ID: %d, Score: %.4f\n", r.ID, r.Score)
    }

    // High-throughput mode (IDs + scores only)
    results, _ = db.Search(ctx, query, 10, vecgo.WithoutData())
}
πŸ”„ Re-open Existing Index
// Dimension and metric are auto-loaded from manifest
db, err := vecgo.Open(ctx, vecgo.Local("./data"))
☁️ Cloud Storage (Writer/Reader Separation)
import (
    "github.com/hupe1980/vecgo"
    "github.com/hupe1980/vecgo/blobstore/s3"
)

// === Writer Node (build index locally, then sync to S3) ===
db, _ := vecgo.Open(ctx, vecgo.Local("/data/vecgo"), vecgo.Create(128, vecgo.MetricL2))
db.Insert(ctx, vector, nil, nil)
db.Close()
// Sync: aws s3 sync /data/vecgo s3://my-bucket/vecgo/

// === Reader Nodes (stateless, horizontally scalable) ===
store, _ := s3.New(ctx, "my-bucket", s3.WithPrefix("vecgo/"))

// Remote() is automatically read-only
db, err := vecgo.Open(ctx, vecgo.Remote(store))

// Writes return ErrReadOnly
_, err = db.Insert(ctx, vec, nil, nil)  // err == vecgo.ErrReadOnly

// With explicit cache directory for faster repeated queries
db, err := vecgo.Open(ctx, vecgo.Remote(store),
    vecgo.WithCacheDir("/fast/nvme"),
    vecgo.WithBlockCacheSize(4 << 30),  // 4GB
)
// Define schema for type safety
schema := metadata.Schema{
    "category": metadata.FieldTypeString,
    "price":    metadata.FieldTypeFloat,
}

db, _ := vecgo.Open(ctx, vecgo.Local("./data"),
    vecgo.Create(128, vecgo.MetricL2),
    vecgo.WithSchema(schema),
)

// Search with filter
filter := metadata.NewFilterSet(
    metadata.Filter{Key: "category", Operator: metadata.OpEqual, Value: metadata.String("electronics")},
    metadata.Filter{Key: "price", Operator: metadata.OpLessThan, Value: metadata.Float(100.0)},
)

results, _ := db.Search(ctx, query, 10, vecgo.WithFilter(filter))
πŸ”€ Hybrid Search (Vector + BM25)
// Insert with text for BM25 indexing
doc := metadata.Document{
    "text": metadata.String("machine learning neural networks"),
}
db.Insert(ctx, vector, doc, nil)

// Hybrid search with RRF fusion
results, _ := db.HybridSearch(ctx, vector, "neural networks", 10)
⏰ Time-Travel Queries

Query historical snapshots without affecting the current state:

// Open at a specific point in time
yesterday := time.Now().Add(-24 * time.Hour)
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithTimestamp(yesterday))

// Or open at a specific version ID
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithVersion(42))

// Query as if it were that moment in time
results, _ := db.Search(ctx, query, 10)

How it works:

  • Old manifests are preserved (each points to immutable segments)
  • Compaction still runs β€” creates NEW optimized segments
  • Old segments retained until Vacuum() removes expired manifests
  • Storage: ~current_data Γ— (1 + retained_versions Γ— churn_rate)

Use cases:

  • πŸ” Debug production issues: "What did the index look like before the bad deployment?"
  • πŸ“Š A/B testing: Compare recall against historical versions
  • πŸ”„ Recovery: Roll back to a known-good state

Managing retention:

// Configure retention policy
policy := vecgo.RetentionPolicy{KeepVersions: 10}
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithRetentionPolicy(policy))

// Reclaim disk space from expired versions
db.Vacuum(ctx)
πŸ“Š Query Statistics & Explain

Understand query execution for debugging and optimization:

var stats vecgo.QueryStats
results, _ := db.Search(ctx, query, 10, vecgo.WithStats(&stats))

// Summary explanation
fmt.Println(stats.Explain())
// Output: "searched 3 segments (1 pruned by stats, 0 by bloom), 
//          scanned 1200 vectors in 2.1ms, recalled 847 candidates (0.7 hit rate)"

// Detailed statistics
fmt.Printf("Segments searched: %d\n", stats.SegmentsSearched)
fmt.Printf("Segments pruned (stats): %d\n", stats.SegmentsPrunedByStats)
fmt.Printf("Segments pruned (bloom): %d\n", stats.SegmentsPrunedByBloom)
fmt.Printf("Vectors scanned: %d\n", stats.VectorsScanned)
fmt.Printf("Candidates recalled: %d\n", stats.CandidatesRecalled)
fmt.Printf("Latency: %v\n", stats.Latency)
fmt.Printf("Graph hops: %d\n", stats.GraphHops)
fmt.Printf("Cost estimate: %.2f\n", stats.CostEstimate())
🎯 Segment Pruning & Manifest Stats

Vecgo automatically prunes irrelevant segments using advanced statistics:

Pruning Strategy Description
Triangle Inequality Skip segments where `
Bloom Filters Skip segments missing required categorical values
Numeric Range Stats Skip segments with min/max outside filter range
Categorical Cardinality Prioritize high-entropy segments for broad queries

These statistics are automatically computed during Commit() and stored in the manifest (v3 format).

// Get current statistics
dbStats := db.Stats()
fmt.Printf("Manifest version: %d\n", dbStats.ManifestID)
fmt.Printf("Total vectors: %d\n", dbStats.TotalVectors)
fmt.Printf("Segment count: %d\n", dbStats.SegmentCount)
πŸ“¦ Insert Modes

Vecgo offers three insert modes optimized for different workloads:

Mode Method Searchable Best For
Single Insert() βœ… Immediately Real-time updates
Batch BatchInsert() βœ… Immediately Medium batches (10-100)
Deferred BatchInsertDeferred() ❌ After flush Bulk loading
// 1. SINGLE INSERT β€” Real-time updates (HNSW-indexed immediately)
//    Use when: you need vectors searchable immediately
id, err := db.Insert(ctx, vector, metadata, payload)

// 2. BATCH INSERT β€” Indexed batch (HNSW-indexed immediately)
//    Use when: you have medium batches and need immediate search
ids, err := db.BatchInsert(ctx, vectors, metadatas, payloads)

// 3. DEFERRED INSERT β€” Bulk loading (NO HNSW indexing)
//    Use when: you're bulk loading and don't need immediate search
//    Vectors become searchable after Commit() triggers flush
ids, err := db.BatchInsertDeferred(ctx, vectors, metadatas, payloads)
db.Commit(ctx) // Flush to disk, now searchable via DiskANN

When to use Deferred mode:

  • Initial data loading (embeddings from a corpus)
  • Periodic bulk updates (nightly reindex)
  • Migration from another database

When NOT to use Deferred mode:

  • Real-time RAG (documents must be searchable immediately)
  • Interactive applications with instant feedback
// Batch delete
err = db.BatchDelete(ctx, ids)

πŸ’Ύ Durability Model

Vecgo uses commit-oriented durability β€” append-only versioned commits:

sequenceDiagram
    participant App as Application
    participant MT as MemTable (RAM)
    participant Seg as Segment (Disk)
    participant Man as Manifest

    App->>MT: Insert(vector, metadata)
    Note over MT: Buffered in memory<br/>❌ NOT durable
    
    App->>MT: Insert(vector, metadata)
    
    App->>Seg: Commit()
    MT->>Seg: Write immutable segment
    Seg->>Man: Update manifest atomically
    Note over Seg,Man: βœ… DURABLE after Commit()
State Survives Crash?
After Insert(), before Commit() ❌ No
After Commit() βœ… Yes
After Close() βœ… Yes (auto-commits pending)

Why commit-oriented?

  • 🧹 Simpler code β€” no WAL rotation, recovery, or checkpointing
  • ⚑ Faster batch inserts β€” no fsync per insert
  • ☁️ Cloud-native β€” pure segment writes, ideal for S3/GCS
  • πŸš€ Instant startup β€” no recovery/replay, just read manifest

πŸ“š Documentation

οΏ½ Examples

Example Description
basic Create index, insert, search, commit
modern Fluent API, schema-enforced metadata, scan iterator
rag Retrieval-Augmented Generation workflow
cloud_tiered Writer/reader separation with S3
bulk_load High-throughput ingestion with BatchInsertDeferred
time_travel Query historical versions by time or version ID
explain Query statistics, cost estimation, performance debugging
observability Prometheus metrics integration

οΏ½πŸ“„ Algorithm References

🀝 Contributing

Contributions welcome! Please open an issue or pull request.

πŸ“œ License

Licensed under the Apache License 2.0. See LICENSE for details.

Documentation ΒΆ

Overview ΒΆ

Package vecgo provides a high-performance embedded vector database for Go.

Vecgo is an embeddable, hybrid vector database designed for production workloads. It combines commit-oriented durability with HNSW + DiskANN indexing for best-in-class performance.

Quick Start ΒΆ

Local mode:

ctx := context.Background()
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))
db, _ := vecgo.Open(ctx, vecgo.Local("./data"))  // re-open existing

Cloud mode:

s3Store, _ := s3.New(ctx, "my-bucket", s3.WithPrefix("vectors/"))
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store))
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store), vecgo.WithCacheDir("/fast/nvme"))

Insert Modes ΒΆ

Vecgo provides three insert modes optimized for different workloads:

// 1. SINGLE INSERT β€” Real-time updates (~625 vec/s, 768 dim)
//    Vectors are HNSW-indexed and searchable immediately.
id, _ := db.Insert(ctx, vector, metadata, payload)

// 2. BATCH INSERT β€” Indexed batch (~3,000 vec/s at batch=100)
//    All vectors are HNSW-indexed and searchable immediately.
ids, _ := db.BatchInsert(ctx, vectors, metadatas, payloads)

// 3. DEFERRED INSERT β€” Bulk loading (~2,000,000 vec/s!)
//    Vectors are NOT indexed into HNSW, stored only.
//    Searchable after Commit() triggers flush to DiskANN segment.
ids, _ := db.BatchInsertDeferred(ctx, vectors, metadatas, payloads)
db.Commit(ctx)  // Now searchable via DiskANN

Use Deferred mode for initial data loading, migrations, or nightly reindex. Use Single/Batch mode for real-time RAG where immediate search is required.

Search with Data ΒΆ

By default, search returns IDs, scores, metadata, and payload:

results, _ := db.Search(ctx, query, 10)
for _, r := range results {
    fmt.Println(r.ID, r.Score, r.Metadata, r.Payload)
}

For minimal results (IDs + scores only), use WithoutData():

results, _ := db.Search(ctx, query, 10, vecgo.WithoutData())

Durability Model ΒΆ

Vecgo uses commit-oriented durability (append-only versioned commits):

db.Insert(ctx, vec, nil, nil)  // buffered in memory
db.Commit(ctx)                 // durable after this

Key Features ΒΆ

  • HNSW + DiskANN hybrid indexing
  • Commit-oriented durability (no WAL complexity)
  • Full quantization suite (PQ, OPQ, SQ, BQ, RaBitQ, INT4)
  • Cloud-native storage (S3/GCS/Azure via BlobStore)
  • Time-travel queries
  • Hybrid search (BM25 + vectors with RRF fusion)
  • SIMD optimized (AVX-512/AVX2/NEON/SVE2)

Index ΒΆ

Constants ΒΆ

View Source
const (
	MetricL2     = distance.MetricL2
	MetricCosine = distance.MetricCosine
	MetricDot    = distance.MetricDot

	QuantizationTypeNone   = quantization.TypeNone
	QuantizationTypePQ     = quantization.TypePQ
	QuantizationTypeOPQ    = quantization.TypeOPQ
	QuantizationTypeSQ8    = quantization.TypeSQ8
	QuantizationTypeBQ     = quantization.TypeBQ
	QuantizationTypeRaBitQ = quantization.TypeRaBitQ
	QuantizationTypeINT4   = quantization.TypeINT4
)

Variables ΒΆ

View Source
var (
	ErrClosed             = engine.ErrClosed
	ErrInvalidArgument    = engine.ErrInvalidArgument
	ErrCorrupt            = engine.ErrCorrupt
	ErrIncompatibleFormat = engine.ErrIncompatibleFormat
	ErrBackpressure       = engine.ErrBackpressure
	ErrReadOnly           = engine.ErrReadOnly
)

Public error taxonomy (re-exported from engine).

View Source
var (
	WithDiskANNThreshold    = engine.WithDiskANNThreshold
	WithCompactionThreshold = engine.WithCompactionThreshold
	WithQuantization        = engine.WithQuantization
)

Re-export Option constructors

View Source
var NewRecord = model.NewRecord

NewRecord creates a new RecordBuilder for fluent record construction.

Functions ΒΆ

This section is empty.

Types ΒΆ

type Backend ΒΆ added in v0.0.13

type Backend interface {
	// contains filtered or unexported methods
}

Backend represents a storage backend for the vector database. Use Local() for filesystem storage or Remote() for cloud storage.

func Local ΒΆ added in v0.0.13

func Local(path string) Backend

Local creates a local filesystem backend.

Example:

ctx := context.Background()
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))

func Remote ΒΆ added in v0.0.13

func Remote(store blobstore.BlobStore) Backend

Remote creates a remote storage backend (S3, GCS, Azure, etc.).

Example:

s3Store, _ := s3.New(ctx, "my-bucket")
db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store), vecgo.WithCacheDir("/fast/nvme"))

type Candidate ΒΆ added in v0.0.13

type Candidate = model.Candidate

Candidate represents a search result.

type CompactionConfig ΒΆ added in v0.0.13

type CompactionConfig = engine.CompactionConfig

CompactionConfig configures compaction.

type CompactionPolicy ΒΆ added in v0.0.13

type CompactionPolicy = engine.CompactionPolicy

CompactionPolicy determines when to compact segments.

type DB ΒΆ added in v0.0.13

type DB struct {
	*engine.Engine
}

DB is the main entry point for the vector database. It wraps the internal engine and provides a high-level API.

func Open ΒΆ added in v0.0.13

func Open(ctx context.Context, backend Backend, opts ...Option) (*DB, error)

Open opens or creates a vector database with the given backend. The context is used for initialization I/O and can be used for timeouts.

For new indexes, use the Create option:

ctx := context.Background()
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.Create(128, vecgo.MetricL2))

For existing indexes, dimension and metric are loaded from the manifest:

db, _ := vecgo.Open(ctx, vecgo.Local("./data"))

For cloud storage:

db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store))

func (*DB) BatchInsertRecords ΒΆ added in v0.0.13

func (db *DB) BatchInsertRecords(ctx context.Context, records []Record) ([]ID, error)

BatchInsertRecords inserts multiple records in a single batch.

Example:

records := []vecgo.Record{rec1, rec2, rec3}
ids, err := db.BatchInsertRecords(ctx, records)

func (*DB) InsertRecord ΒΆ added in v0.0.13

func (db *DB) InsertRecord(ctx context.Context, rec Record) (ID, error)

InsertRecord inserts a record built with the fluent RecordBuilder API.

Example:

rec := vecgo.NewRecord(vec).
    WithMetadata("category", metadata.String("tech")).
    WithPayload(jsonData).
    Build()
id, err := db.InsertRecord(ctx, rec)

func (*DB) Vacuum ΒΆ added in v0.0.13

func (db *DB) Vacuum(ctx context.Context) error

Vacuum removes old manifest versions and their orphaned segments based on the RetentionPolicy configured via WithRetentionPolicy().

This reclaims disk space from:

  • Old manifest versions (beyond KeepVersions or KeepDuration)
  • Segments no longer referenced by any retained manifest

Time-travel queries to vacuumed versions will fail.

Safe to call periodically (e.g., daily cron job). No-op if no retention policy is set.

Example:

// Configure retention when opening
policy := vecgo.RetentionPolicy{KeepVersions: 10}
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithRetentionPolicy(policy))

// Later, reclaim space from expired versions
err := db.Vacuum(ctx)

type FlushConfig ΒΆ added in v0.0.13

type FlushConfig = engine.FlushConfig

FlushConfig configures MemTable flushing.

type ID ΒΆ added in v0.0.13

type ID = model.ID

ID is the unique identifier for a vector.

type Metric ΒΆ added in v0.0.13

type Metric = distance.Metric

Metric defines the distance comparison type.

type MetricsObserver ΒΆ added in v0.0.13

type MetricsObserver = engine.MetricsObserver

MetricsObserver interfaces for observing engine metrics.

type Option ΒΆ added in v0.0.13

type Option = engine.Option

Option configures the engine.

func Create ΒΆ added in v0.0.13

func Create(dim int, metric Metric) Option

Create returns an Option that specifies creating a new index with the given dimension and metric. Use this when creating a new index for the first time. If omitted, vecgo will attempt to open an existing index and read dim/metric from the manifest.

func ReadOnly ΒΆ added in v0.0.13

func ReadOnly() Option

ReadOnly puts the engine in read-only mode. In this mode:

  • Insert/Delete operations return ErrReadOnly
  • No local state is required (pure memory cache)
  • Ideal for stateless serverless search nodes

Example:

db, _ := vecgo.Open(ctx, vecgo.Remote(s3Store), vecgo.ReadOnly())

func WithBlockCacheSize ΒΆ added in v0.0.13

func WithBlockCacheSize(size int64) Option

WithBlockCacheSize sets the size of the block cache in bytes.

func WithCacheDir ΒΆ added in v0.0.13

func WithCacheDir(dir string) Option

WithCacheDir sets the local directory for caching remote data. Only applicable when using Remote() backend. Defaults to os.TempDir()/vecgo-cache-<random>.

func WithCompactionConfig ΒΆ added in v0.0.13

func WithCompactionConfig(cfg CompactionConfig) Option

WithCompactionConfig sets the compaction configuration.

func WithCompactionPolicy ΒΆ added in v0.0.13

func WithCompactionPolicy(policy CompactionPolicy) Option

WithCompactionPolicy sets the compaction policy.

func WithDimension ΒΆ added in v0.0.13

func WithDimension(dim int) Option

WithDimension sets the vector dimension.

func WithFlushConfig ΒΆ added in v0.0.13

func WithFlushConfig(cfg FlushConfig) Option

WithFlushConfig sets the flush configuration.

func WithLexicalIndex ΒΆ added in v0.0.13

func WithLexicalIndex(idx lexical.Index, field string) Option

WithLexicalIndex sets the lexical index and the metadata field to index.

func WithLogger ΒΆ added in v0.0.13

func WithLogger(l *slog.Logger) Option

WithLogger sets the logger for the engine.

func WithMemoryLimit ΒΆ added in v0.0.13

func WithMemoryLimit(bytes int64) Option

WithMemoryLimit sets the memory limit for the engine in bytes. If set to 0, memory is unlimited (no backpressure). Default is 1GB.

func WithMetric ΒΆ added in v0.0.13

func WithMetric(m Metric) Option

WithMetric sets the distance metric.

func WithMetricsObserver ΒΆ added in v0.0.13

func WithMetricsObserver(observer MetricsObserver) Option

WithMetricsObserver sets the metrics observer for the engine.

func WithRetentionPolicy ΒΆ added in v0.0.13

func WithRetentionPolicy(p RetentionPolicy) Option

WithRetentionPolicy sets the retention policy for time-travel versioning. Old manifests and their orphaned segments are removed by Vacuum() based on this policy.

Example:

policy := vecgo.RetentionPolicy{
    KeepVersions: 10,                    // Keep last 10 versions
    KeepDuration: 7 * 24 * time.Hour,    // OR keep 7 days of history
}
db, _ := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithRetentionPolicy(policy))

func WithSchema ΒΆ added in v0.0.13

func WithSchema(schema metadata.Schema) Option

WithSchema sets the metadata schema for the engine.

func WithTimestamp ΒΆ added in v0.0.13

func WithTimestamp(t time.Time) Option

WithTimestamp opens the database at the state closest to the given time (Time-Travel). This enables querying historical data states without loading the latest version.

Example:

// Query the database as it was yesterday
yesterday := time.Now().Add(-24 * time.Hour)
db, err := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithTimestamp(yesterday))

Note: The database must have been committed at least once at or before the given timestamp. If no version exists at or before the timestamp, an error is returned.

func WithVersion ΒΆ added in v0.0.13

func WithVersion(v uint64) Option

WithVersion opens the database at a specific manifest version ID (Time-Travel). Version IDs are monotonically increasing and assigned on each Commit().

Example:

// Open at version 42
db, err := vecgo.Open(ctx, vecgo.Local("./data"), vecgo.WithVersion(42))

Use Stats().ManifestID to get the current version ID after opening.

type QuantizationType ΒΆ added in v0.0.13

type QuantizationType = quantization.Type

QuantizationType defines the vector quantization method.

type QueryStats ΒΆ added in v0.0.13

type QueryStats = model.QueryStats

QueryStats provides detailed execution statistics for a search query. Use WithStats to collect these during a search.

type Record ΒΆ added in v0.0.13

type Record = model.Record

Record represents a single item to be inserted.

type RetentionPolicy ΒΆ added in v0.0.13

type RetentionPolicy = engine.RetentionPolicy

RetentionPolicy defines rules for retaining old versions. Used with Vacuum() to control time-travel storage overhead.

type SearchOption ΒΆ added in v0.0.13

type SearchOption = func(*model.SearchOptions)

SearchOption configures a search query.

func WithFilter ΒΆ added in v0.0.13

func WithFilter(filter *metadata.FilterSet) SearchOption

WithFilter sets a typed metadata filter for the search.

func WithMetadata ΒΆ added in v0.0.15

func WithMetadata() SearchOption

WithMetadata requests metadata to be returned in the search results. Metadata IS included by default. Use WithoutData() to exclude all, then selectively include what you need with WithMetadata(), WithPayload(), etc.

func WithNProbes ΒΆ added in v0.0.13

func WithNProbes(n int) SearchOption

WithNProbes sets the number of probes for the search.

func WithPayload ΒΆ added in v0.0.15

func WithPayload() SearchOption

WithPayload requests payload to be returned in the search results. Payload IS included by default. Use WithoutData() to exclude all, then selectively include what you need with WithMetadata(), WithPayload(), etc.

func WithPreFilter ΒΆ added in v0.0.13

func WithPreFilter(preFilter bool) SearchOption

WithPreFilter forces pre-filtering (or post-filtering if false).

func WithRefineFactor ΒΆ added in v0.0.13

func WithRefineFactor(factor float32) SearchOption

WithRefineFactor sets the refinement factor for reranking.

func WithSelectivityCutoff ΒΆ added in v0.0.13

func WithSelectivityCutoff(cutoff float64) SearchOption

WithSelectivityCutoff sets the selectivity threshold above which HNSW with post-filtering is preferred over bitmap-based brute-force search.

When a filter matches more than this percentage of documents in a segment, the engine skips expensive bitmap materialization and uses HNSW traversal with post-filtering instead. This is faster because bitmap overhead dominates at high selectivity.

Default: 0.30 (30%) - based on benchmark analysis Range: 0.0 - 1.0 - Lower values (e.g., 0.1): More aggressive HNSW usage - Higher values (e.g., 0.5): More bitmap brute-force usage - 0: Uses adaptive heuristic based on k and absolute match count

Example:

// Use HNSW when filter matches >20% of documents
results, _ := db.Search(ctx, query, 10,
    vecgo.WithFilter(filter),
    vecgo.WithSelectivityCutoff(0.20))

func WithStats ΒΆ added in v0.0.13

func WithStats(stats *QueryStats) SearchOption

WithStats enables collection of detailed query execution statistics. Pass a pointer to a QueryStats struct that will be populated after the query. Use this for query debugging, performance analysis, and cost estimation.

Example:

stats := &vecgo.QueryStats{}
results, _ := db.Search(ctx, query, 10, vecgo.WithStats(stats))
fmt.Println(stats.Explain()) // Human-readable explanation
fmt.Printf("Distance computations: %d\n", stats.DistanceComputations)

func WithVector ΒΆ added in v0.0.13

func WithVector() SearchOption

WithVector requests the vector to be returned in the search results. Vectors are NOT included by default due to their large size.

func WithoutData ΒΆ added in v0.0.13

func WithoutData() SearchOption

WithoutData disables automatic retrieval of metadata and payload. By default, search returns metadata and payload for each result. Use this option for high-throughput scenarios where only IDs and scores are needed.

Example:

results, _ := db.Search(ctx, query, 10, vecgo.WithoutData())

type SegmentQueryStats ΒΆ added in v0.0.13

type SegmentQueryStats = model.SegmentQueryStats

SegmentQueryStats contains execution statistics for a single segment.

type VacuumStats ΒΆ added in v0.0.13

type VacuumStats = engine.VacuumStats

VacuumStats holds results of the vacuum operation.

Directories ΒΆ

Path Synopsis
Package blobstore provides storage abstraction for Vecgo's immutable segments.
Package blobstore provides storage abstraction for Vecgo's immutable segments.
minio
Package minio provides a BlobStore implementation using the MinIO client.
Package minio provides a BlobStore implementation using the MinIO client.
s3
Package s3 provides S3 implementations of the blobstore.BlobStore interface.
Package s3 provides S3 implementations of the blobstore.BlobStore interface.
Package distance provides vector distance calculations with SIMD acceleration.
Package distance provides vector distance calculations with SIMD acceleration.
examples
basic command
bulk_load command
Package main demonstrates Vecgo's bulk loading capabilities.
Package main demonstrates Vecgo's bulk loading capabilities.
cloud_tiered command
explain command
Package main demonstrates Vecgo's query explanation and statistics capabilities.
Package main demonstrates Vecgo's query explanation and statistics capabilities.
modern command
rag command
time_travel command
Package main demonstrates Vecgo's time-travel query capabilities.
Package main demonstrates Vecgo's time-travel query capabilities.
internal
arena
Package arena provides an off-heap memory allocator for HNSW graphs.
Package arena provides an off-heap memory allocator for HNSW graphs.
bitmap
Package bitmap provides a best-in-class query-time bitmap engine for vector search.
Package bitmap provides a best-in-class query-time bitmap engine for vector search.
bitset
Package bitset provides a lock-free segmented bitset for concurrent access.
Package bitset provides a lock-free segmented bitset for concurrent access.
cache
Package cache provides LRU caching for block data.
Package cache provides LRU caching for block data.
conv
Package conv provides safe integer type conversion utilities.
Package conv provides safe integer type conversion utilities.
engine
Package engine implements the core vector database engine.
Package engine implements the core vector database engine.
fs
Package fs provides filesystem abstractions for testability and fault injection.
Package fs provides filesystem abstractions for testability and fault injection.
hash
Package hash provides fast, hardware-accelerated hashing utilities for data integrity.
Package hash provides fast, hardware-accelerated hashing utilities for data integrity.
hnsw
Package hnsw implements Hierarchical Navigable Small World graphs.
Package hnsw implements Hierarchical Navigable Small World graphs.
kmeans
Package kmeans implements k-means clustering for quantization training.
Package kmeans implements k-means clustering for quantization training.
manifest
Package manifest provides Bloom filters for fast categorical negative lookups.
Package manifest provides Bloom filters for fast categorical negative lookups.
mem
Package mem provides memory allocation utilities for SIMD operations.
Package mem provides memory allocation utilities for SIMD operations.
metadata
Package imetadata provides internal metadata indexing for efficient filtering.
Package imetadata provides internal metadata indexing for efficient filtering.
mmap
Package mmap provides memory-mapped file access for zero-copy I/O.
Package mmap provides memory-mapped file access for zero-copy I/O.
pk
Package pk provides a lock-free MVCC primary key index.
Package pk provides a lock-free MVCC primary key index.
quantization
Package quantization provides vector compression techniques for memory reduction.
Package quantization provides vector compression techniques for memory reduction.
resource
Package resource implements the ResourceController for global limits and governance.
Package resource implements the ResourceController for global limits and governance.
searcher
Package searcher provides pooled search context for zero-allocation queries.
Package searcher provides pooled search context for zero-allocation queries.
segment
Package segment defines interfaces for immutable data segments.
Package segment defines interfaces for immutable data segments.
segment/diskann
Package diskann implements a DiskANN/Vamana segment for graph-based ANN search.
Package diskann implements a DiskANN/Vamana segment for graph-based ANN search.
segment/flat
Package flat implements an immutable Flat segment with brute-force search.
Package flat implements an immutable Flat segment with brute-force search.
segment/memtable
Package memtable implements the in-memory L0 segment with HNSW indexing.
Package memtable implements the in-memory L0 segment with HNSW indexing.
simd
Package simd provides SIMD-optimized vector operations.
Package simd provides SIMD-optimized vector operations.
simd/cmd/generator command
Package main is a code generator for vecgo SIMD functions.
Package main is a code generator for vecgo SIMD functions.
vectorstore
Package vectorstore provides a canonical vector storage interface and a high-performance columnar implementation.
Package vectorstore provides a canonical vector storage interface and a high-performance columnar implementation.
Package lexical defines the interface for lexical (keyword) search indexes.
Package lexical defines the interface for lexical (keyword) search indexes.
bm25
Package bm25 provides a BM25-based lexical search index.
Package bm25 provides a BM25-based lexical search index.
Package metadata provides efficient metadata storage and filtering for Vecgo.
Package metadata provides efficient metadata storage and filtering for Vecgo.
Package model defines core types used throughout Vecgo.
Package model defines core types used throughout Vecgo.
Package testutil provides testing utilities for Vecgo.
Package testutil provides testing utilities for Vecgo.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL