Documentation
¶
Overview ¶
Package quantization provides vector compression techniques for memory reduction.
Vecgo supports six quantization methods:
- Binary Quantization (BQ): 32x compression (1 bit per dimension)
- RaBitQ: Randomized Binary Quantization with Norm Correction (Better Recall than BQ)
- Scalar Quantization (SQ8): 4x compression (1 byte per dimension)
- INT4 Quantization: 8x compression (4 bits per dimension)
- Product Quantization (PQ): 8-64x compression
- Optimized Product Quantization (OPQ): 8-64x compression with learned rotation
Architecture ¶
┌────────────────────────────────────────────────────────────────────┐ │ Quantization Methods │ ├──────────┬──────────┬──────────┬──────────┬──────────┬────────────┤ │ BQ │ RaBitQ │ SQ8 │ INT4 │ PQ │ OPQ │ │ (1 bit) │ (1 bit+) │ (8 bit) │ (4 bit) │ (8 bit) │ (8 bit+R) │ ├──────────┴──────────┴──────────┴──────────┴──────────┴────────────┤ │ SIMD Distance Functions │ │ Hamming (POPCNT) │ L2/Dot (AVX-512/NEON) │ ADC Lookup │ └────────────────────────────────────────────────────────────────────┘
Binary Quantization ¶
Compresses vectors to 1 bit per dimension using sign-based encoding:
bq := quantization.NewBinaryQuantizer(128) code, _ := bq.Encode(vector) // 128 floats → 16 bytes dist := HammingDistance(code1, code2) // Hamming distance
Performance:
- Compression: 32x (128-dim float32: 512 bytes → 16 bytes)
- Distance: Ultra-fast (POPCNT instruction)
- Accuracy: 70-85% recall (best for pre-filtering)
RaBitQ (Randomized Binary Quantization) ¶
Improvements over standard BQ by storing the vector norm and using a modified distance estimator:
rq := quantization.NewRaBitQuantizer(128) code, _ := rq.Encode(vector) // 128 floats → 20 bytes (16 binary + 4 norm) dist, _ := rq.Distance(query, code) // Norm-corrected L2 approximation
Trade-off: Slightly larger storage (+4 bytes per vector) for significantly better L2 distance approximation.
Scalar Quantization (SQ8) ¶
Compresses each dimension to 8 bits using per-dimension min/max normalization:
sq := quantization.NewScalarQuantizer(128) sq.Train(trainingVectors) code, _ := sq.Encode(vector) // 128 floats → 128 bytes dist, _ := sq.L2Distance(query, code)
Performance:
- Compression: 4x (float32 → uint8)
- Accuracy: 95-99% recall (excellent for most use cases)
INT4 Quantization ¶
Compresses each dimension to 4 bits (2 dimensions per byte):
iq := quantization.NewInt4Quantizer(128) iq.Train(trainingVectors) code, _ := iq.Encode(vector) // 128 floats → 64 bytes dist, _ := iq.L2Distance(query, code)
Performance:
- Compression: 8x (float32 → 4-bit)
- Accuracy: 90-95% recall
Product Quantization (PQ) ¶
Splits vector into subvectors and quantizes each independently using k-means:
pq, _ := quantization.NewProductQuantizer(128, 8, 256) // dim=128, M=8, K=256 pq.Train(trainingVectors) code, _ := pq.Encode(vector) // 128 floats → 8 bytes
Parameters:
- dimension: Vector dimensionality (must be divisible by numSubvectors)
- numSubvectors (M): How many splits (typically 8-16)
- numCentroids (K): Codebook size per subvector (typically 256 for uint8)
Memory reduction:
- 128-dim float32 = 512 bytes
- PQ(8, 256) = 8 bytes (64x compression)
- PQ(16, 256) = 16 bytes (32x compression)
Accuracy: 90-95% recall
Optimized Product Quantization (OPQ) ¶
Learns block-diagonal rotation matrices before PQ to improve reconstruction quality:
opq, _ := quantization.NewOptimizedProductQuantizer(128, 8, 256, 10) // +10 iterations opq.Train(trainingVectors) code, _ := opq.Encode(vector) // 128 floats → 8 bytes
Benefits vs PQ:
- 20-30% better reconstruction quality
- Same compression ratio
- Higher training cost (alternating optimization with SVD)
Accuracy: 93-97% recall
Quantization Comparison ¶
| Method | Compression | Recall | Speed | Use Case | |---------|-------------|---------|----------|-----------------------| | None | 1x | 100% | Baseline | Default | | Binary | 32x | 70-85% | Fastest | Pre-filtering | | RaBitQ | ~30x | 80-90% | Fast | Better BQ alternative | | SQ8 | 4x | 95-99% | Fast | General purpose | | INT4 | 8x | 90-95% | Fast | Memory-constrained | | PQ | 8-64x | 90-95% | Medium | High compression | | OPQ | 8-64x | 93-97% | Slower | High recall + low mem |
Thread Safety ¶
All quantizers are safe for concurrent read operations (Encode, Decode, Distance) after training. Training (Train) must be single-threaded or externally synchronized.
Serialization ¶
ScalarQuantizer and Int4Quantizer implement encoding.BinaryMarshaler/BinaryUnmarshaler for persistence. ProductQuantizer uses SetCodebooks/Codebooks for manual serialization.
Index ¶
- func HammingDistance(a, b []uint64) int
- func HammingDistanceBytes(a, b []byte) int
- func NormalizedHammingDistance(a, b []uint64, dimension int) float32
- type BinaryQuantizer
- func (bq *BinaryQuantizer) BytesPerDimension() int
- func (bq *BinaryQuantizer) BytesTotal() int
- func (bq *BinaryQuantizer) CompressionRatio() float32
- func (bq *BinaryQuantizer) ComputeHammingDistance(query []float32, codes []uint64) (int, error)
- func (bq *BinaryQuantizer) Decode(b []byte) ([]float32, error)
- func (bq *BinaryQuantizer) Dimension() int
- func (bq *BinaryQuantizer) Encode(v []float32) ([]byte, error)
- func (bq *BinaryQuantizer) EncodeUint64(v []float32) ([]uint64, error)
- func (bq *BinaryQuantizer) EncodeUint64Into(dst []uint64, v []float32) error
- func (bq *BinaryQuantizer) IsTrained() bool
- func (bq *BinaryQuantizer) Threshold() float32
- func (bq *BinaryQuantizer) Train(vectors [][]float32) error
- func (bq *BinaryQuantizer) WithThreshold(threshold float32) *BinaryQuantizer
- type Int4Quantizer
- func (q *Int4Quantizer) BytesPerDimension() int
- func (q *Int4Quantizer) Decode(b []byte) ([]float32, error)
- func (q *Int4Quantizer) Encode(v []float32) ([]byte, error)
- func (q *Int4Quantizer) L2Distance(query []float32, code []byte) (float32, error)
- func (q *Int4Quantizer) L2DistanceBatch(query []float32, codes []byte, n int, out []float32) error
- func (q *Int4Quantizer) MarshalBinary() ([]byte, error)
- func (q *Int4Quantizer) Train(vectors [][]float32) error
- func (q *Int4Quantizer) UnmarshalBinary(data []byte) error
- type OptimizedProductQuantizer
- func (opq *OptimizedProductQuantizer) BytesPerVector() int
- func (opq *OptimizedProductQuantizer) CompressionRatio() float64
- func (opq *OptimizedProductQuantizer) ComputeAsymmetricDistance(query []float32, codes []byte) (float32, error)
- func (opq *OptimizedProductQuantizer) Decode(codes []byte) ([]float32, error)
- func (opq *OptimizedProductQuantizer) Encode(vec []float32) ([]byte, error)
- func (opq *OptimizedProductQuantizer) IsTrained() bool
- func (opq *OptimizedProductQuantizer) Train(vectors [][]float32) error
- type ProductQuantizer
- func (pq *ProductQuantizer) AdcDistance(table []float32, codes []byte) (float32, error)
- func (pq *ProductQuantizer) BuildDistanceTable(query []float32) ([]float32, error)
- func (pq *ProductQuantizer) BytesPerVector() int
- func (pq *ProductQuantizer) Codebooks() ([]int8, []float32, []float32)
- func (pq *ProductQuantizer) CompressionRatio() float64
- func (pq *ProductQuantizer) ComputeAsymmetricDistance(query []float32, codes []byte) (float32, error)
- func (pq *ProductQuantizer) Decode(codes []byte) ([]float32, error)
- func (pq *ProductQuantizer) Encode(vec []float32) ([]byte, error)
- func (pq *ProductQuantizer) IsTrained() bool
- func (pq *ProductQuantizer) NumCentroids() int
- func (pq *ProductQuantizer) NumSubvectors() int
- func (pq *ProductQuantizer) SetCodebooks(codebooks []int8, scales, offsets []float32)
- func (pq *ProductQuantizer) Train(vectors [][]float32) error
- type Quantizer
- type RaBitQuantizer
- func (rq *RaBitQuantizer) BytesPerDimension() int
- func (rq *RaBitQuantizer) BytesTotal() int
- func (rq *RaBitQuantizer) Decode(b []byte) ([]float32, error)
- func (rq *RaBitQuantizer) Distance(query []float32, code []byte) (float32, error)
- func (rq *RaBitQuantizer) Encode(v []float32) ([]byte, error)
- func (rq *RaBitQuantizer) Train(vectors [][]float32) error
- type ScalarQuantizer
- func (sq *ScalarQuantizer) BytesPerDimension() int
- func (sq *ScalarQuantizer) CompressionRatio() float64
- func (sq *ScalarQuantizer) Decode(b []byte) ([]float32, error)
- func (sq *ScalarQuantizer) DecodeInto(b []byte, dst []float32)
- func (sq *ScalarQuantizer) DotProduct(q []float32, code []byte) (float32, error)
- func (sq *ScalarQuantizer) Encode(v []float32) ([]byte, error)
- func (sq *ScalarQuantizer) EncodeInto(v []float32, dst []byte)
- func (sq *ScalarQuantizer) L2Distance(q []float32, code []byte) (float32, error)
- func (sq *ScalarQuantizer) L2DistanceBatch(q []float32, codes []byte, n int, out []float32) error
- func (sq *ScalarQuantizer) MarshalBinary() ([]byte, error)
- func (sq *ScalarQuantizer) Max(dim int) float32
- func (sq *ScalarQuantizer) Maxs() []float32
- func (sq *ScalarQuantizer) Min(dim int) float32
- func (sq *ScalarQuantizer) Mins() []float32
- func (sq *ScalarQuantizer) QuantizationError() float32
- func (sq *ScalarQuantizer) SetBounds(mins, maxs []float32) error
- func (sq *ScalarQuantizer) Train(vectors [][]float32) error
- func (sq *ScalarQuantizer) UnmarshalBinary(data []byte) error
- type Type
Constants ¶
This section is empty.
Variables ¶
This section is empty.
Functions ¶
func HammingDistance ¶
HammingDistance computes the Hamming distance between two binary-encoded vectors. This counts the number of bit positions where the vectors differ. Uses POPCNT (population count) for maximum performance.
func HammingDistanceBytes ¶
HammingDistanceBytes computes the Hamming distance between two byte slices. This is a convenience wrapper that converts bytes to uint64 for POPCNT.
func NormalizedHammingDistance ¶
NormalizedHammingDistance returns Hamming distance normalized to [0, 1]. This is useful for comparing with other distance metrics.
Types ¶
type BinaryQuantizer ¶
type BinaryQuantizer struct {
// contains filtered or unexported fields
}
BinaryQuantizer implements binary quantization (1-bit per dimension). It compresses float32 vectors (4 bytes/dim) to bits (0.125 bytes/dim) for 32x memory savings.
Binary quantization uses a simple threshold: values >= threshold become 1, otherwise 0. Distance is computed using Hamming distance (popcount of XOR), which is extremely fast on modern CPUs using the POPCNT instruction.
Trade-offs:
- 32x compression ratio (vs float32)
- Very fast distance computation (Hamming via POPCNT)
- Significant accuracy loss for fine-grained similarity
- Best used for coarse filtering or with reranking
func NewBinaryQuantizer ¶
func NewBinaryQuantizer(dimension int) *BinaryQuantizer
NewBinaryQuantizer creates a new binary quantizer for the given dimension. The default threshold is 0.0 (sign-based quantization).
func (*BinaryQuantizer) BytesPerDimension ¶
func (bq *BinaryQuantizer) BytesPerDimension() int
BytesPerDimension returns the storage size per dimension. For binary quantization, this is effectively 1 bit (0.125 bytes). This method returns 0 to indicate sub-byte storage, but BytesTotal() should be used for actual size.
func (*BinaryQuantizer) BytesTotal ¶
func (bq *BinaryQuantizer) BytesTotal() int
BytesTotal returns the total storage size for a vector.
func (*BinaryQuantizer) CompressionRatio ¶
func (bq *BinaryQuantizer) CompressionRatio() float32
CompressionRatio returns the compression ratio vs float32 storage. For binary quantization, this is always 32x.
func (*BinaryQuantizer) ComputeHammingDistance ¶
func (bq *BinaryQuantizer) ComputeHammingDistance(query []float32, codes []uint64) (int, error)
ComputeHammingDistance computes the Hamming distance between a float32 query and binary codes. It uses a pooled buffer for the quantized query to avoid allocations.
func (*BinaryQuantizer) Decode ¶
func (bq *BinaryQuantizer) Decode(b []byte) ([]float32, error)
Decode reconstructs a float32 vector from binary representation. Note: This is a lossy reconstruction - values are either threshold-0.5 or threshold+0.5.
func (*BinaryQuantizer) Dimension ¶
func (bq *BinaryQuantizer) Dimension() int
Dimension returns the expected vector dimension.
func (*BinaryQuantizer) Encode ¶
func (bq *BinaryQuantizer) Encode(v []float32) ([]byte, error)
Encode quantizes a float32 vector to binary representation. Each dimension is converted to a single bit based on the threshold. The result is packed into uint64 words for efficient storage and POPCNT operations.
Storage format: ceil(dimension / 64) uint64 words, little-endian bit packing.
func (*BinaryQuantizer) EncodeUint64 ¶
func (bq *BinaryQuantizer) EncodeUint64(v []float32) ([]uint64, error)
EncodeUint64 quantizes a float32 vector to packed uint64 words. This is more efficient for distance computation as it avoids byte-to-uint64 conversion.
func (*BinaryQuantizer) EncodeUint64Into ¶
func (bq *BinaryQuantizer) EncodeUint64Into(dst []uint64, v []float32) error
EncodeUint64Into quantizes a float32 vector into an existing uint64 slice. The destination slice must be large enough to hold the quantized vector.
func (*BinaryQuantizer) IsTrained ¶
func (bq *BinaryQuantizer) IsTrained() bool
IsTrained returns whether the quantizer has been trained.
func (*BinaryQuantizer) Threshold ¶
func (bq *BinaryQuantizer) Threshold() float32
Threshold returns the current threshold value.
func (*BinaryQuantizer) Train ¶
func (bq *BinaryQuantizer) Train(vectors [][]float32) error
Train calibrates the quantizer by computing the mean value across all vectors. The mean is used as the threshold for binary encoding.
func (*BinaryQuantizer) WithThreshold ¶
func (bq *BinaryQuantizer) WithThreshold(threshold float32) *BinaryQuantizer
WithThreshold sets a custom threshold for binary encoding. Values >= threshold become 1, values < threshold become 0.
type Int4Quantizer ¶
type Int4Quantizer struct {
// contains filtered or unexported fields
}
Int4Quantizer implements 4-bit scalar quantization. It compresses float32 vectors (4 bytes/dim) to 4 bits (0.5 byte/dim) for 8x memory savings. Two dimensions are packed into a single byte.
func NewInt4Quantizer ¶
func NewInt4Quantizer(dim int) *Int4Quantizer
NewInt4Quantizer creates a new Int4Quantizer.
func (*Int4Quantizer) BytesPerDimension ¶
func (q *Int4Quantizer) BytesPerDimension() int
func (*Int4Quantizer) Decode ¶
func (q *Int4Quantizer) Decode(b []byte) ([]float32, error)
Decode reconstructs the vector.
func (*Int4Quantizer) Encode ¶
func (q *Int4Quantizer) Encode(v []float32) ([]byte, error)
Encode quantizes a vector to 4-bit packed bytes.
func (*Int4Quantizer) L2Distance ¶
func (q *Int4Quantizer) L2Distance(query []float32, code []byte) (float32, error)
L2Distance computes squared L2 distance between query and compressed vector. Uses SIMD-optimized precomputed lookup tables for fast distance calculation.
func (*Int4Quantizer) L2DistanceBatch ¶
L2DistanceBatch computes squared L2 distance for a batch of codes. Uses SIMD-optimized batch computation for better cache locality.
func (*Int4Quantizer) MarshalBinary ¶
func (q *Int4Quantizer) MarshalBinary() ([]byte, error)
MarshalBinary serializes the quantizer state.
func (*Int4Quantizer) Train ¶
func (q *Int4Quantizer) Train(vectors [][]float32) error
Train calculates min/max ranges for quantization.
func (*Int4Quantizer) UnmarshalBinary ¶
func (q *Int4Quantizer) UnmarshalBinary(data []byte) error
UnmarshalBinary deserializes the quantizer state.
type OptimizedProductQuantizer ¶
type OptimizedProductQuantizer struct {
// contains filtered or unexported fields
}
OptimizedProductQuantizer implements Block Optimized Product Quantization (OPQ). It splits vectors into subspaces (blocks) and learns an optimal rotation for each block to minimize quantization error. This is a "Parametric" OPQ implementation where the rotation matrix is constrained to be block-diagonal.
func NewOptimizedProductQuantizer ¶
func NewOptimizedProductQuantizer(dimension, numSubvectors, numCentroids, numIterations int) (*OptimizedProductQuantizer, error)
NewOptimizedProductQuantizer creates a new OPQ quantizer. It automatically selects a block size for the rotation matrices.
func (*OptimizedProductQuantizer) BytesPerVector ¶
func (opq *OptimizedProductQuantizer) BytesPerVector() int
BytesPerVector returns the compressed size per vector in bytes.
func (*OptimizedProductQuantizer) CompressionRatio ¶
func (opq *OptimizedProductQuantizer) CompressionRatio() float64
CompressionRatio returns the theoretical compression ratio.
func (*OptimizedProductQuantizer) ComputeAsymmetricDistance ¶
func (opq *OptimizedProductQuantizer) ComputeAsymmetricDistance(query []float32, codes []byte) (float32, error)
ComputeAsymmetricDistance computes distance between a query and OPQ codes.
func (*OptimizedProductQuantizer) Decode ¶
func (opq *OptimizedProductQuantizer) Decode(codes []byte) ([]float32, error)
Decode reconstructs a vector.
func (*OptimizedProductQuantizer) Encode ¶
func (opq *OptimizedProductQuantizer) Encode(vec []float32) ([]byte, error)
Encode quantizes a vector using OPQ.
func (*OptimizedProductQuantizer) IsTrained ¶
func (opq *OptimizedProductQuantizer) IsTrained() bool
IsTrained returns true if trained.
func (*OptimizedProductQuantizer) Train ¶
func (opq *OptimizedProductQuantizer) Train(vectors [][]float32) error
Train calibrates the OPQ quantizer using alternating optimization.
type ProductQuantizer ¶
type ProductQuantizer struct {
// contains filtered or unexported fields
}
ProductQuantizer implements Product Quantization (PQ) for 8-32x compression. PQ splits vectors into subvectors and quantizes each independently using k-means clustering.
Example: 128-dim vector with M=8 subvectors → 8 uint8 codes = 8 bytes (16x compression vs float32)
func NewProductQuantizer ¶
func NewProductQuantizer(dimension, numSubvectors, numCentroids int) (*ProductQuantizer, error)
NewProductQuantizer creates a new PQ quantizer. Parameters:
- dimension: Vector dimensionality (must be divisible by numSubvectors)
- numSubvectors: Number of subvectors to split into (M, typically 8, 16, or 32)
- numCentroids: Number of centroids per subspace (K, typically 256 for uint8 codes)
func (*ProductQuantizer) AdcDistance ¶
func (pq *ProductQuantizer) AdcDistance(table []float32, codes []byte) (float32, error)
AdcDistance computes the approximate distance between a query (represented by the distance table) and a quantized vector (represented by codes).
func (*ProductQuantizer) BuildDistanceTable ¶
func (pq *ProductQuantizer) BuildDistanceTable(query []float32) ([]float32, error)
BuildDistanceTable precomputes distances from a query to all centroids. Returns a flattened table of size M * K where table[m*K + k] is the squared distance from query subvector m to centroid k. This enables fast ADC computation using SIMD.
func (*ProductQuantizer) BytesPerVector ¶
func (pq *ProductQuantizer) BytesPerVector() int
BytesPerVector returns the compressed size per vector in bytes.
func (*ProductQuantizer) Codebooks ¶
func (pq *ProductQuantizer) Codebooks() ([]int8, []float32, []float32)
Codebooks returns the PQ codebooks and quantization parameters. Returns flat codebooks (M * K * subvectorDim), scales (M), and offsets (M).
func (*ProductQuantizer) CompressionRatio ¶
func (pq *ProductQuantizer) CompressionRatio() float64
CompressionRatio returns the theoretical compression ratio.
func (*ProductQuantizer) ComputeAsymmetricDistance ¶
func (pq *ProductQuantizer) ComputeAsymmetricDistance(query []float32, codes []byte) (float32, error)
ComputeAsymmetricDistance computes distance between a query vector and PQ codes. This is asymmetric distance computation (ADC) - query is full precision, database is quantized. Much faster than decoding and computing full distance.
func (*ProductQuantizer) Decode ¶
func (pq *ProductQuantizer) Decode(codes []byte) ([]float32, error)
Decode reconstructs an approximate vector from PQ codes.
func (*ProductQuantizer) Encode ¶
func (pq *ProductQuantizer) Encode(vec []float32) ([]byte, error)
Encode quantizes a vector into PQ codes. Returns M uint8 codes (one per subvector).
func (*ProductQuantizer) IsTrained ¶
func (pq *ProductQuantizer) IsTrained() bool
IsTrained returns whether the quantizer has been trained.
func (*ProductQuantizer) NumCentroids ¶
func (pq *ProductQuantizer) NumCentroids() int
NumCentroids returns the number of centroids per subspace (K).
func (*ProductQuantizer) NumSubvectors ¶
func (pq *ProductQuantizer) NumSubvectors() int
NumSubvectors returns the number of subvectors (M).
func (*ProductQuantizer) SetCodebooks ¶
func (pq *ProductQuantizer) SetCodebooks(codebooks []int8, scales, offsets []float32)
SetCodebooks sets the PQ codebooks directly (for loading from disk).
func (*ProductQuantizer) Train ¶
func (pq *ProductQuantizer) Train(vectors [][]float32) error
Train calibrates the PQ quantizer using k-means clustering on training vectors. This must be called before Encode/Decode.
type Quantizer ¶
type Quantizer interface {
// Encode quantizes a float32 vector to compressed representation
Encode(v []float32) ([]byte, error)
// Decode reconstructs a float32 vector from quantized representation
Decode(b []byte) ([]float32, error)
// Train calibrates the quantizer on a set of vectors (optional for some quantizers)
Train(vectors [][]float32) error
// BytesPerDimension returns the storage size per dimension
BytesPerDimension() int
}
Quantizer defines the interface for vector quantization methods.
type RaBitQuantizer ¶
type RaBitQuantizer struct {
// contains filtered or unexported fields
}
RaBitQuantizer implements Randomized Binary Quantization (RaBitQ). It improves upon standard Binary Quantization by preserving vector magnitudes and correcting distance estimates.
Reference: "RaBitQ: Quantizing High-Dimensional Vectors with a Single Bit per Dimension"
The basic idea is: L2^2(x, y) = ||x||^2 + ||y||^2 - 2<x, y> We approximate <x, y> using Hamming distance of binary codes and the norms. <x, y> ≈ (||x|| * ||y|| / Dim) * (Dim - 2 * Hamming(Bx, By))
Storage format: [Binary Code (Dim/8 bytes)] + [Norm (4 bytes float32)]
func NewRaBitQuantizer ¶
func NewRaBitQuantizer(dimension int) *RaBitQuantizer
NewRaBitQuantizer creates a new RaBitQ quantizer.
func (*RaBitQuantizer) BytesPerDimension ¶
func (rq *RaBitQuantizer) BytesPerDimension() int
func (*RaBitQuantizer) BytesTotal ¶
func (rq *RaBitQuantizer) BytesTotal() int
func (*RaBitQuantizer) Decode ¶
func (rq *RaBitQuantizer) Decode(b []byte) ([]float32, error)
Decode reconstructs a float32 vector (approximate). It restores magnitude but direction is quantized to binary hypercube vertices.
func (*RaBitQuantizer) Distance ¶
func (rq *RaBitQuantizer) Distance(query []float32, code []byte) (float32, error)
Distance computes the approximate L2 distance squared between a query vector and a quantized vector.
func (*RaBitQuantizer) Encode ¶
func (rq *RaBitQuantizer) Encode(v []float32) ([]byte, error)
Encode quantizes a float32 vector to RaBitQ representation. Returns binary code followed by float32 norm (little endian).
func (*RaBitQuantizer) Train ¶
func (rq *RaBitQuantizer) Train(vectors [][]float32) error
Train is a no-op for basic RaBitQ (unless we add rotation learning).
type ScalarQuantizer ¶
type ScalarQuantizer struct {
// contains filtered or unexported fields
}
ScalarQuantizer implements 8-bit scalar quantization. It compresses float32 vectors (4 bytes/dim) to uint8 (1 byte/dim) for 4x memory savings.
This implementation uses per-dimension min/max values to maximize precision, which significantly improves recall compared to global min/max.
func NewScalarQuantizer ¶
func NewScalarQuantizer(dimension int) *ScalarQuantizer
NewScalarQuantizer creates a new 8-bit scalar quantizer for the given dimension.
func (*ScalarQuantizer) BytesPerDimension ¶
func (sq *ScalarQuantizer) BytesPerDimension() int
BytesPerDimension returns 1 (uint8 storage).
func (*ScalarQuantizer) CompressionRatio ¶
func (sq *ScalarQuantizer) CompressionRatio() float64
CompressionRatio returns the memory compression ratio (always 4.0 for 8-bit quantization).
func (*ScalarQuantizer) Decode ¶
func (sq *ScalarQuantizer) Decode(b []byte) ([]float32, error)
Decode reconstructs a float32 vector from quantized representation.
func (*ScalarQuantizer) DecodeInto ¶
func (sq *ScalarQuantizer) DecodeInto(b []byte, dst []float32)
DecodeInto reconstructs a float32 vector into a pre-allocated slice. dst must have length >= len(b). Caller is responsible for dimension validation. This is the zero-allocation hot path for batch decoding.
func (*ScalarQuantizer) DotProduct ¶
func (sq *ScalarQuantizer) DotProduct(q []float32, code []byte) (float32, error)
DotProduct computes the dot product between a float32 query and a quantized vector.
func (*ScalarQuantizer) Encode ¶
func (sq *ScalarQuantizer) Encode(v []float32) ([]byte, error)
Encode quantizes a float32 vector to 8-bit representation. Each dimension is linearly mapped from [min, max] to [0, 255].
func (*ScalarQuantizer) EncodeInto ¶
func (sq *ScalarQuantizer) EncodeInto(v []float32, dst []byte)
EncodeInto quantizes a float32 vector into a pre-allocated byte slice. dst must have length >= len(v). Caller is responsible for dimension validation. This is the zero-allocation hot path for batch encoding.
func (*ScalarQuantizer) L2Distance ¶
func (sq *ScalarQuantizer) L2Distance(q []float32, code []byte) (float32, error)
L2Distance computes the squared L2 distance between a float32 query and a quantized vector.
func (*ScalarQuantizer) L2DistanceBatch ¶
L2DistanceBatch computes the squared L2 distance between a float32 query and a batch of quantized vectors.
func (*ScalarQuantizer) MarshalBinary ¶
func (sq *ScalarQuantizer) MarshalBinary() ([]byte, error)
MarshalBinary implements encoding.BinaryMarshaler. Format (little-endian): [dimension:uint32] [min_0:float32][max_0:float32]...[min_n:float32][max_n:float32]
func (*ScalarQuantizer) Max ¶
func (sq *ScalarQuantizer) Max(dim int) float32
Max returns the maximum value used for quantization for a specific dimension.
func (*ScalarQuantizer) Maxs ¶
func (sq *ScalarQuantizer) Maxs() []float32
Maxs returns the per-dimension maximum values.
func (*ScalarQuantizer) Min ¶
func (sq *ScalarQuantizer) Min(dim int) float32
Min returns the minimum value used for quantization for a specific dimension.
func (*ScalarQuantizer) Mins ¶
func (sq *ScalarQuantizer) Mins() []float32
Mins returns the per-dimension minimum values.
func (*ScalarQuantizer) QuantizationError ¶
func (sq *ScalarQuantizer) QuantizationError() float32
QuantizationError estimates the average quantization error per dimension. This is a theoretical lower bound assuming uniform distribution.
func (*ScalarQuantizer) SetBounds ¶
func (sq *ScalarQuantizer) SetBounds(mins, maxs []float32) error
SetBounds initializes the quantizer with pre-computed bounds.
func (*ScalarQuantizer) Train ¶
func (sq *ScalarQuantizer) Train(vectors [][]float32) error
Train calibrates the quantizer by finding min/max values per dimension across all vectors.
func (*ScalarQuantizer) UnmarshalBinary ¶
func (sq *ScalarQuantizer) UnmarshalBinary(data []byte) error
UnmarshalBinary implements encoding.BinaryUnmarshaler.