Vector Annotation Databases (VAD)
An Architecture for Auditable Semantic Retrieval
Ian M. Tepoot • Crafted Logic Lab, Vancouver, BC (Canada)
ORCID: 0009-0004-9067-8049• DOI: 10.5281/zenodo.20498724
ABSTRACT
The term vector database has, in industry practice, come to denote a specific architectural pattern in which high-dimensional embeddings serve as the primary lookup mechanism for stored data. We term this pattern a Vector Index Database (VID). While effective for similarity search, this architecture embeds systemic limitations: retrieval logic is opaque and resistant to audit, the index is coupled to a specific embedding model, and migration between models requires full re-indexing with no guarantee of functional equivalence. This paper proposes an alternative architecture: the Vector Annotation Database (VAD) that inverts the VID relationship. In a VAD, explicit data objects with deterministic semantic metadata form the stable retrieval core; embeddings serve as replaceable annotation layers that enrich rather than constitute the indexing surface. A two-stage retrieval pattern applies Boolean narrowing over named fields, then embedding-based similarity refinement within the narrowed set.
We introduce the audit block, a write-time documentation artifact that closes the chain between opaque vector similarity and regulatory requirements for traceable, challengeable retrieval. The architecture yields stable re-indexing with persistent relational mappings, native multi-embedding concurrency, and operational fallback; the retrieval surface functions independently of the embedding layer. The VAD architecture is grounded in two Hephaestic engineering principles (Tepoot, 2026): Semantic Encoding Density, which demonstrates that natural language tokens are high-compression semantic addresses rather than a lossy interface, and the Semantic Interchange Property, which establishes semantic data portability across transformer architectures. We survey existing vector database products (including Chroma, Pinecone, Weaviate, Milvus, Qdrant, and Vespa) and identify the gap: no current system treats embeddings as systematically replaceable annotations with deterministic traversal, auditability, and migration semantics. VAD names and systematizes this contract. This paper describes technology that is the subject of U.S. Patent Application 64/077,244 (Tepoot, 2026a)
-
Tepoot, I. (2026). Vector Annotation Databases: An Architecture for Auditable Semantic Retrieval. Zenodo. https://doi.org/10.5281/zenodo.20498724