Learn how to build a custom retrieval-augmented generation (RAG) system for code using vector databases like LanceDB, embeddings models like voyage-code-3, and chunking strategies for better code search and context
voyage-code-3
, which will give the most accurate answers of any existing embeddings model for code. You can obtain an API key here. Because their API is OpenAI-compatible, you can use any OpenAI client by swapping out the URL.
voyage-code-3
, it has a maximum context length of 16,000 tokens, which is enough to fit most files. This means that in the beginning you can get away with a more naive strategy of truncating files that exceed the limit. In order of easiest to most comprehensive, 3 chunking strategies you can use are:
contextProviders
array in your configuration:
rerank-2
model from Voyage AI, which has examples of usage here.