In the Pickaxe implementation of AI and knowledgebases, to what extent will responses be drawn from the knowledgebase vs. the broad knowledge gained in the LLM’s prior training?
If a prompt instructs a pickaxe to use only its knowledge base when formulating responses, will it obey?
The model will ALWAYS draw from its training data. It must do this. Everything the LLM generates is a mathematical probability of its training data. If you’re interested in how this process works, you can watch this layman’s explanation of how LLMs work.
Your Pickaxe draws from the Knowledge Base via a system called “RAG” or Retrieval Augmented Generation. Here’s a quick explanation:
The Knowledge Base in Pickaxe is a system that lets you upload documents that would normally be too large for an AI model to read. This information is used by your Pickaxe to better inform its answers. The Knowledge Base system is a RAG (Retrieval Augmented Generation) system. When you upload a file, our system breaks it into small uniform chunks, turns those into vector embeddings, and stores them for fast semantic search. Importantly, your Pickaxe does not read or memorize the entire document every time it answers a question. Instead, when an end-user sends a message, the system scans all the stored chunks, scores them based on relevance, and selects only the most relevant ones—usually just a few paragraphs. These chunks are then inserted into the chatbot’s context so it can answer accurately and efficiently. In other words, you can upload millions of words into the Knowledge Base, and your Pickaxe will look at the most relevant few thousand each time it answers a question.