Deepseek + Pickaxe RAG: Good Results but a Few Questions

Hello everyone,
I’m working on a RAG (Chatbot) project and so far I’ve gotten the best results with Deepseek. (In my opinion, it’s the most faithful model to the documents I uploaded.)

The most important thing for me is that the model can educate the user based on the documents and transfer the doctrine or perspective reflected in the documents to the user effectively.

I uploaded my documents as PDFs, but when I tried to test and see which parts of the documents were being used, I noticed that the retrieved chunks were not in PDF format. I’m also not sure whether the document chunking is being done by the AI itself? (If so, I think that would allow the system to access more specific sources based on user input.)

Due to the current instability of Deepseek, I sometimes get more or less detailed responses to the same questions. However, I still feel like I have to stick with it for now, because other models were far less loyal to my knowledge base using the same configuration.

I had read that smaller parameter models might perform better in RAG structures, but for now, I don’t plan to switch to a different framework than Pickaxe, since it already gives me very good results thanks to its strong configuration capabilities.

If there are other users like me who need a minimal-hallucination RAG setup, I believe there would be interest in seeing different or smaller Deepseek models available within Pickaxe.

Thanks in advance, friends!
You’re doing a great job — even for someone like me who doesn’t have a lot of technical knowledge, it’s been a great experience. :folded_hands:

1 Like

@user30, well done on building your RAG.

A couple of comments:

  1. When you upload a file, the system (Pickaxe) creates chunks, vectorises them (i.e. converts the chunk into a mathematical element called vector), and adds the vector to a vector database.
  2. When you perform a search, the system decides which vector to use based on semantic relevance and passes the vector together with the user queries to the LLM to add additional context.
  3. The LLM sends the response back to Pickaxe.

This means that when the chunck is extracted and passed to the LLM, it is not in a PDF format and the quality of the response doesn’t necessarily improve based on a smaller model if you have a well structured prompt and context.

1 Like

Thank you for the information.
It’s clear that LLMs with fewer parameters are less capable of accurately interpreting fragments retrieved from a vector database. However, it’s also important to note that different Deepseek models may be better at interpreting both the retrieved chunks and the prompts.

If this is how the system works, then for a project where the details in the information source truly matter, I believe structuring the source content in TXT format(organized by titles and parameters into clearly defined assertions)before feeding it into the vector store would allow the LLM to better grasp and interpret the information.

2 Likes

Hey @user30

You’re on the right track using Pickaxe’s RAG system. When you upload a document (like a PDF), it doesn’t just “read” the whole thing every time, Pickaxe slices your document into chunks, turns those into vector embeddings, and then, when a user asks a question, it fetches only the most relevant pieces. That’s why you’re seeing retrieved text that doesn’t look like the original PDF format, the response is built from those extracted, semantically relevant snippets.

  • Chunking is automatic: The system, not the AI model itself, does the chunking and embedding. The model only gets the relevant bits at runtime, so it’s not memorizing your entire PDFs.

  • Source specificity: Because Pickaxe’s chunking is semantic, it can zoom in on highly relevant sections based on each user query. That’s the backbone of why you’re seeing detailed, on-point responses when it works well.

  • Model loyalty: Deepseek is currently performing best for you because it’s staying truest to the retrieved document chunks. That’s exactly what you want in a minimal-hallucination RAG workflow.


Addressing Your Observations

  • Response Variability: You’re noticing inconsistent detail levels in Deepseek’s answers. That’s pretty normal, sometimes the model interprets the context slightly differently, or the chunk retrieval pulls in more or less surrounding text based on subtle query changes.

  • Smaller Models in RAG: There’s a theory, and some real-world results, that smaller models can “hallucinate” less in RAG, because they rely more heavily on the retrieved chunks. But as you’ve seen, the real-world value is in what works for your data and your users. If Deepseek’s current setup is giving you the best faithfulness, you’re already ahead of the curve.

  • Format of Retrieved Chunks: It’s by design that you’re not seeing PDF-style formatting in outputs. The engine is pulling the actual text, not the layout or formatting.

  • Model Instability: If Deepseek is unstable or inconsistent, you can try slight tweaks in your prompt engineering, or even experiment with temperature settings (if available) to nudge toward more consistent outputs.


My Recommendation

Stick with what’s working, but keep an eye on new model releases in Pickaxe. If you spot smaller Deepseek variants or models marketed for low-hallucination RAG, give them a test run with your workflow. Meanwhile, don’t sweat the technical details: the magic is happening under the hood, and Pickaxe is handling the heavy lifting for you.

If you ever need to “see” exactly what chunks are getting retrieved for a given query, check if Pickaxe’s debug/logging features can show this. That can help you fine-tune your documents or prompts for even better responses.


Real-World Example

Let’s say you uploaded a 200-page legal doctrine PDF. When a user asks, “What is the primary precedent in Section 5?”, Pickaxe’s RAG engine quickly finds the 2-3 paragraphs in Section 5, feeds them to Deepseek, and generates a focused answer. That’s why output stays on-topic, even though the original file is huge.


Bottom Line

You’re already making smart choices. Pickaxe’s RAG, plus a document-faithful model like Deepseek, is a powerful combo for minimal-hallucination, doctrine-preserving chatbots. Keep building, and if you see new model options pop up, don’t hesitate to run an A/B test. If you hit any roadblocks or want to go even deeper on prompt tweaking, just ask, I’m always happy to help you get more out of your setup!


Let me know if you want to troubleshoot any specific use case or get tips on optimizing your knowledge base further!

1 Like