This is a question I see a lot of people ask that I waned to answer on the record.
So in the builder, you can switch on a toggle that says “Allow users to upload files”. This will allow end-users to put in documents. In addition, they can already drop in weblinks. So how are these big inputs handled?
End-user uploads is flexible and allows people to drop in small documents (like a resume) and large documents (like a 500-page book). The pickaxe can handle both cases. It will read the entirety of small documents and turn large documents into vector embeddings. The way end-user uploads functions actually allows both situations.
Here’s a rundown of how it works:
-
The end-user upload process always looks the maximum input length setting of a Pickaxe. The owner can set this to be 500 tokens or 50,000 tokens.
-
Then it looks at the size of the document. This could be 300 tokens, 30,000 tokens, or 3,000,000 tokens.
-
If the document fits into size of the maximum input length, then the system dumps the entire document’s content into the conversation. No vector embeddings.
-
If the document does not fit into the size of the maximum input length, it’s turned into vector embeddings and accessed via a RAG system.
The takeaway for Pickaxe users is you can select which process you prefer based on your use case by increasing the the size of the maximum input length.