How are files uploaded by the end-user handled? (an answer)

This is a question I see a lot of people ask that I waned to answer on the record.

So in the builder, you can switch on a toggle that says “Allow users to upload files”. This will allow end-users to put in documents. In addition, they can already drop in weblinks. So how are these big inputs handled?

End-user uploads is flexible and allows people to drop in small documents (like a resume) and large documents (like a 500-page book). The pickaxe can handle both cases. It will read the entirety of small documents and turn large documents into vector embeddings. The way end-user uploads functions actually allows both situations.

Here’s a rundown of how it works:

  • The end-user upload process always looks the maximum input length setting of a Pickaxe. The owner can set this to be 500 tokens or 50,000 tokens.

  • Then it looks at the size of the document. This could be 300 tokens, 30,000 tokens, or 3,000,000 tokens.

  • If the document fits into size of the maximum input length, then the system dumps the entire document’s content into the conversation. No vector embeddings.

  • If the document does not fit into the size of the maximum input length, it’s turned into vector embeddings and accessed via a RAG system.

The takeaway for Pickaxe users is you can select which process you prefer based on your use case by increasing the the size of the maximum input length.

2 Likes

My form-based tool let’s end users upload multiple files for analysis. However, the results appear to be analyzing only one of the uploaded files (vs. all of them). Is this a bug? Do I need to update my instructions to say upload only one file?

On Forms, end-users should only be able to upload one file. When they upload a new file it should replace the existing file. I have investigated your issue and found it is a bug. I will fix it.

1 Like