Uploading .json files and chunk size

I have a JSON file.json for each item I wish the LLM to know and reason about. However investigating the chunks shows that it is cutting them short and missing key information.
Or is it, I just maybe can’t scroll and see to the very bottom. I can’t afford an upgrade to use the edit chunk button at this point.
How do I know the full file is uploaded and used?

I’ve noticed the same thing now that I can look at the uploaded knowledge file. I’m not really sure whether the arbitrary breaking up of the chunks has a significant adverse effect on the results or not but it does look disconcerting.

I don’t have a technical solution for your challenge, but testing it for information contained at the bottom of the JSON should be one way to see if the Pickaxe is able to access the data.

1 Like

Try uploading the data as a CSV. If you upload a CSV or an excel file with headings that are useful, Pickaxe will conform its chunks to the rows of the document.

So for example, if you have the text in one column, the name of the book in a second column, perhaps some keywords in a third column, pickaxe will take each row and make it a chunk. The first row will be taken as headers. Generally this context can be helpful, and I encourage you to experiment with adding it.

This allows you to take control of the chunking directly if you want to, as opposed to uploading a pdf, where we do the chunking ourselves.

Pickaxe’s chunks are about 300 words long. We find that length to work well. It is open to debate whether longer or shorter would work better.

There might be a hard upper limit somewhere near 1,000 words.

1 Like

It would be nice to be able to see the whole chunk in the UI.

I believe you have to upgrade to pro for that functionality. You can see more here.

You can see the chunks in the Knowledge Explorer. To do this, simply:

  • Go into the Learn tab
  • Click on a document

Then you’ll see all the chunks. You can click into a chunk to expand it if it is too large to see in the table.

Hi Mike, this happens with larger chunks. You can’t scroll it more.

I then had a thought to check the element in the code and it is truncated so it’s a p tag but not all the content is shown. That wouldn’t mean that this is the whole chunk necessarily that is used by the KB. I’m assuming it’s built with react or similar.

Imgur

You can click on the chunk to expand it to the full-size and edit it, but I believe it is only available in the Pro tier.

Not available for regular paid plans.