Regarding files types, you can train the models with most file types (PDFs, TXTs, DOC, webpages, even youtube videos).
Regarding training data amount, there’s no single right answer. You should upload all the information you want to train it on. But as a general rule of thumb, less is often more. Don’t give it too many empty calories. Instead of uploading 50 blog posts that cover similar material, I would recommend uploading the dozen most information-dense blog posts. As an example of why less can be more, imagine trying to answer a question drawing from 12 sources instead of just one.
Regarding uploading CSVs, yes, you can upload CSVs into the knowledge base. We actually handle CSVs in a special way.