Pretty sure privateGPT can interact with PDFs
Free Open-Source Artificial Intelligence
Welcome to Free Open-Source Artificial Intelligence!
We are a community dedicated to forwarding the availability and access to:
Free Open Source Artificial Intelligence (F.O.S.A.I.)
More AI Communities
LLM Leaderboards
Developer Resources
GitHub Projects
FOSAI Time Capsule
- The Internet is Healing
- General Resources
- FOSAI Welcome Message
- FOSAI Crash Course
- FOSAI Nexus Resource Hub
- FOSAI LLM Guide
So I either need something like this that I could host myself (is something like that even feasible?)
The closest thing I could find that already exists is GPT4All Chat with LocalDocs Plugin. That basically builds a DB of snippets from your documents and then tries to pick relevant stuff based on your query to provide additional input as part of your prompt to a local LLM. There are details about what it can and can't do further down the page. I have not tested this one myself, but this is something you could experiment with.
Another idea -- if you want to get more into engineering custom tools -- would be to split a document (or documents) you want to interact with into multiple overlapping chunks that fit within the context window (assuming you can get the relevant content out -- PyPDF2's documentation explains why this can be difficult), and then prompt with something like "Does this text contain anything that answers <query>? <chunk>". (May take some experimentation to figure out how to engineer the prompt well.) You could repeat that for each chunk gathering snippets and then do a second pass over all snippets asking the LLM to summarize and/or rate the quality of its own answers (or however you want to combine results).
Basically you would need to give it two prompts: a prompt for the "map" phase that you use to apply to every snippet to try to extract relevant info from each snippet, and a second prompt for the "reduce" phase that combines two answers (which is then chained).
i.e.:
f(a) + f(b) + f(c) + ... + f(z)
where f(a)
is the result of the first extraction on snippet a
and +
means "combine these two snippets using the second prompt". (You can evaluate in whatever order you feel is appropriate -- including in parallel, if you have enough compute power for that.)
If you have enough context space for it, you could include a summary of the previous state of the conversation as part of the prompts in order to get something like an actual conversation with the document going.
No idea how well that would work in practice (probably very slow!), but it might be fun to experiment with.