Something I'm finding very useful recently: using LLMs to build personal knowledge bases for various topics of research interest. In this way, a large fraction of my recent token throughput is going less into manipulating code, and more into manipulating knowledge (stored as markdown and images)...
Open source, agentic knowledge bases for all of humanity's knowledge
Alpha Research builds open source agentic knowledge bases. Our mission is to help knowledge workers work 100x faster. We launched and open sourced our first platform, Alpha Book on a mirror of the Project Gutenberg dataset (75,000 books).
Knowledge work is bottlenecked in two key ways: finding evidence and testing hypotheses. We make both 100x faster. We extend keyword search and RAG with general coding agents, which use filesystem tools like ripgrep and reason iteratively to find broad evidence. Coding agents also write ad-hoc scripts to label datasets and test hypotheses.
Our timeline is simple: repeat Alpha Book across every public dataset of humanity's knowledge. Next we'll launch Alpha Econ with US Census, BLS, NBER, etc datasets. Alpha Justice with US Supreme Court Cases, and so on. We support museums, research libraries, historical societies, and archives. We also do custom integration work for enterprise.
Powering
Knowledge bases allow agents to work like knowledge workers
When to use Alpha Research
- Search. Looking for evidence? We compile and index public plaintext and image datasets for you to access.
- Hypothesis testing. Suppose you're a historian. You are writing a thesis on grief in 19th century fiction. Alpha Research finds evidence for you. Suppose you're an economist. You might have a thesis "Periods of compressed time-to-sale in housing correlate with above-ask transactions and predict mean reversion within 18 months." Alpha Research runs that experiment for you.
- Agentic knowledge workers. Autonomous research loops that plan → retrieve → filter → synthesize → verify
- Fandom, wikis. Online communities have years of canon. Drive higher quality engagement and help onboard new members to your community by building a knowledge base for a Substack or a podcast.
- Alpha Search is open source. We are proud to support research institutes, museums, presidential libraries, historical societies, and enterprise.
FAQ
What does Alpha Research do?
We build open source tooling for knowledge workers to work faster. The Alpha Research platform enables agentic search and hypothesis testing over large databases. Starting with Project Gutenberg, we are expanding to add datasets in law, economics, history, and every other archived dataset. Our mission is to enable a new wave of quantitative humanities research.
Can I use this on my private dataset?
Yes. Our code is open source. We will install it for you, contact us here.
What kind of data is it for?
Large datasets of reference text and images. Books, papers, films, photos, scans. Primary sources and archival datasets.
Who do you work with?
Individual researchers can sign up, ask questions, and test hypotheses. We are also proud to support research institutes, museums, presidential libraries, historical societies, and enterprises.
Can my agent use Alpha Book?
Yes! Copy prompt
Go to https://alpha-book.org/skill.md and follow the instructions there.