Chatterbox is intended to be an open source self-hostable auto-rag service to allow:
This project is still being worked on.
Document indexing using Temporal workflow and chat using ReAct LLM architecture with querying internal knowledge base tool.
(Multi tenants) Single table for all users’ knowledge bases or one table for each user knowledge base
(Long running indexing workflow) Indexing workflow may fail e.g. rate limit from Mistral OCR API
(LLM Agnostic) Use LiteLLM or openrouter as LLM gateway
(Document indexing) Mistral document OCR API as a good default document parsing and indexing tool assuming pdf document is well formatted in markdown.
Some cool existing solutions