Documentation Index
Fetch the complete documentation index at: https://docs.gdilabs.io/llms.txt
Use this file to discover all available pages before exploring further.
What it does
- Reads every markdown file in the hub and parses YAML frontmatter against the hub schema.
- Chunks each body by heading.
- Embeds chunks via Ollama (default
nomic-embed-text), OpenAI (text-embedding-3-small), or a deterministic local-hash fallback for CI. - Upserts chunks into Qdrant with payload indices on
kh_id,scope,scope_prefixes,sensitivity,source_type,team, andtags. - Maintains a Postgres manifest so unchanged files are skipped on subsequent runs.
- Sweeps orphans — IDs in the manifest but no longer in the hub get deleted from both Qdrant and the manifest.
Embedding providers
- Ollama (default): runs against a local or remote Ollama instance.
- OpenAI: hosted embedding API.
- Hash: deterministic local fallback for CI smoke tests; never used in production.
CLI
Configuration
| Variable | Purpose |
|---|---|
QDRANT_URL | Qdrant endpoint. |
QDRANT_API_KEY | Qdrant auth (optional). |
QDRANT_COLLECTION | Collection name; default cerebrum_knowledge_hub. |
EMBEDDING_PROVIDER | One of ollama, openai, hash. |
OLLAMA_BASE_URL, OLLAMA_EMBED_MODEL | Ollama provider config. |
OPENAI_API_KEY, OPENAI_EMBED_MODEL | OpenAI provider config. |
POSTGRES_DSN | Manifest persistence (optional). |
--no-manifest (first runs, debug, or when Postgres is unavailable) and under --dry-run.