AssistantHub

AssistantHub is a self-hosted RAG (Retrieval-Augmented Generation) data and chatbot platform. It enables you to create AI assistants that can answer questions grounded in your uploaded documents, powered by vector embeddings, hybrid search, and large language models. Upload PDFs, text files, HTML, and more -- AssistantHub automatically extracts content, summarizes, chunks, generates embeddings, and makes it searchable. Your assistants retrieve relevant context at query time and generate accurate, citation-ready responses.

AssistantHub ships as a fully orchestrated Docker Compose stack -- one command brings up the entire platform, including the LLM inference engine, document processing pipeline, vector database, object storage, and a browser-based management dashboard.

Slack support was added in v0.9.0, allowing each assistant to connect directly to Slack and process threaded Slack conversations through the same AssistantHub chat pipeline.

Screenshots (click to expand)

New in v0.9.0

Slack integration per assistant -- Configure Slack connectivity directly on assistant settings with Enable Slack, app token, bot token, channel ID, start-of-message indicator, and draft connectivity verification.
Shared chat execution rail -- Slack requests reuse the same retrieval, compaction, citation, inference, and history flow as AssistantHub chat instead of a separate inference path.
Thread-aware Slack replies -- Incoming Slack messages map to deterministic AssistantHub threads and replies are posted back to the originating Slack thread.
Slack verification API and dashboard flow -- Added POST /v1.0/assistants/{assistantId}/settings/slack/verify plus dashboard support for testing draft values before save.
Chat history origin tracking -- chat_history.origin now records request source such as web or slack.
Migration script: migrations/007_upgrade_to_v0.9.0.sql

Slack Integration Added In v0.9.0

AssistantHub supports per-assistant Slack connectivity through Assistant Settings.

Enable Slack on an assistant and provide:
- App Token (xapp-...)
- Bot Token (xoxb-...)
- Channel ID
- Start-of-Message Indicator
Use Verify Connectivity in the dashboard before saving
AssistantHub maintains one Socket Mode connection per Slack-enabled assistant
In configured channels, messages are processed when they start with the configured indicator or mention the bot
Direct messages to the bot are also supported
Slack conversations reuse the same non-streaming chat execution rail as AssistantHub chat, including retrieval, citations, compaction, and history persistence
Slack responses are posted back into the originating Slack thread

Operational notes:

Slack tokens are stored in the AssistantHub database in plaintext, so rely on your deployment's at-rest protections
The Slack app must have Socket Mode enabled and be invited to any private channels it should service
AssistantHub consumes the EasySlack NuGet package at version 1.0.1

New in v0.7.0

Metadata filtering for chat completions -- Filter RAG retrieval to only return documents matching specified labels and/or tags. Labels are simple string lists (required/excluded). Tags are key-value conditions supporting operators: Equals, NotEquals, Contains, StartsWith, EndsWith, GreaterThan, LessThan, IsNull, IsNotNull. Filters can be configured as defaults on an assistant (applied to every conversation) and/or supplied per-request via the metadata_filter field on the chat completion request body. When both are present, they are merged (required labels/tags unioned, excluded labels/tags unioned).

Per-request metadata_filter on chat completions -- The POST /v1.0/assistants/{id}/chat endpoint accepts an optional metadata_filter object in the request body. This is an AssistantHub extension to the OpenAI-compatible chat schema. Clients that omit it get standard unfiltered retrieval. Example:

{
  "messages": [{"role": "user", "content": "What were the Q4 results?"}],
  "metadata_filter": {
    "required_labels": ["finance", "quarterly-report"],
    "excluded_labels": ["draft"],
    "required_tags": [
      {"key": "department", "condition": "Equals", "value": "accounting"}
    ]
  }
}

Assistant-level default filters -- New RetrievalLabelFilter and RetrievalTagFilter settings on each assistant. Configure via the dashboard (Retrieval Filters section) or API. These defaults are applied to every chat retrieval for that assistant.
Filter discovery endpoints -- Four new API endpoints to discover available filter values:
- GET /v1.0/collections/{collectionId}/labels/distinct (admin)
- GET /v1.0/collections/{collectionId}/tags/distinct (admin)
- GET /v1.0/assistants/{assistantId}/labels/distinct (public)
- GET /v1.0/assistants/{assistantId}/tags/distinct (public)
Dashboard -- Retrieval Filters configuration in assistant settings, collapsible metadata filter panel in the chat UI for per-session filtering, and metadata filter display in the history detail view
Auditing -- The effective merged filter is stored in ChatHistory.MetadataFilter and displayed in the History View modal
Docker image tags updated to v0.7.0
See CHANGELOG.md for full details

v0.6.0

LLM-based re-ranking -- After initial retrieval, an LLM scores each chunk's relevance to the user's query and filters out low-quality results before context injection
See CHANGELOG.md for full details

v0.5.0

Native web crawlers -- Built-in web crawling engine that automatically discovers, retrieves, and ingests website content. Configure a URL, schedule, and ingestion rule, and AssistantHub handles the rest
Crawl plans and scheduling -- Persistent crawler configurations with automatic recurring execution on configurable intervals (one-time, minutes, hours, days, weeks)
Delta-based crawling -- Subsequent crawls compare against the previous enumeration to process only new, changed, and deleted content
Document traceability -- Every crawled document is linked back to its source crawler and operation. Filter the Documents view by crawler to see all ingested content
On-demand controls -- Start, stop, test connectivity, and preview discovered content from the dashboard or API
Full dashboard integration -- Crawlers management view, operations viewer with statistics, enumeration browser, and Documents view integration
16 new API endpoints -- Complete CRUD, lifecycle control, statistics, and enumeration access for crawl plans and operations
See CHANGELOG.md for full details

v0.4.0

Query rewrite -- LLM-based query rewriting for improved retrieval recall
Full multi-tenancy -- Row-level tenant isolation, three-tier authorization, auto-provisioning, tenant-scoped routes
See CHANGELOG.md for full details

v0.3.0

Initial release with multi-assistant platform, automated document ingestion, flexible search modes, streaming chat, and browser-based dashboard
See CHANGELOG.md for full details

Features

Assistants -- Create and manage multiple AI assistants, each with their own configuration, personality, and knowledge base.
Documents -- Upload documents (PDF, text, HTML, and more) to build a knowledge base for each assistant. Documents are automatically chunked, embedded, and indexed.
Crawlers -- Native web crawling engine that automatically discovers, retrieves, and ingests website content on a schedule. Supports delta-based crawling (only new/changed/deleted content is processed), configurable depth, parallelism, throttling, content filtering, and web authentication (Basic, API Key, Bearer Token). Each crawled document is traceable back to its source crawler and operation.
Ingestion Rules -- Define reusable ingestion configurations that specify target S3 buckets, RecallDB collections, summarization, chunking strategies, and embedding settings. Documents reference an ingestion rule for processing.
Summarization -- Optionally summarize document content before or after chunking using configurable completion endpoints, improving retrieval quality for long documents.
Endpoint Management -- Manage embedding and completion (inference) endpoints on the Partio service directly from the dashboard or API.
Search -- Leverages pgvector and RecallDB for vector, full-text, and hybrid search. Configure per-assistant search modes with tunable scoring weights for optimal retrieval from your document corpus.
Retrieval Gate -- Optional LLM-based retrieval gate that intelligently decides whether each user message requires a new document search or can be answered from existing conversation context, reducing unnecessary retrieval calls.
Chat -- Public-facing chat endpoint that retrieves relevant context from your documents and generates responses using configurable LLM providers (Ollama, OpenAI, Gemini). Supports real-time SSE streaming.
Conversation Compaction -- Automatic summarization of older messages when the conversation approaches the context window limit, preserving continuity across long conversations.
Feedback -- Collect thumbs-up/thumbs-down feedback and free-text comments on assistant responses to monitor quality and improve over time.
Multi-Tenant -- Full row-level tenant isolation with three-tier authorization (Global Admin via API key or IsAdmin flag, Tenant Admin, User). Auto-provisioning of tenant resources, per-tenant S3 bucket isolation ({tenantId}_ prefix), and tenant-scoped RecallDB mapping.
Dashboard -- Browser-based management UI for configuring assistants, uploading documents, viewing feedback, managing endpoints, and testing chat.
Query rewrite -- Optionally rewrite user queries into multiple semantically varied phrasings before retrieval to broaden recall and capture synonyms, alternate phrasing, and conceptual restatements
LLM-based re-ranking -- Re-ranking scores each retrieved chunk for relevance using an LLM, filtering low-quality results before context injection.
Metadata filtering -- Filter RAG retrieval by document labels (required/excluded string lists) and tags (key-value conditions with conditional operators). Configure default filters per assistant and/or override per-conversation via the metadata_filter field on chat completion requests.
Source citations -- Optional per-assistant citation metadata that maps model claims to source documents with bracket notation, relevance scores, and text excerpts. Configurable document linking via presigned S3 URLs or authenticated download endpoints
RAG evaluation -- Built-in evaluation framework for measuring retrieval and response quality. Define ground-truth facts (question/expected-facts pairs) per assistant, run automated evaluation passes with LLM-based judging, and review per-fact results with pass/fail verdicts. Supports custom judge prompts and real-time SSE progress streaming.

Quick Start (Docker)

The fastest way to run AssistantHub and all its dependencies is with Docker Compose. This is the recommended deployment method.

cd docker
docker compose up -d

Once all services are healthy, open http://localhost:8801 to access the dashboard.

On a fresh startup, assistanthub-server now waits for partio-server to become healthy before it starts. This avoids the transient partio-server:8400 DNS/startup race that could previously abort AssistantHub startup immediately after a factory reset.

Note: Deploying individual services outside of Docker is also possible, but requires manual configuration and deployment of each dependency (PostgreSQL with pgvector, Ollama, Less3, DocumentAtom, Partio, RecallDB). The Docker Compose stack handles all service wiring, health checks, and startup ordering automatically, which is why manual setup documentation is not provided.

Services

The Docker Compose stack orchestrates the following services:

Service	Port	Description
assistanthub-server	8800	The core AssistantHub REST API server (.NET 10). Handles all business logic: assistant management, document ingestion orchestration, chat with RAG, user authentication, and integration with all downstream services.
assistanthub-dashboard	8801	Browser-based management dashboard (React 19, served by nginx). Provides a full UI for configuring assistants, uploading documents, managing endpoints, viewing feedback/history, and live chat testing. Proxies API requests to the server.
ollama	11434	Local LLM inference engine. Runs language models (e.g., `gemma3:4b`) for chat completion, conversation compaction, retrieval gate classification, and title generation. Models are persisted in a Docker volume.
less3	8000	S3-compatible object storage server. Stores uploaded document files. AssistantHub uses the S3 API to write, read, and delete document objects during ingestion and cleanup.
less3-ui	8001	Web-based management UI for Less3. Allows direct browsing and management of S3 buckets and objects.
documentatom-server	8301	Document processing service. Extracts text content from uploaded files (PDF, DOCX, HTML, text, and more), returning structured cells that represent the document's content.
documentatom-dashboard	8302	Web-based management UI for DocumentAtom.
partio-server	8321	Text chunking, embedding, and summarization service. Splits extracted text into chunks using configurable strategies, computes vector embeddings via configurable embedding endpoints, and optionally summarizes content using a completion endpoint. Also manages embedding and completion endpoint configurations.
partio-dashboard	8322	Web-based management UI for Partio. Allows direct management of embedding and completion endpoints.
pgvector	5432	PostgreSQL with the pgvector extension. Provides the underlying vector storage and full-text search capabilities used by RecallDB. Supports cosine similarity search over high-dimensional embedding vectors.
recalldb-server	8401	Vector and full-text search database. Wraps pgvector with a REST API for storing, searching, and managing document embeddings. Supports vector search (semantic similarity), full-text search (keyword matching), and hybrid search (weighted combination).
recalldb-dashboard	8402	Web-based management UI for RecallDB. Allows direct browsing of collections, records, and search testing.

Using an External Ollama Instance

If you already have Ollama running on your host machine or on another server, you can skip the containerized Ollama and point AssistantHub at your existing instance instead.

1. Comment out the Ollama service in docker/compose.yaml:

Comment out (or remove) the ollama service and its volume:

services:

  # --- Infrastructure ---

  # ollama:
  #   image: ollama/ollama:latest
  #   container_name: ollama
  #   ports:
  #     - "11434:11434"
  #   environment:
  #     OLLAMA_NUM_PARALLEL: "4"
  #     OLLAMA_MAX_LOADED_MODELS: "4"
  #   volumes:
  #     - ollama-models:/root/.ollama
  #   restart: unless-stopped

Also comment out the ollama-models volume at the bottom of the file:

volumes:
  pgvector-data:
  # ollama-models:

And remove - ollama from the partio-server service's depends_on list.

2. Update docker/assistanthub/assistanthub.json to point to your Ollama instance:

In the Inference section, change the Endpoint from the container hostname to your Ollama instance's address:

"Inference": {
  "Provider": "Ollama",
  "Endpoint": "http://host.docker.internal:11434",
  "ApiKey": "default",
  "DefaultModel": "gemma3:4b"
}

Ollama on the same machine (Docker Desktop): Use http://host.docker.internal:11434. The special hostname host.docker.internal resolves to your host machine from inside Docker containers. Do not use localhost -- inside a container, localhost refers to the container itself, not your host machine.
Ollama on the same machine (Linux without Docker Desktop): Use http://172.17.0.1:11434 (the default Docker bridge gateway), or run the compose stack with network_mode: host. You may also need to set OLLAMA_HOST=0.0.0.0 in your Ollama configuration so it listens on all interfaces.
Ollama on another machine: Use that machine's IP or hostname, e.g. http://192.168.1.50:11434. Ensure the Ollama port is accessible from the Docker network.

3. Update docker/partio/partio.json to point to your Ollama instance:

In the DefaultEmbeddingEndpoints section, change the Endpoint from the container hostname to match the address you used above:

"DefaultEmbeddingEndpoints": [
  {
    "Model": "all-minilm",
    "Endpoint": "http://host.docker.internal:11434",
    "ApiFormat": "Ollama",
    "ApiKey": null
  }
]

4. Update embedding and completion endpoints in the Partio dashboard:

After startup, open the Partio dashboard at http://localhost:8322 and update both the embedding endpoints and completion endpoints to point to your Ollama instance:

Change the Endpoint URL from http://ollama:11434 to your instance's address (e.g. http://host.docker.internal:11434).
Change the Health Check URL from a relative path (/api/tags) to a fully-qualified URL (e.g. http://host.docker.internal:11434/api/tags). Health checks using relative paths will fail with an "invalid request URI" error.

Without these changes, document ingestion (embeddings) and chat completions will fail.

5. Start the stack:

cd docker
docker compose up -d

Dashboards

Dashboard	URL	Default Credentials
AssistantHub	http://localhost:8801	Email: `admin@assistanthub`, Password: `password`
Less3	http://localhost:8001	Admin API Key: `less3admin`, Access Key: `default`, Secret Key: `default`
DocumentAtom	http://localhost:8302	No authentication configured by default
Partio	http://localhost:8322	Email: `admin@partio`, Password: `password`, Admin API Key: `partioadmin`
RecallDB	http://localhost:8402	Email: `admin@recall`, Password: `password`, Admin API Key: `recalldbadmin`

Important: Change all default passwords immediately after first login.

Configuration

The server reads configuration from assistanthub.json in the working directory. For Docker deployments, this file is located at docker/assistanthub/assistanthub.json and is mounted into the container.

{
  "Webserver": {
    "Hostname": "*",
    "Port": 8800,
    "Ssl": false
  },
  "Database": {
    "Type": "Sqlite",
    "Filename": "./data/assistanthub.db",
    "Hostname": "",
    "Port": 0,
    "DatabaseName": "",
    "Username": "",
    "Password": ""
  },
  "S3": {
    "Region": "USWest1",
    "BucketName": "default",
    "AccessKey": "default",
    "SecretKey": "default",
    "EndpointUrl": "http://less3:8000",
    "UseSsl": false,
    "BaseUrl": "http://less3:8000"
  },
  "DocumentAtom": {
    "Endpoint": "http://documentatom-server:8000",
    "AccessKey": "default"
  },
  "Chunking": {
    "Endpoint": "http://partio-server:8400",
    "AccessKey": "partioadmin",
    "EndpointId": "default"
  },
  "Embeddings": {
    "Endpoint": "http://partio-server:8400",
    "AccessKey": "partioadmin",
    "EndpointId": "default"
  },
  "Inference": {
    "Provider": "Ollama",
    "Endpoint": "http://ollama:11434",
    "ApiKey": "default",
    "DefaultModel": "gemma3:4b"
  },
  "RecallDb": {
    "Endpoint": "http://recalldb-server:8600",
    "AccessKey": "recalldbadmin"
  },
  "AdminApiKeys": [
    "changeme"
  ],
  "DefaultTenant": {
    "Id": "default",
    "Name": "Default"
  },
  "ProcessingLog": {
    "Directory": "./processing-logs/",
    "RetentionDays": 30
  },
  "ChatHistory": {
    "RetentionDays": 7
  },
  "Crawl": {
    "EnumerationDirectory": "./crawl-enumerations/"
  },
  "Logging": {
    "ConsoleLogging": true,
    "EnableColors": false,
    "FileLogging": true,
    "LogDirectory": "./logs/",
    "LogFilename": "assistanthub.log",
    "IncludeDateInFilename": true,
    "MinimumSeverity": 1,
    "Servers": []
  }
}

Key Settings

Section	Description
`Webserver`	Hostname, port, and SSL toggle for the HTTP listener.
`Database`	Database type (`Sqlite`, `Postgresql`, `SqlServer`, `Mysql`) and connection details.
`S3`	S3-compatible object storage (Less3) for uploaded documents.
`DocumentAtom`	Endpoint and access key for the DocumentAtom document-processing service.
`Chunking`	Endpoint, access key, and default endpoint ID for the Partio chunking service.
`Embeddings`	Endpoint, access key, and default endpoint ID for the Partio embeddings service.
`Inference`	LLM provider (`Ollama`, `OpenAI`, or `Gemini`), endpoint, API key, and default model.
`RecallDb`	Endpoint and access key for the RecallDB vector database service.
`AdminApiKeys`	List of API keys that grant global admin access (not tied to any tenant). Users with `IsAdmin=true` also receive global admin privileges.
`DefaultTenant`	ID and name for the default tenant, auto-created on first run.
`ProcessingLog`	Directory and retention for per-document processing logs (namespaced by tenant).
`ChatHistory`	Retention period in days for chat history records (0 = keep indefinitely). Background cleanup runs hourly.
`Crawl`	Directory for storing crawl enumeration files (delta snapshots used for change detection between crawl runs).
`Logging`	Console/file logging toggles, severity level, log directory, and optional syslog servers.

Factory Reset (Docker)

To completely reset AssistantHub to a clean state, use the factory reset script:

cd docker
docker compose down
cd factory
./reset.sh        # Linux/macOS
reset.bat         # Windows

The script will prompt you to type RESET to confirm. This destroys all runtime data (databases, uploaded documents, logs, vector data) and restores factory-default databases. Configuration files are preserved. Downloaded Ollama models are kept by default; pass --include-models to remove them as well.

After the reset completes, start the environment again:

cd docker
docker compose up -d

Expected behavior after reset:

assistanthub-server will not start until partio-server is healthy
this is intentional and prevents AssistantHub from failing early while validating chunking and embeddings connectivity
if startup appears slower than before, wait for Partio to finish its health checks and model initialization

API Overview

AssistantHub exposes a versioned REST API at /v1.0/. All authenticated endpoints require a bearer token in the Authorization header or as a token query parameter.

For complete endpoint documentation including request/response schemas and examples, see REST_API.md.

Endpoint Summary

Category	Endpoints	Description
Health	`GET /`, `HEAD /`	Server info and health check (unauthenticated)
Authentication	`POST /v1.0/authenticate`	Authenticate with email/password (+ optional TenantId) or bearer token
WhoAmI	`GET /v1.0/whoami`	Return current authentication context (tenant, role, user)
Tenants	`PUT/GET /v1.0/tenants`, `GET/PUT/DELETE/HEAD /v1.0/tenants/{id}`	Tenant management (global admin only)
Users	`PUT/GET /v1.0/tenants/{tenantId}/users`, `GET/PUT/DELETE/HEAD .../users/{id}`	Tenant-scoped user management
Credentials	`PUT/GET /v1.0/tenants/{tenantId}/credentials`, `GET/PUT/DELETE/HEAD .../credentials/{id}`	Tenant-scoped credential management
Buckets	`PUT/GET /v1.0/buckets`, `GET/DELETE/HEAD /v1.0/buckets/{name}`	S3 bucket management (tenant-scoped by `{tenantId}_` prefix)
Bucket Objects	`GET/PUT/POST/DELETE /v1.0/buckets/{name}/objects`	S3 object management with upload, download, metadata, and directory creation (tenant-scoped)
Collections	`PUT/GET /v1.0/collections`, `GET/PUT/DELETE/HEAD /v1.0/collections/{id}`	RecallDB collection management (admin only)
Collection Records	`PUT/GET /v1.0/collections/{id}/records`, `GET/DELETE .../records/{recordId}`	Browse and manage records within collections (admin only)
Collection Metadata	`GET /v1.0/collections/{id}/labels/distinct`, `GET .../tags/distinct`	Discover distinct label values and tag keys in a collection (admin only)
Ingestion Rules	`PUT/GET /v1.0/ingestion-rules`, `GET/PUT/DELETE/HEAD /v1.0/ingestion-rules/{id}`	Document processing rule management
Embedding Endpoints	`PUT /v1.0/endpoints/embedding`, `POST .../enumerate`, `GET/PUT/DELETE/HEAD .../{id}`, `GET .../health`, `POST .../test`	Partio embedding endpoint management and smoke testing (admin only)
Completion Endpoints	`PUT /v1.0/endpoints/completion`, `POST .../enumerate`, `GET/PUT/DELETE/HEAD .../{id}`, `GET .../health`, `POST .../test`	Partio completion endpoint management and smoke testing (admin only)
Assistants	`PUT/GET /v1.0/assistants`, `GET/PUT/DELETE/HEAD /v1.0/assistants/{id}`	Assistant management (owner or admin)
Assistant Settings	`GET/PUT /v1.0/assistants/{id}/settings`, `POST .../settings/slack/verify`	Per-assistant endpoint, prompt, RAG, and Slack configuration. Includes draft Slack connectivity verification (owner or admin).
Crawl Plans	`PUT/GET /v1.0/crawlplans`, `GET/PUT/DELETE/HEAD /v1.0/crawlplans/{id}`, `POST .../start`, `POST .../stop`, `POST .../connectivity`, `GET .../enumerate`	Crawler management with schedule control, connectivity testing, and content preview
Crawl Operations	`GET /v1.0/crawlplans/{id}/operations`, `GET .../statistics`, `GET/DELETE .../operations/{id}`, `GET .../statistics`, `GET .../enumeration`	Crawl execution history, statistics, and enumeration file access
Documents	`PUT/GET /v1.0/documents`, `GET/DELETE/HEAD /v1.0/documents/{id}`, `GET .../processing-log`	Document upload, management, and processing log access
Feedback	`GET /v1.0/feedback`, `GET/DELETE /v1.0/feedback/{id}`	View and manage user feedback
History	`GET /v1.0/history`, `GET/DELETE /v1.0/history/{id}`	View and manage chat history with timing metrics
Threads	`GET /v1.0/threads`	List conversation threads
Models	`GET /v1.0/models`, `POST /v1.0/models/pull`, `GET .../pull/status`, `DELETE /v1.0/models/{modelName}`	List, pull, delete, and check pull status for inference models
Eval Facts	`PUT/GET /v1.0/eval/facts`, `GET/PUT/DELETE /v1.0/eval/facts/{factId}`	Ground-truth fact management for RAG evaluation
Eval Runs	`POST/GET /v1.0/eval/runs`, `GET/DELETE /v1.0/eval/runs/{runId}`, `GET .../results`, `GET .../stream`	Start, list, and stream evaluation runs with LLM-judged results
Eval Results	`GET /v1.0/eval/results/{resultId}`	Retrieve individual evaluation result details
Eval Judge Prompt	`GET /v1.0/eval/judge-prompt/default`	Retrieve the default judge prompt template
Configuration	`GET/PUT /v1.0/configuration`	View and update server configuration (admin only)
Public Chat	`POST /v1.0/assistants/{id}/chat`	Chat completion with RAG and optional metadata filtering (unauthenticated, SSE or JSON)
Public Generate	`POST /v1.0/assistants/{id}/generate`	Lightweight inference without RAG (unauthenticated)
Public Compact	`POST /v1.0/assistants/{id}/compact`	Force conversation compaction (unauthenticated)
Public Feedback	`POST /v1.0/assistants/{id}/feedback`	Submit feedback (unauthenticated)
Public Info	`GET /v1.0/assistants/{id}/public`	Get assistant public info and appearance (unauthenticated)
Public Metadata	`GET /v1.0/assistants/{id}/labels/distinct`, `GET .../tags/distinct`	Discover available label and tag filter values for an assistant's collection (unauthenticated)
Public Threads	`POST /v1.0/assistants/{id}/threads`	Create a conversation thread (unauthenticated)

Architecture

                          ┌──────────────────┐
                          │    Dashboard     │
                          │  (React / Vite)  │
                          │    Port 8801     │
                          └────────┬─────────┘
                                   │
                                   │ HTTP (nginx reverse proxy)
                                   ▼
                          ┌──────────────────┐
                          │  AssistantHub    │
                          │ Server (.NET 10) │
                          │    Port 8800     │
                          └──┬────┬────┬──┬──┘
                             │    │    │  │
              ┌──────────────┘    │    │  └──────────────┐
              │                   │    │                 │
              ▼                   ▼    ▼                 ▼
   ┌──────────────────┐ ┌────────────────┐    ┌──────────────────┐
   │   DocumentAtom   │ │   RecallDB     │    │      Less3       │
   │ (Doc Processing) │ │(Vector Search) │    │  (S3 Storage)    │
   │    Port 8301     │ │   Port 8401    │    │    Port 8000     │
   └────────┬─────────┘ └────────┬───────┘    └──────────────────┘
            │                    │
            ▼                    ▼
   ┌──────────────────┐ ┌──────────────────┐
   │     Partio       │ │    pgvector      │
   │ (Chunk/Embed)    │ │  (PostgreSQL)    │
   │    Port 8321     │ │    Port 5432     │
   └────────┬─────────┘ └──────────────────┘
            │
            ▼
   ┌──────────────────┐
   │     Ollama       │
   │  (LLM Inference) │
   │   Port 11434     │
   └──────────────────┘

Document Ingestion Data Flow

  ┌─────────┐       ┌──────────────┐       ┌──────────────┐
  │  User   │       │ AssistantHub │       │    Less3     │
  │(Browser │──1───►│   Server     │──2───►│ (S3 Storage) │
  │  or API)│       │              │       └──────────────┘
  └─────────┘       └──────┬───────┘
                           │
                      3    │
                           ▼
                    ┌──────────────┐
                    │ DocumentAtom │   Extracts text cells
                    │              │   from PDF, DOCX, HTML, etc.
                    └──────┬───────┘
                           │
                      4    │  Text cells
                           ▼
                    ┌──────────────┐
                    │    Partio    │   Optionally summarizes cells,
                    │              │   chunks text, computes embeddings
                    └──────┬───────┘
                           │
                      5    │  Chunks + embeddings
                           ▼
                    ┌──────────────┐
                    │   RecallDB   │   Stores chunks and vectors
                    │  (pgvector)  │   for retrieval
                    └──────────────┘

User uploads a document via the API or dashboard, selecting an ingestion rule.
The document file is stored in the ingestion rule's S3 bucket via Less3.
DocumentAtom extracts text content from the document, returning structured cells.
Partio processes the cells: optionally summarizes (pre- or post-chunking per the rule), splits into chunks using the rule's chunking strategy, and computes vector embeddings via the configured embedding endpoint.
Chunks and embeddings are stored in the ingestion rule's RecallDB collection. Chunk record IDs are saved on the document for cleanup on deletion.

Chat Data Flow

  ┌─────────┐       ┌──────────────┐       ┌──────────────┐
  │  User   │       │ AssistantHub │       │   RecallDB   │
  │(Browser │──1───►│   Server     │──2───►│  (pgvector)  │
  │  or API)│       │              │◄──3───│              │
  └─────────┘       └──────┬───────┘       └──────────────┘
       ▲                   │
       │                4  │  Context + messages
       │                   ▼
       │            ┌──────────────┐
       └─────6──────│    Ollama    │   Generates response
                    │  (Inference) │   (streaming or batch)
                    └──────────────┘

User sends a message to the chat endpoint with conversation history.
If RAG is enabled (and the retrieval gate permits), the server embeds the query and searches RecallDB using the assistant's configured search mode (vector, full-text, or hybrid).
RecallDB returns relevant document chunks ranked by similarity score.
The server assembles the system prompt with retrieved context and sends the full message list to the configured inference provider (Ollama, OpenAI, or Gemini). If the conversation exceeds the context window, older messages are compacted first.
The LLM generates a response.
The response is streamed back to the user token-by-token via SSE (or returned as a complete JSON response). Chat history with timing metrics is persisted.

Tech Stack

Backend: .NET 10 (C#), WatsonWebserver
Frontend: React 19, Vite 6, JavaScript
Database: SQLite (default), PostgreSQL, SQL Server, MySQL
Vector Search: RecallDB backed by PostgreSQL with pgvector
Document Processing: DocumentAtom (text extraction), Partio (chunking, embedding, summarization)
Object Storage: Less3 (S3-compatible)
Inference Providers: Ollama (local), OpenAI (cloud), Gemini (cloud)
Containerization: Docker, Docker Compose
Web Server (Dashboard): nginx

SDKs

Client libraries are available for integrating with the AssistantHub API:

SDK	Location	Description
JavaScript/TypeScript	`sdk/js/`	Dual ESM/CJS output, native fetch, async generators for SSE streaming
Python	`sdk/python/`	Pydantic v2 models, httpx client, PEP 561 compliant
C#	`sdk/csharp/`	.NET 8.0, System.Text.Json, typed exceptions, IAsyncEnumerable streaming

Each SDK directory contains its own README with installation instructions and usage examples.

Issues, Feedback, and Improvements

Bug Reports and Feature Requests -- Use the Issues tab to report bugs or request new features.
Questions and Discussion -- Use the Discussions tab for general questions, ideas, and community feedback.
Improvements -- We are happy to accept pull requests, please keep them focused and short

License

This project is licensed under the MIT License. See LICENSE.md for details.

Name		Name	Last commit message	Last commit date
Latest commit History 125 Commits
.claude		.claude
archive		archive
assets		assets
dashboard		dashboard
docker		docker
migrations		migrations
postman		postman
sdk		sdk
src		src
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
GEMINI_SUPPORT.md		GEMINI_SUPPORT.md
LICENSE.md		LICENSE.md
README.md		README.md
REST_API.md		REST_API.md
TESTING.md		TESTING.md
build-dashboard.bat		build-dashboard.bat
build-server.bat		build-server.bat
openapi.json		openapi.json
run-tests.bat		run-tests.bat
run-tests.ps1		run-tests.ps1
run-tests.sh		run-tests.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AssistantHub

New in v0.9.0

Slack Integration Added In v0.9.0

New in v0.7.0

v0.6.0

v0.5.0

v0.4.0

v0.3.0

Features

Quick Start (Docker)

Services

Using an External Ollama Instance

Dashboards

Configuration

Key Settings

Factory Reset (Docker)

API Overview

Endpoint Summary

Architecture

Document Ingestion Data Flow

Chat Data Flow

Tech Stack

SDKs

Issues, Feedback, and Improvements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AssistantHub

New in v0.9.0

Slack Integration Added In v0.9.0

New in v0.7.0

v0.6.0

v0.5.0

v0.4.0

v0.3.0

Features

Quick Start (Docker)

Services

Using an External Ollama Instance

Dashboards

Configuration

Key Settings

Factory Reset (Docker)

API Overview

Endpoint Summary

Architecture

Document Ingestion Data Flow

Chat Data Flow

Tech Stack

SDKs

Issues, Feedback, and Improvements

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages