Abstract AssistantServer behind ABC for pluggable voice frameworks#14
Abstract AssistantServer behind ABC for pluggable voice frameworks#14sboily wants to merge 2 commits intoServiceNow:mainfrom
Conversation
|
Thank you for creating this PR! We would also love to make EVA framework agnostic so that its possible to test with frameworks outside of pipecat. This makes sense as a start - however the bigger challenge we see is implementing any AssistantServer in a way such that it replicates the pipecat logging exactly. Renaming the logs is fine, however our evaluation logic is tightly coupled with the pipecat log entries and format. Any AssistantServer would need to create the exact same type of logs. It would be helpful for us if you could include the AssistantServer class you plan to use, in addition to the abstract base class, so we can see how it would work. |
Extract AssistantServerBase ABC from the Pipecat-coupled AssistantServer so that alternative voice frameworks can be plugged in without modifying the orchestrator, metrics, or evaluation pipeline. Changes: - New AssistantServerBase ABC (src/eva/assistant/base.py) defining the server contract: start(), stop(), get_conversation_stats() - Rename AssistantServer -> PipecatAssistantServer (backward-compat alias) - Factory function with lazy-import registry in assistant/__init__.py - Add EVA_FRAMEWORK config field to RunConfig (default: "pipecat") - Worker uses factory instead of direct import - Rename pipecat_logs_path -> framework_logs_path throughout - Remove dead execute_realtime_tool from ToolExecutor - Move nvidia-riva-client to optional [nvidia] dep (conflicts with roomkit's websockets/deepgram-sdk requirements) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
f9f15e8 to
ac045a8
Compare
Concrete implementation of AssistantServerBase using RoomKit's voice
pipeline while reusing EVA's AgenticSystem, AuditLog, and ToolExecutor
for LLM reasoning.
Architecture:
- TwilioWebSocketBackend (from roomkit.voice.backends.twilio_ws)
bridges EVA's Twilio WebSocket protocol to RoomKit's VoiceChannel
- RoomKit VoiceChannel handles STT (Deepgram), TTS (ElevenLabs), VAD
- RoomKit WavFileRecorder with ALL mode for audio output
(inbound + outbound + mixed WAV files)
- RoomKit hooks produce framework_logs.jsonl events
- EVA's AgenticSystem handles LLM reasoning + tool calling unchanged
- Dedicated write queue for full-duplex WebSocket audio
Tested end-to-end: full conversations with correct metrics,
clean audio recordings, user_behavioral_fidelity = 1.0.
Usage: EVA_FRAMEWORK=roomkit EVA_MODEL__STT=deepgram \
EVA_MODEL__TTS=elevenlabs python main.py --debug
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ac045a8 to
9ff42b0
Compare
|
Thanks for the feedback! I've updated the PR with a concrete RoomKitAssistantServer implementation. What i did:
Dependency challenge: I propose restructuring pyproject.toml to separate framework-specific deps into optional extras: Core deps (LLM, metrics, evaluation) stay in dependencies. This makes the framework choice explicit and avoids version conflicts. The PR includes this restructuring. Current status:
Happy to iterate based on your input. |
Why
We're building RoomKit — a multi-channel voice/AI framework — and we'd love to use EVA to benchmark our voice agents. Today, the assistant server is tightly coupled to Pipecat, which means testing any other framework requires forking and rewriting internals.
We think EVA's evaluation methodology (bot-to-bot conversations, accuracy + experience metrics) is excellent and would benefit from being framework-agnostic, letting the community benchmark their own voice stacks against the same scenarios and metrics.
What
This PR introduces an
AssistantServerBaseABC that codifies the existing contract betweenConversationWorkerand the assistant server (which was already narrow — juststart(),stop(), andget_conversation_stats()). The current Pipecat implementation becomesPipecatAssistantServer, one concrete subclass behind a factory with lazy imports.After this change, plugging in a new framework requires:
AssistantServerBaseLiteraltype inRunConfig.frameworkNo changes to the orchestrator, metrics, or evaluation pipeline.
Heads up
This is a first proposal to open a discussion. We understand this is a meaningful architectural change and we're not expecting it to be merged as-is. We'd like to hear from the maintainers:
pipecat_logs→framework_logs)?Happy to iterate on any of this based on your feedback.
🤖 Generated with Claude Code