Description#
XPerto is a voice assistant designed to assist during multi-person conversations, particularly in scenarios like meetings or interviews. It is being developed for a research project at the Hochschule der Medien Stuttgart, focusing on researching the acceptance of AI assistants in professional settings.
The main features include:
- Voice Interaction: Allows users to interact with the assistant using natural language.
- Multi-User Turn Detection: Automatically detects whether a response is needed or not based on the conversation context. Thus enabling multiple participants to interact with each other and the assistant.
- Contextual Awareness: Maintains context of the conversation to provide relevant responses.
Technologies#
- Pydantic: Used for settings management.
- Pipecat: The main framework used for building the Voice Assistant.
- Deepgram: Speech-to-Text service used for transcribing audio input.
- GPT 4o: Language model used for generating responses.
- ElevenLabs: Text-to-Speech service used for generating spoken responses.
