XPerto - Meeting Voice Assistant

Table of Contents

Description
#

XPerto is a voice assistant designed to assist during multi-person conversations, particularly in scenarios like meetings or interviews. It is being developed for a research project at the Hochschule der Medien Stuttgart, focusing on researching the acceptance of AI assistants in professional settings.

The main features include:

Voice Interaction: Allows users to interact with the assistant using natural language.
Multi-User Turn Detection: Automatically detects whether a response is needed or not based on the conversation context. Thus enabling multiple participants to interact with each other and the assistant.
Contextual Awareness: Maintains context of the conversation to provide relevant responses.

Technologies
#

Pydantic: Used for settings management.
Pipecat: The main framework used for building the Voice Assistant.
Deepgram: Speech-to-Text service used for transcribing audio input.
GPT 4o: Language model used for generating responses.
ElevenLabs: Text-to-Speech service used for generating spoken responses.

Author

Jason Schühlein

Description#

Technologies#

Description
#

Technologies
#