The Televoice Project

This has been a weekend project over President's Day weekend '24, and was inspired by another one of my projects. I started this project with a very different approach, and have used multiple VoiceAPIs, and text-to-speech providers.

Control Flow

Demo

Installation Instructions

Install required dependencies

Ensure that you first have all of the required dependencies installed, which relate to fastapi, ngrok, etc.

pip install -r requirements.txt

Run the API

You can run the API by calling api.py.

python3 api.py

A Note on NGROK

I have used NGROK in this project, as it has allowed me to quickly develop and establish a tunnel from my development environment to outside IDEs. This is not required, but will need some editing to change.

Adding Environment Variables

I have provided an empty environment file template below. Copy and paste this into .env, and begin pasting in your API keys.

NGROK_AUTHTOKEN=""

# telephon
VONAGE_API_KEY=""
VONAGE_API_SECRET=""
VONAGE_APPLICATION_ID=""
VONAGE_APPLICATION_NAME=""
VONAGE_SIGNATURE_SECRET=""
VONAGE_JWT=""

# speech to tex
DEEPGRAM_API_KEY=""

# gpt to generate a respons
OPENAI_API_KEY=""

# text to speech
PLAYHT_USER_ID=""
PLAYHT_API_KEY=""

ELEVENLABS_VOICE_ID=""
ELEVENLABS_API_KEY=""

Project Setup

This project has been setup with three main components.

`api.py`

This is where our FastAPI instance lives. This is also the entrypoint to this program.

`const.py`

This is where all project related settings go.

`processing`

This is where all files related to processing go. This contains four main modules.

generate_response - This is where we generate the response to send to the client.
speechtotext - This is where we convert speech to text
texttospeech - This is where we convert text to speech
telephony - This provides a class that allows us to change Telephony providers

All modules in this folder are object oriented. In each of the modules, I have added an abstract.py, where I have defined the base class. In each of the files inside each module, the Abstract class is implemented, and then finally imported in api.py. To swap modules, look at APISettings in api.py.

Caveats

This was a fun project, however, I was not thrilled with some of the things that I ran into. First of all, I started by using the Telnyx Voice API. The Telnyx documentation was great, but I could not get any high-quality audio without establishing a RTC connection. I also tried twilio, and found the same issue. Vonage, on the other hand, provided twice as good audio over websockets, so that was nice to see.

If I had more time I would implement more, and on a local machine with a gpu. This would cut down on the latency. For example, if I used openai/whisper, instead of Deepgram, I have a feeling I could cut down on the speech processing time.

Finally, context seems to be important to streaming text-to-speech, so we need to wait for the chat completion to finish before generating audio. I think I would change this to have a sliding window of context to feed to the text-to-speech service, but am not sure how I would do this yet.

Also, as seen in the texttospeech/open_ai.py file, I was not able to get the opus bindings to work for python, and was not able to implement openai text to speech (I could get it to work, but it's Monday and time for schoolwork).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
processing		processing
.gitignore		.gitignore
api.py		api.py
const.py		const.py
log_conf.yaml		log_conf.yaml
readme.md		readme.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Televoice Project

Control Flow

Demo

Installation Instructions

Install required dependencies

Run the API

A Note on NGROK

Adding Environment Variables

Project Setup

`api.py`

`const.py`

`processing`

Caveats

About

Releases

Packages

Languages

rhbuckley/televoice-proof

Folders and files

Latest commit

History

Repository files navigation

The Televoice Project

Control Flow

Demo

Installation Instructions

Install required dependencies

Run the API

A Note on NGROK

Adding Environment Variables

Project Setup

api.py

const.py

processing

Caveats

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`api.py`

`const.py`

`processing`

Packages