Text to speech with AI (ChatGPT) APIs?

Ryan_Hartz · June 6, 2024, 5:29pm

Has anyone dabbled in TTS with AI models? I see ChatGPT has an API, but I have no idea on how to convert the code to Xojo, or even if this is possible yet. I tried using ChatGPT to convert the Python code to Xojo, but it didn’t quite work out. Not surprising.

https://platform.openai.com/docs/guides/text-to-speech

Andrew_Lambert · June 6, 2024, 6:37pm

Check out my open source wrapper for OpenAI:

Edwin_van_den_Akker · June 7, 2024, 11:38am

I actually made an app that helps me with my video production work.
The app enhances my scripts. It is even able to use OpenAI’s TTS to generate “dummy” or “guide” voice-over audio, that later is re-recorded with a real voice-over talent.

To test the capabilities I use RapidAPI (formerly known as PAW).
I set certain re-usable variables in the environment variables list. These hold my API-keys, Base-URL and various URL endpoints.

Video:

Fun fact… the voice-overs in this video are straight from OpenAI’s TTS generator.

And definitely watch Geoff’s video on youtube:
https://youtu.be/fEkFrspP9oM?si=xvaL13TnbQ6RZR24&t=2839

Ryan_Hartz · June 10, 2024, 7:16pm

This is amazing! Thank you Andrew for your work on this! I’ve been playing around with it, and it works very nicely. I see there is a character limit of 4096, and I see that this is an OpenAI limitation for TTS, not specific to your project. I do have some text which exceeds this threshold, but I think I can figure out a way around this. If anyone has any suggestions, I’d love to hear

Thanks again!

Edwin_van_den_Akker · June 11, 2024, 7:29am

You could split the text up in smaller chunks of ≤ 4096 characters.
Like splitting it into an array of lines. Each chunk can be constructed by adding lines until the chunk size exceeds the limit. Then simply pass the chunk without that last line.

Ryan_Hartz · June 12, 2024, 1:35am

Yes, that is the way to go. Thanks!