this post was submitted on 27 Aug 2025
21 points (100.0% liked)

Opensource

4771 readers
159 users here now

A community for discussion about open source software! Ask questions, share knowledge, share news, or post interesting stuff related to it!

CreditsIcon base by Lorc under CC BY 3.0 with modifications to add a gradient



founded 2 years ago
MODERATORS
 

I’m a firefighter who’s also a software engineer and am working on a training app. In it, I have it generating text dispatches of various scenes for us to discuss on rainy slow training days. Such as “Respond to 123 Main st, for a report of a smell of smoke” etc.

I already use google maps to generate a random address and show the map / street view. With the maps api I can domain lock the key…

But with their Text to Speech api I cannot. Seems silly. But I get it.

Are there any alternatives? I would be ok spinning up a middle server to also rework the audio to generate radio static, etc, but first pass I am looking for non-robotic (ie not browser based) TTS.

Thoughts?

you are viewing a single comment's thread
view the rest of the comments
[–] mesamunefire@piefed.social 5 points 4 months ago* (last edited 4 months ago) (2 children)

Sometime last year, I was trying to get an open source TTS to generate audiobooks from books that will most likely never get a real person attached. I ended up using piper for it. It was "good enough" for car rides and such. Bark felt kinda like cheating/uncanny valley and would do strange things to the voice after a while.

Hope that helps!

[–] flandish@lemmy.world 1 points 4 months ago (1 children)

if i read this correctly a piper has webasm bindings too? I am running the training app in a browser as a vue3+ts stack. So I could spin up a worker on cloudflare, api key lock, and then use that or run it all browser side with wasm.

Time to tinker!

[–] mesamunefire@piefed.social 3 points 4 months ago* (last edited 4 months ago) (1 children)

Give it a shot. Its been a full year since ive messed with this so let me know how it goes. Might be some better stuff out now. But yeah piper/eSpeak both have fantastic performance. You can even put it on your phone should the need arise: https://f-droid.org/packages/org.woheller69.ttsengine/

[–] flandish@lemmy.world 1 points 4 months ago

cool! i know cf workers ai can run a tts model and i can domain lock the api key… but this seems cooler. :)

[–] flandish@lemmy.world 1 points 4 months ago

sweet. thanks!