I use the HA Voice Preview in two different rooms and got rid of my Alexa Dots. I've been trying both speech-to-phrase and whisper with medium.en running on the GPU for STT, tried llama3.2 and granite4 for the LLM with local command handling
I've been trying to get it working better, but it's been a struggle. The wake word responds to me, but not my girlfriend's voice. I try setting timers, and it says done, but never triggers the timer.
I'd love to improve operating performance of my assistant, but want to know what options work well for others. I've been experimenting with an intermediary STT proxy to send it to both whisper and speech-to-phrase to see which one has more confidence.
What about throughput, latency, schema modeling, query load balancing/routing, confidentiality, regulatory compliance, operational tooling? How easily can I write a CRUD or line of business service using it?