Honestly, a lot of newer TTS is worse than the 80s/90s stuff like DECtalk or PlainTalk (/MacinTalk). Both of which, while not exactly human-sounding, actually sounded better (at least in a sort of aesthetic way). For an example, Microsoft Sam (and whatever the voice is default for espeak) is such a downgrade IMO.
I'm not sure how heavy Piper models are (data or running), but I'm sure TTS could be better without neural anything.