I noticed it with Anton Petrov's daily science content. At first I thought he had a cold or something. He has gotten sick many times since he started uploading around 2017, but this has been persistent and getting worse over time. I figured maybe it was long covid or something. Over the past week it has markedly varied in intensity, becoming more annoyingly abrasive in sections of informative value and easing elsewhere to the point where his original voice nearly totally made it through. I know how he produces and edits his content, and the patterns in the videos do not follow a natural distribution of segments he has spliced together.
If this was only one channel, I would maybe write off the issue. It is suspicious, specifically because Anton uploads daily commercial free science content with an excellent quality and reputation about recent research in white papers. He cites valid sources while making simple connections to other recent research.
Now the same throaty audio pattern is happening with Asianomitry on YouTube.
The pattern is not even within the range of plausible human voice issues any more. There is a distinct periodic frequency modulation like sound. I strongly believe this is a new technique they are using to obfuscate AI training. It started happening around the time or before Anton's content had the potentially seizure inducing flashing distortion added by google to obfuscate AI training. It is far more difficult to conclusively detect this type of audio distortion compared to visual effects.
The scope of people this is applied to could be narrow. Specifically, I have messed around with training open weights models and I exclusively watch reputable science edutainment content with no emotional empathy or spurious opinionated slant. Everyone I watch regularly has a masters or PhD, except in the 3d printing space. If adversarial targeting of individuals for potential open weights AI training is possible, I might be in that pool. I have never used programmatic scraping tools or agentic online APIs, but have searched and downloaded content in outlier patterns. Odds are, it is not targeted, and most people are seeing (hearing) this too.
I have played with using and training speech to text and text to speech around a year ago, and this is exactly the kind of pattern that would make training garbage. It also makes the content garbage for humans IMO... wrote this instead of consuming any such content... welcome to the dark age...