cross-posted from: https://lemmy.world/post/28011368
So I started by doing research and by research I mean watching two videos on YouTube about basic recommendation algorithms.
I did watch a 30 minute video on Netflix software engineer talking about using machine learning and complex matrix and these bandit style machine learning algorithms to recommend TV shows/movies really the base conclusion is that there's a 50% improve over doing all these complex things over their baseline measurement. Baseline will mean traditional pre neral network based algorithms.
The way I interpret it is that basics take you a long way and all the basics are is just organizing any peertube video into a vector and people watching into a vector as well. The idea would be that which videos are more similar to each other would be good recommendations if a watcher watch one of those videos, or if they didn't like it don't recommend any videos similar to that. Once these videos get vectorized then the watcher's vector can be updated in a basic way more watch time mean its more of what they want and a like would give it a boost, or comment could boost multiplier.
I'd say that the watcher's vector can be stored locally while videos vector is public. It will be a while to figure out a function/algorithm to adapt to watcher. Does the watcher taste change, do they multiple things , should the algorithm adapt fast or slow as new videos come in, novelty/consistency. I don't expect this problem to be solved anytime soon , but the recommendation algorithm will simply evolve and split as to have their own unique benefits and drawbacks.
To start foundation is to start a standard for video vector. Video can be quantified and qualified. There's only a few measurable quantities like video length and existing views. Qualitative attribute of videos like "is it a cooking tutorial, "is it a sports commentary ", or "is it a Livestream VOD" are going to require that the vector be stored in a format that can adapt to the expanding number of dimensions the quality a peertube video can have. Next issue is measure qualities to an actual number is something sports related or sports adjacent would a 1 mean yes or would a 0 mean neutral/agnostic or no.
The last simplist issue would be communicated the algorithm that updates the watcher's vector since that can be done via updates from peertube server or GitHub
I'm not technical enough to know what the details mean but Im excited of the idea of a simple algorithm for peertube