As I'm sure y'all have noticed, the server has been having stability issues the past week. Every once in a while, the server seems to "crash"; it will time out on all network requests for about 5 minutes, happening several times within an hour. Additionally, there have been several times over the past week when the server queues got backed up, resulting in incoming and outgoing federation traffic being delayed.
At first, I assumed the server hardware had simply become insufficient to handle all of the network and federation traffic we have been receiving, potentially due to a DDOS attack. I was honestly starting to think I would need to upgrade the server hardware, which is ridiculous, as the VPS is already quite powerful and is certainly expensive enough.
Luckily, @green_copper helped with debugging and noticed the server timeouts only occur when the microblog section is viewed (and sometimes when the combined section is viewed). So rather than a lack of hardware being the issue, I believe it actually has to do with the recent Mbin upgrade. My guess is that an SQL query was changed in one of the commits between Mbin v1.9.1 and v1.10.0, which resulted in a widely inefficient query, causing the server to freeze up and time out on requests.
I have and will be investigating the issue this week. I will start by analyzing which database queries are causing the timeouts, but if needed, roll back to Mbin v1.9.1 and try to figure out which specific Mbin commit introduced the bug. Additionally, I may have access restricted only to signed-in users while I'm working on this.
In the meantime, if you guys can refrain from viewing microblogs (or combined view), that would keep the server accessible for everyone and help me work on it faster, as I can't do anything while the server is frozen. If you want any realtime updates on the server issue (or anything else related to kbin.earth), feel free to join the Matrix room.
Thanks for sticking around and for your patience!
Absolutely. We do appreciate you.