RoundSparrow

joined 3 years ago
MODERATOR OF
 

For lemmy server testing and performance baseline measurement, I think it would be cool to have every API call exercised.

Anyone willing to create and share some JavaScript client code? Normally these are run with Jest via NodeJS - but we can make it an extra step to integrate into Jest. I'm just thinking someone doing front-end app work can do a well organized hit on every API surface.

You can skip the creation of a user if you want, that code is already in the testing

Probably ideal to organize moderator vs non-moderator.

Something like: edit profile with every option one at a time, create a community, edit it, create posts, edit, delete, undelete, reply, etc. Imagine you were doing interactive tests of a major upgrade and wanted to hit every feature and button.

Right now most of that testing is done in independent scripts, such as a user test: https://github.com/LemmyNet/lemmy/blob/main/api_tests/src/user.spec.ts

And you can see it only tests editing a profile is working, not that actual features change their behavior. Although I've started to add that for back-end behaviors like showing read/unread posts on the list. Front-end devs are the ones who know what end-users do and the fringe cases to look out for. Thank you.

 

I thought some people were out there in June creating stress-testing scripts, but I haven't seen anything materializing/showing results in recent weeks?

I think it would be useful to have an API client that establishes some baseline performance number that can be run before a new release of Lemmy and at least ensure there is no performance regression?

The biggest problem I have had since day 1 is not being able to reproduce the data that lemmy.ml has inside. There is a lot of older content stored that does not get replicated, etc.

The site_aggregates UPDATE statement lacking a WHERE clause and hitting 1500 rows (number of known Lemmy instances) of data instead of 1 row is exactly the kind of data-centered problem that has slipped through the cracks. That was generating a ton of extra PostgreSQL I/O for every new comment and post from a local user.

The difficult things to take on:

  1. Simulating 200 instances instead of just 5 that the current API testing code does. First, just to have 200 rows in many of the instance-specific tables so that local = false API calls are better exercised. And probably about 25 of those instances have a large number of remote subscribers to communities.

  2. async federation testing. The API testing in lemmy right now does immediate delivery with the API call so you don't get to find out the tricky cases of servers being unreachable.

  3. Bulk loading of data. On one hand it is good to exercise the API by inserting posts and comments one at a time, but maybe loading data directly into the PostgreSQL backend would speed up development and testing?

  4. The impact of scheduled jobs such as updates to certain aggregate data and post ranking for sorting. We may want to add special API feature for testing code to trigger these on-demand to stress test that concurrency with PostgreSQL isn't running into overloads.

  5. Historically, there have been changes to the PostgreSQL table layout and indexes (schema) with new versions of Lemmy, which can take significant time to execute on a production server with existing data. Some kind of expectation for server operators to know how long an upgrade can take to modify data.

  6. Searching on communities, posts, comments with significant amounts of data in PostgreSQL. Scanning content of large numbers of posts and comments can be done by users at any time.

  7. non-Lemmy federated content in database. Possible performance and code behavior that arises from Mastodon and other non-Lemmy interactions.

I don't think it would be a big deal if the test takes 30 minutes or even longer to run.

And I'll go out and say it: Is a large Lemmy server willing to offer a copy of their database for performance troubleshooting and testing? Lemmy.ca cloned their database last Sunday which lead to the discovery of site_aggregates UPDATE without WHERE problem. Maybe we can create a procedure of how to remove private messages and get a dump once a month from a big server to analyze possible causes of PostgreSQL overloads? This may be a faster path than building up from-scratch with new testing logic.

 

Autism brain can get latched on technical issues and have no awareness of just how much people with different attitudes about art and entertainment, like Rock Stars making music, do things different.

I've had to think of analogies of the situation. Like criticizing the maintenance cost or oil leak of an exotic sports car. And not realizing it's all about the 'hand crafted' nature and not about practicality.

On the Reddit adult autism community people have shared experiences about drinking. For some autistic people it helps, others say it causes them problems.

I haven't drank heavy since New Years, but I had to drink to reset my brain over my social interpretation mistake.

I just got too worked up over the style and fashion of a art-centered creative project. And I really regret how I didn't realize the social situation. I really regret that I worried about a problem taking so long to be fixed when I haven't found a single other person who cares about it. Stupid brain. Social things like this can be so confusing some times, I didn't even realize it.

Street vendor food. wildly different social conversions and standards people have I don't have troubles with those, but "Rock Star" culture in business/workplace and music/entertainment/art cultures can confuse me. I get bewildered why people would let Harvey Weinstein get away with so much for so many years. That kind of "Rock Star" culture and how people have different standards for what they consider normal.... bewildering. Rock Star politicians, I've never seen the appeal of wearing clothing with a politician on it. I have such a hard time seeing these situations, my mind doesn't go there. But I can be that way about actual rock stars, art, so the problem I am having I am just shifting my mind to interpret the whole situation that way.

But I feel so dumb and stupid for not recognizing it earlier.

 

I'm supposed to be able to live in the current era without spectrum of years

Many nations: Rent/Housing perspective/10 years ago?

 

Reference, June 4, 2023: https://github.com/LemmyNet/lemmy/issues/2910

Questions:

  1. Lemmy.ml was crashing every 10 minutes of every single day since May 25. Do you dispute this claim I make? Server log Evidence?

  2. June 4 issue 2910 was on Github and the same developers who run lemmy.ml put a tag on the issue June 4.

  3. Did it say server-crashing topic?

  4. Were you aware of the end-of-June Reddit API change?

  5. When was the issue fixed, date and who?

  6. Did you ask PostgreSQL communities on Lemmy for help? Which posts?

  7. Is lemmy.ml a developer run instance? Is this your idea of "supported" example of Lemmy's rust lemmy_server code?

Please explain. Like PostgreSQL EXPLAIN?

Was this what was given as an Issue 2910 to you on June 4, 2023?

 
 

Would fight or flight kick in with their behavior in avoiding the critical deepest (crash-causing, lost-data) software component name and identity? If the mistakes in such programming languages were causing multiple independent servers to melt down in CPU overload?

Can media machines traumatize developers, if there are extreme dehumanization contents and pornography stored in that back-end critical component? If the avoided programming language and logical errors in their own responsibilities utilized the word "TRIGGER" (programming syntax), could that induce social anxiety behavior?

 

IIRC, it was lemmy.ca full copy of live data that was used (copy made on development system, not live server - if I'm following). Shared Saturday July 22 on GitHub was this procedure:

...

Notable is the AUTO_EXPLAIN SQL activation statements:

LOAD 'auto_explain';
SET auto_explain.log_min_duration = 0;
SET auto_explain.log_analyze = true;
SET auto_explain.log_nested_statements = true;

This technique would be of great use for developers doing changes and study of PostgreSQL activity. Thank you!

 

Even if starting out doing http connections one to one like vote rows are replicated today. This would at least cut down the PostgreSQL storage of non-local instance votes in favor of having just one single row per comment and post (which is already existing overhead, even for non-local post and comment).

The home instance of a community (which owns the post, which owns the comment) is the only place to have the full history of activity given how Lemmy 0.18.2 has no backfill procedure for activities before first instance subscriber to a community.

 

Right now querying posts has logic like this:

WHERE (((((((((("community"."removed" = $9) AND ("community"."deleted" = $10)) AND ("post"."removed" = $11)) AND ("post"."deleted" = $12)) AND (("community"."hidden" = $13)

Note that a community can be hidden or deleted, separate fields. And it also has logic to see if the creator of the post is banned in the community:

LEFT OUTER JOIN "community_person_ban" ON (("post"."community_id" = "community_person_ban"."community_id") AND ("community_person_ban"."person_id" = "post"."creator_id"))

And there is both a deleted boolean (end-user delete) and removed boolean (moderator removed) on a post.

Much of this also applies to comments. Which are also owned by the post, which are also owned by the community.

view more: ‹ prev next ›