fiat_lux

joined 1 week ago
[–] fiat_lux@lemmy.zip 1 points 1 minute ago

Your pizzas always look fabulous, but I really want to introduce your wife to some better olives. If you ever get the chance to pick up some kalamata or ligurian olives, be sure to try them out, but you'll probably want to reduce the quantity you add, because they have a lot of flavor.

Black olives are one of the food victims of industrial farming. It's difficult to find the ones that are actually black from natural ripening instead of processing to look ripe, but they taste very different.

[–] fiat_lux@lemmy.zip 39 points 2 days ago* (last edited 2 days ago) (1 children)

Link is to a shit pdf on a proton drive. It's a basic description of the Google auction house. The prices they list are largely driven by the bids advertisers place, but that's not to say Google doesn't charge a bigger minimum for different demographic segments, they very much do. As does Facebook etc.

For example, one reason that parents are worth less is because of the products they listed. Diapers cost less than business lawyers, so the margins are much slimmer, so advertisers aren't going to bid as much for an ad placement.

It does miss one thing that is, in my opinion, one of the more revolting aspects of their auction house. As a bidder your dollar is worth less than a big company's dollar, even as little as one tenth. You could bid a million dollars on an ad space that Apple only bid $100001 on and you'd lose. That gap is dynamically calculated (at least in part) based on comparative search rankings.

Here's the text without their ad at the end:

The Price of Free Google

What the Ad Industry Pays to Target Americans

A Proton Mail analysis of 54,216 advertiser-defined profiles across the U.S.

The price of your attention

Every user has a price

Every Google search triggers an invisible, real-time auction where advertisers bid for access to your attention. These bids are calculated in milliseconds based on how likely you are to spend. This is how the system decides what you are worth to advertisers.

Proton analyzed 54,216 advertiser-defined profiles across 251 U.S. cities using real ad-market pricing.

● Highest-value user: $17,929/year
● Lowest-value user: $31/year

That’s a 577x difference. This disparity is not an anomaly — it is the business model.

“Google doesn’t just build a profile from the information you knowingly provide. If you sign up for services, click ads, or ignore others, that creates signals the system can use to infer much more than you realize. It can start with age or interests, then expand into assumptions about income, family status, political leanings, or religion.
When the system isn’t sure, it tests those assumptions by serving different ads, links, or recommendations and watching how you respond. It doesn’t just tracking who you are. It’s constantly learning, so it can price access to you more precisely.”
— Eamonn Maguire, Director of Engineering, Machine Learning & AI

Who the system values most — and least These two profiles illustrate how the same system assigns radically different value.

$17,929/year
● 35–44, male
● Bozeman, MT
● Not a parent
● Desktop, heavy user

High-intent, high-margin services:
● business lawyer
● home renovation
● golf courses

$31/year
● 18–24, male
● Fort Smith, AR
● Parent
● Android, casual user

Price-sensitive, lower-margin searches:
● cheap diapers
● family apartments
● toddler clothes

Same system. Same country. 577x difference.

Value is not distributed equally
The gap between the average and the median shows that a small number of high-value users disproportionately influence the system.

The top 10% of users generate 43% of total value.

● Average value: $1,605/year
● Median value: $760/year

Most users are worth far less than the system’s top performers.

How your value is calculated

Your value is constantly recalculated

Your value is not fixed. It is continuously recalculated based on signals that predict the likelihood of a commercially valuable action.

These signals include:
● What you search
● When you search
● What device you use
● Who you are inferred to be

High-intent searches — such as legal services, insurance, or financial products — command significantly higher prices than general browsing or informational queries. Your value can change from one moment to the next depending on what you do. In this system, behavior matters more than time spent

The signals behind the price

Your device changes your value

Device usage has a measurable impact on how users are valued.
● Desktop: $2,894/year
● iPhone: $1,338/year
● Android: $585/year

Desktop users are worth nearly 5x more than Android users — even when everything else is the same.

These differences reflect observed behavior — including conversion rates and commercial intent — not the cost of the device itself. Your device becomes a proxy for purchasing behavior.

Parents are systematically valued less

Parental status affects how users are priced within the system.

Non-parents are worth ~17% more on average.

The gap increases during peak earning years:
● 25–34: +24%
● 35–44: +34.5%

Having children reduces your perceived commercial value.

Same age — same location — same device. Different value.

Value peaks in midlife

User value is highest between the ages of 25 and 44.

This period corresponds with:
● Major financial decisions
● High-value purchases
● Career-related services

As users age, overall value declines — but does not disappear. For users 65+, approximately 75% of value is concentrated in:

● Health
● Real estate
● Financial planning

The system adapts by narrowing focus rather than reducing targeting.

Gender is not a primary driver of value

Gender has a measurable but limited impact on how users are priced within the ad ecosystem.

Average values across genders are broadly similar — with differences in the single digits.

Differences in value are driven primarily by how advertisers price categories of demand — not by gender alone. Higher-value industries — such as finance, legal services, and B2B technology — tend to influence outcomes more strongly than identity itself.

As a result, gender can affect value indirectly, but it is not a consistent or defining factor.

Where you live affects what you’re worth

Local economies shape how much advertisers are willing to pay for access to users.

Location alone can dramatically change what you’re worth.

Highest-value markets include:

  1. Edmond, OK
  2. Bozeman, MT
  3. Naperville, IL
  4. Santa Fe, NM
  5. Durham, NC

Lowest-value markets include:
247. Greensboro, NC
248. Gulfport, MS
249. Fort Smith, AR
250. Lowell, MA
251. West Valley City, UT

More usage means more value

Frequency of use acts as a multiplier on user value.

● Heavy users: $3,611/year
● Average users: $843/year
● Casual users: $362/year

Heavy users generate nearly 10x more value than casual users. More usage doesn’t just increase your value — it multiplies it.

This creates strong incentives to maximize engagement.

[–] fiat_lux@lemmy.zip 11 points 3 days ago (2 children)

I'm happy for both of you, this is adorable.

I'm assuming the red part detaches easily from the rest? I hope your bumper crop means this will never be necessary, but should you find yourself in a pinch, the search term "Sequin Horsehair Crinoline Tube" might help you in the future.

[–] fiat_lux@lemmy.zip 9 points 4 days ago* (last edited 4 days ago)

I didn't see red as risk-free at all. You're setting yourself up for a post-button Mad Max world where you know all of your fellow survivors are willing to kill you and up to 49% of humanity.

[–] fiat_lux@lemmy.zip 15 points 4 days ago* (last edited 4 days ago) (1 children)

When I was about 12, I got into a discussion about the environment with another kid at school. She told me that it didn't matter if we ruined the environment of the countries we all live in now, because we could all just move to the Arctic or Antarctica.

I was so surprised by the absurdity of that statement that it stuck with me vividly. To her credit, some years later she asked if I remembered her saying that and then admitted that it was a dumb thing to say. I occasionally remember this as an amusing childhood experience.

Besides the credit part, I remembered it again today for a different reason, this time in a conversation about model collapse.

[Model collapse is] a solved problem. We can see that it’s solved by the fact that AI models continue to get better, despite an increasing amount of AI-generated data being present in the world that training data is being drawn from.
...
AI models are never going to get worse than they are now because if they did get worse we’d just throw them out and go back to the earlier ones that worked better, perhaps re-training with the same data but better training techniques or model architectures.

This is my fault for letting myself get into a discussion about model collapse on the fediverse.

I'm not sure why model collapse isn't a big topic anymore, but maybe that's just because the environmental catastrophes are a more pressing concern. To be clear, I'm not concerned about the models themselves, just our increasing inability to verify the authenticity or accuracy of any information we encounter, including search engines just not turning up any useful results.

On a slightly different topic, if anyone has suggestions for how a person could acquire money to live, which can't involve physical labor, is probably remote-only, and possibly allows part-time flexibility, while unable to move from an expensive location for at least the next couple of years: I'm open to ideas. Because scamming people on Polymarket with a hairdryer sounded far more appealing than it ought.

[–] fiat_lux@lemmy.zip 2 points 4 days ago (1 children)

We can see that it’s solved by the fact that AI models continue to get better despite an increasing amount of AI-generated data being present in the world that training data is being drawn from.

Even if it logically followed that model improvement means model collapse is a solved problem, which it absolutely doesn't, even the premise that models are improving to a significant degree is up for debate.

MMLU pro benchmark over time line graph showing plateauing values Massive Multitask Language Understanding (MMLU) benchmark vs time 07-2023 to 01-2026

A lot of people really want to believe that AI is going to just “go away” somehow, and this notion of model collapse is a convenient way to support that belief

Model collapse may for some people be an argument used to support a hope that AI will go away, but the reality of that hope does not alter the validity of the model collapse problem.

You can tell it's not a solved problem because researchers are still trying to quantify the risk and severity of collapse - as you can see even just from the abstracts in the links I provided.

Some choice excerpts from the abstracts, for those who don't want to click the links:

Our results show that even the smallest fraction of synthetic data (e.g., as little as 1% of the total training dataset) can still lead to model collapse

...we establish ... that collapse can be avoided even as the fraction of real data vanishes. On the other hand, we prove that some assumptions ... are indeed necessary: Without them, model collapse can occur arbitrarily quickly, even when the original data is still present in the training set.

[–] fiat_lux@lemmy.zip 3 points 4 days ago (3 children)

It can't only be from data from previous generations, even if the initial demonstration used that, because that would mean a single piece of human-generated text is sufficient to avoid collapse.

The loss of data from generation to generation is one way model collapse can occur, but it's only one way. The actual issues that cause collapse are replication of errors and increasing data homogeneity. In a world where an unknown quantity of new data is AI generated, it is not possible to ensure only a certain quantity is used as future training data.

Additionally, as new human generated content is based on the information provided by AI, even if not used intentionally in the construction of the text itself, the error replication and data diversity issues cross over from being only an AI-generated content problem to an all content problem. You can see examples of this happening now in the media where a journalist relies on AI output to fact check, and then the article with the error gets republished by other media outlets.

Real AI training methods may stave off some model collapse, if we ignore existing issues around the cultural homogeneity of training data from across all time periods, or assume the models are sufficiently weighted to mitigate those issues, but it's by no means settled that collapse is a non-problem.

You've mentioned using data mixing to prevent collapse, but some of the research suggests that even iterative mixing isn't sufficient dependent on the quantities of real vs synthetic data. Strong Model Collapse (2024), Dohmatob, Feng, Subramonian, Kempe goes into that, and since then there's been When Models Don’t Collapse: On the Consistency of Iterative MLE (2025) Barzilai, Shamir which presents one theoretical case where collapse won't occur provided some assumptions hold, but the math is beyond me. They also note multiple situations where near-instant collapse can occur.

How much data poisoning might affect any of that is not at all clear, it would need to be in sufficient quantity for whatever model to have an effect, but it certainly wouldn't help. The recent Bixonimania scandal suggests it's feasible.

[–] fiat_lux@lemmy.zip 4 points 4 days ago

I went looking for the author for this, it's credited to Vinay Krishnan but the twitter account it was posted to is now private. He unfortunately doesn't have it on his website, but it is filled with other very good writing.

[–] fiat_lux@lemmy.zip 3 points 5 days ago (5 children)

“model collapse” was demonstrated by repeatedly training generation after generation of models on the output of previous generations

the best models these days are trained largely on synthetic data - data that’s been pre-processed by other AIs to turn it into stuff that makes for better training material

You can prevent model collapse simply by enriching the training data with good data - stuff that is already archived, that can’t be “contaminated."

This feels like an odd juxtaposition.

If model collapse can be avoided by enriching with uncontaminated data, and model collapse comes from using training data generated by previous generations, doesn't that imply that:

  1. Either the best models are headed towards model collapse, or,
  2. Models can't be updated because modern data isn't usable?
 

Image Text

"S3d and Diamond Multimedia make your system scream.

Fast forward to the future of 3D multimedia. Supersonic graphics.

All the power 3D has to offer for business and entertainment on your PC, right here- right now.

Stealth 3D 2000 from Diamond Multimedia™ does it all with the S3d chip on board. Use Diamond's Stealth 3D 2000 together with S3d logo software. They'll make your system scream.

SEEK FIND. DEMAND. S3d Onboard S3 Incorporated

hit our web site for the real stuff: seek.s3.com

Designed for Microsoft Windows 95

S3d is compatible with Windows 95, Windows 3.1. Windows NT. and OS/2

Product Information Number 316"

Image description

The image is a composite photo created by Gerald Bybee, created from a high contrast black and white closeup of a bald Asian woman screaming, with her hands loosely covering her ears. In this composite her eyes have been replaced with copies of her open mouth.

[–] fiat_lux@lemmy.zip 4 points 5 days ago (1 children)

They also had an animal neglect version, but only the bondage ad was pulled in the UK. Late 90s advertising for games and consoles tried to be as edgy as possible when published in magazines targeted at male demographics.

https://www.timeextension.com/features/flashback-when-nintendo-was-forced-to-pull-its-offensive-game-boy-advert

[–] fiat_lux@lemmy.zip 23 points 5 days ago

Congratulations. There's something about convincing a cat you're a source of enjoyment that is ridiculously rewarding. You earned those purrs.

I hope he remembers that he enjoyed this experience so you can both keep enjoying future purring!

[–] fiat_lux@lemmy.zip 7 points 5 days ago

Panel 3 and 4 aren't quite right.

The guy in Panel 3 didn't just remix it, he cherry picked the parts that would be most likely to rank for either a short or long tail keyword strategy depending on the size and business of his client or employer.

And that guy doesn't have his paper taken away in Panel 4. He's feeding as many papers as he can to the AI which are tailored for "Answer Engine Optimization" or "Generative Engine Optimization" (they haven't settled on a catchy name yet for what is largely the same thing, even if some claim they're different).

The techniques have changed slightly but SEO has been a filthy game for much longer than AI. Google made sure of that with their auction house, "featured snippet" sections and backlink authority ranking systems.

view more: next ›