AI - Artificial intelligence

162 readers
3 users here now

AI related news and articles.

Rules:

founded 6 months ago
MODERATORS
1
 
 

Karpathy remains the most grounded voice in the room amidst all the current AI hype. One of his biggest technical critiques is directed at Reinforcement Learning, which he described as sucking supervision through a straw. You do a long, complex task, and at the end, you get a single bit of feedback, right or wrong, and you use that to upweight or downweight the entire trajectory. It's incredibly noisy and inefficient, suggesting we really need a paradigm shift toward something like process supervision. A human would never learn that way because we'd review our work, figure out which parts were good and which were bad, and learn in a much more nuanced way. We're starting to see papers try to address this, but it's a hard problem.

He also pushed back on the idea that we’re recreating evolution or building digital animals. Karpathy argues that because we train on the static artifacts of human thought, in form of internet text, rather than biological survival imperatives, we aren't building organisms. Animals come from evolution, which bakes a huge amount of hardware and instinct directly into their DNA. A zebra can run minutes after it's born. We're building something else that's more akin to ghosts. They are fully digital, born from imitating the vast corpus of human data on the internet. It's a different kind of intelligence, starting from a different point in the space of possible minds.

This leads into his fairly conservative timeline on agents. All these wild predictions about AGI are largely fundraising hype. While the path to capable AI agents is tractable, it's going to take about a decade, not a single year. The agents we have today are still missing too much. They lack true continual learning, robust multimodality, and the general cognitive depth you'd need to hire one as a reliable intern. They just don't work well enough yet.

Drawing from his time leading Autopilot at Tesla, he views coding agents through the lens of the "march of nines." Just like self-driving, getting the demo to work is easy, but grinding out reliability to 99.9999% takes ten years. Right now, agents are basically just interns that lack the cognitive maturity to be left alone.

Finally, he offered some interesting thoughts on architecture and the future. He wants to move away from massive models that memorize the internet via lossy compression, advocating instead for a small, 1-billion parameter cognitive core that focuses purely on reasoning and looks up facts as needed. He sees AI as just a continuation of automation curve we’ve been on for centuries.

2
3
4
 
 

(repost since I messed up the link last time)

The story references the similarly-dead Humane Pin and leans on “why buy separate AI hardware when you have a phone”. Amazon Alexa has gotten LLM integrations, so it’s no longer way behind the startups; is it still seen as a dead end for Amazon?

5
6
 
 

Deboo — JWG Dialogue Mode, engage! Deboo: Explain machine learning 🤓🤓 JWG: Imagine giving a computer a giant basket of examples and whispering, “Figure out the pattern hiding in here.” The machine squints (metaphorically), pokes around the data, adjusts a zillion tiny dials ins...

7
 
 

Backpropagation is one of the most important concepts in neural networks, however, it is challenging for learners to understand its concept because it is the most notation heavy part

8
9
10
11
12
 
 

A couple of years ago I decided to turn this blog into a podcast. At the time, I decided to make up a stupid rule: whatever model I use to clone my voice and generate article transcripts needs to be an open model.

13
14
15
16
17
18
19
20
 
 

The two key points:

  • Meetings, interruptions, review delays, and slow CI pipelines cost more than AI saves. Individual productivity tools can’t fix organisational dysfunction.
  • AI amplifies existing engineering culture. Strong quality practices get faster. Weak practices accumulate debt faster.
21
 
 

Abstract

As language models (LMs) increasingly infiltrate into high-stakes domains such as law, medicine, journalism and science, their ability to distinguish belief from knowledge, and fact from fiction, becomes imperative. Failure to make such distinctions can mislead diagnoses, distort judicial judgments and amplify misinformation. Here we evaluate 24 cutting-edge LMs using a new KaBLE benchmark of 13,000 questions across 13 epistemic tasks. Our findings reveal crucial limitations. In particular, all models tested systematically fail to acknowledge first-person false beliefs, with GPT-4o dropping from 98.2% to 64.4% accuracy and DeepSeek R1 plummeting from over 90% to 14.4%. Further, models process third-person false beliefs with substantially higher accuracy (95% for newer models; 79% for older ones) than first-person false beliefs (62.6% for newer; 52.5% for older), revealing a troubling attribution bias. We also find that, while recent models show competence in recursive knowledge tasks, they still rely on inconsistent reasoning strategies, suggesting superficial pattern matching rather than robust epistemic understanding. Most models lack a robust understanding of the factive nature of knowledge, that knowledge inherently requires truth. These limitations necessitate urgent improvements before deploying LMs in high-stakes domains where epistemic distinctions are crucial.

22
 
 

An AI model released by the Chinese AI company DeepSeek uses new techniques that could significantly improve AI’s ability to “remember.”

Released last week, the optical character recognition (OCR) model works by extracting text from an image and turning it into machine-readable words. This is the same technology that powers scanner apps, translation of text in photos, and many accessibility tools.

23
 
 

Among the myriad abilities that humans possess, which ones are uniquely human? Language has been a top candidate at least since Aristotle, who wrote that humanity was “the animal that has language.” Even as large language models such as ChatGPT superficially replicate ordinary speech, researchers want to know if there are specific aspects of human language that simply have no parallels in the communication systems of other animals or artificially intelligent devices.

In particular, researchers have been exploring the extent to which language models can reason about language itself. For some in the linguistic community, language models not only don’t have reasoning abilities, they can’t. This view was summed up by Noam Chomsky, a prominent linguist, and two co-authors in 2023, when they wrote in The New York Times(opens a new tab) that “the correct explanations of language are complicated and cannot be learned just by marinating in big data.” AI models may be adept at using language, these researchers argued, but they’re not capable of analyzing language in a sophisticated way.

24
25
view more: next ›