AI - Artificial intelligence

165 readers
6 users here now

AI related news and articles.

Rules:

founded 6 months ago
MODERATORS
1
6
Olmo 3 is a fully open LLM (simonwillison.net)
submitted 19 hours ago* (last edited 19 hours ago) by cm0002@toast.ooo to c/Aii@programming.dev
 
 

Olmo is the LLM series from Ai2—the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along with those releases.

The new Olmo 3 claims to be “the best fully open 32B-scale thinking model” and has a strong focus on interpretability:

At its center is Olmo 3-Think (32B), the best fully open 32B-scale thinking model that for the first time lets you inspect intermediate reasoning traces and trace those behaviors back to the data and training decisions that produced them.

2
 
 

Are you a wizard with words? Do you like money without caring how you get it? You could be in luck now that a new role in cybercrime appears to have opened up – poetic LLM jailbreaking.

A research team in Italy published a paper this week, with one of its members saying that the "findings are honestly wilder than we expected."

Researchers found that when you try to bypass top AI models' guardrails – the safeguards preventing them from spewing harmful content – attempts to do so composed in verse were vastly more successful than typical prompts.

3
4
5
1
Man vs. Machine (philosophyofbalance.com)
6
7
 
 

Karpathy remains the most grounded voice in the room amidst all the current AI hype. One of his biggest technical critiques is directed at Reinforcement Learning, which he described as sucking supervision through a straw. You do a long, complex task, and at the end, you get a single bit of feedback, right or wrong, and you use that to upweight or downweight the entire trajectory. It's incredibly noisy and inefficient, suggesting we really need a paradigm shift toward something like process supervision. A human would never learn that way because we'd review our work, figure out which parts were good and which were bad, and learn in a much more nuanced way. We're starting to see papers try to address this, but it's a hard problem.

He also pushed back on the idea that we’re recreating evolution or building digital animals. Karpathy argues that because we train on the static artifacts of human thought, in form of internet text, rather than biological survival imperatives, we aren't building organisms. Animals come from evolution, which bakes a huge amount of hardware and instinct directly into their DNA. A zebra can run minutes after it's born. We're building something else that's more akin to ghosts. They are fully digital, born from imitating the vast corpus of human data on the internet. It's a different kind of intelligence, starting from a different point in the space of possible minds.

This leads into his fairly conservative timeline on agents. All these wild predictions about AGI are largely fundraising hype. While the path to capable AI agents is tractable, it's going to take about a decade, not a single year. The agents we have today are still missing too much. They lack true continual learning, robust multimodality, and the general cognitive depth you'd need to hire one as a reliable intern. They just don't work well enough yet.

Drawing from his time leading Autopilot at Tesla, he views coding agents through the lens of the "march of nines." Just like self-driving, getting the demo to work is easy, but grinding out reliability to 99.9999% takes ten years. Right now, agents are basically just interns that lack the cognitive maturity to be left alone.

Finally, he offered some interesting thoughts on architecture and the future. He wants to move away from massive models that memorize the internet via lossy compression, advocating instead for a small, 1-billion parameter cognitive core that focuses purely on reasoning and looks up facts as needed. He sees AI as just a continuation of automation curve we’ve been on for centuries.

8
9
10
 
 

Deboo — JWG Dialogue Mode, engage! Deboo: Explain machine learning 🤓🤓 JWG: Imagine giving a computer a giant basket of examples and whispering, “Figure out the pattern hiding in here.” The machine squints (metaphorically), pokes around the data, adjusts a zillion tiny dials ins...

11
12
13
 
 

(repost since I messed up the link last time)

The story references the similarly-dead Humane Pin and leans on “why buy separate AI hardware when you have a phone”. Amazon Alexa has gotten LLM integrations, so it’s no longer way behind the startups; is it still seen as a dead end for Amazon?

14
15
 
 

Backpropagation is one of the most important concepts in neural networks, however, it is challenging for learners to understand its concept because it is the most notation heavy part

16
17
18
19
 
 

A couple of years ago I decided to turn this blog into a podcast. At the time, I decided to make up a stupid rule: whatever model I use to clone my voice and generate article transcripts needs to be an open model.

20
21
22
23
24
25
view more: next ›