AI - Artificial intelligence

1

4

Major insurers move to avoid liability for AI lawsuits as multi-billion dollar risks emerge — Recent public incidents have lead to costly repercussions (www.tomshardware.com)

submitted 5 hours ago by codeinabox@programming.dev to c/Aii@programming.dev

1 comments fedilink

2

4

Intel LLM Scaler vLLM Update Supports More Models (www.phoronix.com)

submitted 1 day ago by cm0002@infosec.pub to c/Aii@programming.dev

0 comments fedilink

Intel software engineers continue to be hard at work on LLM-Scaler as their solution for running vLLM on Intel GPUs in a Docker containerized environment. A new beta release of LLM-Scaler built around vLLM was released overnight with support for running more large language models.

Since the "LLM-Scaler 1.0" debut of the project back in August there have been frequent updates for expanding LLM coverage on Intel GPUs and exposing more features for harnessing the AI compute power on Intel graphics hardware. The versioning scheme though remains a mess with today's test version being "llm-scaler-vllm beta release 0.10.2-b6" even with "1.0" previously being announced.

3

38

Only 8% of Americans would pay extra for AI, according to ZDNET-Aberdeen research (www.zdnet.com)

submitted 3 days ago by codeinabox@programming.dev to c/Aii@programming.dev

4 comments fedilink

4

1

Artisanal Intelligence: what's the deal with AI Art (polclarissou.com)

submitted 1 day ago by cm0002@libretechni.ca to c/Aii@programming.dev

1 comments fedilink

5

2

Claude Opus 4.5, and why evaluating new LLMs is increasingly difficult (simonwillison.net)

submitted 2 days ago by cm0002@suppo.fi to c/Aii@programming.dev

0 comments fedilink

Anthropic released Claude Opus 4.5 this morning, which they call “best model in the world for coding, agents, and computer use”. This is their attempt to retake the crown for best coding model after significant challenges from OpenAI’s GPT-5.1-Codex-Max and Google’s Gemini 3, both released within the past week!

The core characteristics of Opus 4.5 are a 200,000 token context (same as Sonnet), 64,000 token output limit (also the same as Sonnet), and a March 2025 “reliable knowledge cutoff” (Sonnet 4.5 is January, Haiku 4.5 is February).

The pricing is a big relief: $5/million for input and $25/million for output. This is a lot cheaper than the previous Opus at $15/$75 and keeps it a little more competitive with the GPT-5.1 family ($1.25/$10) and Gemini 3 Pro ($2/$12, or $4/$18 for >200,000 tokens). For comparison, Sonnet 4.5 is $3/$15 and Haiku 4.5 is $1/$5

6

0

Trump's Genesis Mission aims to build a centralized AI platform to power scientific breakthroughs (www.engadget.com)

submitted 1 day ago by other_cat@piefed.zip to c/Aii@programming.dev

4 comments fedilink

President Donald Trump has issued a new Executive Order that launches the “Genesis Mission,” an AI-focused initiative that aims to make the "most complex and powerful scientific instrument ever built."

7

35

LLMs can be easily jailbroken using poetry (www.theregister.com)

submitted 5 days ago by cm0002@infosec.pub to c/Aii@programming.dev

3 comments fedilink

Are you a wizard with words? Do you like money without caring how you get it? You could be in luck now that a new role in cybercrime appears to have opened up – poetic LLM jailbreaking.

A research team in Italy published a paper this week, with one of its members saying that the "findings are honestly wilder than we expected."

Researchers found that when you try to bypass top AI models' guardrails – the safeguards preventing them from spewing harmful content – attempts to do so composed in verse were vastly more successful than typical prompts.

8

7

Olmo 3 is a fully open LLM (simonwillison.net)

submitted 4 days ago* (last edited 4 days ago) by cm0002@toast.ooo to c/Aii@programming.dev

0 comments fedilink

Olmo is the LLM series from Ai2—the Allen institute for AI. Unlike most open weight models these are notable for including the full training data, training process and checkpoints along with those releases.

The new Olmo 3 claims to be “the best fully open 32B-scale thinking model” and has a strong focus on interpretability:

At its center is Olmo 3-Think (32B), the best fully open 32B-scale thinking model that for the first time lets you inspect intermediate reasoning traces and trace those behaviors back to the data and training decisions that produced them.

9

2

Why it takes months to tell if new AI models are good (www.seangoedecke.com)

submitted 5 days ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

10

1

Man vs. Machine (philosophyofbalance.com)

submitted 1 week ago by arendjr@programming.dev to c/Aii@programming.dev

0 comments fedilink

11

0

Ai2 just announced Olmo 3, a leading fully open LM suite built for reasoning, chat, & tool use (allenai.org)

submitted 6 days ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

Olmo 3 in the Ai2 Playground → https://playground.allenai.org/

Download: https://huggingface.co/collections/allenai/olmo-3-68e80f043cc0d3c867e7efc6

Technical report: https://allenai.org/papers/olmo3

12

1

What happens if we swap AI brains? (nevkontakte.com)

submitted 1 week ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

13

6

LLMs are bullshitters. But that doesn't mean they're not useful (blog.kagi.com)

submitted 1 week ago by codeinabox@programming.dev to c/Aii@programming.dev

0 comments fedilink

14

2

Andrej Karpathy — “We’re summoning ghosts, not building animals” (www.youtube.com)

submitted 1 week ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

Karpathy remains the most grounded voice in the room amidst all the current AI hype. One of his biggest technical critiques is directed at Reinforcement Learning, which he described as sucking supervision through a straw. You do a long, complex task, and at the end, you get a single bit of feedback, right or wrong, and you use that to upweight or downweight the entire trajectory. It's incredibly noisy and inefficient, suggesting we really need a paradigm shift toward something like process supervision. A human would never learn that way because we'd review our work, figure out which parts were good and which were bad, and learn in a much more nuanced way. We're starting to see papers try to address this, but it's a hard problem.

He also pushed back on the idea that we’re recreating evolution or building digital animals. Karpathy argues that because we train on the static artifacts of human thought, in form of internet text, rather than biological survival imperatives, we aren't building organisms. Animals come from evolution, which bakes a huge amount of hardware and instinct directly into their DNA. A zebra can run minutes after it's born. We're building something else that's more akin to ghosts. They are fully digital, born from imitating the vast corpus of human data on the internet. It's a different kind of intelligence, starting from a different point in the space of possible minds.

This leads into his fairly conservative timeline on agents. All these wild predictions about AGI are largely fundraising hype. While the path to capable AI agents is tractable, it's going to take about a decade, not a single year. The agents we have today are still missing too much. They lack true continual learning, robust multimodality, and the general cognitive depth you'd need to hire one as a reliable intern. They just don't work well enough yet.

Drawing from his time leading Autopilot at Tesla, he views coding agents through the lens of the "march of nines." Just like self-driving, getting the demo to work is easy, but grinding out reliability to 99.9999% takes ten years. Right now, agents are basically just interns that lack the cognitive maturity to be left alone.

Finally, he offered some interesting thoughts on architecture and the future. He wants to move away from massive models that memorize the internet via lossy compression, advocating instead for a small, 1-billion parameter cognitive core that focuses purely on reasoning and looks up facts as needed. He sees AI as just a continuation of automation curve we’ve been on for centuries.

15

3

Reducing Background Audio Noise Using Neural Networks (odan.github.io)

submitted 1 week ago by cm0002@libretechni.ca to c/Aii@programming.dev

0 comments fedilink

16

1

A dialogue on Machine Learning 🤓🤓🤓 (codeberg.org)

submitted 1 week ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

Deboo — JWG Dialogue Mode, engage! Deboo: Explain machine learning 🤓🤓 JWG: Imagine giving a computer a giant basket of examples and whispering, “Figure out the pattern hiding in here.” The machine squints (metaphorically), pokes around the data, adjusts a zillion tiny dials ins...

17

8

Monitoring Students’ Chatbot Conversations Is Big Business Now (gizmodo.com)

submitted 1 week ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

18

3

Introducing GPT-5.1 for developers (simonwillison.net)

submitted 1 week ago by cm0002@no.lastname.nz to c/Aii@programming.dev

0 comments fedilink

19

4

What’s next for Rabbit? Employees haven’t been paid (www.tomsguide.com)

submitted 1 week ago by Mniot@programming.dev to c/Aii@programming.dev

0 comments fedilink

(repost since I messed up the link last time)

The story references the similarly-dead Humane Pin and leans on “why buy separate AI hardware when you have a phone”. Amazon Alexa has gotten LLM integrations, so it’s no longer way behind the startups; is it still seen as a dead end for Amazon?

20

2

Stop Overhyping TOON — JSON is still good (www.linkedin.com)

submitted 1 week ago by codeinabox@programming.dev to c/Aii@programming.dev

1 comments fedilink

21

2

Backpropagation Explainer (xnought.github.io)

submitted 2 weeks ago by cm0002@suppo.fi to c/Aii@programming.dev

0 comments fedilink

Backpropagation is one of the most important concepts in neural networks, however, it is challenging for learners to understand its concept because it is the most notation heavy part

22

6

DeepSeek and Qwen AI Models Now Available as Ubuntu Snaps (www.omgubuntu.co.uk)

submitted 2 weeks ago by cm0002@digipres.cafe to c/Aii@programming.dev

0 comments fedilink

23

5

Why do AI models use so many em-dashes? (www.seangoedecke.com)

submitted 2 weeks ago by cm0002@digipres.cafe to c/Aii@programming.dev

1 comments fedilink

24

5

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B (huggingface.co)

submitted 2 weeks ago by cm0002@literature.cafe to c/Aii@programming.dev

3 comments fedilink

link to model https://huggingface.co/WeiboAI/VibeThinker-1.5B

25

2

TTS still sucks (duarteocarmo.com)

submitted 2 weeks ago by cm0002@literature.cafe to c/Aii@programming.dev

0 comments fedilink

A couple of years ago I decided to turn this blog into a podcast. At the time, I decided to make up a stupid rule: whatever model I use to clone my voice and generate article transcripts needs to be an open model.