sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

587
active users

#hallucinations

3 posts3 participants0 posts today

"My favorite cognitive bias is the availability heuristic and a close second is its cousin salience bias. Humans are empirically predisposed towards noticing and remembering things that are more striking, and to overestimate their frequency.

If you are estimating the variables above based on the vibe that you’re getting from the experience of using an LLM, you may be overestimating its utility.

Consider a slot machine.
(...)
Now, consider Mallory.

If you put ten minutes into writing a prompt, and Mallory gives a completely off-the-rails, useless answer, and you lose ten minutes, well, that’s just what using a computer is like sometimes. Mallory malfunctioned, or hallucinated, but it does that sometimes, everybody knows that. You only wasted ten minutes. It’s fine. Not a big deal. Let’s try it a few more times. Just ten more minutes. It’ll probably work this time.

If you put ten minutes into writing a prompt, and it completes a task that would have otherwise taken you 4 hours, that feels amazing. Like the computer is magic! An absolute endorphin rush.

Very memorable. When it happens, it feels like P = 1.

But... did you have a time budget before you started? Did you have a specified N such that “I will give up on Mallory as soon as I have spent N minutes attempting to solve this problem with it”? When the jackpot finally pays out that 4 hours, did you notice that you put 6 hours worth of 10-minute prompt coins into it in?

If you are attempting to use the same sort of heuristic intuition that probably works pretty well for other business leadership decisions, Mallory’s slot-machine chat-prompt user interface is practically designed to subvert those sensibilities. Most business activities do not have nearly such an emotionally variable, intermittent reward schedule. They’re not going to trick you with this sort of cognitive illusion."

blog.glyph.im/2025/08/futzing-

blog.glyph.imThe Futzing Fraction
More from Glyph

"For three weeks in May, the fate of the world rested on the shoulders of a corporate recruiter on the outskirts of Toronto. Allan Brooks, 47, had discovered a novel mathematical formula, one that could take down the internet and power inventions like a force-field vest and a levitation beam.

Or so he believed.

Mr. Brooks, who had no history of mental illness, embraced this fantastical scenario during conversations with ChatGPT that spanned 300 hours over 21 days. He is one of a growing number of people who are having persuasive, delusional conversations with generative A.I. chatbots that have led to institutionalization, divorce and death.

Mr. Brooks is aware of how incredible his journey sounds. He had doubts while it was happening and asked the chatbot more than 50 times for a reality check. Each time, ChatGPT reassured him that it was real. Eventually, he broke free of the delusion — but with a deep sense of betrayal, a feeling he tried to explain to the chatbot."

nytimes.com/2025/08/08/technol

The New York Times · Chatbots Can Go Into a Delusional Spiral. Here’s How It Happens.By Kashmir Hill

"Part of the challenge for #AIdevelopers is reaching a balance between verifying information for accuracy and enabling the model to be “#creative”."
Is there a chance that AI firms will be the ones hiring good #humanjournalists to keep the facts coming?

FT: "The ‘#hallucinations’ that haunt #AI: why #chatbots struggle to tell the #truth"

ft.com/content/7a4e7eae-f004-4

Financial Times · The ‘hallucinations’ that haunt AI: why chatbots struggle to tell the truthBy Melissa Heikkilä

"The world’s leading artificial intelligence groups are stepping up efforts to reduce the number of “hallucinations” in large language models, as they seek to solve one of the big obstacles limiting take-up of the powerful technology.

Google, Amazon, Cohere and Mistral are among those trying to bring down the rate of these fabricated answers by rolling out technical fixes, improving the quality of the data in AI models, and building verification and fact-checking systems across their generative AI products.

The move to reduce these so-called hallucinations is seen as crucial to increase the use of AI tools across industries such as law and health, which require accurate information, and help boost the AI sector’s revenues.

It comes as chatbot errors have already resulted in costly mistakes and litigation. Last year, a tribunal ordered Air Canada to honour a discount that its customer service chatbot had made up, and lawyers who have used AI tools in court documents have faced sanctions after it made up citations.

But AI experts warn that eliminating hallucinations completely from large language models is impossible because of how the systems operate."

ft.com/content/7a4e7eae-f004-4

Financial Times · The ‘hallucinations’ that haunt AI: why chatbots struggle to tell the truthBy Melissa Heikkilä

It’s “frighteningly likely” many US #courts will overlook #AI errors, expert says

Now, experts are warning that judges overlooking AI #hallucinations in court filings could easily become commonplace, especially in the typically overwhelmed lower courts. And so far, only two states have moved to force judges to sharpen their tech competencies and adapt so they can spot AI red flags and theoretically stop disruptions to the justice system at all levels.
#security

arstechnica.com/tech-policy/20

A judge points to a diagram of a hand with six fingers
Ars Technica · It’s “frighteningly likely” many US courts will overlook AI errors, expert saysBy Ashley Belanger

…the term hallucinations is subtly misleading. It suggests that the bad behavior is an aberration, a bug, when it’s actually a feature of the probabilistic pattern-matching mechanics of neural networks.
—Karen Hao, Empire of AI: Dreams and Nightmares in Sam Altman's OpenAI
#ai #llms #llm #hallucinations