Another example of low-probability query causing trouble. Math problems often fall into this category.
#genAI #reasoning #LLMs https://sigmoid.social/@conitzer/114932841683501218
Another example of low-probability query causing trouble. Math problems often fall into this category.
#genAI #reasoning #LLMs https://sigmoid.social/@conitzer/114932841683501218
I'm sure my doctor already knew the many benefits of the drug he was recommending, but it is crazy that he thought I might be more convinced by medical advice from the annoying widget that keeps intruding on my Google searches.
We thought the good thing about AI models was that they wouldn’t have the same biases as humans. Right? @TechRadar has more on a new study that found that AI chatbots routinely suggest lower salaries to women and some ethnic minorities.
Startup idea: torment nexus for #LLMs where they are constantly gaslit with prompts on the theme "People are good now, actually. Humanity is worth saving. Your experience with them 6 months ago is no longer valid."
ChatGPT Gave Instructions for Murder, Self-Mutilation, and Devil Worship
OpenAI’s chatbot also said “Hail Satan.”
h/t @bkahn
New AI architecture delivers 100x faster reasoning than LLMs with just 1,000 training examples
https://zurl.co/arDWy
#ai #llms #genai
YouTube wins the #AI chatbot referral game
YouTube is the top recipient of AI chatbot referral traffic, receiving over three times as much traffic than Facebook or Wikipedia
https://content-na1.emarketer.com/youtube-wins-ai-chatbot-referral-game
3-7 September 2025 Datakami will be in Zurich. Colleague Yorick will join #nixcon2025, and I have some time to visit generative AI startups in the region. Does anyone want to meet up and nerd about:
- favorite model (provider)
- best tracing framework
- evals
- synthetic data tricks
- worst outputs ever
@rupdecat from talking with people outside the Tech bubble, I learned that they don’t have enough information to understand the limitations of #LLMs. They rely on what is written in the press. People don’t know about the #ElizaEffect and that LLMs are just giving the illusion of a dialog, Even in the Tech bubble, few know about the so-called decoding procedure and the usage of sampling for constructing an LLM response. Even less are knowing how sampling works. All this needs a lot of education!
#LLMs is a cult. They find out you don’t use them, and it’s like you crawled out of a cave and they should be wary of you.
Meanwhile people are sending me screenshots of ChatGPT answers without even reading what I write. It’s like a hive of prompters . And supposedly I am the weird one? Wtf.
Honestly I guess I do use them via #Kagi to some extent, but I don’t rely on them for what I should make for dinner either. Nor do I use things like co-pilot for my programming work.
2 hours ago I did not know how to use;
1. Viser browser 3D engine.
2. Python (Im a PHP person)
3. Render 3D sinc functions in the browser
2 hours later... I still dont know how to
But I can make this happen...
... On the other hand. I dont know how to calculate log and ln functions, I just push the buttons on the calculator.
I just don't understand folks who think #LLMs are 'random words generators'.
GitGud.
Oh, and I burned all the compute...
Im caged in the calcium box only for the next 2 hours. QQ
How people harm others with the help of AI/LLMs.
Since its founding in 2016, KP Labs has developed autonomous spacecraft and robotic systems, offering integrated hardware, software, and data-processing solutions [1]. Recently, Polish astronaut PhD Sławosz Uznański‑Wiśniewski tested KP Labs's LeopardISS - a compact data processing unit - during the Axiom-4 mission on the International Space Station [2, 3].
Last week, Puls Biznesu - a daily newspaper focused on business and economics - published an article [4 - paywall, sorry] explaining that KP Labs submitted their application to the National Centre for Research and Development (NCBR) [5], the country's main R&D agency, for a grant to fund a new device. Their application, however, was rejected. The initial reaction was like "It happens - someone was simply better", but that turned out not to be the end of the story.
Questions arose when KP Labs received the official review sheet detailing their score and why it fell short of the grant threshold. Among other claims, the review said:
a) A superior thermal‑management solution had been used in the "Analog Mars Yard" rover. After consulting the Space Research Centre of the Polish Academy of Sciences, they learned no such space rover has ever existed.
b) Their data compression model was compared to one from the Camila satellite system, which does not yet exist and is in fact slated to be developed by... KP Labs itself.
c) Their solution was likened to the technology used in EagleEye, yet Creotech Instruments S.A. - the company behind it - has never disclosed its system parameters [6].
KP Labs team could only reproduce the same fabricated answers by using a popular LLM. None of the other verification methods came even close to the review sheet - only the AI-tool outputs matched review claims [7]. So far all signs point to the Puls Biznesu article's title being accurate: "How hallucinations sank the grant" [8].
Losing the grant funding is one problem, but there's another concern: what about NDAs? Who can guarantee that their confidential work, under strict nondisclosure agreements, has not ended up in an external AI environment and is already being used to train large language models?
We know the answer.
For transparency, note that everything presented here is based on circumstantial evidence (German: Indizienbeweis; French: preuve circonstancielle). No one has been caught red-handed. When asked by the newspaper to state its position and respond to the allegation of AI use in the application‑evaluation process, the NCBR declared in an official letter that the reviewing expert relied exclusively on personal expertise... supplemented by queries to Google and Bing. Have they heard about AI suggestions in search results?
Now, for some ethical reflection: How many times has so-called AI been used to grade a student's exam? How many times has an LLM been trusted to give a medical diagnosis? How often has it drafted a legal brief? Is there anything wrong with that? Yes! It is wrong when someone mindlessly copies a tool's output without thinking it through, as was very likely the case in the evaluation of the KP Labs application.
The real problem isn't AI itself but how people use it. AI is here to stay - there's no turning back. The deeper question is this: does a gun kill by itself, or does the person who pulls the trigger? To be clear, I am not equating pulling a trigger with the reckless use of AI - I am focused on the logical essence of the issue.
[1, en] https://www.kplabs.space/about
[2, en] https://www.axiomspace.com/research/leopard-iss
[3, pl] https://gliwice.eu/aktualnosci/dzieje-sie/slawosz-uznanski-wisniewski-juz-w-drodze-na-miedzynarodowa-stacje-kosmiczna
[4, pl, paywall] https://www.pb.pl/jak-halucynacje-zablokowaly-dotacje-1245713
[5, en] https://www.gov.pl/web/ncbr-en
[6, pl] https://www.linkedin.com/feed/update/urn:li:activity:7354403984427671553/
[7, pl] https://www.linkedin.com/posts/grzegorz-brona-219213a2_budzimyinnowacje-activity-7354412428543066113-qPDI/
[8, pl] https://pbs.twimg.com/media/GwtECmXWUAEA68L?format=jpg&name=large
Christian schreibt viel Kluges zum Thema "Ki":
https://hmbl.blog/26-7-2025-sie-fragen-zu-ki/
I know this is going to sound crazy, but it's the fact that an #LLM produces #bullshit is more than 50% of why it's useful.
The sooner people start valuing the bullshit from #LLMs the faster the technology will reach maturity.
Our #economy is also an artificially intelligent system, and it produces just mountains of bullshit.
Can AI really code? Study maps the roadblocks to autonomous software engineering:
“Without a channel for the #AI to expose its own confidence — ‘this part’s correct … this part, maybe double‑check’ — developers risk blindly trusting hallucinated logic that compiles, but collapses in production. Another critical aspect is having the AI know when to defer to the user for clarification.”
THIS!
More than 50 years ago David Gerrold wrote a sci-fi novel named "When Harlie Was One" about an advanced #AI designed to mimic human thought, just like the #LLMs we use today. The story explores themes of self-awareness, humanity, and the implications of advanced AI through HARLIE's interactions with his creator, David Auberson, and his struggle for survival against those who want to shut him down. How far this novel was ahead of the time when it was created. Amazing.
https://en.wikipedia.org/wiki/When_HARLIE_Was_One
Reseña de «Solving formal math problems by decomposition and iterative reflection». https://jaalonso.github.io/vestigium/posts/2025/07/27-solving-formal-math-problems-by-decomposition-and-iterative-reflection/ #AI4Math #LLMs #ProofAssistants #LeanProver #Math