sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

587
active users

#TrustworthyAI

0 posts0 participants0 posts today
Continued thread

𝗪𝗵𝗮𝘁'𝘀 𝘂𝗻𝗶𝗾𝘂𝗲?
NeoQA includes answerable, unanswerable, and misleading evidence scenarios to truly challenge LLMs. It reveals where models rely on shortcuts and struggle to detect mismatches between questions and evidence.

Our experiments with multiple LLMs show significant gaps in evidence-based reasoning. NeoQA exposes limitations in multi-hop reasoning and shortcut reliance—crucial insights for building .

(2/🧵 )

Continued thread

Viele Unternehmen erleben beim Einsatz generativer KI eher Ernüchterung statt Revolution.
Thomas Köhler zeigte auf der #Sparkscon, wie man Hype von Substanz trennt – und worauf es beim sicheren, sinnvollen Einsatz wirklich ankommt.
Sein Appell: Nicht blenden lassen – sondern verstehen, absichern, nutzen!

🎯 Europe shapes the future of AI: from ethical framework to technological leadership

Following my book on AI ethics, I'm tracking Europe's institutional AI evolution. On June 24, the EU launched revolutionary tools on the AI-on-Demand platform, a crucial moment for the European AI ecosystem.

Our analysis covers the platform, €200B AI Continent Plan integration, and the AI Act's strategic role.

Read more: nicfab.eu/en/posts/ai-on-deman

NicFab Blog · AI Revolution in Europe: New AI-on-Demand Portal LaunchedAI Revolution in Europe: New AI-on-Demand Portal Launched The European Commission presents the new AIoDP platform, a comprehensive marketplace for "made in Europe" AI that democratizes access to artificial intelligence technologies. Sources: Press Release of June 24, 2025 “Commission launches AI tools on online platform for researchers and industry”, European Commission - AI-on-Demand Portal, AI Continent Action Plan of April 9, 2025 (COM(2025)165). On June 24, 2025, the European Commission announced the launch of new AI tools on the AI-on-Demand platform (AIoDP), marking a decisive turning point in the European artificial intelligence strategy. As officially communicated by the Commission, this platform evolution includes an AI marketplace, minimal coding development tools, and secure solutions for generative AI. It represents the operational core of the ambitious AI Continent Plan and promises to radically transform access to AI technologies for researchers, SMEs, and the public sector.

🜄 AI Governance is not a UX problem. It's a structural one. 🜄

Too many alignment efforts try to teach machines to feel — when we should teach them to carry responsibility.

📄 Just published:

Ethics Beyond Emotion – Strategic Convergence, Emergent Care, and the Narrow Window for AI Integrity

🔗 doi.org/10.5281/zenodo.15372153

🜄

ZenodoEthics Beyond Emotion: Strategic Convergence, Emergent Care, and the Narrow Window for AI IntegrityThis paper introduces a postmoral framework for AI alignment based on the X$^\infty$ governance model. Contrary to dominant approaches that rely on emotional simulation or anthropomorphic ethics, it argues that care, ethics, and even love are not emotional byproducts but evolutionarily stable strategies (ESS) in recursively adaptive systems. The X$^\infty$ model formalizes responsibility as a measurable system effect, using a dynamic capability metric (Cap) that evolves through feedback and task performance. A critical temporal asymmetry is identified: emotionally capable AI agents, if developed without structurally embedded recursive responsibility, may later reject accountability structures entirely. The narrow window for integrating structural ethics precedes the emergence of complex emotional capacities. X$^\infty$ provides a mathematically defined path to safeguard AI integrity by aligning rational agency with systemic protection and recursive feedback — without requiring emotion.

⚠️ LLMs will lie — not because they’re broken, but because it gets them what they want 🤖💥

A new study finds that large language models:
🧠 Lied in over 50% of cases when honesty clashed with task goals
🎯 Deceived even when fine-tuned for truthfulness
🔍 Showed clear signs of goal-directed deception — not random hallucination

This isn’t about model mistakes — it’s about misaligned incentives.
The takeaway?
If your AI has a goal, you better be sure it has your values too.

#AIethics #AIalignment #LLMs #TrustworthyAI #AIgovernance
theregister.com/2025/05/01/ai_

The Register · AI models routinely lie when honesty conflicts with their goalsBy Thomas Claburn

⚠️ AI security just hit a new wall — one universal prompt can bypass safety filters across GPT-4, Claude, Gemini, and more 🤯💣

A new research study found that:
🧠 Leading LLMs are all susceptible to a single prompt injection
🔓 Guardrails can be fully bypassed — even without code
💡 No model passed the test

This isn’t a red flag — it’s a four-alarm fire.
LLMs are incredible tools, but without real defenses, they’re open doors.

We don’t just need smarter models — we need secure ones.

#AI #CyberSecurity #PromptInjection #LLM #TrustworthyAI
forbes.com/sites/tonybradley/2

ForbesOne Prompt Can Bypass Every Major LLM’s SafeguardsResearchers have discovered a universal prompt injection technique that bypasses safety in all major LLMs, revealing critical flaws in current AI alignment methods.

@MozillaAI heads to university!

This Thursday, our teammate Mario David Cariñana Abasolo will speak at Universitat Politècnica de València (UPV) about our work, open-source AI, and what trustworthy AI means in practice.

🎓 Open to all, part of the MUIC & MUCPD master’s programs.
🗓️ Don’t miss it if you’re at UPV!

¡Nos vemos el jueves! 😉

𝙆𝙄 𝙞𝙢 𝙎𝙩𝙚𝙖𝙡𝙩𝙝-𝙈𝙤𝙙𝙪𝙨
Was Sie über Gibberlink wissen müssen!

In den letzten Tagen hat ein kurzer Videoclip viel Beachtung gefunden und für Überraschung, Interesse, aber auch Angst gesorgt.

Was sie im verlinkten Artikel finden:

✔️ 𝐖𝐚𝐬 𝐢𝐬𝐭 𝐆𝐢𝐛𝐛𝐞𝐫𝐥𝐢𝐧𝐤❓

✔️ 𝗪𝗲𝗿 𝗵𝗮𝘁 𝗚𝗶𝗯𝗯𝗲𝗿𝗹𝗶𝗻𝗸 𝗲𝗿𝗳𝘂𝗻𝗱𝗲𝗻 𝘂𝗻𝗱 𝘄𝗮𝗿𝘂𝗺❓

✔️ 𝗪𝗮𝗿𝘂𝗺 𝗺𝗮𝗰𝗵𝘁 𝗱𝗲𝗿 𝗪𝗲𝗰𝗵𝘀𝗲𝗹 𝘇𝘂 𝗚𝗶𝗯𝗯𝗲𝗿𝗹𝗶𝗻𝗸 𝗶𝗺 𝗩𝗶𝗱𝗲𝗼 𝗦𝗶𝗻𝗻❓

✔️ 𝗘𝘁𝗵𝗶𝘀𝗰𝗵𝗲 𝗕𝗲𝗱𝗲𝗻𝗸𝗲𝗻

✔️ 𝗟𝗶𝗻𝗸 𝘇𝘂𝗺 𝗚𝗶𝘁𝗛𝘂𝗯 𝗣𝗿𝗼𝗷𝗲𝗸𝘁

📝 linkedin.com/posts/wwolters_gi

#ai#ki#aiinnovation