sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

587
active users

#whisper

3 posts3 participants0 posts today

#TIL that some #SpeechToText models interpret audio noise as [applause]. 👏

So you know you've messed things up, if you speak to your self-developed Speech-To-Text wrapper and it outputs: [applause]. 😬

Like "Hey you've messed things up!" *Clap, clap, clap* 👏

Not sure I like this kind of encouragement...🤨

#AI#Whisper#FunFact

🔊 Whisper has a serious challenger: Moshi STT

Developed by the French research lab Kyutai, Moshi STT is a new open-source speech recognition system that’s blazingly fast, highly accurate, and optimized for Apple Silicon and CUDA — all designed with real-time performance in mind.

scalastic.io/en/moshi-stt-vs-w

Scalastic · Why Moshi STT Could Replace Whisper (and How to Install It on macOS!)Discover Moshi STT by Kyutai, an open-source real-time speech transcription solution, optimized for Mac (Apple Silicon) and CUDA—fast, accurate, and easy to install. Includes a guide, user feedback, and useful links.
Continued thread

... erster Eindruck und Spoiler: wertvolle Impulse und Diskurs gingen weit über grundlegende KI-Tool-Theorie und allgemeine Nutzungserfahrungen hinaus - wie etwa die automatische Audiotranskription im @TIB_AVPortal oder den praktischen Erfahrungen des Filminstituts Hannover mit mit #Whisper. Aspekte reichten von praktischer KI-Softwarenutzung zur #Videoproduktion bis hin zu kritischer Auseindersetzung mit Prozessen, Rechtsfragen und Herausforderungen für Filmbibliotheken- und Archive ...

#Whisper #WebGPU by #Huggingface sounds very exciting!

Does this mean an #activitypub server could delegate translation-into-user's-language of all the posts to the user's device?

I'm too thick to have been able to find any system-requirements information for just the text-translation feature... Is this #translation feature likely to fly on mobile devices too?

Am I getting too excited too soon?

dev.to/proflead/real-time-audi

github.com/keatonkraiger/Whisp

DEV CommunityReal-Time Audio to Text in Your Browser – Whisper WebGPU TutorialIn this article, I’m going to show you how you can easily transcribe audio and video files on your...
Replied in thread

Just to clarify: I don't think #AI use is inherently bad for science.
#LLM‘s can help you reword, make text flow better, be more precise and write better, because – unfortunately – training data also includes lots of good scientific texts.
ASR systems like #whisper allow you to spend less time on word by word #transcription and more on what's between the lines.

But use for citing literature? Writing whole sections or papers? Review? Coding in qualitative research?!

That's an issue

люди предлагали выкладывать субтитры Whisper’а в публичный доступ, что я и сделала!

https://wonderfox.anyaforger.art/subtitles/en/

на данный момент есть субтитры к What’s with Andy (кроме первого сезона и части третьего - случайно утеряла часть субтитров при удалении папки с мультиком). этот пост будет обновляться с появлением новых субтитров (а они будут)

wonderfox.anyaforger.artEn – Блобфоксы, кофе и линуха