sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

608
active users

#docvqa

0 posts0 participants0 posts today
michabbb<p><a href="https://social.vivaldi.net/tags/TechNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>TechNews</span></a>: <a href="https://social.vivaldi.net/tags/Qwen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Qwen</span></a> Releases New <a href="https://social.vivaldi.net/tags/VisionLanguage" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VisionLanguage</span></a> <a href="https://social.vivaldi.net/tags/LLM" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>LLM</span></a> Qwen2-VL 🖥️👁️</p><p>After a year of development, <a href="https://social.vivaldi.net/tags/Qwen" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Qwen</span></a> has released Qwen2-VL, its latest <a href="https://social.vivaldi.net/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> system for interpreting visual and textual information. 🚀</p><p>Key Features of Qwen2-VL:</p><p>1. 🖼️ Image Understanding:</p><p> Qwen2-VL shows performance on <a href="https://social.vivaldi.net/tags/VisualUnderstanding" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VisualUnderstanding</span></a> benchmarks including <a href="https://social.vivaldi.net/tags/MathVista" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MathVista</span></a>, <a href="https://social.vivaldi.net/tags/DocVQA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>DocVQA</span></a>, <a href="https://social.vivaldi.net/tags/RealWorldQA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>RealWorldQA</span></a>, and <a href="https://social.vivaldi.net/tags/MTVQA" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MTVQA</span></a>. </p><p>2. 🎬 Video Analysis:</p><p> Qwen2-VL can analyze videos over 20 minutes in length. This is achieved through online streaming capabilities, allowing for video-based <a href="https://social.vivaldi.net/tags/QuestionAnswering" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>QuestionAnswering</span></a>, <a href="https://social.vivaldi.net/tags/Dialog" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>Dialog</span></a>, and <a href="https://social.vivaldi.net/tags/ContentCreation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ContentCreation</span></a>. <a href="https://social.vivaldi.net/tags/VideoAnalysis" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>VideoAnalysis</span></a></p><p>3. 🤖 Device Integration:</p><p> The <a href="https://social.vivaldi.net/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AI</span></a> can be integrated with <a href="https://social.vivaldi.net/tags/mobile" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>mobile</span></a> phones, <a href="https://social.vivaldi.net/tags/robots" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>robots</span></a>, and other devices. It uses reasoning and decision-making abilities to interpret visual environments and text instructions for device control. <a href="https://social.vivaldi.net/tags/AIAssistants" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>AIAssistants</span></a> 📱</p><p>4. 🌍 Multilingual Capabilities:</p><p> Qwen2-VL understands text in images across multiple languages. It supports most European languages, Japanese, Korean, Arabic, Vietnamese, among others, in addition to English and Chinese. <a href="https://social.vivaldi.net/tags/MultilingualAI" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>MultilingualAI</span></a></p><p>This release represents an advancement in <a href="https://social.vivaldi.net/tags/ArtificialIntelligence" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>ArtificialIntelligence</span></a>, combining visual perception and language understanding. 🧠 Potential applications include <a href="https://social.vivaldi.net/tags/education" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>education</span></a>, <a href="https://social.vivaldi.net/tags/healthcare" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>healthcare</span></a>, <a href="https://social.vivaldi.net/tags/robotics" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>robotics</span></a>, and <a href="https://social.vivaldi.net/tags/contentmoderation" class="mention hashtag" rel="nofollow noopener" target="_blank">#<span>contentmoderation</span></a>.</p><p><a href="https://github.com/QwenLM/Qwen2-VL" rel="nofollow noopener" translate="no" target="_blank"><span class="invisible">https://</span><span class="">github.com/QwenLM/Qwen2-VL</span><span class="invisible"></span></a></p>