Sigmoid Social

Hacker NewsFP8 is ~100 tflops faster when the kernel name has "cutlass" in it<a href="https://twitter.com/cis_female/status/1943069934332055912" rel="nofollow noopener" translate="no" target="_blank">https://twitter.com/cis_female/status/1943069934332055912</a><a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/FP8" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP8</a> <a href="https://mastodon.social/tags/tflops" class="mention hashtag" rel="nofollow noopener" target="_blank">#tflops</a> <a href="https://mastodon.social/tags/cutlass" class="mention hashtag" rel="nofollow noopener" target="_blank">#cutlass</a> <a href="https://mastodon.social/tags/performance" class="mention hashtag" rel="nofollow noopener" target="_blank">#performance</a> <a href="https://mastodon.social/tags/optimization" class="mention hashtag" rel="nofollow noopener" target="_blank">#optimization</a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a>

Benjamin Carr, Ph.D. 👨🏻‍💻🧬<a href="https://hachyderm.io/tags/JackDongarra" class="mention hashtag" rel="nofollow noopener" target="_blank">#JackDongarra</a> Makes a Stand for Traditional <a href="https://hachyderm.io/tags/HPC" class="mention hashtag" rel="nofollow noopener" target="_blank">#HPC</a>: "US still doesn’t have a clear, long-term plan for what comes next.... U.S. risks falling behind."Challenges to high-performance computing threaten <a href="https://hachyderm.io/tags/US" class="mention hashtag" rel="nofollow noopener" target="_blank">#US</a> <a href="https://hachyderm.io/tags/innovation" class="mention hashtag" rel="nofollow noopener" target="_blank">#innovation</a>The <a href="https://hachyderm.io/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> boom has led chip makers to focus on <a href="https://hachyderm.io/tags/FP16" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP16</a> and <a href="https://hachyderm.io/tags/FP8" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP8</a>, not the <a href="https://hachyderm.io/tags/FP64" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP64</a> used by scientific research. If chip companies stop making the parts that <a href="https://hachyderm.io/tags/scientists" class="mention hashtag" rel="nofollow noopener" target="_blank">#scientists</a> need, then it could become harder to do important research. <a href="https://theconversation.com/challenges-to-high-performance-computing-threaten-us-innovation-255188" rel="nofollow noopener" translate="no" target="_blank">https://theconversation.com/challenges-to-high-performance-computing-threaten-us-innovation-255188</a>

Hacker NewsDeepSeek Open Sources DeepGEMM: Clean and efficient FP8 GEMM kernels — <a href="https://github.com/deepseek-ai/DeepGEMM" rel="nofollow noopener" translate="no" target="_blank">https://github.com/deepseek-ai/DeepGEMM</a> <a href="https://mastodon.social/tags/HackerNews" class="mention hashtag" rel="nofollow noopener" target="_blank">#HackerNews</a> <a href="https://mastodon.social/tags/DeepSeek" class="mention hashtag" rel="nofollow noopener" target="_blank">#DeepSeek</a> <a href="https://mastodon.social/tags/DeepGEMM" class="mention hashtag" rel="nofollow noopener" target="_blank">#DeepGEMM</a> <a href="https://mastodon.social/tags/FP8" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP8</a> <a href="https://mastodon.social/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> <a href="https://mastodon.social/tags/Kernels" class="mention hashtag" rel="nofollow noopener" target="_blank">#Kernels</a> <a href="https://mastodon.social/tags/OpenSource" class="mention hashtag" rel="nofollow noopener" target="_blank">#OpenSource</a>

michabbbIntroducing Phind-405B and faster, high quality <a href="https://social.vivaldi.net/tags/AI" class="mention hashtag" rel="nofollow noopener" target="_blank">#AI</a> answers for everyone🚀 Phind-405B: New flagship <a href="https://social.vivaldi.net/tags/llm" class="mention hashtag" rel="nofollow noopener" target="_blank">#llm</a>, based on Meta Llama 3.1 405B, designed for programming & technical tasks. <a href="https://social.vivaldi.net/tags/Phind405B" class="mention hashtag" rel="nofollow noopener" target="_blank">#Phind405B</a>⚡ 128K tokens, 32K context window at launch, 92% on HumanEval, great for web app design. <a href="https://social.vivaldi.net/tags/Programming" class="mention hashtag" rel="nofollow noopener" target="_blank">#Programming</a> <a href="https://social.vivaldi.net/tags/AIModel" class="mention hashtag" rel="nofollow noopener" target="_blank">#AIModel</a>💡 Trained on 256 H100 GPUs with FP8 mixed precision, 40% memory reduction. <a href="https://social.vivaldi.net/tags/DeepSpeed" class="mention hashtag" rel="nofollow noopener" target="_blank">#DeepSpeed</a> <a href="https://social.vivaldi.net/tags/FP8" class="mention hashtag" rel="nofollow noopener" target="_blank">#FP8</a>⚡ Phind Instant Model: Super fast, 350 tokens/sec, based on Meta Llama 3.1 8B. <a href="https://social.vivaldi.net/tags/PhindInstant" class="mention hashtag" rel="nofollow noopener" target="_blank">#PhindInstant</a>🚀 Runs on NVIDIA TensorRT-LLM with flash decoding, fused CUDA kernels. <a href="https://social.vivaldi.net/tags/NVIDIA" class="mention hashtag" rel="nofollow noopener" target="_blank">#NVIDIA</a> <a href="https://social.vivaldi.net/tags/GPUs" class="mention hashtag" rel="nofollow noopener" target="_blank">#GPUs</a>🔍 Faster Search: Prefetches results, saves up to 800ms latency, better embeddings. <a href="https://social.vivaldi.net/tags/FastSearch" class="mention hashtag" rel="nofollow noopener" target="_blank">#FastSearch</a>👨‍💻 Goal: Help developers experiment faster, new features coming soon! <a href="https://social.vivaldi.net/tags/DevTools" class="mention hashtag" rel="nofollow noopener" target="_blank">#DevTools</a> <a href="https://social.vivaldi.net/tags/Innovation" class="mention hashtag" rel="nofollow noopener" target="_blank">#Innovation</a><a href="https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches" rel="nofollow noopener" translate="no" target="_blank">https://www.phind.com/blog/introducing-phind-405b-and-better-faster-searches</a>

Charlie BlakeGlad to be on here! My <a href="https://sigmoid.social/tags/introduction" class="mention hashtag" rel="tag">#introduction</a>:I'm an AI researcher in the UK, working at Graphcore - a semiconductor company who develop the <a href="https://sigmoid.social/tags/IPU" class="mention hashtag" rel="tag">#IPU</a> (a <a href="https://sigmoid.social/tags/GPU" class="mention hashtag" rel="tag">#GPU</a> alternative) 💻 I joined last year, having previously been at Oxford for my MSc.My interests are in <a href="https://sigmoid.social/tags/numerics" class="mention hashtag" rel="tag">#numerics</a> (especially <a href="https://sigmoid.social/tags/fp8" class="mention hashtag" rel="tag">#fp8</a> 8️⃣), <a href="https://sigmoid.social/tags/LLMs" class="mention hashtag" rel="tag">#LLMs</a>, mixture-of-expert models, and anything to do with <a href="https://sigmoid.social/tags/solitaire" class="mention hashtag" rel="tag">#solitaire</a> ♣️ ♦️ Thanks to <a href="https://sigmoid.social/@thegradient" class="u-url mention">@thegradient</a> for making this happen 😃

Recent searches

Search options

Administered by:

Server stats:

#fp8