#IJCAI2025 Exploring Canada’s AI ecosystem with the IJCAI Local Organizing Committee at MILA —founded by Yoshua Bengio (1993) and now among the global leaders in deep learning #DL.
#IJCAI2025 Exploring Canada’s AI ecosystem with the IJCAI Local Organizing Committee at MILA —founded by Yoshua Bengio (1993) and now among the global leaders in deep learning #DL.
Thesis: Efficient deep learning inference on end devices
GPU-centric Communication Schemes for HPC and ML Applications
AI is a hot topic these days, but what does that word even mean? Originally it was about our quest to understand our own minds, but it has come to refer to one technology: deep learning. We often talk about AI as if it were just a human mind in a box, but the reality is quite different, in nuanced ways that AI companies play down. In this month's blog post, I explore how AI relates to human intelligence, what it reproduces, and what it doesn't.
https://thinkingwithnate.wordpress.com/2025/04/02/how-is-ai-like-human-intelligence/
Moore’s Law for AI agents: the length of tasks that AIs can do is doubling about every 7 months.
These results appear robust. The authors were able to retrodict back to GPT-2. They further ran experiments on SWE-bench Verified and found a similar trend.
Read more: https://metr.org/blog/2025-03-19-measuring-ai-ability-to-complete-long-tasks
The rise of AI research in a graph — see how its ArXiv submissions compare to other fields over the past decade. #AI #ArXiV #ArtificialIntelligence #DL #ML #CV #NLP #XAI #AIResearch #CS #ComputerScience #DataScience #Research #Revolution #AIBoom #AIRevolution
Self-Improving Reasoners.
Both expert human problem solvers and successful language models employ four key cognitive behaviors
1. verification (systematic error-checking),
2. backtracking (abandoning failing approaches),
3. subgoal setting (decomposing problems into manageable steps), and
4. backward chaining (reasoning from desired outcomes to initial inputs).
Some language models naturally exhibits these reasoning behaviors and exhibit substantial gains, while others don't and quickly plateau.
The presence of reasoning behaviors, not the correctness
of answers is the critical factor. Models with incorrect solutions containing proper reasoning patterns achieve comparable performance to those trained on correct solutions.
It seems that the presence of cognitive behaviors enables self-improvement through RL.
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs
https://arxiv.org/abs/2503.01307
CRIUgpu: Transparent Checkpointing of GPU-Accelerated Workloads
TritonBench: Benchmarking Large Language Model Capabilities for Generating Triton Operators
#CUDA #CodeGeneration #LLM #DeepLearning #DL #Python #Package
Read any Deep Learning papers that made you do a double take?
Share them here, and we can make a list to blow each other's minds and get closer to actually understanding what the hell is going on. Boosts appreciated!
We've learned a ton about Deep Learning over the years, but in a fundamental way we still don't get it. There's tons of tricks we use without knowing why, and weird examples that work much better or much worse than you'd expect. We try to probe and visualize what's going on inside the black box, and what we find is often strange and hard to interpret.
I'm in an excellent class right now exploring the "surprises" of deep learning, reading papers like this to build a better understanding. I've shared a few of them here, but now I'm looking for more to share back with the class.
Any suggestions?
"We encourage the open source community, regulatory authorities and industry to continue to strive toward greater transparency and alignment with open source development principles when training and fine-tuning AI models" https://buff.ly/3Eyn85w #AI #ML #DL #NN #oss #opensource
I know this is provocative, but I agree with Kelsey, this is just a different type of software, and we know that a lot of best practices apply: open source, Contianer, CI/CD, etc https://buff.ly/42TrnTu #AI #ML #DL #NN #oss #opensource
LLM guardrails explained for system admins https://buff.ly/3EEopbo #AI #ML #DL #NN #oss #opensource
Why is transparent, open data important to LLMs (Part 2)? https://buff.ly/3QhjQ9t #AI #ML #DL #NN #oss #opensource
Why is transparent, open data important to LLMs? https://buff.ly/42UZwSV #AI #ML #DL #NN #oss #opensource
Thesis: Towards autonomous resource management: Deep learning prediction of CPU-GPU load balancing