In today’s episode, I’m providing you with the five most important takeaways from the release of OpenAI’s long-anticipated GPT-5 model.
Read MoreFiltering by Tag: llms
LLM Pre-Training and Post-Training 101, with Julien Launay
How are cutting-edge LLMs are trained? Find out in today's exceptional episode with Julien Launay, who digs into pre-training (supervised learning) and post-training (reinforcement learning) in eloquent detail.
Julien:
• CEO and co-founder of Adaptive ML, a remarkably fast-growing startup focused on enabling A.I. models to learn from experience.
• Previous led the extreme-scale research teams at Hugging Face and LightOn, where he helped develop state-of-the-art open-source models.
• Organizer of the "Efficient Systems for Foundation Models" workshop at ICML (the prestigious International Conference on Machine Learning).
Today's episode will appeal most to hands-on practitioners but other folks who are open to getting into the technical weeds on Large Language Model (LLM) training should also listen in.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir
Sensational episode for you today with the illustrious A.I. author, educator and entrepreneur Sinan Ozdemir on how LLM benchmarks are lying to you... and what you can do about it.
Sinan:
Is Founder and CTO of LoopGenius, a generative A.I. startup.
Authored several excellent books, including, most recently, the bestselling "Quick Start Guide to Large Language Models".
Hosts the "Practically Intelligent" podcast.
Was previously adjunct faculty at The Johns Hopkins University, now teaches several times a month within the O'Reilly platform.
Serial A.I. entrepreneur, including founding a Y Combinator-backed generative A.I. startup way back in 2015 that was later acquired.
Holds a Master’s in Pure Math from Johns Hopkins.
Today’s episode skews slightly toward our more technical listeners but Sinan excels at explaining complex concepts in a clear way so today’s episode may appeal to any listener of this podcast.
In today’s episode, Sinan details:
Why the A.I. benchmarks everyone relies on might be lying to you.
How the leading A.I. labs are gaming the benchmark system.
Tricks to actually effectively evaluate LLMs’ capabilities for your use cases.
What the future of benchmarking will involve, including how to benchmark agentic and multimodal models.
How a simple question about watermelon seeds reveals the 40% failure rate of even today’s most advanced A.I. models.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
My Four-Hour Agentic AI Workshop is Live and 100% Free
In case you missed my post last week, my four-hour Agentic A.I. workshop (with Ed Donner, pictured) is live. 8,000 people have already watched it! Here's what they're saying:
Read MoreAgentic AI Hands-On in Python: MCP, CrewAI and OpenAI Agents SDK (by Jon Krohn and Ed Donner)
Now live! Four hours long and 100% free, this hands-on workshop covers all the Agentic A.I. theory and tools you need to develop and deploy multi-agent teams with Python.
Beautifully shot by a professional film crew (led by the exceptional Lucie McCormick) at the Open Data Science Conference (ODSC) East in Boston a few weeks ago and then meticulously edited by SuperDataScience's inimitable Mario Pombo, this training (within the GenAI-forward Cursor IDE) features all of today's essential agent frameworks:
OpenAI Agents SDK
CrewAI
Anthropic's Model Context Protocol (MCP)
From design considerations through to practical implementation tips, by completing all four modules in this video, you will have all the knowledge and skills needed to create effective multi-agent systems. The four modules are:
Defining Agents
Designing Agents
Developing Agents
The Future of Agents
The coding elements are led by the wonderful Ed Donner, whom many of you will already know as one of the very best in the world at creating and teaching hands-on A.I. content.
We received rave reviews for the session at ODSC East and the lecture hall was standing-room only for the entire duration, so I anticipate that you'll love it too!
Watch the full training here: youtu.be/LSk5KaEGVk4
The “State of AI” Report 2025
In today’s Five-Minute Friday episode, I’ll cover the five biggest takeaways from the 2025 edition of the renowned AI Index Report, which was published a few weeks ago by the Stanford University Institute for Human-Centered AI. Every year this popular report — often called the “State of AI” report — covers the biggest technical advances, new achievements in benchmarking, investment flowing into AI and more. Here’s a link to the colossal full report in the show notes; today’s episode will cover the five most essential items.
Read MoreModel Context Protocol (MCP) and Why Everyone’s Talking About It
Today we're diving into Model Context Protocol, or MCP – the hot topic taking the AI world by storm in early 2025.
Read MoreBeyond GPUs: The Power of Custom AI Accelerators, with Emily Webber
The mind-blowing A.I. capabilities of recent years are made possible by vast quantities of specialized A.I.-accelerator chips. Today, AWS's (brilliant, amusing and Zen!) Emily Webber explains how these chips work.
Emily:
• Is a Principal Solutions Architect in the elite Annapurna Labs ML service team that is part of Amazon Web Services (AWS).
• Works directly on the Trainium and Inferentia hardware accelerators (for, respectively, training and making inferences with A.I. models).
• Also works on the NKI (Neuron Kernel Interface) that acts as a bare-metal language and compiler for programming AWS instances that use Trainium and Inferentia chips.
• Wrote a book on pretraining foundation models.
• Spent six years developing distributed systems for customers on Amazon’s cloud-based ML platform SageMaker.
• Leads the Neuron Data Science community and leads the technical aspects for the “Build On Trainium” program — a $110m credit-investment program for academic researchers.
Today’s episode is on the technical side and will appeal to anyone who’s keen to understand the relationship between today’s gigantic A.I. models and the hardware they run on.
In today’s episode, Emily details:
• The little-known story of how Annapurna Labs revolutionized cloud computing.
• What it takes to design hardware that can efficiently train and deploy models with billions of parameters.
• How Tranium2 became the most powerful A.I. chip on AWS.
• Why AWS is investing $110 million worth of compute credits in academic AI research.
• How meditation and Buddhist practice can enhance your focus and problem-solving abilities in tech.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Serverless, Parallel, and AI-Assisted: The Future of Data Science is Here, with Zerve’s Dr. Greg Michaelson
What are "code nodes" and "RAG DAGs"? Listen to today's episode with the highly technical (but also highly hilarious) Dr. Greg Michaelson to get a glimpse into the future of data science and A.I. model development.
Greg:
Is a Co-Founder of Zerve AI, a super-cool platform for developing and delivering A.I. products that launched to the public on this very podcast a little over a year ago.
Previously spent 7 years as DataRobot’s Chief Customer Officer and 4 years as Senior Director of Analytics & Research for Travelers.
Was a baptist pastor while he obtained his PhD in Applied Statistics!
Today’s episode is on the technical side and so will appeal most to hands-on practitioners like data scientists, AI/ML engineers and software developers… but Greg is such an engaging communicator that anyone interested in how the practice of data science is rapidly being revolutionized may enjoy today’s episode.
In it, Greg details:
How Zerve's collaborative, graph-based coding environment has matured over the past year, including their revolutionary 'Fleet' feature (in beta) that allows massive parallelization of code execution without additional cost.
How AI assistants are changing the coding experience by helping build, edit, and connect your data science projects.
Why the rise of LLMs might spell trouble for many SaaS businesses as building in-house solutions becomes increasingly viable.
The innovative ways companies are using retrieval-augmented generation (RAG) to create more powerful A.I. applications.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
LLMs and Agents Are Overhyped, with Dr. Andriy Burkov
Andriy Burkov's ML books are mega-bestsellers and his newsletter has a wild 900,000 subscribers. He seldom does interviews so don't miss today's episode, in which he takes compelling, contrarian views on LLMs and agents.
More on Dr. Burkov:
His indispensable "100-Page Machine Learning Book" seems to be on *every* data scientist / ML engineer's bookshelf.
He also wrote "ML Engineering" and his latest book, "The 100-Page Language Model Book", was released this year to rave reviews.
His "Artificial Intelligence" newsletter is subscribed to by 900,000 people on LinkedIn.
He's the Machine Learning Lead at TalentNeuron, a global labor-market analytics provider.
He runs his own book-publishing company, True Positive Inc.
Previously held data science / ML roles at Gartner, Fujitsu and more.
Holds a PhD in Computer Science (A.I.) from Université Laval in Quebec, where his doctoral dissertation focused on multi-agent decision-making — 15 years ago!
Despite Dr. Burkov being such a technical individual, most of today’s episode should appeal to anyone interested in A.I. (although some parts here and there will be particularly appealing to hands-on machine-learning practitioners).
In today’s episode, Andriy details:
Why he believes AI agents are destined to fail.
How he managed to create a chatbot that never hallucinates — by deliberately avoiding LLMs.
Why he thinks DeepSeek AI crushed Bay Area A.I. leaders like OpenAI and Anthropic.
What makes human intelligence unique from all other animals and why A.I. researchers need to crack this in order to attain human-level intelligence in machines.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
OpenAI’s o3-mini: SOTA reasoning and exponentially cheaper
Today’s episode will fill you in on everything you need to know about an important model OpenAI recently released to the public called o3-mini.
Read MoreGenerative AI for Business, with Kirill Eremenko and Hadelin de Ponteves
Craving an intro to building and deploying commercially successful Generative A.I. applications? In today's episode, superstar data-science instructors Kirill and Hadelin (>5 million students between them) will fill you in!
Kirill Eremenko is one of our two guests today. He's:
Founder and CEO of SuperDataScience, an e-learning platform.
Founded the SuperDataScience Podcast in 2016 and hosted the show until he passed me the reins four years ago.
Our second guest is Hadelin de Ponteves:
Was a data engineer at Google before becoming a content creator.
In 2020, took a break from Data Science content to produce and star in a Bollywood film featuring "Miss Universe" Harnaaz Sandhu.
Together, Kirill and Hadelin:
Have created dozens of data science courses; they are the most popular data science instructors on the Udemy platform, with over five million students between them!
They also co-founded CloudWolf, an education platform for quickly mastering Amazon Web Services (AWS) certification.
And, in today’s episode, they announce (for the first time anywhere!) another (brand-new) venture they co-founded together.
Today’s episode is intended for anyone who’s interested in real-world, commercial applications of Generative A.I. — a technical background is not required.
In today’s episode, Kirill and Hadelin detail:
What generative A.I. models like Large Language Models are and how they fit within the broader category of “Foundation Models”.
The 12 crucial factors to consider when selecting a foundation model for a given application in your organization.
The 8 steps to ensuring foundation models are deployed commercially successfully.
Many real-world examples of how companies are customizing A.I. models quickly and at remarkably low cost.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
AI Systems as Productivity Engines, with You.com’s Bryan McCann
Today, wildly intelligent Bryan McCann describes the Agentic A.I. behind his skyrocketing startup You.com and how it will lead to scientific discoveries human scientists couldn't dream of making. Don't miss this episode!
Bryan:
• Co-Founder and CTO of You.com, a prominent Bay Area A.I. startup that has raised $99m in venture capital (including a $50m Series B in September that valued the firm at nearly a billion dollars).
• Was previously Lead Research Scientist at Salesforce and an assistant on courses at Stanford such as Andrew Ng’s wildly popular machine learning course.
• Holds a Master’s in Computer Science, a Bachelor’s in Computer Science and a Bachelor’s in Philosophy, all from Stanford University.
Today’s episode should be fascinating to anyone interested in AI. In it, extremely well-spoken Bryan details:
• The philosophical underpinnings of the breakthroughs that led to the leading A.I. models we have today as well as the ones that will emerge in the coming years.
• How a coding mistake he made serendipitously revealed fundamental insights about meaning and language model alignment.
• Why he believes humanity is entering an existential crisis due to A.I., but nevertheless remains optimistic about the future.
• The fascinating connection between language models and biological proteins.
• Why A.I. systems might soon be able to make scientific discoveries humans could never dream of making.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
PyTorch Lightning, Lit-Serve and Lightning Studios, with Dr. Luca Antiga
Lightning AI makes tons of tools that speed A.I. model dev and deployment, including the wildly popular open-source library PyTorch Lightning. Today, hear from hands-on CTO Dr. Luca Antiga how all the magic happens ⚡️
More on Luca:
CTO of Lightning AI, which (as one of world’s hottest startups developing A.I. tools) have raised over $80m in venture capital.
Is also CTO of OROBIX, an A.I. services company that Luca co-founded 15 years ago.
Holds a PhD in biomedical engineering from Politecnico di Milano… and did his postdoc at the Robarts Research Institute in London, Ontario (coincidentally around the same time I was doing brain-imaging research there).
Today’s episode will probably appeal most to hands-on practitioners like data scientists, software developers and ML engineers, but any tech-savvy professional could find it valuable.
In today’s episode, Luca details:
How Lightning AI's suite of tools (in addition to PyTorch Lightning, this includes Lightning Studios, LitServe and the Thunder Compiler) is making A.I. development faster and easier.
The rise of small language models and their potential to rival LLMs.
His journey from biomedical imaging to deep learning pioneer.
How software developer’s work will be transformed by A.I. in the coming years.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
PyTorch: From Zero to Hero, with Luka Anicin
Today's episode is on Python's most popular auto-differentiation library, PyTorch, and how you can use it to design, train and deploy deep neural nets, including LLMs. Acclaimed PyTorch instructor Luka Anicin is our guide.
Luka:
Is one of Udemy’s all-time bestselling instructors on A.I.; over 500,000 students have taken his courses.
His latest course, available exclusively at SuperDataScience.com, is called “PyTorch: From Zero to Hero”.
CEO of full-lifecycle A.I. consultancy Datablooz.
Holds a Bachelor’s in Computer Science, a Master’s in Data Science and is nearing completion of his PhD in Applied A.I.
Today’s episode will probably appeal most to hands-on practitioners like data scientists, software developers and ML engineers.
In it, Luka details:
What the popular Python library PyTorch is for.
Why you would select PyTorch over TensorFlow or Scikit-learn.
The tensor building blocks PyTorch provides for designing, training and deploying state-of-the-art deep neural networks, including Large Language Models (LLMs).
His top tips for accurate and efficient deep learning.
Guidance on PyTorch portfolio projects.
Real-world PyTorch case-studies from his experience leading an A.I. consultancy.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The AI Scientist: Towards Fully Automated, Open-Ended Scientific Discovery
A team of researchers from Sakana AI, a Japanese AI startup founded last year by Google alumni and that reportedly was valued at over a $1 billion in June, this week published a paper titled "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery" that is making big waves and could revolutionize how we conduct scientific research.
Read MoreAgentic AI, with Shingai Manjengwa
Today's episode is all about Agentic A.I. — perhaps the hottest topic in A.I. today. Astoundingly intelligent and articulate Shingai Manjengwa couldn't be a better guide for us on this hot topic 🔥
Shingai:
Head of A.I. Education at ChainML, a prestigious startup focused on developing tools for a future powered by A.I. agents.
Founder and former CEO of Fireside Analytics Inc. (developed online data-science courses that have been undertaken by 500,000 unique students).
Previously was Director of Technical Education at the prominent global A.I. research center, the Vector Institute in Toronto.
Holds an MSc in Business Analytics from New York University.
Today’s episode should be equally appealing to hands-on practitioners like data scientists as to folks who generally yearn to stay abreast of the most cutting-edge A.I. techniques.
In today’s episode, Shingai details:
What A.I. agents are.
Why agents are the most exciting, fastest-growing A.I. application today.
How LLMs relate to agentic A.I.
Why multi-agent systems are particularly powerful.
How blockchain technology enables humans to better understand and trust A.I. agents.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Llama 3.1 405B: The First Open-Source Frontier LLM
Meta releasing its giant (405-billion parameter) Llama 3.1 model is a game-changer: For the first time, an "open-source" LLM competes at the frontier (against proprietary models GPT-4o and Claude).
Read MoreMerged LLMs Are Smaller And More Capable, with Arcee AI’s Mark McQuade and Charles Goddard
Today's episode is seriously mind-expanding. In it, Mark and Charles detail how they're pushing the A.I. frontier through LLM merging, extremely efficient (even CPU-only!) LLM training, and *Small* Language Models.
Mark McQuade:
• Is Co-Founder and CEO of Arcee.ai.
• Previously, he held client-facing roles at Hugging Face and Roboflow as well as leading the data science and engineering practice of a Rackspace company.
• He studied electronic engineering at Fleming College in Canada.
Charles Goddard:
• Is Chief of Frontier Research at Arcee.ai
• Previously, he was a software engineer at Apple and the famed NASA Jet Propulsion Laboratory.
• Studied engineering at Olin College in Massachusetts.
Today’s episode is relatively technical so will likely appeal most to hands-on practitioners like data scientists and ML engineers. In it, Charles and Mark detail:
• How their impressive open-source model-merging approach combines the capabilities of multiple LLMs without increasing the model’s size.
• A separate open-source approach for training LLMs efficiently by targeting specific modules of the network to train while freezing others.
• The pros and cons of Mixture-of-Experts versus Mixture-of-Agents approaches.
• How to enable small language models to outcompete the big foundation LLMs like GPT-4, Gemini and Claude.
• How to leverage open-source projects to land big enterprise contracts and attract big chunks of venture capital.
On that final note, congrats to the Arcee.ai team on announcing their $24m Series A round this very day... unsurprising given their tremendously innovative tech and rapid revenue ramp-up! It's very rare to see runaway A.I. startup successes like this one.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Deep Learning Classics and Trends, with Dr. Rosanne Liu
Today's guest is the amazing Google DeepMind research scientist, Dr. Rosanne Liu!
Rosanne:
• Is a Research Scientist at Google DeepMind in California.
• Is Co-Founder and Executive Director of ML Collective, a non-profit that provides global ML research training and mentorship.
• Was a founding member of Uber AI Labs, where she served as a Senior Research Scientist.
• She has published deep learning research in top academic venues such as NeurIPS, ICLR, ICML and Science, and her work has been covered in publications like WIRED and the MIT Tech Review.
• Holds a PhD in Computer Science from Northwestern University.
Today’s episode, particularly in the second half when we dig into Rosanne’s fascinating research, is relatively technical so will probably appeal most to hands-on practitioners like data scientists and ML engineers.
In today’s episode, Rosanne details:
• The problem she founded the ML Collective to solve.
• How her work on the “intrinsic dimension” of deep learning models inspired the now-standard LoRA approach to fine-tuning LLMs.
• The thorny problems with LLM evaluation benchmarks and how they might be solved.
• The pros and cons of curiosity- vs goal-driven ML research.
• The positive impacts of diversity, equity and inclusion in the ML community.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.