Today, we're diving into the fascinating AI boom that's been sweeping across China since early 2025, examining what this means for the global AI landscape and markets.
Read MoreFiltering by Category: Podcast
Serverless, Parallel, and AI-Assisted: The Future of Data Science is Here, with Zerve’s Dr. Greg Michaelson
What are "code nodes" and "RAG DAGs"? Listen to today's episode with the highly technical (but also highly hilarious) Dr. Greg Michaelson to get a glimpse into the future of data science and A.I. model development.
Greg:
Is a Co-Founder of Zerve AI, a super-cool platform for developing and delivering A.I. products that launched to the public on this very podcast a little over a year ago.
Previously spent 7 years as DataRobot’s Chief Customer Officer and 4 years as Senior Director of Analytics & Research for Travelers.
Was a baptist pastor while he obtained his PhD in Applied Statistics!
Today’s episode is on the technical side and so will appeal most to hands-on practitioners like data scientists, AI/ML engineers and software developers… but Greg is such an engaging communicator that anyone interested in how the practice of data science is rapidly being revolutionized may enjoy today’s episode.
In it, Greg details:
How Zerve's collaborative, graph-based coding environment has matured over the past year, including their revolutionary 'Fleet' feature (in beta) that allows massive parallelization of code execution without additional cost.
How AI assistants are changing the coding experience by helping build, edit, and connect your data science projects.
Why the rise of LLMs might spell trouble for many SaaS businesses as building in-house solutions becomes increasingly viable.
The innovative ways companies are using retrieval-augmented generation (RAG) to create more powerful A.I. applications.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in March 2025
We had absolutely killer guests and killer conversations on my podcast in March. This isn't bluster; I learned a ton from Andriy, Richmond, Natalie and Varun... Today's episode features all the best highlights!
The specific conversation highlights included in today's episode are:
The mega-bestselling author of "The 100-Page Machine Learning Book" (and now "The 100-Page Language Models Book"!) Dr. Andriy Burkov on the missing piece of AGI: Why LLMs can't plan or self-reflect.
Relatedly, the fascinating and exceptionally well-spoken Natalie Monbiot contrasted artificial intelligence with the human variety, detailing what makes us unique.
The charismatic software engineer Richmond Alake (of MongoDB) explained his "A.I. Stack" concept and how you can leverage it to build better A.I. applications.
Former Google Gemini engineer Varun Godbole provides a helpful overview of guide to neural network design, the (freely available!) "Deep Learning Tuning Playbook".
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Neural Processing Units Bringing AI to PCs, with Shirish Gupta
In many situations, it's impractical (or even impossible!) to have A.I. executed in the cloud. In today's episode, Shirish Gupta details when to run A.I. locally and how Neural Processing Units (NPUs) make it practical.
Today's episode is about efficiently designing and deploying AI applications that run on the edge. Our guide on that journey is SuperDataScience Podcast fan, Shirish! Here's more on him:
• Has spent more than two decades working for the global technology juggernaut, Dell Technologies, in their Austin, Texas headquarters.
• Has held senior systems engineering, quality engineering and field engineering roles.
• For the past three years, has been Director of AI Product Management for Dell’s PC Group.
• Holds a Master’s in Mechanical Engineering from the University of Maryland.
Today’s episode should appeal to anyone who is involved with or interested in real-world A.I. applications.
In this episode, Shirish details:
• What Neural Processing Units (NPUs) are and why they're transforming A.I. on edge devices.
• Four clear, compelling reasons to consider moving AI workloads from the cloud to your local device.
• The "A.I. PC" revolution that's bringing A.I. acceleration to everyday laptops and workstations.
• What kinds of Large Language Models are best-suited to local inference on AI PCs.
• How Dell's Pro A.I. Studio toolkit will drastically reduce enterprise A.I. deployment time.
• Plenty of real-life A.I. PC examples, including how a healthcare provider achieved physician-level accuracy with a custom vision model.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Hugging Face’s smolagents: Agentic AI in Python Made Easy
Today, we’re diving into Hugging Face’s smolagents – a new development that gives AI models more autonomy. Hugging Face, the open-source AI powerhouse behind technologies like Transformers, has now turned its attention to AI agents – programs where AI models can plan and execute tasks on their own – and their latest library smolagents makes building these agents simpler than ever. In this short episode, I’ll break down what smolagents are, how they work, and why they’re a big deal for developers, businesses, and researchers alike.
Read MoreHow Semiconductors Are Made (And Fuel the AI Boom), with Kai Beckmann
Today's episode is an important one on the hardware that underlies all computing and is fueling the A.I. boom. It’s hard to imagine a better guest than Kai Beckmann for this essential topic.
Kai:
• Is Member of the Executive Board of Merck KGaA, Darmstadt, Germany (a 350-year-old firm that’s the world's oldest chemical and pharmaceutical company and that has more than 62,000 employees across 60 countries).
• Having worked at the gigantic firm for over 35 years, he’s been CEO of their Electronics business for the past eight years.
• Under his leadership, Merck KGaA develops cutting-edge, materials-based solutions and equipment for leading chip companies — 99% of electronic devices contain one of their products 🤯
• A leading speaker within the semiconductor industry, he’s an expert in material-based semiconductor solutions, A.I., digitalization, and change management.
Today’s episode will be of interest to anyone looking to understand the hardware that all of computing and data science depend on. In it, Kai details:
• How materials from one company are found in virtually every electronic device on the planet.
• How A.I. is being used to develop materials that power... more A.I.
• His vinyl-record analogy for understanding computer-chip manufacturing.
• The impact that scaled-up, stable quantum computing will have on society.
• How a neuromorphic chip might someday run on the power of a low-wattage light bulb while matching human brain capabilities.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
How AI is Transforming Baseball (with Lessons For All of Us)
Baseball has always been a game of numbers. For decades, teams have pored over stats like batting averages and ERAs to gain an edge. But in recent years, artificial intelligence has taken baseball analytics to new heights. In today’s episode, we’ll explore how AI is revolutionizing baseball – from scouting and player performance to in-game strategy and even fan experience – and what that means for the future of sports and other industries.
Read MoreBecome Your Best Self Through AI Augmentation — feat. Natalie Monbiot
The deep-thinking and highly articulate Natalie Monbiot returns to my podcast today for a can't-miss episode (one of my favorite convos ever) on how A.I. will overhaul our lives, our work, our society in the coming years.
More on Natalie:
Through her consultancy, Virtual Human Economy, she advises on virtual humans and A.I. clones, including to startups like Wizly and investment firms like Blue Tulip Ventures.
Was previously Head of Strategy at Hour One, a leading virtual-human video-generation startup.
Regularly speaks at the world's largest conferences, including Web Summit and SXSW.
Holds a Master's in Modern Languages and Literature from the University of Oxford.
Today’s fascinating episode will be of great interest to all listeners. In it, Natalie details:
How A.I. is making us dumber — and what we can do about it.
Why the "virtual human economy" could be the next evolution of human civilization.
The two states of being humans are seeking (and how A.I. could help us achieve them).
Why focusing on merely 10x’ing our capabilities misses the much bigger opportunity of A.I.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Microsoft’s “Majorana 1” Chip Brings Quantum ML Closer
Microsoft’s Majorana 1 is a newly unveiled quantum computing chip that marks a major breakthrough in the quest for practical quantum computers. It’s the world’s first quantum processor built on a so-called Topological Core architecture – meaning it uses topological qubits (based on exotic Majorana particles that I’ll dig into more shortly) instead of the fragile qubits found in today’s machines. Microsoft believes this innovation could accelerate the timeline for solving real-world, industrial-scale problems with quantum computing from “decades” to just a few years.
Read MoreNoSQL Is Ideal for AI Applications, with MongoDB’s Richmond Alake
In today's episode (#871), I'm joined by the gifted writer, speaker and ML developer Richmond Alake, who details what NoSQL databases are and why they're ideally suited for A.I. applications.
Richmond:
Is Staff Developer Advocate for AI and Machine Learning at MongoDB, a huge publicly-listed database company with over 5000 employees and over a billion dollars in annual revenue.
With Andrew Ng, he co-developed the DeepLearning.AI course “Prompt Compression and Query Optimization” that has been undertaken by over 13,000 people since its release last year.
Has delivered his courses on Coursera, DataCamp, and O'Reilly.
Authored 200+ technical articles with over a million total views, including as a writer for NVIDIA.
Previously held roles as an ML Architect, Computer Vision Engineer and Web Developer at a range of London-based companies.
Holds a Master’s in computer vision, machine learning and robotics from The University of Surrey in the UK.
Today's episode (filmed in-person at MongoDB's London HQ!) will appeal most to hands-on practitioners like data scientists, ML engineers and software developers, but Richmond does a stellar job of introducing technical concepts so any interested listener should enjoy the episode.
In today’s episode, Richmond details:
How NoSQL databases like MongoDB differ from relational, SQL-style databases.
Why NoSQL databases like MongoDB are particularly well-suited for developing modern A.I. applications, including Agentic A.I. applications.
How Mongo incorporates a native vector database, making it particularly well-suited to RAG (retrieval-augmented generation).
Why 2025 marks the beginning of the "multi-era" that will transform how we build A.I. systems.
His powerful framework for building winning A.I. strategies in today's hyper-competitive landscape.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
OpenAI’s “Deep Research”: Get Days of Human Work Done in Minutes
What does Deep Research do?
Read MoreAI Should Make Humans Wiser (But It Isn’t), with Varun Godbole
Today's trippy, brain-stimulating episode features Varun Godbole, a former Google Gemini LLM researcher who’s turned his attention to the future implications of the crazy-fast-moving exponential moment we're in.
Varun:
Spent the past decade doing Deep Learning research at Google, across pure and applied research projects.
For example, he was co-first author of a Nature paper where a neural network beat expert radiologists at detecting tumors.
Also co-authored the Deep Learning Tuning Playbook (that has nearly 30,000 stars on GitHub!) and, more recently, the LLM Prompt Tuning Playbook.
He's worked on engineering LLMs so that they generate code and most recently spent a few years as a core member of the Gemini team at Google.
Holds a degree in Computer Science as well as in Electrical and Electronic Engineering from The University of Western Australia.
Varun mostly keeps today’s episode high-level so it should appeal to anyone who, like me, is trying to wrap their head around how vastly different society could be in a few years or decades as a result of abundant intelligence.
In today’s episode, Varun details:
How human relationship therapy has helped him master A.I. prompt engineering.
Why focusing on A.I. agents so much today might be the wrong approach — and what we should focus on instead.
How the commoditization of knowledge could make wisdom the key differentiator in tomorrow's economy.
Why the future may belong to "full-stack employees" rather than traditional specialized roles.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in February 2025
February was another insane month on my podcast. In addition to having stunning smiles, all four guests I hosted are fascinating, highly knowledgeable experts. Today's episode features highlights of my convos with them.
The specific conversation highlights included in today's episode are:
Professional-athlete-turned-data-engineer Colleen Fotsch on how DBT simplifies data modeling and documentation.
Engineer-turned-entrepreneur Vaibhav Gupta on the new programming language, BAML, he created for AI applications. He details how BAML will save you time and a considerable amount of money when calling LLM APIs.
Professor Frank Hutter on how TabPFN, the first deep learning approach to become the state of the art for modeling tabular data (i.e., the structured rows and columns of data that, until now, deep learning was feeble at modeling).
The ebullient Cal Al-Dhubaib on the keys to scaling (and selling!) a thriving data science consultancy.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
LLMs and Agents Are Overhyped, with Dr. Andriy Burkov
Andriy Burkov's ML books are mega-bestsellers and his newsletter has a wild 900,000 subscribers. He seldom does interviews so don't miss today's episode, in which he takes compelling, contrarian views on LLMs and agents.
More on Dr. Burkov:
His indispensable "100-Page Machine Learning Book" seems to be on *every* data scientist / ML engineer's bookshelf.
He also wrote "ML Engineering" and his latest book, "The 100-Page Language Model Book", was released this year to rave reviews.
His "Artificial Intelligence" newsletter is subscribed to by 900,000 people on LinkedIn.
He's the Machine Learning Lead at TalentNeuron, a global labor-market analytics provider.
He runs his own book-publishing company, True Positive Inc.
Previously held data science / ML roles at Gartner, Fujitsu and more.
Holds a PhD in Computer Science (A.I.) from Université Laval in Quebec, where his doctoral dissertation focused on multi-agent decision-making — 15 years ago!
Despite Dr. Burkov being such a technical individual, most of today’s episode should appeal to anyone interested in A.I. (although some parts here and there will be particularly appealing to hands-on machine-learning practitioners).
In today’s episode, Andriy details:
Why he believes AI agents are destined to fail.
How he managed to create a chatbot that never hallucinates — by deliberately avoiding LLMs.
Why he thinks DeepSeek AI crushed Bay Area A.I. leaders like OpenAI and Anthropic.
What makes human intelligence unique from all other animals and why A.I. researchers need to crack this in order to attain human-level intelligence in machines.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Bringing Back Extinct Animals like the Wooly Mammoth and Dodo Bird
For this week’s Five-Minute Friday-style episode, I’m diving into a biotechnology story I found mind-blowing: bringing back extinct animals like the wooly mammoth and the dodo bird.
Read MoreHow to Grow (and Sell) a Data Science Consultancy, with Cal Al-Dhubaib
Today, my ebullient long-time friend Cal Al-Dhubaib makes his debut on my podcast to spill the beans on how you can launch your own thriving (data science / A.I. / ML) consultancy and, eventually, sell it 💰
Cal:
Is Head of AI & Data Science at Further, a data and A.I. company based in Atlanta that has hundreds of employees.
Previously, he was founder and CEO of Pandata, an Ohio-based A.I. and machine learning consultancy that he grew for over eight years until it was acquired by Further a year ago.
Delivers terrific talks — don’t miss him if you have the chance!
Holds a degree in data science from Case Western Reserve University in Cleveland.
Today’s episode should appeal to any listener, particularly anyone that would like to drive revenue and profitability from data science or AI projects.
In it, Cal covers:
Why his first startup was unsuccessful, but how the experience allowed him to discover an untapped market and build Pandata, a thriving data science consultancy.
His unconventional strategy of requiring clients to make a sizable commitment up front that initially scared away clients but ultimately attracted the best ones.
The way core values inspired by his "tin can to Mars" thought experiment shaped his hiring and company culture.
How making data science "boring", helping his clients trust AI systems and delivering a clear return on investment became his formula for success.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
OpenAI’s o3-mini: SOTA reasoning and exponentially cheaper
Today’s episode will fill you in on everything you need to know about an important model OpenAI recently released to the public called o3-mini.
Read MoreTabPFN: Deep Learning for Tabular Data (That Actually Works!), with Prof. Frank Hutter
I've been teaching Deep Learning for a decade. In that time, countless students have been disappointed by applying DL to tabular and time-series data. Finally, thanks to Prof. Frank Hutter, that will no longer be the case!
Frank:
Is a tenured professor of machine learning and head of the Machine Learning Lab at the University of Freiburg, although he has been on leave since May to focus on…
His fellowship on AutoML and Tabular Foundation Models at the ELLIS Institute Tübingen in Germany…
As well as becoming Co-Founder and CEO of Prior Labs, a German startup that provides a commercial counterpart to his tabular deep-learning model research and open-source projects… and that has just announced a huge €9m pre-seed funding round.
Holds a PhD in Computer Science from The University of British Columbia and his research has been extremely impactful: It has been cited over 87,000 times!
Today’s episode is on the technical side and will largely appeal to hands-on practitioners like data scientists, AI/ML engineers, software developers and statisticians (especially Bayesian statisticians)!
For a bit of context: Pretty much everyone works with tabular data, either primarily or occasionally. Tabular data are data stored in a table format, so structured into rows and columns, where the columns might be different data types, say, some numeric, some categorical and some text. For a decade, deep learning has ushered in the A.I. era by making huge advancements across many kinds of data — pixels from cameras, sound from microphones and of course natural language — but through all of this revolution, deep learning has struggled to be impactful on highly ubiquitous tabular data… until now.
In today’s episode, Prof. Hutter details:
How his revolutionary transformer architecture, TabPFN, has finally cracked the code on using deep learning for tabular data and is outperforming traditionally leading approaches like gradient-boosted trees on tabular datasets.
How version 2 of TabPFN, released last month to much fanfare thanks to its publication in the prestigious journal Nature, is a massive advancement, allowing it to handle orders of magnitude more training data.
How embracing Bayesian principles allowed TabPFN v2 to work "out of the box" on time-series data, beating specialized models and setting a new state of the art on the key time-series analysis benchmark.
The breadth of verticals that TabPFN has already been applied to and how you can now get started with this (conveniently!) open-source project on your tabular data today.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in January 2025
Happy Valentine's Day 💘 ! My high-calorie gift to you is today's episode, which features the best highlights from conversations I had with the (absolutely epic!) guests I hosted on my podcast in January.
The specific conversation highlights included in today's episode are:
Famed futurist Azeem Azhar on how to break your linear mindset to prepare for the exponential technological change that we are experiencing (and will experience even more rapidly in years to come).
Global quantum-computing expert Dr. Florian Neukart on practical, real-world applications of quantum computing today.
Kirill Eremenko and Hadelin de Ponteves — who have together taught over 5 million people data science — with their 12-step checklist for selecting an appropriate foundation model (e.g., large language model) for a given application.
Brooke Hopkins (former engineer at Waymo, now founder and CEO of Y Combinator-backed startup Coval) on why you should evaluate A.I. agents with reference-free metrics.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
From Pro Athlete to Data Engineer: Colleen Fotsch’s Inspiring Journey
Colleen Fotsch won national swimming championships and was a pro athlete in both CrossFit and bobsledding. Now she's excelling at data analytics and engineering! Today, hear her fun, inspiring and practical story.
More on Colleen:
As a collegiate swimmer, she won national championships and set an American record in the relay.
As a pro CrossFit athlete, she twice competed at the “Games”, which is the highest echelon of the sport.
And then she simultaneously pursued a degree in data analytics while training with the US Bobsled team.
An injury ended her Olympic Bobsled team dream, but luckily she’d been pursuing that analytics career in parallel!
She began working full-time as a data analyst four years ago and has now grown into a data-engineering leadership role at a healthcare-staffing firm called CHG Healthcare in Utah, where she serves as Senior Technical Manager of their Data Platform.
Inspires her 280,000 Instagram followers on a daily basis.
Today’s episode essentially has two separate parts:
The first half focuses on Colleen’s exciting journey to the highest levels of three sports: swimming, CrossFit and bobsledding. That part should be fascinating to just about anyone.
The second half covers Colleen’s transition into data analytics and data engineering; that part will appeal to technically-minded listeners, particularly ones considering a career in or early on in a career in analytics or engineering.
In today’s episode, Colleen details:
The connection between a competitive sports mindset and data-career success.
Proven strategies for being hired into your first data role later in your career.
Why being "not smart enough" for coding was a mental block she had to overcome.
How analytics engineering bridges the gap between data engineering and analysis.
The huge benefits deskbound professionals can enjoy by including regular exercise in their week, and tips and tricks for developing or growing an exercise habit.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.