In today’s Five-Minute Friday episode, I’ll cover the five biggest takeaways from the 2025 edition of the renowned AI Index Report, which was published a few weeks ago by the Stanford University Institute for Human-Centered AI. Every year this popular report — often called the “State of AI” report — covers the biggest technical advances, new achievements in benchmarking, investment flowing into AI and more. Here’s a link to the colossal full report in the show notes; today’s episode will cover the five most essential items.
Read MoreFiltering by Tag: data science
Blackwell GPUs Are Now Available at Your Desk, with Sama Bali and Logan Lawler
Today's charming and complementary guests — Sama Bali from NVIDIA and Logan Lawler from Dell — make for an extra fun episode on the powerful new Blackwell GPUs... now available at your desk!
More on Sama:
A.I. Solutions leader at NVIDIA that specializes in bringing A.I. products to market.
Prior to NVIDIA, held a Machine Learning Solutions role at Amazon Web Services (AWS).
Focused on educating data scientists and developers on A.I. innovations and implementing them effectively in enterprises.
Holds a Masters in Engineering Management from San José State University.
More on Logan:
Leads Dell Pro Max A.I. Solutions (if you haven’t heard of Pro Max before, we’ll cover that in this episode!)
Over his sixteen-year tenure at Dell Technologies, has held positions across merchandising, services, marketing and e-commerce.
Holds an MBA in management from Texas State University.
Today’s episode will be particularly appealing to hands-on data science, machine learning and A.I. practitioners but it isn’t especially technical and so can be enjoyed by anyone!
In today’s episode, Sama and Logan detail:
Why data scientists are camping out at 6AM to attend NVIDIA's GTC conference.
The killer specs of NVIDIA’s next-generation Blackwell GPUs.
How Dell and Nvidia have joined forces to bring server-level AI power right to your desktop.
How microservices are revolutionizing A.I. development and deployment.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in March 2025
We had absolutely killer guests and killer conversations on my podcast in March. This isn't bluster; I learned a ton from Andriy, Richmond, Natalie and Varun... Today's episode features all the best highlights!
The specific conversation highlights included in today's episode are:
The mega-bestselling author of "The 100-Page Machine Learning Book" (and now "The 100-Page Language Models Book"!) Dr. Andriy Burkov on the missing piece of AGI: Why LLMs can't plan or self-reflect.
Relatedly, the fascinating and exceptionally well-spoken Natalie Monbiot contrasted artificial intelligence with the human variety, detailing what makes us unique.
The charismatic software engineer Richmond Alake (of MongoDB) explained his "A.I. Stack" concept and how you can leverage it to build better A.I. applications.
Former Google Gemini engineer Varun Godbole provides a helpful overview of guide to neural network design, the (freely available!) "Deep Learning Tuning Playbook".
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
OpenAI’s “Deep Research”: Get Days of Human Work Done in Minutes
What does Deep Research do?
Read MoreIn Case You Missed It in February 2025
February was another insane month on my podcast. In addition to having stunning smiles, all four guests I hosted are fascinating, highly knowledgeable experts. Today's episode features highlights of my convos with them.
The specific conversation highlights included in today's episode are:
Professional-athlete-turned-data-engineer Colleen Fotsch on how DBT simplifies data modeling and documentation.
Engineer-turned-entrepreneur Vaibhav Gupta on the new programming language, BAML, he created for AI applications. He details how BAML will save you time and a considerable amount of money when calling LLM APIs.
Professor Frank Hutter on how TabPFN, the first deep learning approach to become the state of the art for modeling tabular data (i.e., the structured rows and columns of data that, until now, deep learning was feeble at modeling).
The ebullient Cal Al-Dhubaib on the keys to scaling (and selling!) a thriving data science consultancy.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in January 2025
Happy Valentine's Day 💘 ! My high-calorie gift to you is today's episode, which features the best highlights from conversations I had with the (absolutely epic!) guests I hosted on my podcast in January.
The specific conversation highlights included in today's episode are:
Famed futurist Azeem Azhar on how to break your linear mindset to prepare for the exponential technological change that we are experiencing (and will experience even more rapidly in years to come).
Global quantum-computing expert Dr. Florian Neukart on practical, real-world applications of quantum computing today.
Kirill Eremenko and Hadelin de Ponteves — who have together taught over 5 million people data science — with their 12-step checklist for selecting an appropriate foundation model (e.g., large language model) for a given application.
Brooke Hopkins (former engineer at Waymo, now founder and CEO of Y Combinator-backed startup Coval) on why you should evaluate A.I. agents with reference-free metrics.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
DeepSeek R1: SOTA Reasoning at 1% of the Cost
In recent weeks, I’m sure you’ve noticed that there’s been a ton of excitement over DeepSeek, a Chinese A.I. company that was spun out of a Chinese hedge fund just two years ago.
Read MoreThe Six Epochs of Intelligence Evolution
The six epochs of intelligence evolution. I came across the definition of these six stages in the futurist Ray Kurzweil’s latest book, The Singularity is Nearer. Per Kurzweil, each of the six stages of intelligence builds on the complexity of the information processing in the preceding stage, so the third epoch depends on the second one happening and the second epoch depends on the first.
Read More2025 AI and Data Science Predictions, with Sadie St. Lawrence
Happy New Year! To prepare you for 2025, today's guest is the clairvoyant Sadie St. Lawrence, who predicts what the biggest A.I. trends will be in the year ahead. We also pick the A.I. winners and losers of 2024.
In a bit more detail, in today’s episode (which will appeal to technical and non-technical listeners alike):
• We cover how Sadie’s predictions for 2024 (which she made a year ago on this show) panned out.
• We award our “wow moment” of 2024, our comeback of the year, our disappointment of the year and our overall winner of 2024.
• And then, of course, we speculate on the five biggest trends to prepare for in 2025.
As with our 2022, 2023 and 2024 predictions episode, our special guest again this year is Sadie St. Lawrence, who is:
• A data science and machine learning instructor whose content has been enjoyed by over 600,000 students.
• The Founder and CEO of the Human Machine Collaboration Institute as well as being founder and chair of Women In Data™️, a community of over 60,000 women across 55 countries.
• Serves on multiple start-up boards.
• Hosts the Data Bytes podcast.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Happy Holidays from the SuperDataScience Podcast
2024 was unquestionably the fastest-moving year yet for A.I. innovation. In particular, we witnessed the meteoric rise of generative AI from its largely-proof-of-concept phase to being commercially indispensable. According to survey results, nearly two-thirds of organizations are now regularly using generative A.I. – a number that has almost doubled since a year earlier. From enhancing product development to facilitating medical breakthroughs, generative AI has become a cornerstone of innovation across industries. For those of who practice data science hands-on, GenAI has proved itself to be near-magical at composing functional code and debugging our errors.
Indeed, as we’ll discuss in detail in next Tuesday’s episode with Sadie St. Lawrence, this year GenAI models crossed reliability and accuracy thresholds, enabling it to power independently acting AI agents, even multi-agent systems that can tackle complex tasks without human supervision. 2025 looks set to be the year Agentic AI takes center stage, the next phase in A.I. transforming every industry and overhauling our way of life; if we get the tricky parts right, then for the better for all of us on this planet.
I hope you’ve enjoyed our exploration of these developments (and much more!) in depth over the course of the year through our podcast episodes, allowing you to hear directly from leading experts and practitioners like Andrew Ng, Bernard Marr and Sol Rashidi. Our discussions have covered a wide range of topics, from the industrialization of data science processes to the ethical considerations surrounding AI implementation.
Through exploring the tricky bits like ethics and equity alongside the breathtaking technological breakthroughs, I hope that overall we’ve left you feeling optimistic about our capacity as a species to get this tech revolution right and have it benefit all of us. This holiday season, I hope you’ll also be able to sit with these positive vibes, get some time away from your screened devices and enjoy the wonder of life — including how lucky we are to be alive at this extraordinary time in history — with your loved ones.
From all of us here at the SuperDataScience Podcast, happy holidays!
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in November 2024
We had a ton of laughs and I had some seriously mind-expanding moments thanks to my guests on the SuperDataScience Podcast last month. ICYMI, today's episode highlights the most riveting moments from November.
The specific conversation highlights included in today's episode are:
Deepali Vyas, Global Head of Data and A.I. at executive-search giant Korn Ferry, on how A.I. is transforming recruitment and how job-seekers can stay ahead of the curve.
Jess Ramos, data analyst and leading content creator on data careers, on where to start if you yourself are seeking a career in data.
Bryan McCann, co-founder and CTO of the rapidly-scaling A.I. platform You.com, on why machines will make much better scientists than humans... and how they will surpass human scientists surprisingly soon.
Martin Goodson, CEO of the prestigious British A.I. firm Evolution AI, on how the public figures who are speaking most loudly about A.I. are probably not the people we should be listening to.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Career Success in the AI Era, with Deepali Vyas
Rapid A.I. advances can be intimidating: How can you approach your career so you flourish in the coming A.I. era? Find out from Deepali Vyas — a world-leading A.I.-exec headhunter — in today's episode.
Deepali is:
• Senior Partner and Global Head of the Data, A.I. and Financial Technology Practice of Korn Ferry, one of the world’s largest executive-search firms.
• Founder of ProFolios.ai, a video-centric, A.I.-enhanced professional-branding platform.
• Founder of Fearless+, a platform that empowers tens of thousands of young people for career success.
• Holds a Bachelor’s in Financial Mathematics and a Master’s in International Finance from the London School of Economics.
Today’s episode should be interesting to everyone. In it, Deepali details:
• How A.I. has driven a 10x increase in applications per position and how you can compete in this high-volume climate.
• Why technical skills are becoming "table stakes" and what will differentiate the best candidates in the A.I. era.
• An insider's view on the talent flows between Wall Street and Silicon Valley, and how you can capitalize on these flows in your career.
• The "green flags" to look for in potential bosses and employers.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in October 2024
It's unreal to be able to speak to folks like the guests I hosted on the SuperDataScience Podcast last month. ICYMI, today's episode highlights the most riveting moments from October.
The specific conversation highlights included in today's episode are:
UC San Diego neuroscience professor Dr. Bradley Voytek on how data science facilitates breakthroughs in our understanding of the brain.
Eloquent Natalie Monbiot on how lifelike, digital versions of ourselves can scale up our public-facing work.
Lightning AI CTO Dr. Luca Antiga on where he sees generative A.I. being most useful in our professional lives.
Gable CEO Chad Sanderson on how, when we work with data, we always need to think about how downstream users might come to interpret our data... which is why he finds data contracts so important that he's writing an O'Reilly book about it.
Polars CEO Ritchie Vink on the incredible specs (e.g., efficiency speedups) of his open-source DataFrame-operations library for Python.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The 10 Reasons AI Projects Fail, with Dr. Martin Goodson
Most A.I. projects fail. In today's episode, the brilliant (and hilarious) Dr. Martin Goodson details the top 10 reasons why A.I. projects fail and how to avoid these common pitfalls.
Martin:
• Is CEO and Chief Scientist at Evolution AI, a firm that uses generative A.I. to extract information from millions of documents a day for their clients.
• Is Founder and Organizer of the London ML Meetup, which (with >15,000 members) is the largest community of AI/ML experts in Europe.
• Previously led data science at startups that apply ML to billions of data points daily.
• Was a statistical geneticist at the University of Oxford (where we shared a small office together)!
Today’s episode will be of interest to anyone even vaguely interested in data science, ML or AI. In today’s episode, Martin details:
• The 10 reasons why data science projects fail and how to avoid these common pitfalls.
• His insights on building A.I. startups that serve large enterprises.
• The importance of open-source A.I. development.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Neuroscience Fueled by ML, with Prof. Bradley Voytek
Today's guest is the extraordinarily intelligent and well-spoken UC San Diego theoretical neuroscience professor, Bradley Voytek. He reveals how AI/ML is accelerating our understanding of the brain.
More on Brad:
• Professor in UC San Diego's Department of Cognitive Science, Data Science Institute, and the Neurosciences Graduate Program.
• Joined Uber as their first data scientist, when it was a 10-person startup, helping build their data science strategy and team.
• Outreach work has appeared in Scientific American, NPR... and Comic-Con!
• Co-authored the amusing book "Do Zombies Dream of Undead Sheep?"
Today’s episode has some brief exchanges that will appeal most to hands-on practitioners, but should overall be fascinating to anyone.
In today’s episode, Brad details:
• How large-scale data science and machine learning are accelerating neuroscience research.
• Discoveries his lab has recently made that overturn nearly a century of neuroscience doctrine.
• Insights on structuring data science education to balance technical skills with creative, practical problem-solving.
• Lessons from using data science to optimize Uber's early ride-prediction algorithms.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in September 2024
Another month, another set of invaluable conversations on the SuperDataScience Podcast I host. ICYMI, today's episode highlights the most fascinating moments from September.
The specific conversation highlights included in today's episode are:
Posit PBC engineering manager Dr. Julia Silge explains why Positron, the next-generation IDE she's leading development of, is better-suited to data scientists than any existing IDE.
PyTorch expert Luka Anicin provides his top tips for training more accurate and compute-efficient ML models.
Exceptional open-source developer Marco Gorelli on why Polars is anywhere from 10 to 100x faster than Pandas, the incumbent Python library for working with DataFrames.
Microsoft's Marck Vaisman on what companies hiring data scientists should be looking for... as opposed to what the typically (and mistakenly!) look for today.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Skills You Need to Be an Effective Data Scientist, with Marck Vaisman
Based on extensive research and analytical evaluations, in today's episode Marck Vaisman details all the skills that are essential for today's data professional.
Marck:
• Has been at Microsoft for seven years; for 5+ years, he’s been a Senior Cloud Solutions Architect, specializing in data, data science and AI/ML.
• For nearly a decade he’s also been an adjunct professor at both Georgetown University and The George Washington University, teaching graduate-level courses on math, stats, analytics and decision sciences.
• Co-Founded a non-profit in Washington, DC that runs both the Data Science DC and Statistical Programming DC Meetups.
• Holds a Bachelor's in Mechanical Engineering from Boston University and an MBA from Vanderbilt University.
Today’s episode will be of interest to anyone who is, manages, or aspires to be a data professional.
In today’s episode, Marck details:
• The skills, competencies and personas that data scientists and related professionals (such as analysts, data engineers, ML engineers and A.I. engineers) can have.
• The academic research on why “data scientist” is such a difficult job title to define.
• A comprehensive characterization of the essential skills that every data professional needs to be effective and the skills that allow you to specialize as a particular subtype of data scientist.
• The implications of all of this for both folks hunting for a data role and the companies that are looking to hire them.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in August 2024
We had a slew of eye-opening conversations in August on the SuperDataScience Podcast I host. ICYMI, today's episode highlights the most fascinating moments from my convos with them.
Specifically, conversation highlights include:
1. ChainML's Head of A.I. Education Shingai Manjengwa on how multiple, individual A.I. agents can come together to perform complex actions.
2. Renowned futurist and entrepreneur Dr. Daniel Hulme on how A.I. can help us become better and faster at our jobs by circumventing the traditional corporate hierarchies that today seem only to slow us down.
3. Mathematical-optimization guru Jerome Yurchisin (of Gurobi Optimization) on how continuing education will be vital in our increasingly automated work environment... and how this education will be streamlined by A.I.
4. Nick Elprin, Co-Founder and CEO of the wildly successful Domino Data Lab, on why it's essential for enterprises to clearly define their A.I. infrastructure in order for their A.I. deployments to prosper.
Check out today's episode (#818) to hear all these eye-opening conversations. The "Super Data Science Podcast with Jon Krohn" is available on all major podcasting platforms and a video version is on YouTube.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The AI Scientist: Towards Fully Automated, Open-Ended Scientific Discovery
A team of researchers from Sakana AI, a Japanese AI startup founded last year by Google alumni and that reportedly was valued at over a $1 billion in June, this week published a paper titled "The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery" that is making big waves and could revolutionize how we conduct scientific research.
Read MoreScaling Data Science Teams Effectively, with Nick Elprin
Today's episode with (extremely intelligent and wildly successful ML entrepreneur) Nick Elprin covers efficiently scaling data science teams and ensuring A.I. projects are commercial wins 🥇
Nick:
• Is Co-Founder and CEO of Domino Data Lab, a colossal Bay Area startup that has raised over $200m in venture capital from some of the world’s most prestigious VC firms.
• Prior to co-founding Domino Data Lab 11 years ago, he worked as a technologist at Bridgewater Associates, the well-known hedge fund.
• He holds both a BA and MS in Computer Science from Harvard University.
Today’s episode may appeal most to technical folks but has tons of content that will be of interest to anyone in or interested in commercializing data science or A.I.
In this episode, Nick details:
• How organizations can leverage enterprise platforms to efficiently scale their data science teams and data science workflows.
• The exact team size at which integrating such a platform becomes worthwhile.
• How to ensure A.I. projects are commercially successful.
• The tech stack they use at Domino to create such a performant platform.
• His top tip for growing your own colossal data science startup.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.