Filtering by Category: Data Science

AI is Disrupting Journalism: The Good, The Bad and The Opportunity

Added on August 1, 2025 by Jon Krohn.

Back in Episode #896, I argued that AI probably won’t be taking your job anytime soon. I followed that up in Episode #904 by discussing how some industries are nevertheless being rapidly and thoroughly disrupted by AI. In that episode, I focused on how AI is overhauling the advertising industry in particular. My post announcing the episode on LinkedIn generated a lot of discussion in the comments and garnered over 50,000 impressions within the first few hours of posting, which led me to the idea of having a series of Friday episodes that cover how particular industries are, like advertising, being rapidly and thoroughly overhauled by AI, with lessons for everyone on how we can adapt to this inevitable change and potentially leverage the winds of change to thrive professionally.

Causal AI, with Dr. Robert Usazuwa Ness

Added on July 29, 2025 by Jon Krohn.

Today's guest, Dr. Robert Osazuwa Ness, wrote the popular new book "Causal A.I." so enjoy this episode on what Causal A.I. is and what advantages it has over "normal" (correlation-based) models.

Robert:

• Senior Researcher at "Microsoft Research A.I."

• His research focuses on statistical and causal inference techniques for controllable, human-aligned multimodal models.

• He is also founder of Altdeep.ai, where he teaches professionals advanced topics in machine learning.

• Holds a PhD in Statistics from Purdue University in Indiana.

Today’s episode will resonate most with hands-on practitioners like data scientists, statisticians and A.I. engineers.

In today’s episode, Robert details:

• The three-rung ladder of causation that determines what types of causal questions you can actually answer with your data.

• The surprising connections between Bayesian networks, graphical models and modern causal A.I.

• Why A.I. systems have been dominated by correlation-based learning and what's stopping them from adopting causal reasoning like humans and animals naturally do.

• How tools like PyTorch, Pyro, and DoWhy are revolutionizing causal inference by separating statistical complexity from causal assumptions.

• How large language models like GPT-4o can act as "causal knowledge bases" and outperform traditional causal methods in some scenarios.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Guest Co-Host of Last Week in A.I., Episode #217

Added on July 25, 2025 by Jon Krohn.

The *only* podcast I listen to is "Last Week in A.I." — and, for this week's episode, I had the privilege of co-hosting it! Get caught up on all the biggest A.I.-related news of the past week, including:

• OpenAI's new "ChatGPT Agent", which can control computers and perform a wide range of tasks, showcasing powerful performance benchmarks and potential applications in business and research.

• xAI's racy new romantic chat... complete with stripping down to lingerie.

• Major business moves in the A.I. space, including Google's acquisition of Windsurf's top talent after OpenAI's deal fell through, and several notable hires by Meta from OpenAI and Apple.

Thanks to Andrey Kurenkov and Jeremie Harris for doing such a tremendous job of curating and hosting this weekly news show — as well as for welcoming me back to co-host again :)

This week's episode of "Last Week in AI" is #217. You can catch it on your favorite podcasting app or on YouTube.

Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman

Added on July 23, 2025 by Jon Krohn.

I was blown away by today's guest, the brilliant dual-PhD Zohar Bronfman as we discussed neuroscience, A.I., and why predictive models offer a better ROI than generative ones. Enjoy!

Dr. Bronfman:

• Is the co-founder and CEO of Pecan AI, a predictive analytics platform that has raised over $100m in venture capital.

• Holds two PhDs — one in computational neuroscience and another in philosophy — bringing a deep, multidisciplinary lens to the design and impact of A.I. systems.

• Focuses on the evolution of machine learning from statistical models to agentic systems that influence real-world outcomes.

Today’s episode will be fascinating for every listener.

In it, Zohar details:

• The trippy implications of the reality that your brain makes decisions hundreds of milliseconds before you're consciously aware of them.

• The intelligence feat that bumblebees can do that current A.I. cannot, with implications for the realization of human-like intelligence in machines.

• Why predictive models are more important than generative models for businesses but how generative LLMs can nevertheless make building and deploying predictive models much easier and accessible.

• The rollercoaster journey that led him to create a sensationally successful A.I. startup immediately upon finishing his academic degrees.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

How Prof. Jason Corso Solved Computer Vision’s Data Problem

Added on July 21, 2025 by Jon Krohn.

Exceptional episode for you today with Prof. Jason Corso, in which he details how he's tackling the biggest problem in machine vision. Jason's super sharp and very well-spoken... don't miss this one!

Jason:

• Professor of Robotics, Electrical Engineering and Computer Science at the prestigious University of Michigan, with over 20 years of research spanning video understanding, robotics, and AI.

• Has published over 150 academic papers that, together, have been cited over 20,000 times.

• Co-founder and Chief Science Officer at Voxel51, a leading platform for visual AI development.

• His work bridges academic innovation and real-world impact, earning him more major honors than I have space to list!

Today’s episode skews a bit toward hands-on practitioners like data scientists and AI/ML engineers, particularly anyone tackling computer-vision problems. That said, Jason is a charismatic and exceptional communicator so perhaps any listener to this podcast will enjoy today’s episode.

In it, Jason details:

• How his research spinout, Voxel51, is solving the biggest bottleneck in computer vision.

• The surprising way autonomous vehicles learn to handle accidents they've never seen.

• Why the secret to better AI models isn't better algorithms — it's something else that’s hiding in plain sight.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann

Added on July 15, 2025 by Jon Krohn.

In today's episode, A.I. researcher Dr. Sebastian Gehrmann details what RAG is and why it makes LLMs *less* safe... despite popular perception of the opposite.

Sebastian:

Is Head of Responsible A.I. at Bloomberg, the New York-based financial, software, data, and media company that (with 20,000 employees) is huge.
Previously, as Head of NLP at Bloomberg, he directed the development and adoption of language technology to bring the best A.I.-enhanced products to the Bloomberg Terminal.
Prior to Bloomberg, was a senior researcher at Google, where he worked on the development of large language models, including the groundbreaking BLOOM and PaLM models.
He holds a Ph.D. in computer science from Harvard University.

Today’s episode skews slightly toward our more technical listeners like data scientists, A.I. engineers and software developers, but anyone who’d like to be up to date on the latest A.I. research may want to give it a listen.

In today’s episode, Sebastian details:

The shocking discovery that retrieval augmented generation (RAG) actually makes LLMs LESS safe, despite the popular perception of the opposite.
Why the difference between 'helpful' and 'harmless' A.I. matters more than you may think.
The hidden “attack surfaces” that emerge when you combine RAG with enterprise data.
The problems that can happen when you push LLMs beyond their intended context window limits.
What you can do to ensure your LLMs are Helpful, Honest and Harmless for your particular use cases.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir

Added on July 8, 2025 by Jon Krohn.

Sensational episode for you today with the illustrious A.I. author, educator and entrepreneur Sinan Ozdemir on how LLM benchmarks are lying to you... and what you can do about it.

Sinan:

Is Founder and CTO of LoopGenius, a generative A.I. startup.
Authored several excellent books, including, most recently, the bestselling "Quick Start Guide to Large Language Models".
Hosts the "Practically Intelligent" podcast.
Was previously adjunct faculty at The Johns Hopkins University, now teaches several times a month within the O'Reilly platform.
Serial A.I. entrepreneur, including founding a Y Combinator-backed generative A.I. startup way back in 2015 that was later acquired.
Holds a Master’s in Pure Math from Johns Hopkins.

Today’s episode skews slightly toward our more technical listeners but Sinan excels at explaining complex concepts in a clear way so today’s episode may appeal to any listener of this podcast.

In today’s episode, Sinan details:

Why the A.I. benchmarks everyone relies on might be lying to you.

How the leading A.I. labs are gaming the benchmark system.
Tricks to actually effectively evaluate LLMs’ capabilities for your use cases.
What the future of benchmarking will involve, including how to benchmark agentic and multimodal models.
How a simple question about watermelon seeds reveals the 40% failure rate of even today’s most advanced A.I. models.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)

Added on July 1, 2025 by Jon Krohn.

Today, exceptional communicator Lilith Bat-Leah explains why "Data-Centric ML Research" trumps our typical focus on model capability, with examples from her extensive Legal A.I. background.

Lilith:

Has over a decade of experience specializing in the application of ML to legal tech.
Is Senior Director of A.I. Labs at Epiq, a leading LegalTech firm that has over 6000 employees.
Has published work on evaluation methods for the use of ML in legal discovery as well as on Data-centric ML Research (DMLR).
Is co-chair of the DMLR working group MLCommons and has organized DMLR workshops at [ICML] Int'l Conference on Machine Learning and ICLR, two of the most important A.I. conferences.
Holds a degree from Northwestern University, in which she focused on statistics.

Today’s episode will appeal primarily to hands-on practitioners like data scientists, AI/ML engineers and software developers.

In today’s episode, Lilith details:

How A.I. is revolutionizing the legal industry by automating up to 80% of traditional discovery processes.

Why 'elusion' is a critical metric that only exists in LegalTech — and what it reveals about machine learning evaluation.
The surprising reason why we should stop obsessing over model improvements and focus on something that takes up 80% data scientists’ time instead.
How she grew from being a temp receptionist to an A.I. lab director by falling in love with statistics.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko

Added on June 24, 2025 by Jon Krohn.

As we approach episode #900, the original SuperDataScience Podcast host Kirill Eremenko returns to reflect on what leads to the highest-paying opportunities in AI. This is a special one; enjoy!

Many of you will already know Kirill:

Founder and CEO of SuperDataScience.com, the eponymous e-learning platform.
Founded the SuperDataScience Podcast nine years ago and hosted the show until he passed me the reins five years ago.
With over 3 million students, he’s the most popular data science and A.I. instructor on Udemy.
He holds a Master’s from The University of Queensland in Australia and a Bachelor’s in Applied Physics and Mathematics from the Moscow Institute of Physics and Technology.

Today’s episode is ideal for anyone looking to advance their data science or A.I. career — or looking to break into a career in this field for the first time.

In today’s episode, Kirill details:

Why employers are still testing A.I. engineers on basic machine learning fundamentals — even for LLM-focused roles.
The surprising reason why staying in data science (as opposed to developing an A.I. specialization) could be the right career move for you.
How one developer discovered the hidden age bias in tech recruiting — and the simple hack to beat it.
The two critical skill areas that separate amateur A.I. engineers from the pros commanding huge salaries.
Why the "back to office" movement could give you a competitive advantage in landing a top A.I. role.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

My Four-Hour Agentic AI Workshop is Live and 100% Free

Added on June 22, 2025 by Jon Krohn.

In case you missed my post last week, my four-hour Agentic A.I. workshop (with Ed Donner, pictured) is live. 8,000 people have already watched it! Here's what they're saying:

How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare

Added on June 19, 2025 by Jon Krohn.

People, not technical capability, are holding back A.I.'s impact in organizations. In today's episode, Diane Hare explains how to overcome friction and enable strategic A.I. transformation.

Diane:

Founder and CEO of the New York-based strategic consulting firm BizLove, which has been mobilizing key stakeholders to deliver on enterprise-wide priorities (like A.I. initiatives!) at Fortune 100 companies for seven years.
Prior to her seven years leading BizLove, spent seven years at EY, the global professional services giant (they have nearly 400,000 employees) formerly known as Ernst & Young.
Board Member at NANO Nuclear Energy Inc. (NASDAQ: NNE)
Holds and MBA and was captain of a semi-professional women’s soccer team in New York City!

Today’s episode is well-suited to anyone looking to make an impact with A.I. and automation, which I suspect is about every listener to my podcast!

In today’s episode, Diane details:

Why people, not technical capability, are holding back A.I.’s transformative power in organizations.
How to prioritize the items on an enterprise A.I. roadmap.
Why storytelling is essential for gaining buy-in from stakeholders on an A.I. initiative.
Her top five tips for enabling A.I. transformation.

This was a super-cool episode for me because Diane's consultancy, BizLove, is a formal partner of my own consultancy, Y Carrot 🥕. While Y Carrot brings rich technical expertise on A.I. (from development through to production deployment), BizLove naturally complements us with their deep experience enabling digital and A.I. transformations of enterprises. Together, we offer every service organizations need to make lasting, impactful improvements with A.I.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

AI (Probably) Isn’t Taking Your Job (At Least Anytime Soon)

Added on June 19, 2025 by Jon Krohn.

Is AI actually taking jobs? Spoiler alert: the data suggest it's not happening yet, despite all the anxiety out there.

Agentic AI Hands-On in Python: MCP, CrewAI and OpenAI Agents SDK (by Jon Krohn and Ed Donner)

Added on June 13, 2025 by Jon Krohn.

Now live! Four hours long and 100% free, this hands-on workshop covers all the Agentic A.I. theory and tools you need to develop and deploy multi-agent teams with Python.

Beautifully shot by a professional film crew (led by the exceptional Lucie McCormick) at the Open Data Science Conference (ODSC) East in Boston a few weeks ago and then meticulously edited by SuperDataScience's inimitable Mario Pombo, this training (within the GenAI-forward Cursor IDE) features all of today's essential agent frameworks:

OpenAI Agents SDK
CrewAI
Anthropic's Model Context Protocol (MCP)

From design considerations through to practical implementation tips, by completing all four modules in this video, you will have all the knowledge and skills needed to create effective multi-agent systems. The four modules are:

Defining Agents
Designing Agents
Developing Agents
The Future of Agents

The coding elements are led by the wonderful Ed Donner, whom many of you will already know as one of the very best in the world at creating and teaching hands-on A.I. content.

We received rave reviews for the session at ODSC East and the lecture hall was standing-room only for the entire duration, so I anticipate that you'll love it too!

Watch the full training here: youtu.be/LSk5KaEGVk4

The Future of Enterprise AI: Investor Shaun Johnson Reveals What Actually Works

Added on June 11, 2025 by Jon Krohn.

What are the biggest opportunities for A.I. startups? Find out in today's episode with the trailblazing venture capitalist Shaun Johnson, including tricks for gaining enterprise A.I. adoption.

Shaun:

Co-founder and general partner at AIX Ventures in San Francisco, where he’s led deals into companies including Perplexity, Chroma, and Workhelix.
He is a former VP of Engineering, Product and Design at Lilt; and a former VP of Product and Design at NimbleRx.
Holds a Master’s in Electrical Engineering from Stanford University and an MBA from the University of California, Berkeley.

Today’s episode is well-suited to any listener to this podcast. In it, Shaun details:

How having investment partners like Richard Socher and Christopher Manning, who are practitioners actively building at the cutting edge of A.I., gives AIX Ventures an edge.

What it takes to become one of the few thousand people in the world pushing the A.I. frontier.
The surprising strategy that makes enterprise A.I. adoption 10x easier.
Why some A.I. startups are better off building in 'red oceans' full of competition rather than seeking blue-ocean opportunities.
The reason big tech companies are buying A.I. talent without acquiring the actual startups.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Case You Missed It in May 2025

Added on June 11, 2025 by Jon Krohn.

We had stellar guests in May, including one episode that had the most positive social-media response of any episode ever. In today's "In Case You Missed It" episode, hear the best parts of all my May convos.

The specific conversation highlights included in today's episode are:

John Roese, Dell Technologies' global CTO and Chief A.I. Officer, on the biggest A.I. opportunities for enterprises in the coming months/years. (This is the episode that received an unprecedented social-media response.)
The authors of the brand-new O'Reilly book "Python Polars: The Definitive Guide", Jeroen Janssens and Thijs Nieuwdorp, on a real-world Polars success story.
Space engineer and entrepreneur Mary Spio on solving global talent shortages with A.I.-infused virtual reality hardware.
Martin Brunthaler, serial entrepreneur/CTO, on how platforms like Adverity allow you to talk with your data in natural language.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

How to Jumpstart Your Data Career (by Applying Like a Scientist), with Avery Smith

Added on June 3, 2025 by Jon Krohn.

Today's fun episode with superstar Avery Smith (>140k LinkedIn and >40k YouTube subscribers) is for folks looking to jumpstart their data career — either landing your first data role or advancing your career. Enjoy!

Avery:

Is the creator of Data Career Jumpstart — a platform to help working professionals break into, well, data careers (like data analyst or data scientist roles).
Hosts the popular Data Career Podcast.
Runs Snow Data Science, an analytics and data-solutions consultancy with clients including the Utah Jazz 🏀
Previously held data scientist roles at ExxonMobil and Vaporsens.
Holds a Master’s in Data Analytics from Georgia Tech.

Today’s episode contains helpful tips for anyone looking to advance their career but is particularly intended for listeners who are seeking their first role working with data.

In today’s episode, Avery details:

How spilling acid on himself led him to becoming a data professional.

His "Every Turtle Swims Past" learning ladder for breaking into data careers.
What’s even more important than skills or experience for landing a job.
How one of his bootcamp students went from delivery driver to data analyst by AB testing her text messages.
Which job boards are killing your data career applications.
Why GitHub is not a portfolio, but what you can use instead.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

We’re In The AI “Trough of Disillusionment” (and that’s Great!)

Added on June 3, 2025 by Jon Krohn.

Today we're diving into a shift happening in the AI landscape right now — one that might surprise you (and perhaps even be worrying!) given all the hype we've been hearing. While tech giants continue pouring billions into AI infrastructure, many organizations are hitting a wall when it comes to actually implementing AI — particularly generative AI — in meaningful ways. Let's explore what the heck is going on.

Conversational AI is Overhauling Data Analytics, with Martin Brunthaler

Added on June 3, 2025 by Jon Krohn.

Fascinating new episode for you from serial entrepreneur/CTO Martin Brunthaler on how GenAI and Agentic A.I. are transforming data analytics today... and how analytics will continue to evolve in the coming years.

Martin Brunthaler:

CTO of Adverity, an Austrian data analytics platform he co-founded a decade ago and that has since raised over $160m in venture capital.
Before Adverity, Martin was co-founder and CTO at two other European tech start-ups, giving him over 20 years of combined experience in starting, scaling and exiting companies across multiple industries including eCommerce, media and mobile.
Holds an engineering diploma (equivalent to a Bachelor's degree) from the Salzburg University of Applied Sciences in Austria.

Today’s episode should be of interest to just about anyone who’d be interested in this podcast because it touches on data analytics, transforming user experiences with modern AI capabilities and growing tech businesses.

In today’s episode, Martin details:

How a childhood fascination with computer programming evolved into founding a globally leading platform for marketing data analytics.

What "data democratization" really means and how the traditional dashboard-based approach to data reporting is failing businesses.
Why data analysts are spending too much time on "busy work" instead of delivering business value.
How conversational AI is overhauling how data insights are gleaned for hands-on data practitioners and business users alike.
His no-nonsense tips for tech startup success.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

The “State of AI” Report 2025

Added on May 27, 2025 by Jon Krohn.

In today’s Five-Minute Friday episode, I’ll cover the five biggest takeaways from the 2025 edition of the renowned AI Index Report, which was published a few weeks ago by the Stanford University Institute for Human-Centered AI. Every year this popular report — often called the “State of AI” report — covers the biggest technical advances, new achievements in benchmarking, investment flowing into AI and more. Here’s a link to the colossal full report in the show notes; today’s episode will cover the five most essential items.