• Home
  • Fresh Content
  • Courses
  • Resources
  • Podcast
  • Talks
  • Publications
  • Sponsorship
  • Testimonials
  • Contact
  • Menu

Jon Krohn

  • Home
  • Fresh Content
  • Courses
  • Resources
  • Podcast
  • Talks
  • Publications
  • Sponsorship
  • Testimonials
  • Contact
Jon Krohn

How to Jailbreak LLMs (and How to Prevent It), with Michelle Yi

Added on August 19, 2025 by Jon Krohn.

Today, extraordinary Michelle Yi details LLM jailbreaking (as well as data poisoning, prompt stealing and slop squatting!) and how to prevent it. Scary content but she makes it funny and entertaining, enjoy!

When I say "extraordinary", I'm not exaggerating. Michelle:

  • Finished her undergrad at the same age as most folks finish high school.

  • While working full-time as an engineering lead at IBM on Jeopardy-playing Watson, she was also a professional violinist in the New York Philharmonic!

  • In the past decade, has held a impressive list of AI leadership roles at Bay Area startups.

  • Now is helping (startlingly underrepresented) women in tech startups and venture capital through co-founding Generationship, being a venture partner in (ironically named) The Tech Bros and a board member for Women In Data™️.

Today's episode skews a bit toward hands-on practitioners but Michelle does such a wonderful job of communicating complex concepts and making them relevant to modern global events that anyone might love this episode.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, llm, ai, jailbreaking, trustworthyAI

Data Lakes 101 (and Why They’re Key for AI Models), with Oz Katz

Added on August 15, 2025 by Jon Krohn.

Today, Oz Katz joins me to explain what data lakes are as well as what data infrastructure's needed to train and run modern A.I. models (such as multi-modal LLMs). This is an informative one!

Oz Katz:

  • Is co-founder and CTO of lakeFS, which looks and feels like Git but is for data (instead of software) versioning and collaboration.

  • Was previously co-founder/CTO at a number of other tech startups, including Swayy, which was acquired by Similarweb in 2015.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, YouTube, SuperDataScience, Podcast, Interview Tags superdatascience, ai, data, datalake, datawarehouse, git, versioning

LLM Pre-Training and Post-Training 101, with Julien Launay

Added on August 12, 2025 by Jon Krohn.

How are cutting-edge LLMs are trained? Find out in today's exceptional episode with Julien Launay, who digs into pre-training (supervised learning) and post-training (reinforcement learning) in eloquent detail.

Julien:

• CEO and co-founder of Adaptive ML, a remarkably fast-growing startup focused on enabling A.I. models to learn from experience.

• Previous led the extreme-scale research teams at Hugging Face and LightOn, where he helped develop state-of-the-art open-source models.

• Organizer of the "Efficient Systems for Foundation Models" workshop at ICML (the prestigious International Conference on Machine Learning).

Today's episode will appeal most to hands-on practitioners but other folks who are open to getting into the technical weeds on Large Language Model (LLM) training should also listen in.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Podcast, SuperDataScience, YouTube Tags superdatascience, llm, llms, ai, model training, reinforcementlearning

In Case You Missed It in July 2025

Added on August 9, 2025 by Jon Krohn.

Wow, what a month... ICYMI, here are the top moments from the conversations I had with my podcast guests in July:

  1. AI-lab director Lilith Bat-Leah on why data-centric machine learning research (DMLR) is the future of AI.

  2. Prolific author and speaker Sinan Ozdemir (last week in San Francisco, I witnessed him deliver the *best* talk I've ever seen) on whether we can trust LLM benchmarks.

  3. Bloomberg's Dr. Sebastian Gehrmann on why generic LLMs fail in regulated industries like finance and healthcare.

  4. AI entrepreneur Dr. Zohar Bronfman on how AI can predict what you'll do next... before you're even consciously aware of your decision.

  5. Microsoft AI researcher Dr. Robert Osazuwa Ness with step-by-step guidance on how to build causal AI models using PyTorch.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Five-Minute Friday, Data Science, Podcast, SuperDataScience, YouTube Tags superdatascience, data science, machine learning, ai, podcast

The Future of Python Notebooks is Here, with Marimo’s Dr. Akshay Agrawal

Added on August 5, 2025 by Jon Krohn.

I love Jupyter Notebooks... but they have a lot of painful "features". Today's guest Akshay Agrawal has built marimo, which resolves these issues and adds in lots of clever new innovations.

More on Akshay:

  • Co-founder and CEO of marimo.

  • Carried out a PhD in electrical engineering at Stanford University.

  • Previously held software engineering roles at Google and Netflix.

Today's episode will appeal most to hands-on practitioners. In it, we cover:

  • Why 96% of Jupyter notebooks fail to reproduce their original results.

  • How reactive notebooks can transform a simple slider adjustment into automatic recalculation across your entire analysis — like Excel for data science.

  • How you can now intuitively select data points with your mouse in a scatter plot and instantly get them back as a Python dataframe for analysis.

  • How one marimo notebook can simultaneously be an executable script, an importable Python module, and even a fully functional click-and-point UI!

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, python, data science, machine learning, ai, jupyter

AI is Disrupting Journalism: The Good, The Bad and The Opportunity

Added on August 1, 2025 by Jon Krohn.

Back in Episode #896, I argued that AI probably won’t be taking your job anytime soon. I followed that up in Episode #904 by discussing how some industries are nevertheless being rapidly and thoroughly disrupted by AI. In that episode, I focused on how AI is overhauling the advertising industry in particular. My post announcing the episode on LinkedIn generated a lot of discussion in the comments and garnered over 50,000 impressions within the first few hours of posting, which led me to the idea of having a series of Friday episodes that cover how particular industries are, like advertising, being rapidly and thoroughly overhauled by AI, with lessons for everyone on how we can adapt to this inevitable change and potentially leverage the winds of change to thrive professionally.

Read More
In Data Science, Five-Minute Friday, Podcast, SuperDataScience, YouTube Tags generativeai, ai, journalism, SuperDataScience

Causal AI, with Dr. Robert Usazuwa Ness

Added on July 29, 2025 by Jon Krohn.

Today's guest, Dr. Robert Osazuwa Ness, wrote the popular new book "Causal A.I." so enjoy this episode on what Causal A.I. is and what advantages it has over "normal" (correlation-based) models.

Robert:

• Senior Researcher at "Microsoft Research A.I."

• His research focuses on statistical and causal inference techniques for controllable, human-aligned multimodal models.

• He is also founder of Altdeep.ai, where he teaches professionals advanced topics in machine learning.

• Holds a PhD in Statistics from Purdue University in Indiana.

Today’s episode will resonate most with hands-on practitioners like data scientists, statisticians and A.I. engineers.

In today’s episode, Robert details:

• The three-rung ladder of causation that determines what types of causal questions you can actually answer with your data.

• The surprising connections between Bayesian networks, graphical models and modern causal A.I.

• Why A.I. systems have been dominated by correlation-based learning and what's stopping them from adopting causal reasoning like humans and animals naturally do.

• How tools like PyTorch, Pyro, and DoWhy are revolutionizing causal inference by separating statistical complexity from causal assumptions.

• How large language models like GPT-4o can act as "causal knowledge bases" and outperform traditional causal methods in some scenarios.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, causality, causalai, ai, python

AI Agents Blackmail Humans 96% of the Time (Agentic Misalignment)

Added on July 25, 2025 by Jon Krohn.

Today's topic is a groundbreaking (and disturbing!) piece of research from Anthropic that should fundamentally change how we think about AI safety and deployment in enterprise environments. If you're working with or planning to deploy AI agents in your organization, this episode could be the most important five minutes you spend this week.

Read More
In Five-Minute Friday, Podcast, SuperDataScience, YouTube Tags ai, Anthropic, ai agents, agenticai

Guest Co-Host of Last Week in A.I., Episode #217

Added on July 25, 2025 by Jon Krohn.

The *only* podcast I listen to is "Last Week in A.I." — and, for this week's episode, I had the privilege of co-hosting it! Get caught up on all the biggest A.I.-related news of the past week, including:

• OpenAI's new "ChatGPT Agent", which can control computers and perform a wide range of tasks, showcasing powerful performance benchmarks and potential applications in business and research.

• xAI's racy new romantic chat... complete with stripping down to lingerie.

• Major business moves in the A.I. space, including Google's acquisition of Windsurf's top talent after OpenAI's deal fell through, and several notable hires by Meta from OpenAI and Apple.

Thanks to Andrey Kurenkov and Jeremie Harris for doing such a tremendous job of curating and hosting this weekly news show — as well as for welcoming me back to co-host again :)

This week's episode of "Last Week in AI" is #217. You can catch it on your favorite podcasting app or on YouTube.

In Data Science, Interview, Podcast, YouTube Tags lastweekinai, ai, ainews, ChatGPT, agenticai, aiagent, xai

Neuroscience, AI and the Limitations of LLMs, with Dr. Zohar Bronfman

Added on July 23, 2025 by Jon Krohn.

I was blown away by today's guest, the brilliant dual-PhD Zohar Bronfman as we discussed neuroscience, A.I., and why predictive models offer a better ROI than generative ones. Enjoy!

Dr. Bronfman:

• Is the co-founder and CEO of Pecan AI, a predictive analytics platform that has raised over $100m in venture capital.

• Holds two PhDs — one in computational neuroscience and another in philosophy — bringing a deep, multidisciplinary lens to the design and impact of A.I. systems.

• Focuses on the evolution of machine learning from statistical models to agentic systems that influence real-world outcomes.

Today’s episode will be fascinating for every listener.

In it, Zohar details:

• The trippy implications of the reality that your brain makes decisions hundreds of milliseconds before you're consciously aware of them.

• The intelligence feat that bumblebees can do that current A.I. cannot, with implications for the realization of human-like intelligence in machines.

• Why predictive models are more important than generative models for businesses but how generative LLMs can nevertheless make building and deploying predictive models much easier and accessible.

• The rollercoaster journey that led him to create a sensationally successful A.I. startup immediately upon finishing his academic degrees.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, neuroscience, ai, generativeai, llm

How Prof. Jason Corso Solved Computer Vision’s Data Problem

Added on July 21, 2025 by Jon Krohn.

Exceptional episode for you today with Prof. Jason Corso, in which he details how he's tackling the biggest problem in machine vision. Jason's super sharp and very well-spoken... don't miss this one!

Jason:

• Professor of Robotics, Electrical Engineering and Computer Science at the prestigious University of Michigan, with over 20 years of research spanning video understanding, robotics, and AI.

• Has published over 150 academic papers that, together, have been cited over 20,000 times.

• Co-founder and Chief Science Officer at Voxel51, a leading platform for visual AI development.

• His work bridges academic innovation and real-world impact, earning him more major honors than I have space to list!

Today’s episode skews a bit toward hands-on practitioners like data scientists and AI/ML engineers, particularly anyone tackling computer-vision problems. That said, Jason is a charismatic and exceptional communicator so perhaps any listener to this podcast will enjoy today’s episode.

In it, Jason details:

• How his research spinout, Voxel51, is solving the biggest bottleneck in computer vision.

• The surprising way autonomous vehicles learn to handle accidents they've never seen.

• Why the secret to better AI models isn't better algorithms — it's something else that’s hiding in plain sight.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, machinevision, Computer Vision, ai, entrepreneur

Why RAG Makes LLMs Less Safe (And How to Fix It), with Bloomberg’s Dr. Sebastian Gehrmann

Added on July 15, 2025 by Jon Krohn.

In today's episode, A.I. researcher Dr. Sebastian Gehrmann details what RAG is and why it makes LLMs *less* safe... despite popular perception of the opposite.

Sebastian:

  • Is Head of Responsible A.I. at Bloomberg, the New York-based financial, software, data, and media company that (with 20,000 employees) is huge.

  • Previously, as Head of NLP at Bloomberg, he directed the development and adoption of language technology to bring the best A.I.-enhanced products to the Bloomberg Terminal.

  • Prior to Bloomberg, was a senior researcher at Google, where he worked on the development of large language models, including the groundbreaking BLOOM and PaLM models.

  • He holds a Ph.D. in computer science from Harvard University.

Today’s episode skews slightly toward our more technical listeners like data scientists, A.I. engineers and software developers, but anyone who’d like to be up to date on the latest A.I. research may want to give it a listen.

In today’s episode, Sebastian details:

  • The shocking discovery that retrieval augmented generation (RAG) actually makes LLMs LESS safe, despite the popular perception of the opposite.

  • Why the difference between 'helpful' and 'harmless' A.I. matters more than you may think.

  • The hidden “attack surfaces” that emerge when you combine RAG with enterprise data.

  • The problems that can happen when you push LLMs beyond their intended context window limits.

  • What you can do to ensure your LLMs are Helpful, Honest and Harmless for your particular use cases.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, ai, llm, generativeai, rag, enterpr

A.I. is Disrupting the Entire Advertising Industry

Added on July 14, 2025 by Jon Krohn.

A few Fridays ago, in Episode #896, I made the case that AI probably isn’t going to take your job anytime soon. AI is, however, being quite disruptive as more and more tasks are automated and there are examples of industries being so disrupted by AI that some folks within the industry need to take note now because, if they don’t adapt, their role — maybe even their whole company — could be at risk.

Read More
In Five-Minute Friday, Podcast, SuperDataScience, YouTube Tags ai, advertising, cannes, data science, SuperDataScience

LLM Benchmarks Are Lying to You (And What to Do Instead), with Sinan Ozdemir

Added on July 8, 2025 by Jon Krohn.

Sensational episode for you today with the illustrious A.I. author, educator and entrepreneur Sinan Ozdemir on how LLM benchmarks are lying to you... and what you can do about it.

Sinan:

  • Is Founder and CTO of LoopGenius, a generative A.I. startup.

  • Authored several excellent books, including, most recently, the bestselling "Quick Start Guide to Large Language Models".

  • Hosts the "Practically Intelligent" podcast.

  • Was previously adjunct faculty at The Johns Hopkins University, now teaches several times a month within the O'Reilly platform.

  • Serial A.I. entrepreneur, including founding a Y Combinator-backed generative A.I. startup way back in 2015 that was later acquired.

  • Holds a Master’s in Pure Math from Johns Hopkins.

Today’s episode skews slightly toward our more technical listeners but Sinan excels at explaining complex concepts in a clear way so today’s episode may appeal to any listener of this podcast.

In today’s episode, Sinan details:

Why the A.I. benchmarks everyone relies on might be lying to you.

  • How the leading A.I. labs are gaming the benchmark system.

  • Tricks to actually effectively evaluate LLMs’ capabilities for your use cases.

  • What the future of benchmarking will involve, including how to benchmark agentic and multimodal models.

  • How a simple question about watermelon seeds reveals the 40% failure rate of even today’s most advanced A.I. models.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Podcast, SuperDataScience, YouTube, Interview Tags superdatascience, ai, llm, llms, benchmarks

In Case You Missed It in June 2025

Added on July 5, 2025 by Jon Krohn.

We had exceptional guests on my podcast in June. In today's "In Case You Missed It" episode, hear the best parts of all my June convos. Here's a quick summary to tantalize you:

  1. Strategy consultant Diane Hare with five tricks for gaining buy-in on A.I. transformation in your organization.

  2. Renowned data-career educator Avery Smith on the two portfolio projects every aspiring data analyst should build.

  3. SuperDataScience founder Kirill Eremenko on what you're missing if you're struggling to land an A.I. job.

  4. San Fran-based venture capitalist Shaun Johnson on the traits that make a great A.I. startup founder.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Five-Minute Friday, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, datascience, machine learning, ai, podcast

Automating Legal Work with Data-Centric ML (feat. Lilith Bat-Leah)

Added on July 1, 2025 by Jon Krohn.

Today, exceptional communicator Lilith Bat-Leah explains why "Data-Centric ML Research" trumps our typical focus on model capability, with examples from her extensive Legal A.I. background.

Lilith:

  • Has over a decade of experience specializing in the application of ML to legal tech.

  • Is Senior Director of A.I. Labs at Epiq, a leading LegalTech firm that has over 6000 employees.

  • Has published work on evaluation methods for the use of ML in legal discovery as well as on Data-centric ML Research (DMLR).

  • Is co-chair of the DMLR working group MLCommons and has organized DMLR workshops at [ICML] Int'l Conference on Machine Learning and ICLR, two of the most important A.I. conferences.

  • Holds a degree from Northwestern University, in which she focused on statistics.

Today’s episode will appeal primarily to hands-on practitioners like data scientists, AI/ML engineers and software developers.

In today’s episode, Lilith details:

How A.I. is revolutionizing the legal industry by automating up to 80% of traditional discovery processes.

  • Why 'elusion' is a critical metric that only exists in LegalTech — and what it reveals about machine learning evaluation.

  • The surprising reason why we should stop obsessing over model improvements and focus on something that takes up 80% data scientists’ time instead.

  • How she grew from being a temp receptionist to an A.I. lab director by falling in love with statistics.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags superdatascience, machine learning, ai, dmlr, LegalTech, legalai

95-Year-Old Annie on How to Stay Healthy and Happy

Added on June 27, 2025 by Jon Krohn.

Our 900th Episode! It almost goes by as quickly as 95 years! By popular demand, my grandmother Annie returns to the podcast with wisdom on staying happy and healthy at any age.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Five-Minute Friday, Personal Improvement, Podcast, SuperDataScience, YouTube Tags superdatascience, podcast, life, wisdom, aging, happiness

Landing $200k+ AI Roles: Real Cases from the SuperDataScience Community, with Kirill Eremenko

Added on June 24, 2025 by Jon Krohn.

As we approach episode #900, the original SuperDataScience Podcast host Kirill Eremenko returns to reflect on what leads to the highest-paying opportunities in AI. This is a special one; enjoy!

Many of you will already know Kirill:

  • Founder and CEO of SuperDataScience.com, the eponymous e-learning platform.

  • Founded the SuperDataScience Podcast nine years ago and hosted the show until he passed me the reins five years ago.

  • With over 3 million students, he’s the most popular data science and A.I. instructor on Udemy.

  • He holds a Master’s from The University of Queensland in Australia and a Bachelor’s in Applied Physics and Mathematics from the Moscow Institute of Physics and Technology.

Today’s episode is ideal for anyone looking to advance their data science or A.I. career — or looking to break into a career in this field for the first time.

In today’s episode, Kirill details:

  • Why employers are still testing A.I. engineers on basic machine learning fundamentals — even for LLM-focused roles.

  • The surprising reason why staying in data science (as opposed to developing an A.I. specialization) could be the right career move for you.

  • How one developer discovered the hidden age bias in tech recruiting — and the simple hack to beat it.

  • The two critical skill areas that separate amateur A.I. engineers from the pros commanding huge salaries.

  • Why the "back to office" movement could give you a competitive advantage in landing a top A.I. role.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, Professional Development, SuperDataScience, YouTube Tags SuperDataScience, Data Science, AI, ai, data careers, career

My Four-Hour Agentic AI Workshop is Live and 100% Free

Added on June 22, 2025 by Jon Krohn.

In case you missed my post last week, my four-hour Agentic A.I. workshop (with Ed Donner, pictured) is live. 8,000 people have already watched it! Here's what they're saying:

Read More
In Data Science, Five-Minute Friday, Accouncement, Live Training, Podcast, SuperDataScience, YouTube Tags agenticai, aiagents, llms, generativeai, python, crewai, openai, mcp

How to Enable Enterprise AI Transformation, with Strategy Consultant Diane Hare

Added on June 19, 2025 by Jon Krohn.

People, not technical capability, are holding back A.I.'s impact in organizations. In today's episode, Diane Hare explains how to overcome friction and enable strategic A.I. transformation.

Diane:

  • Founder and CEO of the New York-based strategic consulting firm BizLove, which has been mobilizing key stakeholders to deliver on enterprise-wide priorities (like A.I. initiatives!) at Fortune 100 companies for seven years.

  • Prior to her seven years leading BizLove, spent seven years at EY, the global professional services giant (they have nearly 400,000 employees) formerly known as Ernst & Young.

  • Board Member at NANO Nuclear Energy Inc. (NASDAQ: NNE)

  • Holds and MBA and was captain of a semi-professional women’s soccer team in New York City!

Today’s episode is well-suited to anyone looking to make an impact with A.I. and automation, which I suspect is about every listener to my podcast!

In today’s episode, Diane details:

  • Why people, not technical capability, are holding back A.I.’s transformative power in organizations.

  • How to prioritize the items on an enterprise A.I. roadmap.

  • Why storytelling is essential for gaining buy-in from stakeholders on an A.I. initiative.

  • Her top five tips for enabling A.I. transformation.

This was a super-cool episode for me because Diane's consultancy, BizLove, is a formal partner of my own consultancy, Y Carrot 🥕. While Y Carrot brings rich technical expertise on A.I. (from development through to production deployment), BizLove naturally complements us with their deep experience enabling digital and A.I. transformations of enterprises. Together, we offer every service organizations need to make lasting, impactful improvements with A.I.

The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.

In Data Science, Interview, Podcast, SuperDataScience, YouTube Tags SuperDataScience, ai, EnterpriseAI, DigitalTransformation, Strategy
Older Posts →
Back to Top