Assessing the fastest-growing job is tricky. For example, using job-posting data isn’t great because there could be lots of duplicate postings out there or a lot of the postings could be going unfilled. Another big issue is defining exactly what a job is: The exact same responsibilities could be associated with the job title “data scientist”, “data engineer” or “ML engineer”, depending on the particular job titles a particular company decides to go with. So, whoever’s evaluating job growth is going to end up bucketing groups of related jobs and responsibilities into one particular, standardized job-title bucket, probably these days in a largely automated, data-driven way; if you dug into individual examples, I’m sure you’d find lots of job-title standardizations you disagreed with but some kind of standardization approach is essential to ensuring identical roles with slightly different job titles get counted as the same thing.
Read MoreFiltering by Category: Data Science
Exponential Views on AI and Humanity’s Greatest Challenges, with Azeem Azhar
Today, the famed futurist Azeem Azhar eloquently details the exponential forces that are overhauling society — and why A.I. is essential for solving humanity's biggest challenges. This is a special episode; don't miss it!
In case you aren't familiar with his legendary name already, Azeem:
Is creator of the invaluable "Exponential View" newsletter (>100k subscribers).
Hosts the "Exponential View" podcast (well-known guests include Tony Blair and Andrew Ng).
Hosted the Bloomberg TV show "Exponentially" (guests include Sam Altman).
Holds fellowships at Stanford University and Harvard Business School.
Was Founder & CEO of PeerIndex, a venture capital-backed machine-learning startup that was acquired in 2014.
He holds an MA in PPE (Politics, Philosophy and Economics) from the University of Oxford.
Today’s episode will appeal to any listener. In it, Azeem details:
The exponential forces that will overhaul society in the coming decades.
Why AI is essential for solving humanity's biggest challenges.
His own cutting-edge, personal use of A.I. agents, LLMs, and automation.
Why there's no 'solid ground' in the future of work and how we can adapt.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Generative AI for Business, with Kirill Eremenko and Hadelin de Ponteves
Craving an intro to building and deploying commercially successful Generative A.I. applications? In today's episode, superstar data-science instructors Kirill and Hadelin (>5 million students between them) will fill you in!
Kirill Eremenko is one of our two guests today. He's:
Founder and CEO of SuperDataScience, an e-learning platform.
Founded the SuperDataScience Podcast in 2016 and hosted the show until he passed me the reins four years ago.
Our second guest is Hadelin de Ponteves:
Was a data engineer at Google before becoming a content creator.
In 2020, took a break from Data Science content to produce and star in a Bollywood film featuring "Miss Universe" Harnaaz Sandhu.
Together, Kirill and Hadelin:
Have created dozens of data science courses; they are the most popular data science instructors on the Udemy platform, with over five million students between them!
They also co-founded CloudWolf, an education platform for quickly mastering Amazon Web Services (AWS) certification.
And, in today’s episode, they announce (for the first time anywhere!) another (brand-new) venture they co-founded together.
Today’s episode is intended for anyone who’s interested in real-world, commercial applications of Generative A.I. — a technical background is not required.
In today’s episode, Kirill and Hadelin detail:
What generative A.I. models like Large Language Models are and how they fit within the broader category of “Foundation Models”.
The 12 crucial factors to consider when selecting a foundation model for a given application in your organization.
The 8 steps to ensuring foundation models are deployed commercially successfully.
Many real-world examples of how companies are customizing A.I. models quickly and at remarkably low cost.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in December 2024
Today's "In Case You Missed it Episode"... is one not to miss! Several of the most fascinating conversations I've ever had on the SuperDataScience Podcast I host happened in December.
The specific conversation highlights included in today's episode are:
1. The legendary Dr. Andrew Ng on why LLM cost doesn't matter for your A.I. proof of concept.
2. Building directly on Andrew's segment, CTO (and my fellow Nebula.io co-founder) Ed Donner on how to choose the right LLM for a given application.
3. Extremely intelligent and clear-spoken Dr. Eiman Ebrahimi (CEO of Protopia AI) on the future of autonomous systems and data security in our Agentic A.I. future.
4. From our 2024 recap episode, Sadie St. Lawrence's three biggest A.I. "wow" moments of the year... as well as the biggest flop of the year. (One company was behind both!)
5. Harvard/MIT humanist chaplain Greg Epstein (and bestselling author on tech in society) on the ethics of accelerating A.I. advancements. Should we, for example, consider slowing A.I. progress down?
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
2025 AI and Data Science Predictions, with Sadie St. Lawrence
Happy New Year! To prepare you for 2025, today's guest is the clairvoyant Sadie St. Lawrence, who predicts what the biggest A.I. trends will be in the year ahead. We also pick the A.I. winners and losers of 2024.
In a bit more detail, in today’s episode (which will appeal to technical and non-technical listeners alike):
• We cover how Sadie’s predictions for 2024 (which she made a year ago on this show) panned out.
• We award our “wow moment” of 2024, our comeback of the year, our disappointment of the year and our overall winner of 2024.
• And then, of course, we speculate on the five biggest trends to prepare for in 2025.
As with our 2022, 2023 and 2024 predictions episode, our special guest again this year is Sadie St. Lawrence, who is:
• A data science and machine learning instructor whose content has been enjoyed by over 600,000 students.
• The Founder and CEO of the Human Machine Collaboration Institute as well as being founder and chair of Women In Data™️, a community of over 60,000 women across 55 countries.
• Serves on multiple start-up boards.
• Hosts the Data Bytes podcast.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
AI Engineering 101, with Ed Donner
My holiday gift to you is my Nebula.io co-founder Ed Donner, one of the most brilliant, articulate people I know. In today's episode, Ed introduces the exciting, in-demand "A.I. Engineer" career — what's involved and how to become one.
After working daily alongside this world-class mind and exceptional communicator for nearly a decade, it is at long last my great pleasure to have the extraordinary Ed as my podcast guest. Ed:
• Is co-founder and CTO of Nebula, a platform that leverages generative and encoding A.I. models to source, understand, engage and manage talent.
• Previously, was co-founder and CEO of an A.I. startup called untapt that was acquired in 2020.
• Prior to becoming a tech entrepreneur, Ed had a 15-year stint leading technology teams on Wall Street, at the end of which he was a Managing Director at JPMorganChase, leading a team of 300 software engineers.
• He holds a Master’s in Physics from the University of Oxford.
Today’s episode will appeal most to hands-on practitioners, particularly those interested in becoming an A.I. Engineer or leveling up their command of A.I. Engineering skills.
In today’s episode, Ed details:
• What an A.I. Engineer (also known as an LLM Engineer) is.
• How the data indicate A.I. Engineers are in as much demand today as Data Scientists.
• What an A.I. Engineer actually does, day to day.
• How A.I. Engineers decide which LLMs to work with for a given task, including considerations like open- vs closed-source, what model size to select and what leaderboards to follow.
• Tools for efficiently training and deploying LLMs.
• LLM-related techniques including RAG and Agentic A.I.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Making Enterprise Data Ready for AI, with Anu Jain and Mahesh Kumar
Today's episode features execs (from fast-growing, VC-backed A.I. startups) Anu Jain and Mahesh Kumar elucidating how enterprises can prepare and manage their data for powerful A.I. applications.
In a bit more detail, today's guests are:
• Anu Jain — CEO of Nexus Cognitive.
• Mahesh Kumar — CMO (with an engineering background and he still writes code!) of Acceldata.
This episode was filmed live at Insight Partners' ScaleUp:AI conference in New York last month.
The episode features highlights of a session I hosted at ScaleUp:AI on "Managing Data to Embrace an A.I.-First Mindset for Enterprises”. It should be interesting to folks looking to make A.I. implementations effective in large organizations that have lots of data.
In the episode, Anu and Mahesh detail:
• How a tiny data error can lead to millions of dollars in losses for an enterprise.
• Why data storage isn't a major cost driver anymore (and what is!)
• What the heck data governance actually is and why it matters.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Tech is Our New Religion And It Needs Reformation, with Greg Epstein
In today's fascinating episode, Harvard/MIT's humanist chaplain (and bestselling author!) Greg Epstein details how tech has become our foremost religion... and why this new religion needs reformation 😈
More on Greg:
• Serves as humanist chaplain at both Harvard University and the Massachusetts Institute of Technology.
• Wrote the bestselling book “Good without God: What a Billion Nonreligious People Do Believe."
Today’s episode focuses largely on Greg’s latest book, "Tech Agnostic", and should be interesting to everyone. In today’s episode, Greg details:
• How technology has supplanted traditional religion as society's most powerful belief system.
• Why Silicon Valley's promise of technological salvation parallels religious prophecies.
• The concerning parallel between AI's "singularity" and religious end-of-times narratives.
• How we can embrace technological progress while maintaining our humanity.
• Why building genuine human connections matters more than chasing technological utopias.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in November 2024
We had a ton of laughs and I had some seriously mind-expanding moments thanks to my guests on the SuperDataScience Podcast last month. ICYMI, today's episode highlights the most riveting moments from November.
The specific conversation highlights included in today's episode are:
Deepali Vyas, Global Head of Data and A.I. at executive-search giant Korn Ferry, on how A.I. is transforming recruitment and how job-seekers can stay ahead of the curve.
Jess Ramos, data analyst and leading content creator on data careers, on where to start if you yourself are seeking a career in data.
Bryan McCann, co-founder and CTO of the rapidly-scaling A.I. platform You.com, on why machines will make much better scientists than humans... and how they will surpass human scientists surprisingly soon.
Martin Goodson, CEO of the prestigious British A.I. firm Evolution AI, on how the public figures who are speaking most loudly about A.I. are probably not the people we should be listening to.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Safe, Fast and Efficient AI, with Protopia’s Dr. Eiman Ebrahimi
Nearly all A.I. projects get stuck in "POC Purgatory" because of complex trade-offs between cost, speed and security. Thankfully, today's guest — Dr. Eiman Ebrahimi — cogently provides a path to production A.I. heaven.
Eiman is extremely intelligent and well-spoken; don't miss this episode! It was a delight developing this episode with him and I learned a ton from his gifted mind throughout the process.
Eiman:
• Is CEO of Protopia AI, venture capital-backed startup based in Austin that converts sensitive data into a special, stochastic format that improves A.I. model accuracy, protects privacy and reduces compute costs.
• Prior to founding Protopia, spent a decade at NVIDIA as Senior Research Scientist and Computer Architect.
• Holds a PhD in Computer Engineering from The University of Texas at Austin.
Today’s episode is relatively technical so might appeal most to technical listeners, but Eiman is such a terrific communicator that anyone interested in A.I. might love it.
In today’s episode, Eiman details:
• How he went from optimizing GPU performance at NVIDIA to revolutionizing A.I. data security.
• Why many promising A.I. projects get stuck in what he calls "proof of concept purgatory" - and how to escape it.
• Gripping, deep detail on the real-world tradeoffs between the cost, speed and security of running A.I. models in production.
• How to make your enterprise A.I. products profitable.
• Why having your own private server doesn't make your A.I. system as secure as you think.
• What Alan Watts' philosophy teaches us about entrepreneurship and innovation.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Andrew Ng on AI Vision, Agents and Business Value
My guest today is the inimitable Andrew Ng! In his trademark, clear-spoken style, Andrew gives us a glimpse of the Agentic A.I. future, particularly how the coming Vision Agent tsunami will change the world.
I suspect pretty much everyone knows Dr. Ng already, but just in case:
As director of Stanford University's AI Lab, his research group played a key role in the development of deep learning (which led to him to founding the influential Google Brain team) as well as educating millions on machine learning (and leading to him co-founding Coursera).
Is Managing General Partner of AI Fund, a world-leading A.I. venture studio.
Was CEO (is now Executive Chairman) of LandingAI, a computer-vision platform that specializes in domain-specific Large Vision Models (analogous to LLMs for language).
Founded DeepLearning.AI, which provides excellent technical training on ML, deep learning (of course!), generative A.I. and many other associated subjects.
Was co-CEO (as well as co-founder and chairman) of Coursera, which brought online learning from 300 leading universities to over 100 million students.
This episode was recorded live at the ScaleUp:AI conference in New York a few weeks ago. Thanks to George Mathew and Jennifer Jordan for inviting me back to the conference to interview Andrew :)
In today’s, Andrew details:
Why a cheaper A.I. model with smart agentic A.I. workflow might outperform more expensive, more advanced models.
The surprising truth about A.I. API costs that most businesses don't realize.• How Marvin Minsky's "Society of Mind" theory from the 1980s is making an unexpected comeback in modern A.I.
A groundbreaking new way to process visual data that goes beyond traditional computer vision.
Why unstructured data will be the key to A.I.'s next big revolution.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Career Success in the AI Era, with Deepali Vyas
Rapid A.I. advances can be intimidating: How can you approach your career so you flourish in the coming A.I. era? Find out from Deepali Vyas — a world-leading A.I.-exec headhunter — in today's episode.
Deepali is:
• Senior Partner and Global Head of the Data, A.I. and Financial Technology Practice of Korn Ferry, one of the world’s largest executive-search firms.
• Founder of ProFolios.ai, a video-centric, A.I.-enhanced professional-branding platform.
• Founder of Fearless+, a platform that empowers tens of thousands of young people for career success.
• Holds a Bachelor’s in Financial Mathematics and a Master’s in International Finance from the London School of Economics.
Today’s episode should be interesting to everyone. In it, Deepali details:
• How A.I. has driven a 10x increase in applications per position and how you can compete in this high-volume climate.
• Why technical skills are becoming "table stakes" and what will differentiate the best candidates in the A.I. era.
• An insider's view on the talent flows between Wall Street and Silicon Valley, and how you can capitalize on these flows in your career.
• The "green flags" to look for in potential bosses and employers.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
AI Systems as Productivity Engines, with You.com’s Bryan McCann
Today, wildly intelligent Bryan McCann describes the Agentic A.I. behind his skyrocketing startup You.com and how it will lead to scientific discoveries human scientists couldn't dream of making. Don't miss this episode!
Bryan:
• Co-Founder and CTO of You.com, a prominent Bay Area A.I. startup that has raised $99m in venture capital (including a $50m Series B in September that valued the firm at nearly a billion dollars).
• Was previously Lead Research Scientist at Salesforce and an assistant on courses at Stanford such as Andrew Ng’s wildly popular machine learning course.
• Holds a Master’s in Computer Science, a Bachelor’s in Computer Science and a Bachelor’s in Philosophy, all from Stanford University.
Today’s episode should be fascinating to anyone interested in AI. In it, extremely well-spoken Bryan details:
• The philosophical underpinnings of the breakthroughs that led to the leading A.I. models we have today as well as the ones that will emerge in the coming years.
• How a coding mistake he made serendipitously revealed fundamental insights about meaning and language model alignment.
• Why he believes humanity is entering an existential crisis due to A.I., but nevertheless remains optimistic about the future.
• The fascinating connection between language models and biological proteins.
• Why A.I. systems might soon be able to make scientific discoveries humans could never dream of making.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in October 2024
It's unreal to be able to speak to folks like the guests I hosted on the SuperDataScience Podcast last month. ICYMI, today's episode highlights the most riveting moments from October.
The specific conversation highlights included in today's episode are:
UC San Diego neuroscience professor Dr. Bradley Voytek on how data science facilitates breakthroughs in our understanding of the brain.
Eloquent Natalie Monbiot on how lifelike, digital versions of ourselves can scale up our public-facing work.
Lightning AI CTO Dr. Luca Antiga on where he sees generative A.I. being most useful in our professional lives.
Gable CEO Chad Sanderson on how, when we work with data, we always need to think about how downstream users might come to interpret our data... which is why he finds data contracts so important that he's writing an O'Reilly book about it.
Polars CEO Ritchie Vink on the incredible specs (e.g., efficiency speedups) of his open-source DataFrame-operations library for Python.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The 10 Reasons AI Projects Fail, with Dr. Martin Goodson
Most A.I. projects fail. In today's episode, the brilliant (and hilarious) Dr. Martin Goodson details the top 10 reasons why A.I. projects fail and how to avoid these common pitfalls.
Martin:
• Is CEO and Chief Scientist at Evolution AI, a firm that uses generative A.I. to extract information from millions of documents a day for their clients.
• Is Founder and Organizer of the London ML Meetup, which (with >15,000 members) is the largest community of AI/ML experts in Europe.
• Previously led data science at startups that apply ML to billions of data points daily.
• Was a statistical geneticist at the University of Oxford (where we shared a small office together)!
Today’s episode will be of interest to anyone even vaguely interested in data science, ML or AI. In today’s episode, Martin details:
• The 10 reasons why data science projects fail and how to avoid these common pitfalls.
• His insights on building A.I. startups that serve large enterprises.
• The importance of open-source A.I. development.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
PyTorch Lightning, Lit-Serve and Lightning Studios, with Dr. Luca Antiga
Lightning AI makes tons of tools that speed A.I. model dev and deployment, including the wildly popular open-source library PyTorch Lightning. Today, hear from hands-on CTO Dr. Luca Antiga how all the magic happens ⚡️
More on Luca:
CTO of Lightning AI, which (as one of world’s hottest startups developing A.I. tools) have raised over $80m in venture capital.
Is also CTO of OROBIX, an A.I. services company that Luca co-founded 15 years ago.
Holds a PhD in biomedical engineering from Politecnico di Milano… and did his postdoc at the Robarts Research Institute in London, Ontario (coincidentally around the same time I was doing brain-imaging research there).
Today’s episode will probably appeal most to hands-on practitioners like data scientists, software developers and ML engineers, but any tech-savvy professional could find it valuable.
In today’s episode, Luca details:
How Lightning AI's suite of tools (in addition to PyTorch Lightning, this includes Lightning Studios, LitServe and the Thunder Compiler) is making A.I. development faster and easier.
The rise of small language models and their potential to rival LLMs.
His journey from biomedical imaging to deep learning pioneer.
How software developer’s work will be transformed by A.I. in the coming years.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Neuroscience Fueled by ML, with Prof. Bradley Voytek
Today's guest is the extraordinarily intelligent and well-spoken UC San Diego theoretical neuroscience professor, Bradley Voytek. He reveals how AI/ML is accelerating our understanding of the brain.
More on Brad:
• Professor in UC San Diego's Department of Cognitive Science, Data Science Institute, and the Neurosciences Graduate Program.
• Joined Uber as their first data scientist, when it was a 10-person startup, helping build their data science strategy and team.
• Outreach work has appeared in Scientific American, NPR... and Comic-Con!
• Co-authored the amusing book "Do Zombies Dream of Undead Sheep?"
Today’s episode has some brief exchanges that will appeal most to hands-on practitioners, but should overall be fascinating to anyone.
In today’s episode, Brad details:
• How large-scale data science and machine learning are accelerating neuroscience research.
• Discoveries his lab has recently made that overturn nearly a century of neuroscience doctrine.
• Insights on structuring data science education to balance technical skills with creative, practical problem-solving.
• Lessons from using data science to optimize Uber's early ride-prediction algorithms.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Are “Citizen Data Scientists” A Myth? With Keith McCormick
In a recent episode, Nick Elprin and I laughed that "citizen data scientists" don't exist. Keith McCormick joins me today to eloquently rebut us and demonstrate the clear value of low-code/no-code tools.
Keith is:
• Data Science Principal at the enterprise A.I. consultancy Further.
• Creator of dozens LinkedIn Learning courses on machine learning and A.I. with, in aggregate, over a million students!
• Author of four statistics books.
Today’s short episode should be of interest to just about any listener. In it, Keith details:
• Common circumstances where low-code/no-code data science tools are the best option for you, even if you are a coding whiz.
• Whether citizen data scientists are myth or reality.
• How AutoML fits into the data science workflow - and why it won't replace data science teams.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Polars: Past, Present and Future, with Polars Creator Ritchie Vink
Because of it's stunningly fast speed, Polars is an extremely popular open-source library for DataFrame operations in Python. Kinda unreal to have Ritchie Vink, Polars' creator, as today's guest!
Ritchie:
• Is CEO and Co-Founder of Polars, Inc., a startup that has raised $4m in seed funding to support his Polars open-source project.
• Previously worked as an ML Engineer, Data Scientist and Data Engineer at companies like adidas and KLM Royal Dutch Airlines.
• Holds a Master’s in Structural Engineering and worked as a civil engineer prior to catching the data-science bug.
Today’s episode will appeal most to hands-on practitioners like data scientists and ML engineers. In it, Ritchie details:
• How Polars regularly achieves 5-20x (sometimes 100x!) speed improvements over Pandas for most DataFrame operations.
• The Eager and Lazy execution APIs Polars offers and when you should use one or the other.
• Ritchie's vision for scaling Polars to handle massive distributed datasets.
• How we can continue to make data-processing efficiency gains even as Moore's Law slows down.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
In Case You Missed It in September 2024
Another month, another set of invaluable conversations on the SuperDataScience Podcast I host. ICYMI, today's episode highlights the most fascinating moments from September.
The specific conversation highlights included in today's episode are:
Posit PBC engineering manager Dr. Julia Silge explains why Positron, the next-generation IDE she's leading development of, is better-suited to data scientists than any existing IDE.
PyTorch expert Luka Anicin provides his top tips for training more accurate and compute-efficient ML models.
Exceptional open-source developer Marco Gorelli on why Polars is anywhere from 10 to 100x faster than Pandas, the incumbent Python library for working with DataFrames.
Microsoft's Marck Vaisman on what companies hiring data scientists should be looking for... as opposed to what the typically (and mistakenly!) look for today.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.