Now Hosting the SuperDataScience Podcast

Added on January 4, 2021 by Jon Krohn.

Happy new year! I'm delighted to announce that I've taken over the reigns as host of the SuperDataScience podcast as of the January 1st episode (#432).

In November, Kirill Eremenko blew me away by asking me if I'd be interested in hosting the program and of course I immediately said, "yes!"

By releasing two riveting episodes per week since 2016, Kirill has amassed an extraordinary global audience of 10,000 listeners per episode. I'm over the moon to have the opportunity to share cutting-edge content from the fields of data science, machine learning, and A.I. with so many engaged professionals and students.

Kirill left behind super-sized shoes to fill, but I'm committed to maintaining the lofty standard that he set. I'll also be maintaining the structure that preceded me:

Odd-numbered episodes feature guests and are released every Wednesday. I've already recorded fun episodes packed with practical insights from Ben Taylor is..., Erica Greene, and Claudia Perlich, with many more "household" data science names lined up.
Even-numbered episodes, like #432, are "Five-Minute Fridays". These are short, come out every Friday (duh), and focused on a specific item of data science or career advice.

With a challenging 2020 behind us, I hope you're as excited as I am to be starting 2021 off with something new.

Co-Hosted SuperDataScience Podcast on 2020's Biggest Breakthroughs

Added on January 1, 2021 by Jon Krohn.

Alongside Kirill Eremenko, Founder and CEO of SuperDataScience, I co-hosted this podcast episode on 2020’s biggest machine learning breakthroughs including:

AlphaFold 2
GPT-3
The latest GPUs

We also announced an exciting upcoming podcast-host transition for 2021! Further details to come on air during the next podcast episode.

Co-Hosted SuperDataScience Podcast Episode

Added on December 23, 2020 by Jon Krohn.

I had the honor of co-hosting SuperDataScience podcast episode #427 with the podcast's founder and host, Kirill Eremenko. The guest on the episode was Syafri Bahar, who is the director of data science at Gojek, a decacorn ($10+ billion-dollar valuation) "super app". The discussion was wide-ranging but we returned time and again to the concept of using data science and technology to make a positive social impact.

"Foundations for Machine Learning" Live Online Bootcamp

Added on November 29, 2020 by Jon Krohn.

On Thursday, I kick off a live, 14-lecture bootcamp on the four foundational subjects underlying machine learning theory:

Linear Algebra
Calculus
Probability and Statistics
Computer Science

All of the bootcamp details -- including lecture dates, a detailed topic-by-topic syllabus, and an introductory video -- are available here.

The premise of the bootcamp is that to be an outstanding data scientist or ML engineer, it doesn't suffice to only know how to use ML algorithms via the abstract interfaces that the most popular libraries (e.g., scikit-learn, Keras) provide. To train innovative models or deploy them efficiently in production, an in-depth appreciation of machine learning theory (pictured as the central, purple floor in my metaphorical house diagram; see below) is required. And, to cultivate such in-depth appreciation of ML, one must possess a working understanding of the four foundational subjects.

When the foundations of the "Machine Learning House" are firm, it also makes it much easier to make the jump from general ML principles (purple floor) to specialized ML domains (the top floor, shown in gray) such as deep learning, natural language processing, machine vision, and reinforcement learning. This is because, the more specialized or cutting-edge the application, the more likely its details for implementation are available only in academic papers or graduate-level textbooks, either of which typically assume an understanding of the four foundational subjects.

In any event, all 14 lectures of the bootcamp are included as part of a subscription to the AI+ Training platform that was launched earlier this year by the Open Data Science Conference (ODSC). Through the platform, you also get unlimited access to recordings of the lectures so you can brush up anytime or attend lectures that you miss.

I love offering online lectures because I get to meet intelligent, ambitious people from all over the world. They're also great for students because of the interactivity. Speaking of which, the course will be filled with paper-and-pencil exercises and we'll work through the solutions together. On top, I've included hundreds of hands-on code demos in Python, with a particular focus on low-level operations in the PyTorch and TensorFlow libraries.

All of the code is available open-source in GitHub now.

Filming "Calculus for Machine Learning"

Added on November 29, 2020 by Jon Krohn.

Same shirt, different day with my "weekend crew". We burnt the midnight oil Friday through Sunday last weekend filming what will be eight hours of interactive videos on Calculus for Machine Learning.

In the above photo, pictured from left to right at New York's Production Central Studios: myself, technician Guillaume Rousseau, and producer Erina Sanders.

All of the code (featuring the Python libraries NumPy, TensorFlow, and PyTorch) is available open-source in GitHub today.

The videos themselves will appear in the O'Reilly learning platform later this year, with thanks to Pearson's Debra Williams Cauley for bringing another project concept of mine to life.

Deep Q-Learning Networks tutorial on YouTube

Added on November 17, 2020 by Jon Krohn.

Here's a free 72-minute intro to Deep Reinforcement Learning with Deep Q-Learning Networks. It's my most popular video, with 44k views. I'd never have predicted it to do so well, which goes to show how important it is to produce on a schedule.

Don't wait for inspiration; it's unlikely to ever come. Simply commit to regular deadlines and produce, produce, produce because some content you release that you think is rushed, that won't be interesting to your audience, will end up surprising you by being right on the money.

As an added benefit, studies show that focusing on quantity over quality paradoxically tends to result in significantly higher quality products.

Anyway, enjoy the video, which I've never shared on my website before. It uses hands-on code demos in Python to bring the mathematical theory of Deep Q-Learning to life (Jupyter notebook is here).

Free Launch of Udemy Course

Added on October 28, 2020 by Jon Krohn.

My first Udemy course, Machine Learning and Data Science Foundations Masterclass, went live on Monday and is free (for now)! An astounding 20,000 students signed up in the first 24 hours and they've given it a five-star average rating.

The course is currently two hours long because Udemy has a two-hour cap on free courses. In about a month, we'll add four more hours of content, which will put us well over the two-hour cap. From that point on, the course will be paid in Udemy, but anyone who enrolled while the course was free will automatically get unlimited access to all of the paid-tier content.

Eventually the course will grow to be ~30 hours long and provide a comprehensive overview of all the subjects -- linear algebra, calculus, probability, and computer science -- that underlie modern machine learning. Again, by signing up now, all of that content will be available to you for free when it comes out; no strings attached!

Many thanks to the super-duper SuperDataScience Team (especially Kirill Eremenko, Leonid Golub and Roberto Lambertini) for partnering with me on this course. And thanks as ever to Sangbin Lee, who flawlessly produced and edited all of the video content.

jon-krohn-filming-linear-algebra-for-machine-learning.jpg

Filming "Linear Algebra for Machine Learning" Videos

Added on October 22, 2020 by Jon Krohn.

Back in studio this past weekend with magical producer Erina Sanders, creating seven hours of interactive Linear Algebra for Machine Learning videos.

All of the code (featuring the Python libraries NumPy, TensorFlow, and PyTorch) is available open-source in GitHub today.

The videos themselves will appear in the O'Reilly learning platform later this year, with thanks to Pearson's Debra Williams Cauley for bringing another project concept of mine to life.

(December 2020 update: These videos are now live in O'Reilly here.)

"Intro to Linear Algebra": The First Subject of my ML Foundations Series is Live!

Added on October 15, 2020 by Jon Krohn.

Intro to Linear Algebra, the first subject of my Machine Learning Foundations tutorial series, is now available in its entirety! It's three content-rich hours, featuring code demos and exercises, split over a total of 24 YouTube videos.

I released the final Intro to Linear Algebra videos over the past week, all of which are from the Matrix Properties segment of the subject (see the segment intro video below):

Topic 18: The Frobenius Norm
Topic 19: Matrix Multiplication
Topic 20: Symmetric and Identity Matrices
Topic 21: Matrix Multiplication Exercises
Topic 22: Matrix Inversion
Topic 23: Diagonal Matrices
Topic 24: Orthogonal Matrices

The YouTube playlist for the entire Machine Learning Foundations series is here.

The series is full of hands-on code demos in Python (particularly the NumPy, TensorFlow, and PyTorch libraries) and all of the code is available open-source in GitHub.

Please let me know what you think of the series so far as it will shape my creation of the remaining seven subjects. Up next in the series will be the second subject, Linear Algebra II: Matrix Operations. Stay tuned!

TensorFlow vs PyTorch for Deep Learning

Added on October 8, 2020 by Jon Krohn.

The question I get asked most is, "Should I use TensorFlow or PyTorch for deep learning?" The YouTube video above is my 26-minute response, detailing the pros and cons of both libraries.

The talk begins with a survey of the primary families of deep learning approaches: Convolutional Neural Networks, Recurrent Neural Networks, Generative Adversarial Networks, and Deep Reinforcement Learning. (Thanks to the Belgian artist Aglaé Bassens for the stunning illustrations that feature throughout this section of the slide deck.)

If you're already familiar with deep learning, you can skip ahead to the 12:24 mark. Via interactive demos, the meat of the talk appraises the two leading deep learning libraries: TensorFlow and PyTorch. With respect to both model development and production deployment, the strengths and weaknesses of the two libraries are detailed with code examples from reference Jupyter notebooks. There's a particular focus on the TensorFlow 2 release that integrates the easy-to-use, high-level Keras API as a formal module within the library.

Thanks to Jared Lander and Amada Echeverría for hosting me at the terrifically well-executed virtual version of this year's New York R Conference, where I provided this lecture. (Note that the content I covered is equally relevant to folks working primarily in Python.)

ODSC West Virtual Conference

Added on September 30, 2020 by Jon Krohn.

Open Data Science Conference (ODSC) West, my favorite conference, is virtual this year and coming up soon, October 27-30. Deeply honored to be speaking alongside these industry legends.

All in all, there will be over 200 speakers this year, and I'll be giving a four-hour workshop that introduces deep learning via hands-on demos featuring TensorFlow 2.

Registration is here.

Techniques for Solving Linear Systems of Equations

Added on September 22, 2020 by Jon Krohn.

Two further videos from my Machine Learning Foundations series out today! Both cover techniques for solving systems of linear equations algebraically:

The first video (and 16th topic in the series overall) introduces substitution for solving linear systems by hand.
The second video (17th topic in the series) uses elimination, which comes in handy in other situations where we'd like to solve linear systems by hand.

These videos wrap up Segment Two, "Common Tensor Operations". They set us up perfectly for Segment Three, "Matrix Properties", in which we'll cover computational approaches to solving linear systems.

The playlist for the entire series, which will consist of 30+ hours of videos, is here.

Common Tensor Operations: A Fresh Segment of my ML Foundations Series

Added on September 10, 2020 by Jon Krohn.

As detailed on GitHub and covered in the short explanatory video above, the ML Foundations series consists of eight subjects:

Intro to Linear Algebra
Linear Algebra II
Calculus I: Limits & Derivatives
Calculus II: Partial Derivatives & Integrals
Probability & Information Theory
Statistics
Algorithms & Data Structures
Optimization

Each of the eight subjects consists of two or three segments, which group together closely related topics and make a given subject more structured and readily digestible. The Intro to Linear Algebra subject, for example, consists of three segments:

Data Structures for Algebra
Common Tensor Operations
Matrix Properties

All of the previously released videos in the ML Foundations YouTube playlist featured topics from the first segment, Data Structures for Algebra. Today, we released five new videos, which mark the beginning of Common Tensor Operations, the second segment:

The first topic in the segment (and Topic 11 in the series overall) is Tensor Transposition.
Topic 12 is Basic Tensor Arithmetic, including coverage of the Hadamard product.
Topic 13 is Reduction from higher-dimensional tensors to lower-dimensional ones.
Topic 14 introduces the Dot Product of two vectors.
And, finally, video 15 provides exercises to test your comprehension of the content covered in the segment so far.

We aim to release two further videos next week, which will wrap up Segment 2, leaving us well-positioned to tackle Segment 3, Matrix Properties.

Modeling Natural Language Data

Added on September 3, 2020 by Jon Krohn.

Here's a new two-hour video on:

Preprocessing natural language data,
Creating word vectors, and
Designing convolutional neural networks for NLP.

Thanks to Debra Williams Cauley for believing in my vision for this video and to Erina Sanders for her always flawless editing.

These two hours come from a complete five-hour tutorial on Deep Learning for Natural Language Processing. It's available here in the O'Reilly learning platform. Or, if you'd like to purchase it, my publisher is running an extreme 75%-off sale on all my videos through September 10th. Use code LEARNDL after following this link.

Final videos of "Data Structures for Algebra" segment of ML Foundations

Added on August 26, 2020 by Jon Krohn.

The final videos from the first segment of my Machine Learning Foundations series, Data Structures for Algebra, are out today:

Video 8 in the series is on matrix tensors.
Topic 9 is Generic Tensor Notation.
And the tenth video is a quick one that provides three comprehension questions on the content covered thus far.

In the second, forthcoming segment of the ML Foundation series, we'll move from primarily creating static tensors to interacting with them via common tensor operations.

The playlist for the entire series, which will consist of 30+ hours of videos, is here.

These particular videos feature hands-on demos in PyTorch and TensorFlow, and all of the code is in GitHub.

Superb editing from Sangbin Lee, as always. Thank you!

Minimizing Unwanted Biases (e.g., by Gender, Ethnicity) within ML Models

Added on August 20, 2020 by Jon Krohn.

Here's a new blog post I wrote on how my team eliminates unwanted biases (e.g., by gender, ethnicity) from algorithms we've deployed in the recruitment sector.

Devising algorithms that stamp out unwanted biases without skimping on accuracy or performance adds time and effort to the machine learning model-design process. When algorithms can have a considerable social impact, as ours do in the human-resources space at GQR Global Markets, investing this time and effort is essential to ensuring equitable treatment of all people.

Vectors and Norms: Three New "ML Foundations" Videos

Added on August 13, 2020 by Jon Krohn.

Three further videos in my Machine Learning Foundations series out today! Taking it up a dimension, these are on vector tensors and the norm functions we use to measure them:

Video 5 in the series is "Vectors and Vector Transposition"
Topic 6 is "Norms and Unit Vectors"
And topic 7 is a quick one on "Basis, Orthogonal, and Orthonormal Vectors"

The YouTube playlist for the entire series, which will consist of 30+ hours of videos, is here.

The series is full of hands-on demos in NumPy, PyTorch, and TensorFlow, and all of the code is in GitHub here.

As usual, producer/editor Sangbin Lee did some fine work here.

A4N Episode 4: Automated Cancer Detection & Self–Driving Cars with Dr. Rasmus Rothe

Added on August 5, 2020 by Jon Krohn.

The fourth episode of A4N — the Artificial Neural Network News Network podcast — is out (listen on my website, on Apple Podcasts, Spotify, Google Podcasts, or YouTube). In this episode, our guest host Rasmus Rothe joins us to discuss Merantix, his rapidly growing AI Venture Studio, and how they are applying machine learning to revolutionize cancer detection, self-driving cars, and more.

Dr. Rasmus Rothe is a German native, and co-founder of Merantix, the world’s first AI-focused venture studio. Merantix has already launched three successful AI-driven companies with three more operating in stealth, and raised an additional EUR 25 MM in 2020 to continue to apply world-class AI research to solving practical issues. Rasmus published 15+ papers on deep learning while attending Oxford, Princeton and ETH Zurich, where he received his Ph.D. in computer vision and deep learning. Before founding Merantix, Rasmus worked for BCG, Google, and built a deep learning service with 150m+ users. He is also a founding board member of the German AI Association.

Click through for more detail, including reference links and a full transcript.

"Hidden in Plain Sight" Podcast

Added on July 30, 2020 by Jon Krohn.

I've been a guest on many wonderful podcasts, but this one on Hidden in Plain Sight easily takes the cake in terms of production value. I highly recommend tuning in for a candid take on the impacts of A.I. today and in the years to come: It’s titled Deep Learning Illustrated with Jon Krohn and can be found here.

Thank you Chad Grills and Lacey Peace for having me on the program. I've been thoroughly impressed by your operation from start to finish!

Next Three Videos in the ML Foundations Series are Live

Added on July 22, 2020 by Jon Krohn.

I published three new videos in my Machine Learning Foundations series today, including an introduction to tensors, with a focus on handling scalar tensors in PyTorch and TensorFlow (vectors and matrices coming up next).

I hope you enjoy them! The response to the first videos in the series -- released a fortnight ago -- has blown me over. I'm excited to continue to get more of these out on YouTube ASAP.

The playlist for the entire series, which will consist of 30+ hours of videos, is here.

The series is full of hands-on demos in Python and all of the code is in GitHub here.

Thanks to Sangbin Lee for continuing to be an outstanding producer and editor.