If you've been using fine-tuned open-source LLMs (e.g. for generative A.I. functionality or natural-language conversations with your users), it's very likely time you switch your starting model over to Llama 2. Here's why:
Read MoreFiltering by Category: YouTube
Generative A.I. without the Privacy Risks (with Prof. Raluca Ada Popa)
Consumers and enterprises dread that Generative A.I. tools like ChatGPT breach privacy by using convos as training data, storing PII and potentially surfacing confidential data as responses. Prof. Raluca Ada Popa has all the solutions.
Today's guest, Raluca:
• Is Associate Professor of Computer Science at University of California, Berkeley.
• Specializes in computer security and applied cryptography.
• Her papers have been cited over 10,000 times.
• Is Co-Founder and President of Opaque Systems, a confidential computing platform that has raised over $31m in venture capital to enable collaborative analytics and A.I., including allowing you to securely interact with Generative A.I.
• Previously co-founded PreVeil, a now-well-established company that provides end-to-end document and message encryption to over 500 clients.
• Holds a PhD in Computer Science from MIT.
Despite Raluca being such a deep expert, she does such a stellar job of communicating complex concepts simply that today’s episode should appeal to anyone that wants to dig into the thorny issues around data privacy and security associated with Large Language Models (LLMs) and how to resolve them.
In the episode, Raluca details:
• What confidential computing is and how to do it without sacrificing performance.
• How you can perform inference with an LLM (or even train an LLM!) without anyone — including the LLM developer! — being able to access your data.
• How you can use commercial generative models OpenAI’s GPT-4 without OpenAI being able to see sensitive or personally-identifiable information you include in your API query.
• The pros and cons of open-source versus closed-source A.I. development.
• How and why you might want to seamlessly run your compute pipelines across multiple cloud providers.
• Why you should consider a career that blends academia and entrepreneurship.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
“The Dream of Life” by Alan Watts
For episode #700 today, I bring you the "Dream of Life" thought experiment originally penned by Alan Watts. You are terrifically powerful (particularly now that you're armed with A.I.!) — are you making good use of your power?
Also, time flies, eh? Another hundred episodes in the bag today! Thanks for listening, providing feedback and otherwise contributing to making SuperDataScience, with over a million downloads per quarter, the most listened-to podcast in the data science industry. We've got some serious awesomeness lined up for the next hundred episodes — I can't wait for the amazing, inspiring, mind-opening conversations.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
The Modern Data Stack, with Harry Glaser
Today, eloquent Harry Glaser details the Modern Data Stack, including cloud collab tools (like Deepnote), running ML from data warehouses (like Snowflake), using dbt Labs for model orchestration, and model deployment best-practices.
Harry:
• Is Co-Founder and CEO of Modelbit, a San Francisco-based startup that has raised $5m in venture capital to make the productionization of machine learning models as fast and as simple as possible.
• Previously, was Co-Founder and CEO of Periscope Data, a code-driven analytics platform that was acquired by Sisense for $130m.
• And, prior to that, was a product manager at Google.
• Holds a degree in Computer Science from the University of Rochester.
Today’s episode is squarely targeted at practicing data scientists but could be of interest to anyone who’d like to enrich their understanding of the modern data stack and how ML models are deployed into production applications.
In the episode, Harry details:
• The major tools available for developing ML models.
• The best practices for model deployment such as version control, CI/CD, load balancing and logging.
• The data warehouse options for running models.
• What model orchestration is.
• How BI tools can be leveraged to collaborate on model prototypes across your organization.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
How Firms Can Actually Adopt A.I., with Rehgan Avon
Rehgan Avon's DataConnect conference is this week and is getting rave reviews. In this SuperDataScience episode, Jon Krohn, the silver-tongued entrepreneur details how organizations can successfully adopt A.I.
Read MoreThe (Short) Path to Artificial General Intelligence, with Dr. Ben Goertzel
Today, the luminary Dr. Ben Goertzel details how we could realize Artificial General Intelligence (AGI) in 3-7 years, why he's optimistic about the Artificial Super Intelligence (ASI) this would trigger, and what post-Singularity society could be like.
Dr. Goertzel:
• Is CEO of SingularityNET, a decentralized open market for A.I. models that aims to bring about AGI and thus the singularity that would transform society beyond all recognition.
• Has been Chairman of The AGI Society for 14 years.
• Has been Chairman of the foundation behind OpenCog — an open-source AGI framework — for 16 years.
• Was previously Chief Scientist at Hanson Robotics Limited, the company behind Sophia, the world’s most recognizable humanoid robot.
• Holds a PhD in mathematics from Temple University and held tenure-track professorships prior to transitioning to industry.
Today’s episode has parts that are relatively technical, but much of the episode will appeal to anyone who wants to understand how AGI — a machine that has all of the cognitive capabilities of a human — could be brought about and the world-changing impact that would have.
In the episode, Ben details:
• The specific approaches that could be integrated with deep learning to realize, in his view, AGI in as few as 3-7 years.
• Why the development of AGI would near-instantly trigger the development of ASI — a machine with intellectual capabilities far beyond humans’.
• Why, despite triggering the singularity — beyond which we cannot make confident predictions about the future — he’s optimistic that AGI will be a positive development for humankind.
• The connections between self-awareness, consciousness and the ASI of the future.
• With admittedly wide error bars, what a society that includes ASI may look like.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Brain-Computer Interfaces and Neural Decoding, with Prof. Bob Knight
In today's extraordinary episode, Prof. Bob Knight details how ML-powered brain computer interfaces (BCIs) could allow real-time thought-to-speech synthesis and the reversal of cognitive decline associated with aging.
This is a rare treat as "Dr. Bob" doesn't use social media and has only made two previous podcast appearances: on Ira Flatow's "Science Friday" and a little-known program called "The Joe Rogan Experience".
Dr. Bob:
• Is Professor of Neuroscience and Psychology at University of California, Berkeley.
• Is Adjunct Professor of Neurology and Neurosurgery at UC San Francisco.
• Over his career, has amassed tens of millions of dollars in research funding, 75 patents, and countless international awards for neuroscience and cognitive computing research.
• His hundreds of papers have together been cited over 70,000 times.
In this episode, Bob details:
• Why the “prefrontal cortex” region of our brains makes us uniquely intelligent relative to all the other species on this planet.
• The invaluable data that can be gathered by putting recording electrodes through our skulls and directly into our brains.
• How "dynamic time-warping" algorithms allow him to decode imagined sounds, even musical melodies, through recording electrodes implanted into the brain.
• How BCIs are life-changing for a broad range of illnesses today.
• The extraordinary ways that advances in hardware and machine learning could revolutionize medical care with BCIs in the coming years.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
NLP with Transformers, feat. Hugging Face’s Lewis Tunstall
Lewis Tunstall — brilliant author of the bestseller "NLP with Transformers" and an ML Engineer at Hugging Face — today details how to train and deploy your own LLMs, the race for an open-source ChatGPT, and why RLHF leads to better models.
Dr. Tunstall:
• Is an ML Engineer at Hugging Face, one of the most important companies in data science today because they provide much of the most critical infrastructure for A.I. through open-source projects such as their ubiquitous Transformers library, which has a staggering 100,000 stars on GitHub.
• Is a member of Hugging Face’s prestigious research team, where he is currently focused on bringing us closer to having an open-source equivalent of ChatGPT by building tools that support RLHF (reinforcement learning from human feedback) and large-scale model evaluation.
• Authored “Natural Language Processing with Transformers”, an exceptional bestselling book that was published by O'Reilly last year and covers how to train and deploy Large Language Models (LLMs) using open-source libraries.
• Prior to Hugging Face, was an academic at the University of Bern in Switzerland and held data science roles at several Swiss firms.
• Holds a PhD in theoretical and mathematical physics from Adelaide in Australia.
Today’s episode is definitely on the technical side so will likely appeal most to folks like data scientists and ML engineers, but as usual I made an effort to break down the technical concepts Lewis covered so that anyone who’s keen to be aware of the cutting edge in NLP can follow along.
In the episode, Lewis details:
• What transformers are.
• Why transformers have become the default model architecture in NLP in just a few years.
• How to train NLP models when you have few to no labeled data available.
• How to optimize LLMs for speed when deploying them into production.
• How you can optimally leverage the open-source Hugging Face ecosystem, including their Transformers library and their hub for ML models and data.
• How RLHF aligns LLMs with the outputs users would like.
• How open-source efforts could soon meet or surpass the capabilities of commercial LLMs like ChatGPT.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
CatBoost: Powerful, efficient ML for large tabular datasets
CatBoost is making waves in open-source ML as it's often the top approach for tasks as diverse as classification, regression, ranking, and recommendation. This is especially so if working with tabular data that include categorical variables.
This justifiable excitement in mind, today's "Five-Minute Friday" episode of SuperDataScience is dedicated to CatBoost (short for “category” and “boosting”).
CatBoost has been around since 2017 when it was released by Yandex, a tech giant based in Moscow. In a nutshell, CatBoost — like the more established (and regularly Kaggle-leaderboard-topping approaches) XGBoost and LightGBM — is at its heart a decision-tree algorithm that leverages gradient boosting. So that explains the “boost” part of CatBoost.
The “cat” (“category”) part comes from CatBoost’s superior handling of categorical features. If you’ve trained models with categorical data before, you’ve likely experienced the tedium of preprocessing and feature engineering with categorical data. CatBoost comes to the rescue here, efficiently dealing with categorical variables by implementing a novel algorithm that eliminates the need for extensive preprocessing or manual feature engineering. CatBoost handles categorical features automatically by employing techniques such as target encoding and one-hot encoding.
In addition to CatBoost’s superior handling of categorical features, the algorithm also makes use of:
• A specialized gradient-based optimization scheme known as Ordered Boosting that takes advantage of the natural ordering of categorical variables to minimize the loss function efficiently.
• Symmetric decision trees, which have a fixed tree depth that enables a faster training time relative to XGBoost and a comparable training time to LightGBM, which is famous for its speed.
• Regularization techniques, such as the well-known L2 regularization as well as ordered boosting and symmetric trees already discussed, all together make CatBoost unlikely to overfit to training data relative to other boosted-tree algorithms.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
YOLO-NAS: The State of the Art in Machine Vision, with Harpreet Sahota
Deci's YOLO-NAS architecture provides today's state of the art in Machine Vision, specifically the key task of Object Detection. Harpreet Sahota joins us from Deci today to detail YOLO-NAS as well as where Computer Vision is going next.
Harpreet:
• Leads the deep learning developer community at Deci AI, an Israeli startup that has raised over $55m in venture capital and that recently open-sourced the YOLO-NAS deep learning model architecture.
• Through prolific data science content creation, including The Artists of Data Science podcast and his LinkedIn live streams, Harpreet has amassed a social-media following in excess of 70,000 followers.
• Previously worked as a lead data scientist and as a biostatistician.
• Holds a master’s in mathematics and statistics from Illinois State University.
Today’s episode will likely appeal most to technical practitioners like data scientists, but we did our best to break down technical concepts so that anyone who’d like to understand the latest in machine vision can follow along.
In the episode, Harpreet details:
• What exactly object detection is.
• How object detection models are evaluated.
• How machine vision models have evolved to excel at object detection, with an emphasis on the modern deep learning approaches.
• How a “neural architecture search” algorithm enabled Deci to develop YOLO-NAS, an optimal object detection model architecture.
• The technical approaches that will enable large architectures like YOLO-NAS to be compute-efficient enough to run on edge devices.
• His “top-down” approach to learning deep learning, including his recommended learning path.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Lossless LLM Weight Compression: Run Huge Models on a Single GPU
Many recent episodes have been focused on open-source Large Language Models that you can download and fine-tune to particular use cases depending on your needs or your users’ needs. I’ve particularly been highlighting LLMs with seven billion up to 13 billion model parameters because this size of model can typically be run on a single consumer GPU so it’s relatively manageable and affordable both to train and have in production.
Read MoreA.I. Accelerators: Hardware Specialized for Deep Learning
Today we’ve got an episode dedicated to the hardware we use to train and run A.I. models (particularly LLMs) such as GPUs, TPUs and AWS's Trainium and Inferentia chips. Ron Diamant may be the best guest on earth for this fascinating topic.
Ron:
• Works at Amazon Web Services (AWS) where he is Chief Architect for their A.I. Accelerator chips, which are designed specifically for training (and making inferences with) deep learning models.
• Holds over 200 patents across a broad range of processing hardware, including security chips, compilers and, of course, A.I. accelerators.
• Has been at AWS for nearly nine years – since the acquisition of the Israeli hardware company Annapurna Labs, where he served as an engineer and project manager.
• Holds a Masters in Electrical Engineering from Technion, the Israel Institute of Technology.
Today’s episode is on the technical side but doesn’t assume any particular hardware expertise. It’s primarily targeted at people who train or deploy machine learning models but might be accessible to a broader range of listeners who are curious about how computer hardware works.
In the episode, Ron details:
• CPUs versus GPUs.
• GPUs versus specialized A.I. Accelerators such as Tensor Processing Units (TPUs) and his own Trainium and Inferentia chips.
• The “AI Flywheel” effect between ML applications and hardware innovations.
• The complex tradeoffs he has to consider when embarking upon a multi-year chip-design project.
• When we get to Large Language Model-scale models with billions of parameters, the various ways we can split up training and inference over our available devices.
• How to get popular ML libraries like PyTorch and TensorFlow to interact optimally with A.I. accelerator chips.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
How to Catch and Fix Harmful Generative A.I. Output
Today, the A.I. entrepreneur Krishna Gade joins me to detail open-source solutions for overcoming the safety and security issues associated with generative A.I. systems, such as those powered by Large Language Models (LLMs).
The remarkably well-spoken Krishna:
• Is Co-Founder and CEO of Fiddler AI, an observability platform that has raised over $45m in venture capital to build trust in A.I. systems.
• Previously worked as an engineering manager on Facebook’s Newsfeed, as Head of Data Engineering at Pinterest, and as a software engineer at both Twitter and Microsoft.
• Holds a Masters in Computer Science from the University of Minnesota.
In this episode, Krishna details:
• How the LLMs that enable Generative A.I. are prone to inaccurate statements, can be biased against protected groups and are susceptible to exposing private data.
• How these undesirable and even harmful LLM outputs can be identified and remedied with open-source solutions like the Fiddler Auditor that his team has built.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Observing LLMs in Production to Automatically Catch Issues
Today, Amber Roberts and Xander Song provide a technical deep dive into the major challenges (such as drift) that A.I. systems (particularly LLMs) face in production. They also detail solutions, such as open-source ML Observability tools.
Both Amber and Xander work at Arize AI, an ML observability platform that has raised over $60m in venture capital.
Amber:
• Serves as an ML Growth Lead at Arize, where she has also been an ML engineer.
• Prior to Arize, worked as an AI/ML product manager at Splunk and as the head of A.I. at Insight Data Science.
• Holds a Masters in Astrophysics from the Universidad de Chile in South America.
Xander:
• Serves as a developer advocate at Arize, specializing in their open-source projects.
• Prior to Arize, he spent three years as an ML engineer.
• Holds a Bachelors in Mathematics from UC Santa Barbara as well as a BA in Philosophy from the University of California, Berkeley.
Today’s episode will appeal primarily to technical folks like data scientists and ML engineers, but we made an effort to break down technical concepts so that it’s accessible to anyone who’d like to understand the major issues that A.I. systems can develop once they’re in production as well as how to overcome these issues.
In the episode, Amber and Xander detail:
• The kinds of drift that can adversely impact a production A.I. system, with a particular focus on the issues that can affect Large Language Models (LLMs).
• What ML Observability is and how it builds upon ML Monitoring to automate the discovery and resolution of production A.I. issues.
• Open-source ML Observability options.
• How frequently production models should be retrained.
• How ML Observability relates to discovering model biases against particular demographic groups.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Six Reasons Why Building LLM Products Is Tricky
Many of my recent podcast episodes have focused on the bewildering potential of fine-tuning open-source Large Language Models (LLMs) to your specific needs. There are, however, six big challenges when bringing LLMs to your users:
1. Strictly limited context windows
2. LLMs are slow and compute-intensive at inference time
3. "Engineering" reliable prompts can be tricky
4. Prompt-injection attacks make you vulnerable to data and IP theft
5. LLMs aren't (usually) products on their own
6. There are legal and compliance issues
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Generative Deep Learning, with David Foster
Today, bestselling author David Foster provides a fascinating technical introduction to cutting-edge Generative A.I. concepts including variational autoencoders, diffusion models, contrastive learning, GANs and (my favorite!) "world models".
David:
• Wrote the O'Reilly book “Generative Deep Learning”; the first edition from 2019 was a bestseller while the second edition was released just last week.
• Is a Founding Partner of Applied Data Science Partners, a London-based consultancy specialized in end-to-end data science solutions.
• Holds a Master’s in Mathematics from the University of Cambridge and a Master’s in Management Science and Operational Research from the University of Warwick.
Today’s episode is deep in the weeds on generative deep learning pretty much from beginning to end and so will appeal most to technical practitioners like data scientists and ML engineers.
In the episode, David details:
• How generative modeling is different from the discriminatory modeling that dominated machine learning until just the past few months.
• The range of application areas of generative A.I.
• How autoencoders work and why variational autoencoders are particularly effective for generating content.
• What diffusion models are and how latent diffusion in particular results in photorealistic images and video.
• What contrastive learning is.
• Why “world models” might be the most transformative concept in A.I. today.
• What transformers are, how variants of them power different classes of generative models such as BERT architectures and GPT architectures, and how blending generative adversarial networks with transformers supercharges multi-modal models.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Open-Source “Responsible A.I.” Tools, with Ruth Yakubu
In today's episode, Ruth Yakubu details what Responsible A.I. is and open-source options for ensuring we deploy A.I. models — particularly the Generative variety that are rapidly transforming industries — responsibly.
Ruth:
• Has been a cloud expert at Microsoft for nearly seven years; for the past two, she’s been a Principal Cloud Advocate that specializes in A.I.
• Previously worked as a software engineer and manager at Accenture.
• Has been a featured speaker at major global conferences like Websummit.
• Studied computer science at the University of Minnesota.
In this episode, Ruth details:
• The six principles that underlie whether a given A.I. model is responsible or not.
• The open-source Responsible A.I. Toolbox that allows you to quickly assess how your model fares across a broad range of Responsible A.I. metrics.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Tools for Building Real-Time Machine Learning Applications, with Richmond Alake
Today, the astonishingly industrious ML Architect and entrepreneur Richmond Alake crisply describes how to rapidly develop robust and scalable Real-Time Machine Learning applications.
Richmond:
• Is a Machine Learning Architect at Slalom Build, a huge Seattle-based consultancy that builds products embedded with analytics and ML.
• Is Co-Founder of two startups: one uses computer vision to correct peoples’ form in the gym and the other is a generative A.I. startup that works with human speech.
• Creates/delivers courses for O'Reilly and writes for NVIDIA.
• Previously worked as a Computer Vision Engineer and as a Software Developer.
• Holds a Masters in Computer Vision, ML and Robotics from the University of Surrey.
Today’s episode will appeal most to technical practitioners, particularly those who incorporate ML into real-time applications, but there’s a lot in this episode for anyone who’d like to hear about the latest tools for developing real-time ML applications from a leader in the field.
In this episode, Richmond details:
• The software choices he’s made up and down the application stack — from databases to ML to the front-end — across his startups and the consulting work he does.
• The most valuable real-time ML tools he teaches in his courses.
• Why writing for the public is an invaluable career hack that everyone should be taking advantage of.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Get More Language Context out of your LLM
The "context window" limits the number of words that can be input to (or output by) a given Large Language Model. Today's episode introduces FlashAttention, a trick that allows for much larger context windows.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.
Contextual A.I. for Adapting to Adversaries, with Dr. Matar Haller
Today, the wildly intelligent Dr. Matar Haller introduces Contextual A.I. (which considers adjacent, often multimodal information when making inferences) as well as how to use ML to build moat around your company.
Matar:
• Is VP of Data and A.I. at ActiveFence, an Israeli firm that has raised over $100m in venture capital to protect online platforms and their users from malicious behavior and malicious content.
• Is renowned for her top-rated presentations at leading conferences.
• Previously worked as Director of Algorithmic A.I. at SparkBeyond, an analytics platform.
• Holds a PhD in neuroscience from the University of California, Berkeley.
• Prior to data science, taught soldiers how to operate tanks.
Today’s episode has some technical moments that will resonate particularly well with hands-on data science practitioners but for the most part the episode will be interesting to anyone who wants to hear from a brilliant person on cutting-edge A.I. applications.
In this episode, Matar details:
• The “database of evil” that ActiveFence has amassed for identifying malicious content.
• Contextual A.I. that considers adjacent (and potentially multimodal) information when classifying data.
• How to continuously adapt A.I. systems to real-world adversarial actors.
• The machine learning model-deployment stack she uses.
• The data she collected directly from human brains and how this research relates to the brain-computer interfaces of the future.
• Why being a preschool teacher is a more intense job than the military.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.