What does it take to build foundation LLMs from scratch today? Deeply impressive Jazmia Henry breaks down the four stages in today's episode, enjoy!
Jazmia:
• Holds degrees from Tulane University and Columbia University... and is partway through a PhD at the University of Oxford.
• Held a technical fellowship at Stanford University.
• Previously worked as a data strategist at Morgan Stanley, head of ML at The Motley Fool and a Lead Applied AI engineer at Microsoft.
• Published a top paper at NeurIPS, the world's most prestigious academic AI conference.
• Currently works as "Member of Technical Staff for AI/ML" at collide., a Texas-based startup that’s building AI infrastructure (including all aspects of specialized foundation models) for the energy industry.
Key topics covered in this episode include:
• What foundation models are.
• Her "full-stack" foundation-model building's four distinct stages.
• How reinforcement learning (RL) models are "bursty" because they idle the GPU during reward calculation and then dump enormous loads on it all at once.
• Reward hacking by RL models.
Thanks to Mark Freeman II for recommending Jazmia as a guest.
The SuperDataScience podcast is available on all major podcasting platforms, YouTube, and at SuperDataScience.com.