Filtering by Category: SuperDataScience

The World is Awful (and it’s Never Been Better)

Added on August 2, 2021 by Jon Krohn.

Feel like the world is kinda poopy? Well, it is! BUT, covid pandemic not withstanding, it's also WAY better than ever before. I articulate this idea with data and charts for this week's Five-Minute Friday episode.

Thanks to Benjamin Todd for pointing me in the direction of a blog post by Max Roser (founder of Our World in Data) that formed the basis of this podcast episode.

Watch or listen here.

R in Production

Added on July 27, 2021 by Jon Krohn.

Dutch national-podium-level powerlifter Veerle van Leemput joins me this week to detail how R is not only an option for production, but may in fact be the *best* production option if data models are central to your application.

Over the course of the episode, Veerle runs down for us her favorite R tools for:
• Data gathering
• Model development
• Deployment into production systems

Veerle has held a number of data-science leadership roles at Dutch companies. She now serves as Managing Director and Head of Data Science at Analytic Health, a London-based firm that builds data-centric software for the healthcare industry. And she was silver medalist in the 57kg class of the 2021 Dutch national powerlifting championships with a total of 335kg (~739 pounds) across the back squat, bench press, and deadlift.

Listen or watch here.

DataScienceGo This Weekend

Added on July 21, 2021 by Jon Krohn.

The DataScienceGO conference is this weekend — registration for Friday and Saturday is 100% free! I'm speaking Saturday on the pros and cons of TensorFlow vs PyTorch for training and deploying deep-learning models.

Awesome speakers — whom you may already be familiar with from recent SuperDataScience episodes — include:
• Erica Greene (episode # 435)
• Harpreet Sahota (# 457)
• Andrew Jones (# 483)

I don't (yet!) personally know the other speakers pictured here but their weighty reputations precede them and I'm looking forward to getting to know them better over the course of the weekend: Gabriela de Queiroz, Karen JEAN-FRANCOIS, Yudan Lin, Ken Jee, and Danny Ma.

Free registration here!

Monetizing Machine Learning

Added on July 21, 2021 by Jon Krohn.

This week's guest is the legendary Vin Vashishta! Vin details his A.I. commercialization strategy, which allows data science teams and machine learning companies alike to be profitable and successful long-term.

Vin is founder of and chief data scientist at V Squared, his own consulting practice that specializes in monetizing machine learning by helping Fortune 100 companies with A.I. strategy. He's also the creator of several platforms (including The ML Rebellion) for learning about critical skill gaps related to artificial intelligence such as commercial strategy, data science leadership, and model explainability.

In addition to the episode's focus on A.I. strategy, Vin answers questions from SuperDataScience listeners (thanks, Serg, Joe, Daniel, Nikhil, and Michael!), including on:
• Efficiency gains from no-code or low-code machine learning tools
• The biggest skills gaps that data scientists have
• The most disturbing data sets
• Investing in socially beneficial models
• The most challenging problem with commercializing AI

Listen or watch here.

(With thanks to Harpreet Sahota for another stellar guest suggestion!)

The Price of Your Attention

Added on July 19, 2021 by Jon Krohn.

Time is money. Every second of your life is yours to use and one of the options you have is to generate income. You can do this hourly, or, as a data scientist, invest time in a digitally-sharable product with a huge potential ROI.

Listen or watch here.

Fixing Dirty Data

Added on July 14, 2021 by Jon Krohn.

My guest this week is the fixer of dirty data herself, the one and only Susan Walsh. We have a lot of laughs in this episode as we discuss how organizations can save substantial sums by tidying up their data.

Susan has worked for a decade as a data-quality specialist for a wide range of firms across the private and public sectors. For the past four years, she's been doing this work as the founder and managing director of her own company, The Classification Guru Ltd. She's also the author of the forthcoming book, "Between the Spreadsheets", and she hosts her own video interview show called "Live from the Data Den".

Listen or watch here.

The History of Calculus

Added on July 11, 2021 by Jon Krohn.

Y'all seem to love these "History of..." episodes, so for Five-Minute Friday this week, here's another one. It's on the History of Calculus! Enjoy 😄

(Leibniz and Newton, who independently devised modern calculus around the same time, are pictured.)

Listen or watch here.

Financial Data Engineering

Added on July 7, 2021 by Jon Krohn.

This week's guest is Doug Eisenstein, an exceptionally clear and content-rich communicator. He fills us in on the complexity of engineering a coherent source of truth for financial models, integrating hundreds of data sources.

Topics covered in the episode include:
• A breakdown of the primary financial sectors and departments
• Why data source integration for finance is wildly complicated
• Specific data engineering approaches that resolve these issues including entity resolution, knowledge graph mapping and tri-temporality.

20 years ago, Doug founded the consulting firm, Advanti and they have since become a critical provider of solutions to complex data engineering problems faced by some of the world's largest banks and asset managers including Morgan Stanley, Bank of America, Citibank and State Street.

Listen or watch here.

Algorithm Aversion

Added on July 5, 2021 by Jon Krohn.

This article was originally adapted from a podcast, which you can check out here.

In many domains, algorithms predict the future better than human forecasters. Despite this, people are susceptible to a cognitive bias called Algorithm Aversion.

Setting Yourself Apart in Data Science Interviews

Added on June 29, 2021 by Jon Krohn.

For this week's guest episode, I interrogated Andrew Jones on his data science interview secrets. If you want to improve your interview performance — especially if you're in a data-related career — this episode's for you.

Andrew has held a number of senior data roles over the past decade, including at the tech giant Amazon. In those roles, Andrew interviewed hundreds upon hundreds of data scientists, leading him to create his Data Science Infinity educational program, a curriculum that provides you with the hard and soft skills you need to set yourself apart from other data scientists during the interview process.

Listen or watch here.

Continuous Calendars

Added on June 25, 2021 by Jon Krohn.

Extremely practical post for you today! It's on the Continuous Calendar, which in my opinion is vastly superior to the standard monthly calendar in every imaginable respect. Click through for more detail.

Performance Marketing Analytics

Added on June 24, 2021 by Jon Krohn.

My guest this week is Kris Tait, who fills us in on how data and machine learning have transformed — and will continue to transform — marketing, enabling even small firms to effectively target customers and grow their revenue.

In this episode of the SuperDataScience show, we cover:
• What performance marketing is
• The rapidly shifting digital marketing ecosystem, as well as how data and ML can mitigate the risks associated with these changes
• The sweet spot for augmenting human marketers' skills with machines
• How any firm should define metrics to maximize return on marketing investment, thereby ensuring broader commercial success
• The most useful modern data science tools for global digital marketing

Kris is the managing director for the US at Croud - Performance Marketing Agency of the Year, an innovative marketing agency that is driven by data analytics and machine learning algorithms.

Listen or watch here.

Top Resume Tips

Added on June 19, 2021 by Jon Krohn.

In recent weeks, I've received several messages from folks struggling to get callbacks for Data Scientist interviews. In reviewing their résumés, I realized there are five specific tips that I highly recommend adhering to.

You can listen or watch here.

Knowledge Graphs

Added on June 15, 2021 by Jon Krohn.

In this week's guest episode, wildly intelligent and meticulously communicative Maureen Teyssier, Ph.D. explains what Knowledge Graphs are, why they're so powerful, and how to grow a flourishing data science team.

In more detail, in today’s episode we cover:
• The theory and applications of Knowledge Graphs, a cool and powerful data type at the heart of much of Maureen’s work at Reonomy
• The data science techniques that Reonomy use to flow data through extremely high-volume pipelines, enabling them to efficiently apply models to their massive data sets
• What Maureen looks for in the data scientists that she hires and the tools and approaches she leverages in order to grow a highly effective data science team
• The differences between data scientists, data analysts, data engineers, and machine learning engineers.
• Maureen’s fascinating academic work in which she used gigantic supercomputers to simulate solar systems and galaxies

Maureen is Chief Data Scientist at Reonomy, a very well-funded New York start-up — they’ve raised over 100 million dollars — that is transforming the world of commercial real estate with data and data science. Prior to working in industry, Maureen was an academic working in the field of computational astrophysics; she obtained her PhD from Columbia University in the City of New York and then carried out research at Rutgers University in New Jersey.

Listen here.

Five Keys to Success

Added on June 14, 2021 by Jon Krohn.

I've recently been able to achieve markedly better results than ever before across my personal and professional lives. For Five-Minute Friday, I reflect on five keys to success that may allow achievement of many complex, long-term goals.

You can listen or watch here.

How to Thrive as an Early-Career Data Scientist

Added on June 8, 2021 by Jon Krohn.

Getting started in data science? Today's episode is for you! Sidney Arcidiacono is absolutely crushing her first year in the field; we discuss the options for getting started in the field and top tips for early-career success.

Trained as a phlebotomist (blood-sample collection), Sidney was inspired by the potential for machine learning to revolutionize healthcare, so she jumped feet first into a full-time computer science degree at Make School, specializing in the data science track. From no familiarity with code or models just a year ago, Sidney's immersion has paid off: She's now fluent in the modern data science software stack and landed a summer data science internship at GreenLight Biosciences, Inc., an RNA-molecule therapeutics firm (like the Pfizer/BioNTech/Moderna vaccines).

Sidney is terrifically sharp and engaging; I think you'll enjoy hearing from her as much as I did during filming.

Watch or listen here.

Peer-Driven Learning

Added on June 4, 2021 by Jon Krohn.

"Peer-driven" learning — where you are formally taught by your coworkers — not only results in team members learning key new skills, but can have added benefits like team bonding, confidence, and innovation. Something to try!

Today's episode is directly inspired by a LinkedIn post by Laura Rodriguez. She tagged me in the post, citing a SuperDataScience episode on communication and relating it to her workplace at ForwardKeys. Thank you, Laura!

The 20% of Analytics Driving 80% of ROI

Added on June 1, 2021 by Jon Krohn.

Today’s episode is with freakin' David Langer, people!! (So obviously it's brilliant, witty, and full of laughs.) He fills us in on the most powerful 20% of analytics — the analytics that drive 80% of companies’ return on investment.

Publishing under his Dave on Data brand, Dave's YouTube channel is top-notch, with several videos that have over a million views (and the thumbnails are hilarious; check 'em out). He is an exceptionally accomplished data scientist and software engineer, including spending nearly a decade at Microsoft's Global HQ, where his titles included principal software architect, principal data scientist, and director of analytics.

Topics in the episode include:

Surprisingly powerful modeling approaches in spreadsheet tools like Excel
The SQL databases we'll need if the data sets we're working with are too big for spreadsheets
Why R programming is easy and should be our default language choice for moderate to advanced statistical analysis
How companies can maximize value from machine learning

Listen or watch here.

Filtering by Category: SuperDataScience

The World is Awful (and it’s Never Been Better)

R in Production

Say No to Pie Charts

DataScienceGo This Weekend

Monetizing Machine Learning

The Price of Your Attention

TensorFlow vs PyTorch @ DataScienceGo Virtual

Fixing Dirty Data

The History of Calculus

Financial Data Engineering

Algorithm Aversion

Setting Yourself Apart in Data Science Interviews

Continuous Calendars

Performance Marketing Analytics

Top Resume Tips

Knowledge Graphs

Five Keys to Success

How to Thrive as an Early-Career Data Scientist

Peer-Driven Learning

The 20% of Analytics Driving 80% of ROI