View profile

EmpiricalHR - data-driven employee decisions - Issue #138

EmpiricalHR - data-driven employee decisions
EmpiricalHR - data-driven employee decisions - Issue #138
Last week I was listening to Pieter Abbeel’s excellent new podcast ‘Robot Brains’ and specifically an interview with Andrej Karpathy, the head of AI at Tesla. What fascinated me was how many similarities (apart from resources) there were between how we set up our machine learning ‘infrastructure’ for text and how Tesla have set up their approaches.
One key part is a belief that models aren’t created by data scientists but by domain-experts curating data. Karpathy calls this ‘software 2.0’. In this shared approach the data scientists build the infrastructure for data labellers to train the models.
All of our ML models depend on vast amounts of training data however more isn’t better. I haven’t counted but I’d be surprised if we had less than 1 million rows in our carefully curated training dataset for the generic models. As I discuss in my People Analytics World presentations it’s all in the edge cases. We have approaches to find and label these edge cases. Tesla identify issues (for example signs hidden in foliage) and then ask their almost 1m cars to send back real-world examples for the labelling teams. Both Tesla and ourselves base Machine Learning approaches around a series of techniques called ‘Active Learning’.
In my workshop on text analysis at the forthcoming People Analytics World (see below for a discount code) I’m going to share not only how to do text analysis but also the infrastructure, including organisation structure, that I think you need to do this well. It’s not hard to find a guide to how to apply the latest algorithm to text. What I hope to provide is insight into how to build a scalable capability within your teams.
Have a great week.

Ethics
For the analysts
Software Engineering Best Practices for Data Scientists
Why's it hard to teach data cleaning?
Most popular last week
People Analytics World 2021
Did you enjoy this issue?
Andrew Marritt | OrganizationView

EmpiricalHR - the leading weekly People Analytics newsletter. A carefully curated collection of the best writing from around the web for those making data-driven employee decisions.

In order to unsubscribe, click here.
If you were forwarded this newsletter and you like it, you can subscribe here.
Powered by Revue
OrganizationView GmbH, Via dal Bagn 15B, 7500 St. Moritz, Switzerland