Disclosure: I’m not really a data scientist these days, I’m a founder of a tech startup (which has a core AI component). These are my unfair, biased, and prejudiced views based on experience of being a professional data scientist for something like a decade.
So firstly, I think it’s worth declaring that machine learning and analytics are incredibly important tools for businesses. The business impact of data science CAN be very powerful. However, I think it’s also fair to say that the impact has been somewhat restricted. As a profession Data Science has been a lucky profession for the past say 5 years. You could leave university and get a well-paid job in most major tech markets. Compared to some skillsets this is very lucky and fortunate.
Now I also think that definitely it’s been challenging for some businesses to extract value from data science. I think this is a combination of factors – innovation is hard, incentives matter, and rarely do businesses empower their data science teams well enough. There’s been a lot of writing about this including my own article https://peadarcoyle.com/2017/08/07/how-do-we-deliver-data-science-in-the-enterprise/
Hackers News has a bad effect on Data Scientists
I also think hackers news plus “they’re the braniacs” that some cultures treat R and D like teams with. The problem with this is that in cost-cutting times people cut that which they don’t understand. And “innovation teams” are often the first to go. So I used to think I only wanted to work on algorithms or r and d – but the reality is you need to be doing something that’s got real-world impacts or real-world deadlines.
Let’s explore this idea further, and articulate what this means.
I think this Will Larson diagram is super helpful.
It’s this diagram actually. Most say dev ops teams are in ‘short term and forced’.
And a lot of data science teams are discretionary and long-term. Now I used to think this was great. I get to work on discretionary and long-term all the time, isn’t that an amazing job. Except for the problem as Will points out is the following – However, the complete lack of forced and short-term work typically implies that there are few to zero folks using your software, or that the feedback loop between those users and the developers is entirely absent. How many data science teams suffer from this? They either work on long modeling projects that go on for several quarters with no results. Or they don’t work on projects that have a sponsor or user. I call this the trophy data scientist problem.
But you want to be a bit further up – you want to do SOME forced work. Because forced means you’re aligned with the business. And data scientists who say aren’t fixing bugs in production or say doing analytics that is connected to the bottom line – well… they’re in trouble. If you’re in this quadrant, this is probably a time to be very afraid.
Where do you want to be?
As Will says in his talks on this topic you want to be somewhere like here.
You want some forced work, which means that teams are using your systems for something. You also want to do some discretionary work, so you can continue to develop technical leverage over time. We’ll define technical leverage as the building of technology that delivers continued value to the business. It’s important for ALL teams to do this. I’ve consulted with numerous data science teams over the years, and been in some. I’d say that not all Data Science teams continuously give value to the business
So what can you do?
- Focus on business questions. If a SQL query adds more value than a machine learning model, work on the SQL query.
- Projects need to be trackable. Timelines need to be predictable. Requirements need to be met. Costs need to be controlled. The unlimited budget and lack of oversight are gone.
- Machine learning needs to provide value that’s better than a simpler approach.
- Avi Bryant said to me words like the biggest mistake teams make in ML is they don’t focus on getting something working end-to-end. They either go off and yak-shave a large infrastructure project, OR they build a complicated model without the infrastructure that’s needed for that model. When something is in production and can be shown to users it has a LOT of value – they can respond, and the model can improve. Get something into production. Not just a Jupyter notebook.
These changes require bringing in experienced, competent leadership at the team level. These are healthy, necessary changes for our field. Maturity is good for our field and the businesses we support.
It’s easy to say ‘but management doesn’t get it’. If you’re the leader of a technical team your job is to communicate your value and add value.
P.s. If you’re a data scientist and you’re looking for any career advice, or if you’ve suffered in a layoff feel free to send me a message. Peadarcoyle[at]googlemail[dot]com I can’t promise to help much but I’ve got some experience and I can share some war stories.