Why building ML systems is about more than ML?

I’m going to use a bit of a click bait title for this article. But the aim of this article is to share experiences I’ve gathered from about 10 years building ML systems, and building ML teams.

metal pipes plumbing pressure
Photo by Pixabay on Pexels.com

Why ML is about more than ML

I saw the following tweets by Erik so I’ve added them.

The key word here is ‘the plumbing around it’. I think looking back at the whole ‘big data’ wave and the ‘Machine Learning’ wave, is that there was a lot of vendor-driven hype around ‘technology’. As I wrote in this post

biggest mistake teams make in ML is they don’t focus on getting something working end-to-end. They either go off and yak-shave a large infrastructure project, OR they build a complicated model without the infrastructure that’s needed for that model. When something is in production and can be shown to users it has a LOT of value – they can respond, and the model can improve. Get something into production. Not just a Jupyter notebook.

Peadar Coyle

I sometimes call this the ‘hackers news’ effect. Which I’ve often found in teams. Where someone comes to me with ‘we want to use deep learning because Airbnb/Uber/Google/Stripe/Etc is using it’. They may explicitly say that or they may say different words but have that effect.

The purpose of any team is to add value to the business. So don’t call yourself a programmer or a data scientist.

I’ve just spent near two years building out our pipelines for audio production at my startup and the resulting core value in terms of business value is rarely the ML side. This isn’t to say that ML isn’t important, but it’s a small part of the puzzle.

  • Algorithms are just cogs in a system
  • Algorithms are dumb and are often optimising for a specific use case
  • Algorithms live in a mess

What are the core lessons?

Most ML systems live as part of an end-to-end system. In the case of our audio production system. There’s the following rough parts

  • A UI for writing text and annotation
  • Data collection and automatic transcription of audio for training jobs
  • ML models for synthetic voice
  • cron jobs for copying wav files and mp3 files from system to system
  • A business rules engine for quality insurance
  • Audio post production process
  • An asset store for storing assets and delivering them via an API

All of these are connected via batch processes and API calls and various cron jobs. And testing this stuff end to end is tricky and involves a lot of nuance/ understanding of a domain specific problem.

ML is just one part of a much larger system. It’s frankly a messy process, and it can often be very difficult to figure out what’s broken. However, in terms of leverage – our leverage is often on the automation side. Improving the reliability of our software allows us to onboard more clients, not to mention offer more product features.

There’s an element of human-in-the-loop here as well. Especially for focusing on edge-cases.

So what should you do?

  • Focus on understanding business problems, if you can reason about business and talk in trade offs – you’ll always have a job.
  • Learn to sell your work as a data scientist – you need to be able to communicate results
  • Work on your software skills – learn stuff like React, Vue, Serverless etc – so you can show results end to end
  • See yourself as a problem solver.

We’re probably seeing a bit of pushback against AI/ML etc. However, we shouldn’t neglect that we are seeing process improvements and enhancements in productivity. However, these are often due to automation and the interaction of humans with algorithms as opposed to anything else.

Further reading

Don’t call yourself a programmer
The future of Data Science is past

Leave a Reply

Your email address will not be published. Required fields are marked *