An extension of the Data Science process – OSEMIC

One of the most famous taxonomies of data science is OSEMN pronounched ‘Awesome’.

It stands for Obtain, Scrub, Explore, Model, Interpret.

I was recently chatting to some data scientists on twitter and they pointed out that shouldn’t it be OSEMIC?

Obtain, Scrub, Explore, Model, Interpret and Communicate!!!

I hadn’t thought of this, but I agree it is part of the process, interpretation by a specialist like myself isn’t the full battle, it needs to be translated into something that business stakeholders can understand. And the challenge is to not lose them with ‘this is the R^2 part’.

I think this ‘last mile’ problem of data science is a real challenge, how do you get something complicated as a Machine Learning model or a differential equation model into something that stakeholders can act on. And I suspect that this is even harder than just learning the mathematics or the programming. I think data scientists can also learn a lot from storytellers such as journalists and designers.

Thanks to everyone who contributed ideas for this post.

