My friend JD Long, has been a source of good inspiration over the years as I learned more about analytics, reporting, software and data science.
I’ve not got a full answer to his question. I think in fact there are a number of blog posts to explore what exactly he’s talking about.
So this is based on my 5 years + experience of interfacing with engineers, IT professionals, risk analysts, product analysts etc.
I want to firstly say – we live in a wonderful open source world. And we should definitely encourage people to learn these new tools. If you have a talented keen colleague who wants to learn more than ‘just Excel’, teach them.
However here’s the rub, we’re taking people from Excel speed prototypes – often without things like version control, or reproducibility. To using open source software tools. And we’re doing that without teaching them about things like reproducibility, or even giving them the support they need.
Lisa is a Risk Analyst working in insurance, she wants to learn R tools. And has been taught by someone the basics. She’s learning basic coding, but she’s never been taught things like git, or how to write functions properly etc. She’s working hard without code review, and then some day she wants to get her code into production.
She goes to speak to Victor – Victor is a super talented IT professional, and is staff engineer.
What goes wrong?
I think one problem is the lack of support for Lisa, it might be because she’s the only analyst with any technical skills, it might be a lack of support from other engineers or data scientists. But she should be getting some code review and mentoring and support.
Victor basically insults her work, talking about functions, and OOP, and other things. She hasn’t a clue what he’s talking about and she says ‘you guys didn’t give me crap when I was building Excel models’.
What happens next?
She talks to her manager, she’s a bit overwhelmed. She starts to goto meetups in her city, and encounters other professionals, and eventually starts picking up what ‘git’ is and other complicated skills.
What Lisa needs?
Lisa is like a lot of us when we start out as Analysts. We learn a programming language, we start hearing about things like ‘data science’, we are bored by the 30 hours of pivot tables we are doing per week and want to automate this. We start to learn things like Linux, and how to run things on a server.
What is the source of this conflict?
I think one of the biggest sources of this conflict, is that learning software skills is hard. So some solutions are things like pair programming, and also understanding the difference in mental models.
The Excel stuff was ‘quick and dirty’ but probably wasn’t reproducible. I’ve in the past spent 3 weeks trying to get an Excel sheet to work, with all sorts of VBA. At that stage, we should be moving from Excel to something more automated, with more tests, and more control. However moving from ‘dirty prototype’ to ‘well written software code’, involves learning lots. In fact there are copious resources out there for this – one of note is Software Carpentry
What is the solution?
I think part of the solution is acknowledging the sources of conflict. We shouldn’t demotivate people like Lisa, our industry needs all sorts of talented people. I’ll write more on this in the future, this short blog post was just to get a sense of what the problems are, and how we can navigate them.
Joel Grus wrote a brilliant talk on his journey from ‘analyst’ to ‘research engineer’. It’s instructive.