Interview with a Data Scientist (Hadley Wickham)

(Repost from 2015) I recently interviewed Hadley Wickham the creator of Ggplot2 and a famous R Stats person. He works for RStudio and his job is to work on Open Source software aimed at Data Geeks. Hadley is famous for his contributions to Data Science tooling and inspires a lot of other languages! I include some light edits. 1.… Continue reading Interview with a Data Scientist (Hadley Wickham)

Data Science as a Process

Hilary Mason one of the shining lights of the world of data science Tweeted recently  ‘Data people: What is the very first thing you do when you get your hands on a new data set?’  What I do when I get a new dataset is a recent article on the Simple Statistics blog, is a response… Continue reading Data Science as a Process

Information Retrieval

Attention conservation notice: 680 words about Information Retrieval, and highly unoriginal. The following is very much inspired by a course by Cosma Shalizi but I felt it was worth rewriting to get to grips with the concepts. This is the first of what is hopefully a series of posts on ‘Information Retrieval’, and applications of… Continue reading Information Retrieval