Probabilistic Programming Primer

Data Science and interpretability

Photo by from Pexels

I’ve been involved in industrial applications of machine learning, analytics and what is generally referred to as ‘data science’ for about 5 years now. My experience in academia also probably extends that number.

We’ve seen remarkable advances in that time, and a greater appreciation of the value of data in industry – now we live in a world where practically every business process – whether it is a supply chain process, a marketing process or a selling an insurance product has as part of that value chain the collection of data. With newer sensors it’s getting even more ubiquitous. This has led to very large data sets and also very diverse data sets. Together with the immense computing power that is now accessible via cloud computing we saw the rise of what we’ll call ‘machine learning’. Historically accuracy has been more important than interpretability, and this has resulted in black box systems that can classify the world better than before – but we can’t explain them. With the rise of newer regulatory regimes such as GDPR and changing consumer preferences – the ‘computer says so’ answer will be unsatisfactory.

There are a set of techniques that allow you to both handle small data sets, explain what your model is predicting and encode in domain knowledge. The solution is a statistical technique called Bayesian inference.

This technique begins with our stating prior beliefs about the system being modelled, allowing us to encode expert opinion and domain-specific knowledge into our system. This for example can be very useful in regulatory contexts or models with a lot of deep domain knowledge. These beliefs are combined with data to constrain the details
of the model. Then, when used to make a prediction, the model
doesn’t give one answer, but rather a distribution of likely answers, allowing us to make decisions about risks.
Bayesian inference has long been a method of choice in academic science for just those reasons: it natively incorporates the idea of confidence, it performs well with sparse data, andthe model and results are highly interpretable and easy to understand.

It is simple to use what you know about the world along with a relatively small or messy data set to predict what the world might look like in the future.

Whats new? Why now?

Until recently the practical engineering challenges of
implementing these systems were prohibitive, and required
a large amount of specialized knowledge. Recently, a new
programming paradigm, probabilistic programming, has
emerged. Probabilistic programming hides the complexity of
Bayesian inference, making these advanced techniques accessible
to a broad audience of programmers and data analysts.

The new course

I’m delighted to announce a course called Probabilistic Programming Primer, this course will take you through the underlying ideas of modern day Probabilistic Programming in a developer and analyst friendly way. The only knowledge you’ll need is high school or early university level probability and linear algebra. And I’ve included tips to other resources if you need a refresher.

If you’re interested – you can sign up at this link –

If you’d like a free taster there is a video here where I introduce the Bayesian way of doing AB Testing.  –  


Leave a Reply