One for your machine learning toolkit

Benford’s Distribution


Knowing about Benford’s distribution could just keep you out of jail. Surely that has to be worth an hour of your time?

Data analytics, which is deeply interconnected with machine learning, is essentially about looking for patterns in data. Part of the skill is knowing which patterns are expected and which are unusual.

Benford’s distribution is a pattern that is at once very common and very often overlooked (for the simple reason that most people don’t know about it). In fact, it is so common that it is usually the lack of Benford’s distribution that is significant. A good example of this is that fraudulently generated data often fails to display this distribution, so fraud-detection is often quoted as the classic use of Benford’s; however simply seeing it as a fraud detection tool is to significantly underestimate its power.

In part one of our MCubed webcast series, Prof. Mark Whitehorn will outline what a Benford’s distribution looks like (obviously!) but, more importantly, will explain WHY this distribution occurs. It is vital to understand this because that knowledge allows you to decide whether the pattern is expected in a given set of data. If it is, then “nothing to see here, move along.” If it isn’t, further investigation may well yield fascinating insights (and put fraudsters in jail).

The session will be rounded off with a look at the latest news in machine learning-related software development.