Show me the Data. pt.1

Overview of the data sources currently available for all levels

Featured image
“Without data you’re just another person with an opinion.”

This simple quote is one of, if not my favourite quote that encapsulates what I’m trying to do with this blog and even at greater extent the kind of assertion I hope to make in this journey. So before I even think about loading VS Code. I need to acquire the right, high quality data in order to answer my problem statement.

‘Can we identify a more suitable on field position for any given (outfield) player using machine learning’

In the following sections this will break down to:

The objective of analytics

In order to even get started to make assertions about football or any event based activity, we need the capability to analyse historical data and forecast what might happen in the future. Assessing all the available analytical options we have at our disposal is a huge task. We can categorise analytics at a high level into three distinct types. It’s also key to note that no one type of analytic is better than another, in-fact if we are looking to create a robust and thorough analytics, these types coexist with, and work in tandem with each other.

Descriptive Analytics: Insight into the past

Descriptive analysis or statistics does exactly what the name implies: they “describe”, or summarise, raw data and make it something that is interpretable by humans. They are analytics that describe the past. The past refers to any point of time that an event has occurred, whether it is one minute ago, or one year ago. Descriptive analytics are useful because they allow us to learn from past behaviors, and understand how they might influence future outcomes. The vast majority of the statistics we encounter in football fall into this category. This comprises of information such as:

Predictive Analytics: Understanding the future

Predictive analytics has its roots in the ability to “predict” what might happen. These analytics are about understanding the future. Predictive analytics provides us with actionable insights through applying the descriptive data we have aggregated to statistical models in order to estimate the likelihood of a future outcome. It is important to remember that no statistical algorithm can “predict” the future with 100% certainty. This is because the foundation of predictive analytics is based on probabilities. Key aspects for consideration in predictive analytics can include:

Prescriptive Analytics: Advise on possible outcomes

The relatively new field of prescriptive analytics allows users to “prescribe” a number of different possible actions and guide them towards a solution. In a nutshell, these analytics are all about providing advice. Prescriptive analytics attempts to quantify the effect of future decisions in order to advise on possible outcomes before the decisions are actually made. At their best, prescriptive analytics predict not only what will happen, but also why it will happen, providing recommendations regarding actions that will take advantage of the predictions. This category effectively aims to answer my initial hypothesis. For the prescriptive analytics we will assess the actual validity of our predictive outcomes with respect to the real life scenario.

In part two, we will assess the data landscape in football analytics and how that data can be used for meaningful analysis.

Thanks for reading,

Steve