Some Fun With Data: Analogue Ensemble Tree

A deterministic forecast is a prediction of a value for a point or interval in time. In contrast a probabilistic forecast is a prediction of a distribution of values. While we're still rather new in this field one method that fascinated us when we learned about it, is the analogue ensemble.

An analogue ensemble is built on past observations and works like this. Given a current situation we look for similar situations in the past, the so called analogues. What it means to be similar has to be defined, e.g. similar features. Since we know the future of those past situations, we can take them and view them as possible predictions for our current situation. The distribution of those predictions is the analogue ensemble forecast.

In order to construct such an ensemble we need to do a lot of comparisons of our current data with past data, which reminded us of a paper we read some years ago.

Human motion database with a binary tree and node transitiongraphs by Katsu Yamane et al. describes how to construct a binary tree out of measurement time series. In their case the measurement is human motion, tracked with body sensors and the fascinating part is, that due to this tree structure new data can be compared to past measurement rather fast. Here is a figure from their paper where they visualize some tree nodes on top of the time series.

We were wondering if an analogue ensemble could also benefit from using a binary tree structure so we tested it, to see where it leads us.

We used energy consumption time series data from kaggle. The data is from over 10 years, hourly energy consumption from PJM in megawatts. Here's a plot of some of the data.

For the similarity we simply used the last 10 values of a timestamp as features and applied k-means clustering. In case of the binary tree, every node of the tree applies k-means with k=2 on its input data and the two resulting clusters will be used for its two child nodes. A tree of depth three will therefore have eight clusters in its bottom layer, but it only takes six comparisons to decide what cluster a new data point belongs to. The runtime increases logarithmic instead of linear.

We took a sample from our time series and used the tree to find probabilistic forecasts for it. The tree structure also allows us to look at how the forecasts converge as we go further down the tree.

Here we tried to plot the tree by plotting the cluster centers and connecting them according to the tree structure.

We also tried to visualize the tree by plotting the distributions of the clusters stored in the nodes. Unfortunately those plots become hard to read very fast. Here's a plot of the first few tree layers.

Finally to evaluate our effort there are a lot of aspects to regard. In order to produce high quality forecasts the feature is very important. We did not expect that our "last ten measurements" features would yield very good results. And we're not sure if k-means is a good way to compute similarities either. However here is a measurement that we computed forecasts for. Since we can't plot the entire distributions, we show here the means of the distributions for different tree depths. (Kinda defeats the purpose of probabilistic forecasts =D).

As expected the forecast becomes better for deeper tree layers. We hope to learn more about analogue ensembles in the future and maybe this exercise will help us. Our tree implementation and the analysis notebook are on our github.

Some Fun With Data

Mittwoch, 31. Juli 2019

Analogue Ensemble Tree

Keine Kommentare:

Kommentar veröffentlichen