Sampling algorithm for preserving fluctuation details of time series data

First, review the concept of time series data. The sequence that constantly produces new data with the change of time is called time series data. Time series data often appear in monitoring scenarios, such as CPU utilization and memory utilization of servers.

One of the main characteristics of time series data is that the data often jumps quickly with time, as shown in the following figure:

As can be seen from the figure, because the data points are too dense and fluctuate very frequently, all the data points have been overlapped once after being connected by dotted lines, and the visualization effect is poor, so it is necessary to sample the original data points, for example, 10000 original data points are sampled into 200 data points.

A simple data sampling algorithm is to find the average, maximum and minimum statistics. For example, the above example of sampling 10000 original data points into 200 data points can divide the original data points into 200 groups, each group contains 50 original data points (200*50= 10000), and then average all the original data points in each group.

This algorithm is very simple, but there is a problem, that is, many details of the fluctuation of the original data are lost, as shown in the following figure:

The gray lines in the figure are raw data, and the dark lines are sampled data. It can be seen that the data becomes smoother after sampling, and many details are lost. In particular, a very obvious peak of the original data in the red box is directly erased, which is likely to represent a business anomaly.

In the paper "Downsampled Time Series for Visual Presentation", a data sampling algorithm LTTB (there are several other similar algorithms, see the paper) is mentioned, which can keep the fluctuation details of the original data while sampling, and the specific principle is not expanded. The effect of the algorithm is shown here.

Can be widely used in monitoring products, and can well solve the following two problems:

How to understand calculus in economics

What software can I use to read files? NH and. CAJ suffix?

What is the price of genuine licensed jade bracelets in Qing Dynasty? What is the quality of jadeite in Qing Dynasty?

Where can I find a paper on organizational behavior?

Zhang Jianping's academic papers

How to make three line tables

What is the duplicate checking rate of school papers?

Asking for information about actor Yang Dong?

Primary school students' scientific composition: Why is there a layer of skin after milk is cooked?

Li Gang is my dad's paper.