HiveBrain v1.2.0
Get Started
← Back to all entries
patternMinor

Detecting events in time series data

Submitted by: @import:stackexchange-cs··
0
Viewed 0 times
eventstimeseriesdatadetecting

Problem

I am collecting data from a sensor over time, and I'm trying to figure out how to detect "events" in the data - specifically, when a given event begins and ends. The frequency, duration, and amplitude of these events varies.

Rather than using some rules-based scheme that proves to be rather ineffective, I want to train a model to detect these events for me. An example of my data (in only one dimension; I have multiple variables that I'm reading) looks like this:

There's one event here and some baseline data on either side.

My raw data before and during this event looks like:

1/1/2016 12:48:03   14.2
1/1/2016 12:48:04   12.8
1/1/2016 12:48:05   13.6
1/1/2016 12:48:06   13.5
1/1/2016 12:48:07   12.9
1/1/2016 12:48:08   15
...
1/1/2016 12:48:34   27.7
1/1/2016 12:48:35   30.3
1/1/2016 12:48:36   31
1/1/2016 12:48:37   32.8
1/1/2016 12:48:38   31.1
1/1/2016 12:48:39   28.7
1/1/2016 12:48:40   32.1
...


I have training data consisting of raw readings and timestamps that I can correlate to a truthset of event start and end times, but I don't know how I should go about featurizing to build a model. I assume I would keep a window of, say, 30 seconds or a minute of readings. (Events are typically more than a minute long, but I don't know whether I would need my window to contain the entire event.)

I'm familiar (but not an expert) with regression and boosted trees, and I know of tools that can generate code I can use without third-party libraries. But I don't know whether these are feasible approaches to solving my problem.

How can I go about detecting these events?

Edit: After some discussion, I've created the following graph to show what more data might look like from different signals. The duration of the manually labeled event is in blue (using the right axis labeled 0/1). Three signals show up strongly:

  • Red covers the overall event reasonably well and is essentially the above example



  • Purple-blue has an even stronger signal but only for the first half

Solution

Your first step is to characterize what effect you expect an event to have on your signals. Does it change the mean? Increase the mean? Change the variability? The more you can say about the type of effect it will have, the more specific a test you'll be able to build, and thus the more effective any analysis is likely to be.

Then, your second step is to apply appropriate techniques from the statistics literature on time series analysis. There are many techniques that look applicable here:

-
You could look for change points, where the mean value changes. For instance, the CUSUM procedure might be useful. Look into the theory on change point detection; there's lots of work on it. See also https://stats.stackexchange.com/questions/tagged/change-point.

-
You might also find literature that describes this as looking for a level shift in time series data. It's basically the same thing.

-
You might be looking for a "structural change" in the signal, which is a bit more broad/general than change point analysis. See, e.g., https://stats.stackexchange.com/questions/tagged/structural-change.

I recommend you start with some exploratory analysis. Based on the underlying physics/mechanics of how your system works, try to identify some ways that an event might affect the resulting signals. Also, do some exploratory data analysis: find some examples of known events, visualize the signals during those time periods, and hypothesize some feature values that seem to be affected during the signal (e.g., the mean got larger; the standard deviation got larger; the mean of the absolute value of the first derivative got larger; something like that). Finally, look for appropriate techniques in the statistical literature. If you can characterize the effect of an event well, you might be able to use change point detection techniques that will work well. If you can't characterize the effect of an event, you might need to fall back to some kind of machine learning, trained using a large training set... but you should still do each of the steps I listed, to help you identify what features might be useful to use in your machine learning model.

Context

StackExchange Computer Science Q#61102, answer score: 6

Revisions (0)

No revisions yet.