-
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hi, Looking at the KMeansAD code, you can see that point anomaly scores are first computed for each moving window based on the distance to a cluster center and then point anomaly socre on the original time series are obtained by averaging, using a reverse windowing operation, the point anomaly score of all windows. You would need to inspect the cluster centers to confirm, but what might be happening is that most cluster centers (the method produce 20 clusters by default if you don't change the parameters) will be scattered on the downward slope, creating subsequences with low point anomaly at the start and high at the end (or reversed). Then the averaging performed would also help produce the results you see there. To illustrate that, you can reduce the number of clusters to 2 : In general, the better approach is to fit KMeansAD only on data that is considered "normal" (i.e. a continuous line in your example) and then try predicting anomalies. The number of clusters and the window size are the two important parameters here to obtain the desired behaviour. It might also be interesting to z-normalize the windows prior to computing the clusters and anomaly scores, but I don't think we currently have the option to do that in KMeansAD (the effort would be minimal to implement it tho) |
Beta Was this translation helpful? Give feedback.
Hi,
Looking at the KMeansAD code, you can see that point anomaly scores are first computed for each moving window based on the distance to a cluster center and then point anomaly socre on the original time series are obtained by averaging, using a reverse windowing operation, the point anomaly score of all windows.
You would need to inspect the cluster centers to confirm, but what might be happening is that most cluster centers (the method produce 20 clusters by default if you don't change the parameters) will be scattered on the downward slope, creating subsequences with low point anomaly at the start and high at the end (or reversed). Then the averaging performed would also help produce…