Frequency-based Algorithms in TSMD

SetFinder

This algorithm [Bagnall et al. 2014] finds the K-motif sets directly, based on a counting and separating principle. In practice, each subsequence is compared to every other, and the non-overlapping matches are counted. Then, each subsequence with a non-zero count is checked to ensure that its distance to another subsequence with a larger number of matches is greater than a given threshold.

class tsmd.competitors.setfinder.Baseline(n_patterns: int, radius: int, wlen: int, distance_name: str = 'UnitEuclidean', distance_params={}, n_jobs=1)

SetFinder algorithm for motif discovery.

Parameters
  • n_patterns (int) – Number of patterns to detect.

  • radius (float) – Threshold factor for pattern inclusion.

  • wlen (int) – Window length.

  • distance_name (str, optional, default "UnitEuclidean") – Name of the distance.

  • distance_params (dict, optional (default=dict())) – Additional distance parameters.

  • n_jobs (int, optional (default=1)) – Number of jobs.

prediction_mask_

Binary mask indicating the presence of motifs across the signal. Each row corresponds to one discovered motif, and each column to a time step. A value of 1 means the motif is present at that time step, and 0 means it is not.

Type

np.ndarray of shape (n_patterns, n_samples)

find_patterns_()

Identify the most representative motifs based on neighbor counts and distance variability.

fit(signal: ndarray) None

Fit SetFinder

Parameters

signal (numpy array of shape (n_samples, )) – The input samples (time series length).

Returns

self – Fitted estimator.

Return type

object

neighborhood_() None

Compute neighborhoods for all subsequences using parallel computation.

This method divides the task into chunks based on n_jobs, and collects the neighborhood indices and distances for each subsequence.

Returns

self – The updated Baseline instance with idxs_ and dists_ attributes set.

Return type

Baseline

Usage

from tsmd.competitors.setfinder import Baseline
from tsmd.tools.utils import transform_label
from tsmd.tools.plotting import plot_signal_pattern


sf=Baseline(n_patterns=2, radius = 1, wlen = 200, distance_name='UnitEuclidean')
sf.fit(signal)

labels=transform_label(sf.prediction_mask_)
plot_signal_pattern(signal,labels)

SetFinder output

Reference

[Bagnall et al. 2014] Anthony Bagnall, Jon Hills, and Jason Lines. 2014. Finding motif sets in time series.arXiv preprint arXiv:1407.3685(2014).

LatentMotif

This method [Grabocka et al. 2016] addresses a variant of the K-Motifs problem as a constrained optimization task, where the center of the motif is learned (the center doesn’t need to be a subsequence of \(S\) but can belong to any element in \(\mathbb{R}^n\)). The initial objective and constraint functions are regularized to enable gradient ascent. The learned subsequences are then returned as the centers of the motif sets. To identify all occurrences of each motif set, a complete scan of the time series subsequences is conducted. Non-overlapping subsequences within a distance \(R\) of the learned center are considered occurrences of the motif set.

class tsmd.competitors.latentmotifs.LatentMotif(n_patterns: int, wlen: int, radius: float, alpha=1.0, learning_rate=0.1, n_iterations=100, n_starts=1, verbose=False)

SetFinder algorithm for motif discovery.

Parameters
  • n_patterns (int) – Number of patterns to detect.

  • radius (float) – Threshold factor for pattern inclusion.

  • wlen (int) – Window length.

  • alpha (float, optional (default=1.0)) – Regularization parameter.

  • learning_rate (float, optional (default=0.1)) – Learning rate.

  • n_iterations (int, optional (default=100):) – Number of gradient iteration.

  • n_starts (int, optional (default=10):) – Number of trials.

  • verbose (bool, optional (default=False) :) – Verbose.

prediction_mask_

Binary mask indicating the presence of motifs across the signal. Each row corresponds to one discovered motif, and each column to a time step. A value of 1 means the motif is present at that time step, and 0 means it is not.

Type

np.ndarray of shape (n_patterns, n_samples)

fit(signal: ndarray) None

Fit LatentMotif

Parameters

signal (numpy array of shape (n_samples, )) – The input samples (time series length).

Returns

self – Fitted estimator.

Return type

object

Usage

from tsmd.competitors.latentmotifs import LatentMotif
from tsmd.tools.utils import transform_label
from tsmd.tools.plotting import plot_signal_pattern


lm=LatentMotif(n_patterns=2, radius = 10, wlen = 200)
lm.fit(signal)

labels=transform_label(lm.prediction_mask_)
plot_signal_pattern(signal,labels)

LatentMotif output

Reference

[Grabocka et al. 2016] Josif Grabocka, Nicolas Schilling, and Lars Schmidt-Thieme. 2016. Latent time-series motifs. ACM Transactions on Knowledge Discovery from Data (TKDD)11, 1(2016), 1–20.