Temporal Aspects in Stream Active Learning

Daniel Kottke, Georg Krempl, Myra Spiliopoulou

Knowledge Management and Discovery Lab
Otto-von-Guericke-University Magdeburg, Germany

daniel.kottke@ovgu.de

www.daniel.kottke.eu/talks/2016_DAGSTAT

Pool vs. Stream Learning

  • Spatial components is extended with temporal information
    • Classification model might change (Drift)

  • Fast, endless instance generation
    • Efficient (on-line) algorithms required

  • Applications:
    • Data from twitter, sensors, bank transactions

Spatial and temporal selection

 

Spatial and temporal selection

 

Spatial and temporal selection

Measure spatial usefulness

Spatial and temporal selection

 

Spatial and temporal selection

Choose the most useful instances based on the spatial value

Spatial Selection Methods


Uncertainty Sampling [1]


Chooses instances with highest uncertainty, i.e. near the decision boundary, based on the posterior probabilities

\[ \mathrm{argmin}_x \left( \mathrm{max}_y \big( P(y \mid x) \big) \right) \]

[1] "Active learning with drifting streaming data", by I. Zliobaite, I., A. Bifet, B. Pfahringer, G. Holmes.
IEEE Transactions on Neural Networks and Learning Systems, 25(1), 2014.


Probabilistic Active Learning [2]

Evaluates each labeling candidate in its neighborhood using its label statistics:
  • Posterior probability (\(\hat{p}\))
  • Number of labels (\(n\))
  • Density (\(d\))
\[ \mathrm{argmax} \left( d \cdot \mathtt{E}_{p}\bigg[ \mathtt{E}_{y} \left[ \mathrm{gain}_{p}((n, \hat{p}),y) \right] \bigg] \right) \]

[2] "Probabilistic Active Learning: Towards Combining Versatility, Optimality and Efficiency",
by G. Krempl, D. Kottke, M. Spiliopoulou. Discovery Science, Bled, 2014. Springer.

Temporal Selection

Problem Specification

  • Input: Single-value stream of spatial usefulness values
     
  • Task: Select the best b % of this value stream instantly
  • Additional Policy:
    • Difference of current and target budget should not exceed a given tolerance window (BIQF)
    • Budget should not be exceeded (Adaptive Threshold)

Incremental Quantile Filter

Balancing

  • Tolerance window (\(w_\textrm{tol}\)):
    maximal difference between current and the target budget
  • If there are label acquisitions left (\(acq_\textrm{left}\) > 0)
    \(\rightarrow\) decrease threshold \(\theta\) (and vice versa)
     

\[ \theta_\textrm{bal} = \theta - \Delta \cdot \frac{acq_\textrm{left}}{w_\textrm{tol}} \]
  • \(\theta_\textrm{bal}\) - Balanced threshold
  • \(\theta\) - IQF acquisition threshold
  • \(\Delta\) - Data range of IQF window
  • \(w_\textrm{tol}\) - Tolerance window size

Drift in spatial usefulness

  • Usefulness values might drift as the underlying
    distribution drifts

Fast Trend correction (FTCIQF)


  • Interpolate trend with linear function
  • Curve fitting using incremental mean \(\mu\) and standard deviation \(\sigma\)
  • Transforms usefulness values into the standard score
\(\qquad u' = \frac{u-\mu}{\sigma}\)

Evaluation

Experimental Settings

  • Algorithms:
    • dPAL + BFTCIQF
    • dPAL + BIQF [3]
    • Variable Uncertainty (VarUncer) [1], Split [1]
    • Random
  • 10 fold cross-validation
  • Different budgets  \(b \in \{0.02, 0,05, 0.1, 0.2, 0.3, 0.5\}\)
  • BIQF parameter:  \(w = 100, w_{\mathrm{tol}} = 50\)

[1] "Active learning with drifting streaming data", by I. Zliobaite, I., A. Bifet, B. Pfahringer, G. Holmes.
IEEE Transactions on Neural Networks and Learning Systems, 25(1), 2014.
[3] "Probabilistic Active Learning in Data Streams", by D. Kottke, G. Krempl, M. Spiliopoulou.
Symposium on Intelligent Data Analysis, Saint-Etienne, 2015.

Evaluation of Active Learning

Randomize BFTCIQF with dPAL

Randomize BFTCIQF with Uncertainty Sampling

Conclusion

  • IQF strategies successfully select the best instances temporally based on usefulness values
  • FTCIQF improves dPAL
  • FTCIQF is not beneficial for Uncertainty Sampling
     

Future Work

  • Extend dPAL with Optimised Probabilistic Active Learning [4]

[4] "Optimised Probabilistic Active learning", by G. Krempl, D. Kottke, V. Lemaire
Machine Learning, 2015.

Thank you for your attention!

Slides, Paper, Bibtex:
www.daniel.kottke.eu/talks/2016_DAGSTAT


Supplemental material:
kmd.cs.ovgu.de/res/pal/

Temporal Aspects of Stream Active Learning
Daniel Kottke, Georg Krempl, Myra Spiliopoulou
Tagung der Deutschen Arbeitsgemeinschaft Statistik (DAGSTAT)
Göttingen, Germany, 2016.

dPAL and high budgets

Uncer and BFTCIQF