Abstract
Many emerging Big Data applications involve realtime classification in which data instances arriving sequentially over time need to be classified based on their feature vectors. A common and implicit assumption in existing works is that the features become available instantly with the instance and simultaneously with each other, which, however, rarely holds in practice. Instead, features of an instance may experience various random delays to be available. In such scenarios, an important trade-off emerges between accurate classification and timely classification. In this paper, we provide a first formulation of this important problem and propose efficient online algorithms, namely DAlay-aware Real-time Classification (DARC) algorithms, that maximize the classification accuracy given an average classification delay constraint. The algorithms are developed based on the Lyapunov stochastic optimization technique which provides strong performance guarantee. Numerical results on an intrusion detection dataset are provided to show the effectiveness of the proposed algorithm.