Online clustering of parallel data streams-computer science, free research papers


In recent years, the management and processing of so-called data streams has become a topic of active research in several fields of computer science such as, e.g., distributed systems, database systems, and data mining. A data stream can roughly be thought of as a transient, continuously increasing sequence of timestamped data. In this paper, we consider the problem of clustering parallel streams of real-valued data, that is to say, continuously evolving time series. In other words, we are interested in grouping data streams the evolution over time of which is similar in a specific sense. In order to maintain an up-to-date clustering structure, it is necessary to analyze the incoming data in an online manner, tolerating not more than a constant time delay. For this purpose, we develop an efficient online version of the classical K-means clustering algorithm. Our methods effciency is mainly due to a scalable online transformation of the original data which allows for a fast computation of approximate distances between streams.

Free download research paper





Related

COMMENT computer science, free research papers





FREE IEEE PAPER


IEEE PROJECTS IEEE PAPERS 2018 2017 2016 EEE ECE FREE DOWNLOAD PDF COMPUTER SCIENCE NEW IEEE PROJECTS CSE IEEE MINI PROJECTS