Improving the predictability of distributed stream processors

Pablo Basanta-Val*, Norberto Fernández-García, Andy J. Wellings, Neil C. Audsley

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Abstract Next generation real-time applications demand big-data infrastructures to process huge and continuous data volumes under complex computational constraints. This type of application raises new issues on current big-data processing infrastructures. The first issue to be considered is that most of current infrastructures for big-data processing were defined for general purpose applications. Thus, they set aside real-time performance, which is in some cases an implicit requirement. A second important limitation is the lack of clear computational models that could be supported by current big-data frameworks. In an effort to reduce this gap, this article contributes along several lines. First, it provides a set of improvements to a computational model called distributed stream processing in order to formalize it as a real-time infrastructure. Second, it proposes some extensions to Storm, one of the most popular stream processors. These extensions are designed to gain an extra control over the resources used by the application in order to improve its predictability. Lastly, the article presents some empirical evidences on the performance that can be expected from this type of infrastructure.

Original languageEnglish
Article number2742
Pages (from-to)22-36
Number of pages15
JournalFuture generation computer systems
Volume52
DOIs
Publication statusPublished - 4 Jun 2015

Keywords

  • Distributed stream processing
  • Predictable infrastructure
  • Real-time

Cite this