TY - JOUR
T1 - Improving the predictability of distributed stream processors
AU - Basanta-Val, Pablo
AU - Fernández-García, Norberto
AU - Wellings, Andy J.
AU - Audsley, Neil C.
PY - 2015/6/4
Y1 - 2015/6/4
N2 - Abstract Next generation real-time applications demand big-data infrastructures to process huge and continuous data volumes under complex computational constraints. This type of application raises new issues on current big-data processing infrastructures. The first issue to be considered is that most of current infrastructures for big-data processing were defined for general purpose applications. Thus, they set aside real-time performance, which is in some cases an implicit requirement. A second important limitation is the lack of clear computational models that could be supported by current big-data frameworks. In an effort to reduce this gap, this article contributes along several lines. First, it provides a set of improvements to a computational model called distributed stream processing in order to formalize it as a real-time infrastructure. Second, it proposes some extensions to Storm, one of the most popular stream processors. These extensions are designed to gain an extra control over the resources used by the application in order to improve its predictability. Lastly, the article presents some empirical evidences on the performance that can be expected from this type of infrastructure.
AB - Abstract Next generation real-time applications demand big-data infrastructures to process huge and continuous data volumes under complex computational constraints. This type of application raises new issues on current big-data processing infrastructures. The first issue to be considered is that most of current infrastructures for big-data processing were defined for general purpose applications. Thus, they set aside real-time performance, which is in some cases an implicit requirement. A second important limitation is the lack of clear computational models that could be supported by current big-data frameworks. In an effort to reduce this gap, this article contributes along several lines. First, it provides a set of improvements to a computational model called distributed stream processing in order to formalize it as a real-time infrastructure. Second, it proposes some extensions to Storm, one of the most popular stream processors. These extensions are designed to gain an extra control over the resources used by the application in order to improve its predictability. Lastly, the article presents some empirical evidences on the performance that can be expected from this type of infrastructure.
KW - Distributed stream processing
KW - Predictable infrastructure
KW - Real-time
UR - http://www.scopus.com/inward/record.url?scp=84930635178&partnerID=8YFLogxK
U2 - 10.1016/j.future.2015.03.023
DO - 10.1016/j.future.2015.03.023
M3 - Article
AN - SCOPUS:84930635178
SN - 0167-739X
VL - 52
SP - 22
EP - 36
JO - Future generation computer systems
JF - Future generation computer systems
M1 - 2742
ER -