By the same authors

On the locality of Java 8 streams in real-time big data applications

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Published copy (DOI)



Publication details

Title of host publicationJTRES '14
DatePublished - 13 Oct 2014
Number of pages9
PublisherAssociation for Computing Machinery (ACM)
Original languageEnglish
ISBN (Electronic)978-1-4503-2813-5

Publication series

NameACM International Conference Proceeding Series


Typical Big Data frameworks do not consider the architecture of the servers that make up the cluster. However, these computers are increasingly heterogeneous and are based on a ccNUMA architecture. In such architectures, main memory access times differ depending on the core on which access is requested. Hence, as well as locality of data access throughout a cluster of servers, locality of memory access within individual servers can have an impact on performance. Java is a commonly-used language for Big Data applications (through the popularity of Hadoop) and the newlyreleased Java 8 introduces streams to simplify data-parallel programming. However, this paper argues that there are no built-in parallel stream sources that can efficiently operate on very large datasets and take data locality into account. This paper details recent work from the JUNIPER project, an EU Framework 7 Project, which is investigating how the Java 8 platform (augmented by the Real-Time Specification for Java) can be used for real-time Big Data applications. JUNIPER introduces architecture-aware stream sources which are suitable for Big Data systems and which preserve locality of data. Our results show that when reading data from disk, thread affinity can seriously degrade the performance of standard Java streams, but JUNIPER's architecture-aware streams maintain their performance.

Discover related content

Find related publications, people, projects, datasets and more using interactive charts.

View graph of relations