Abstract
Efficient resource and application management
is one of the most complex and challenging tasks in high
performance computing. Large-scale computing systems
that contain hundreds, thousands or even millions of cores
demand solutions that can operate in a distributed, robust,
and scalable fashion. However, while hardware parallelism
is relatively straight forward to achieve, this is not generally
the case for software. This leads to under-utilization of the
hardware parallelism as well as imbalanced load distribution
causing inefficiency and hotspots. In response to this challenge,
this paper introduces a novel distributed and decentralized
run-time management algorithm. The proposed method is
guided by an optimization model inspired by artificial bee
colonies (ABC). While ABC have proven useful for optimizing
large sets of numerical test functions, this is the first time they
are applied in the context of many-core system management.
The initial result shows that, the ABC model is promising in
context of run-time management for many-core systems. It is
also anticipated that the algorithms bio-inspired foundations
will inherently enable scalability, reliability, and adaptation.
We are showing initial experiments, where the initial results
indicate the capability of our model to improve the thermal
distribution across the system.
is one of the most complex and challenging tasks in high
performance computing. Large-scale computing systems
that contain hundreds, thousands or even millions of cores
demand solutions that can operate in a distributed, robust,
and scalable fashion. However, while hardware parallelism
is relatively straight forward to achieve, this is not generally
the case for software. This leads to under-utilization of the
hardware parallelism as well as imbalanced load distribution
causing inefficiency and hotspots. In response to this challenge,
this paper introduces a novel distributed and decentralized
run-time management algorithm. The proposed method is
guided by an optimization model inspired by artificial bee
colonies (ABC). While ABC have proven useful for optimizing
large sets of numerical test functions, this is the first time they
are applied in the context of many-core system management.
The initial result shows that, the ABC model is promising in
context of run-time management for many-core systems. It is
also anticipated that the algorithms bio-inspired foundations
will inherently enable scalability, reliability, and adaptation.
We are showing initial experiments, where the initial results
indicate the capability of our model to improve the thermal
distribution across the system.
Original language | English |
---|---|
Title of host publication | 2018 IEEE Symposium Series on Computational Intelligence (SSCI) |
Place of Publication | USA |
Publisher | IEEE |
Pages | 1084-1091 |
Number of pages | 8 |
ISBN (Electronic) | 978-1-5386-9276-9 |
ISBN (Print) | 978-1-5386-9277-6 |
DOIs | |
Publication status | Published - 30 Nov 2018 |
Event | 2018 SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE - Sheraton Grand Bangalore Hotel @ Brigade Gateway, Bengaluru, India Duration: 18 Nov 2018 → 21 Nov 2018 http://ieee-ssci2018.org/ |
Conference
Conference | 2018 SYMPOSIUM SERIES ON COMPUTATIONAL INTELLIGENCE |
---|---|
Abbreviated title | IEEE-SSCI 2018 |
Country/Territory | India |
City | Bengaluru |
Period | 18/11/18 → 21/11/18 |
Internet address |
Bibliographical note
© IEEE, 2018. This is an author-produced version of the published paper. Uploaded in accordance with the publisher’s self-archiving policy. Further copying may not be permitted; contact the publisher for details.Keywords
- Many-core system
- Bio-inspired Hardware
- bee colony
- run-time management