Invited Talks

The Changing Software Stack of Extreme-Scale Supercomputers

Pete Beckman

Director, Exascale Technology and Computing Institute, Argonne National Laboratory and the University of Chicago, USA

Abstract:

For more than 20 years the software stack used by large-scale supercomputers has been stable and deployments around the globe have grown. There are two main reasons for this success: a stable message passing programming model and a largely open source software stack that effectively leverages community development. Our stable programming model and the constantly improving software stack have allowed for sustained investments in complex simulation codes. Code teams have worked for decades inventing new algorithms and increasing the performance and scalability of simulation and modeling. However we can all see dramatic changes on the horizon. The basic technology for computing systems is changing. Power efficiency is now a key concern. CPU clock frequencies are no longer increasing. Instead, parallelism within a CPU is multiplying every year. Furthermore, architectures seem to be diverging, dividing the once cohesive software stack. Looking forward, we can see that the new architectures for extreme-scale supercomputers will change our programming models and lead to a new software stack that can respond dynamically to system power.

Pete Beckman

On the Confluence of Exascale and Big Data

Sudip Dosanjh

Director of the National Energy Research Scientific Computing (NERSC) Center, Lawrence Berkeley National Laboratory, USA

Abstract:

Exascale computing has rightly received considerable attention within the high performance computing community. In many fields, scientific progress requires a thousand-fold increase in supercomputing performance over the next decade. Science needs include performing single simulations that span a large portion of an exascale system, as well as high throughput computing. The big data problem has also received considerable attention, but is sometimes viewed as being orthogonal to exascale computing. This talk focuses on the confluence of exascale and big data. Exascale and big data face many similar technical challenges including increasing power/energy constraints, the growing mismatch between computing and data movement speeds, an explosion in concurrency and the reduced reliability of large computing systems. Even though exascale and data intensive systems might have different system-level architectures, the fundamental building blocks will be similar. Analyzing all the information produced by exascale simulations will also generate a big data problem. And finally, many experimental facilities are being inundated with large quantities of data as sensors and sequencers improve at rates that surpass Moore's Law. It is becoming increasingly difficult to analyze all of the data from a single experiment and it is often impossible to make comparisons across data sets. It will only be possible to accelerate scientific discovery if we bring together the high performance computing and big data communities.

Sudip Dosanjh

Challenges for Exascale: Architectures and Workflows for Big Data in Life Sciences

Wolfgang Nagel

Director, Center for Information Services and High Performance Computing (ZIH); Professor for Computer Architecture, Institute for Computer Engineering, Technical University of Dresden, Germany

Abstract:

Over the past years a tremendous increase in research data has been seen. These data originate from a variety of sources - experiments, sensor networks, computer simulations, microscopes, digitized objects, and even from the public, like the world wide web. Challenges in exploring these data arise not only from their sheer quantity but also from the complexity with which they hide the embedded knowledge.

The management of such data needs to be supported by tools. These tools must include both scalable management and scalable access capabilities. Increasing amounts of data not only need to be managed but also need to be stored in and retrieved from the management system in a scalable way. The same is valid for the metadata, which are needed to describe and reuse the research data. Often a data analysis consists of many single steps and might submit thousands or millions of jobs. Therefore, tools for workflow support need more intelligence and must be scalable as well. For example, they must manage data and computing tasks altogether, organizing and balancing the resources needed for both. For resilience reasons, they must be able to realize exceptions/errors and react to them, which means not only restarting or recreating a workflow: they must react on the source of the exception and find ways to circumvent them automatically.

The talk will address challenges expected for managing data from life sciences on exascale systems which will support an extremely large main memory. Because of size and speed, there will be a need to support different kinds of I/O technologies. Depending on dynamic requirements, an intelligent I/O middleware have to migrate data between different storage devices. Integrating the data management into the workflow system finally will ease finding and accessing the data relevant for a task.

Wolfgang Nagel

Performance Analysis Techniques for the Exascale Co-Design Process

Martin Schulz

Computer Scientist at the Center for Applied Scientific Computing (CASC) at Lawrence Livermore National Laboratory (LLNL), USA

Abstract:

Reaching exascale will require substantial advances at all levels of the computational ecosystem: the hardware, the OS, the runtime system, algorithms, as well as the applications themselves. Further, we need to work on these aspects together - individual solutions limited to single layers won't provide the necessary benefits. Following this idea, a wide range of efforts focus on the idea of Co-Design for exascale, including three dedicated Exascale Co-Design centers initiated by the US Department of Energy. A central aspect in any of these Co-Design efforts are techniques to measure, track and analyze a wide range of performance metrics, incl. execution time, memory system behavior, power consumption and the resiliency to faults.

In this talk I will highlight two approaches providing analysis frameworks for exascale efforts and their use in the Co-Design centers: PAVE, a project that investigates a new way of mapping performance data to more intuitive domains and uses advanced visualization techniques to pinpoint problems, and GREMLIN, an exascale evaluation environment capable of emulating expected properties of exascale architectures on petascale machines. Combined, these projects enable us to provide a meaningful introspection into the target applications' characteristics as well as their expected behavior and, more importantly, likely bottlenecks on future generation machines.

Martin Schulz
Last Updated on Wednesday, 08 May 2013 08:00