Peter J Braam (Braam Research and the University of Cambridge)
Mini symposium website: https://sites.google.com/a/braamresearch.com/wodic/
The Workshop on Data Intensive Computing will focus on the interaction between storage, data movement and data consumption and production in large scale high performance applications.
Three invited lectures will be delivered by:
- Eric Barton, Intel
- Dr Bojan Nikolic, SKA SDP workgroup, University of Cambridge
Prof. Dr. Peter Sanders, KIT, Germany
WODIC welcomes original submissions in a range of areas, including but not limited to:
- I/O systems and middleware and their interactions with applications
- Data formats and protocols
- Studies of data use by applications
- Solutions for future hardware (e.g. non volatile memory, accelerators) in connection with data use by applications.
- Papers should present original research. As Data Intensive Computing spans many disciplines, papers should provide sufficient background material to make them accessible to the broader community.
10:00am - 11:00am : Eric Barton (Intel), DAOS – An Architecture for Exascale Storage
Three emerging trends must be considered when assessing how HPC storage will operate at exascale. First, exascale simulation workflows will greatly expand the volume and complexity of data and metadata to be stored and analysed. Second, ever increasing core and node counts will require corresponding scaling of application concurrency while simultaneously increasing the frequency of hardware failure. Third, new NVRAM technologies will allow storage, accessible at extremely fine grain and low latency, to be distributed across the entire cluster fabric to exploit full cross-sectional bandwidth. This talk describes Distributed Application Object Storage (DAOS) – a new storage architecture that Intel is developing to address the scalability and resilience issues and exploit the performance opportunities presented by these emerging trends.
11:30am - 12:30pm : Dr. Bojan Nikolic (University of Cambridge), Processor for the Square Kilometre Array Telescope
Science Data Square Kilometre Array (SKA) is planned to be, by a large factor, the largest and most sensitive radio telescope ever constructed. In large part it has been made possible by advances in electronics and computing allowing simpler and smaller mechanical structures to record the radio signals in exchange for greater demands on the computing to reconstruct the images. The Science Data Processor (SDP) is the part of the SKA that will take the signal stream of around 10 TeraByte/s from the digital electronics section and turn it into images of the radio sky that are usable for scientific analysis. It is a unique computing challenge because of the combined demands of very high input data rates (~10 TeraByte/s), large computational requirement (around 100 PetaFLOP/s) and the need for iterative algorithms requiring large working storage (around 200PetaByte). The solution needs to wed elements of traditional scientific "high-performance computing" and data-driven parallel work distribution strategies. After reviewing the main drivers of computational requirements for the SDP I will present the architecture of both the hardware and software that has been developed to meet those requirements. The architecture can be summarised as follows:
- A hybrid programming model with coarse-grained dataflow programming with actors which are fine-grained multi-threaded
- Data-locality aware scheduling
- A usual model for dealing with failures and stragglers based on "non-precious" data
- A distributed, throughput optimised storage subsystem for the observed data during the iterative processing stage
I will present the progress in detailed design and prototyping of this architecture and highlight some of the remaining challenges and risks.
Submission deadline: March 24, 2015
Notification of acceptance: April 13, 2015
Mini Symposium: September 2, 2015