Technical digests from visited conferences

From HP-SEE Wiki

Jump to: navigation, search

Contents

NWS 2012

Networkshop 2012, 11th April 2012, Veszprém, Hungary

Technical digest:

Two vendors (SGI, IBM) gave presentation about their new developments. The SGI introduced the ICE X and Prism XL and IBM demonstrated the iDataPlex DX360 M4 product. Here are some information about them:

SGI ICE X:

  • Intel Xeon Processor E5-2600
  • 5 times processing power density
  • 53 Tflops/rack
  • FDR Infiniband
  • Scalable to 1000s of nodes
  • Variable corresponding cooling systems (cooling via air, cold water and even warm water)
  • Less cables
  • Two 2-socket nodes tilted face to face
  • Closed loop airflow + cellular cooling

More details: http://www.sgi.com/products/servers/ice/x/

SGI Prism XL:

  • GPU system
  • Petaflop in a cabinet
  • Supports NVIDIA and AMD GPU-based accelerators as well as Tilera's TileEncore cards
  • Cuda, OpenCL, and TIlera MDE programming environments can be used
  • AMD Opteron 4100 series, two per stick

More details: http://www.sgi.com/products/servers/prism_xl/

IBM iDataPlex DX360 M4:

  • Water cooling door at the back of the rack
  • Warm water cooling
  • Up to 256GB memory per server
  • Intel Xeon E5-2600 Series processors
  • Hot-swap and redundant power supplies
  • Two Gen-III PCIe Slots plus slotless 10GbE or QDR/FDR10 Infiniband

More details: http://www-03.ibm.com/systems/x/hardware/rack/dx360m4/index.html

7th International Workshop on Parallel Matrix Algorithms and Applications

7th International Workshop on Parallel Matrix Algorithms and Applications (PMAA 2012) 28-30 June 2012, Birkbeck University of London, UK

Technical digest:

The topics of this workshop are relevant to the HP-SEE project and the technology watch not only because it was a forum for an exchange of ideas, insights and experiences in different areas of parallel computing (Multicore, Manycores and GPU) but also because of the well presented stream devoted to Energy aware performance metrics.

Recent years have seen a dramatic change in core technology. Voltage and thus frequency scaling has stopped. Thus, to continue the exponential overall improvements technologies have turned into multi-core chips and parallelism at all scales. With these new trends, a series of new problems arise: how to program such complex machines and how to keep pace with the very fast increase in power requirements. In his invited presentations Dr. Becas (IBM) concentrated on the latter, and the focal point of his recent research was the energy aware performance metrics. It was demonstrated that traditional energy aware performance metrics that are directly derived from the old Flop/sec performance metric have serious shortcomings and can potentially give a completely different picture reality. Instead, it was shown that by optimizing functions of time to solution and energy at the sae time, one can get a much more clear picture. This immediately implies a change in the way we gauge performance of computing systems: we need to abandon the single benchmark, and rather opt for a set of benchmarks, that ae basic kernels with widely different characteristics.

The 11th International Symposium on Parallel and Distributed Computing

The 11th International Symposium on Parallel and Distributed Computing - ISPDC 2012, in conjunction with MAC Summer Workshop 2012, June 25-29, 2012, Leibniz Supercomputing Centre, Munich, Germany

Technical digest:

The topics of the workshop were highly relevant to the HP-SEE project and the technology watch, encompassing issues of hardware, middleware and application software developments. The most important presentations, regarding deployment of advanced hardware in Europe HPC centers, were given by Arndt Bode, chairman of the Board of Directors of Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities, Germany, “Energy efficient supercomputing with SuperMUC” and Wolfgang E. Nagel, Director of the Center for Information Services and High Performance Computing (ZIH), Dresden, Germany, “Petascale-Computing: What have we learned, and why do we have to do something?”

We point out that the plans for the expansion of the Bulgarian supercomputer center also include the procurement of a BlueGene Q system in the future. Many talks presented new advanced methods for software development for the new hardware architectures. For example, the talk by David I. Ketcheson, Assistant Professor, Applied Mathematics, King Abdullah University of Science and Technology, Saudi-Arabia: PyClaw: making a legacy code accessible and parallel with Python presented a general hyperbolic PDE solver that is easy to operate yet achieves efficiency near that of hand-coded Fortran and scales to the largest supercomputers, using Python for most of the code while employing automatically-wrapped Fortran kernels for computationally intensive routines. Several talks presented other developments regarding the use of python in HPC environments.

Another area of active developments is the use of accelerators and more specifically GPUs. Many such talks were presented. We outline the talk by Rio Yokota, Extreme Computing Research Group, KAUST Supercomputing Lab, King Abdullah University of Science and Technology, Saudi-Arabia on Petascale Fast Multipole Methods on GPUs.

Issues and problems and the ways to approach their resolution, regarding the operations of a distributed HPC infrastructure in Europe, were presented by Achim Streit, Director of Steinbuch Centre for Computing (SCC), Karlsruhe Institute of Technology (KIT), Germany, in his talk “Distributed Computing in Germany and Europe and its Future”.

Where he emphasized the suitability of UNICORE middleware. Several talks presented approaches that combine HPC and Cloud usage. A representative from Matlab demonstrated the ease of use of cloud resources from matlab prompt. Another interesting cloud platform was presented by Miriam Schmidberger and Markus Schmidberger: Software Enginee-ring as a Service for HPC, where a prototype platform was presented and a call for testing it and gathering of user / operator requirements was opened, so that HP-SEE project can consider such option.

PRACE 1IP WP9 Workshop

PRACE 1IP Work Package 9 Future Technologies Workshop

Technical digest:

The workshop held in Daresbury, UK, included presentations from PRACE (http://www.prace-ri.eu/) members on the latest progress on work in Work Package 9 tasks, and presentations from external speakers on technology developments key to the delivery of HPC systems in 2014 and beyond.

Software

Tools and libraries

  • The talk about the advanced debugging techniques “Debugging for task-based programming models” covered new concepts in debugging parallel, task-based applications with Temanejo tool. More information about Temanejo can be found at http://temanejo.wordpress.com/
  • Presentation about Enabling MPI communication between GPU devices using MVAPICH2-GPU summarized CSCS WP 9.2.C efforts, which included deployment and evaluation of MVAPICH2-GPU on the CSCS iDataPlex cluster, evaluation of InfiniBand routing schemes on the 2-D partition of the IDataPlex cluster, Integration of rCUDA (remote CUDA GPU virtualization interface) with SLURM resource management system and JSC I/O prototype evaluations. More about MVAPICH2-GPU can be read at http://mvapich.cse.ohio-state.edu/performance/gpu.shtml

New programming languages and models

  • "Programming heterogeneous systems with OpenACC directives" gave example of parallelizing OpenMP code with OpenACC, compared performance results between OpenMP and OpenACC and pointed out how much impact on performance data locality can have. More information about OpenACC can be found at http://www.openacc-standard.org/
  • Talk titled "Experiments with porting to UPC" introduced Unified Parallel C (UPC) and presented experiences with porting applications to UPC and results and observations of experiments executed. More about UPC compilers and ported software can be found at following urls:

System software, middleware and programming environments

  • Rogue Wave presented a talk titled “Productivity and Performance in HPC”, where they described their new performance analysis and optimization tool, called ThreadSpotter, which features automatic finding of possible optimizations and context-driven manual how to perform such optimizations. More information about ThreadSpotter can be found at http://www.roguewave.com/products/threadspotter.aspx
  • In the talk about scheduling for heterogeneous resource system, Can Özturan (Bogazici University, Turkey) presented a simulator for testing scheduling algorithm, formulation of integer programming (IP) problem with implementation of IP based scheduling SLURM plugin, and tests by SLURM simulation. The code for the SLURM plugin is available at: http://code.google.com/p/slurm-ipsched/
  • The talk titled “System software components for production quality large-scale hybrid cluster for industry and academia” presented a set of system software components, among them HNagios, a customized version of Nagios for HPC, GPFS, a parallel file system, which is compared with LUSTRE, and PBSPro, a resource manager with focus on GPUs.
  • Presentation from TU Dresden, about challenges for performance support environments, covered definition of performance and key terms that define it and challenges that software tools face, with an overview of challenges specific for monitoring, storage and presentation tools. It demonstrated how Vampir (http://www.vampir.eu/) handles such challenges and how it compares or integrates with other tools such as TAU (http://tau.uoregon.edu/) and Scalasca (http://www.scalasca.org/).

Hardware

Green HPC (energy efficiency, cooling)

  • JKU presented experimental evaluation of an FPGA accelerated prototype, comparing stream and dataflow architecture and energy efficiency in comparison with multicores.
  • Presentation of “Measuring Energy-to-Solution on CoolMUC, a hot water cooled MPP Cluster” by LRZ, assessed usage of warm water cooling and possible benefits.
  • BSC gave Report on Energy2Solution prototype, which has shown current progress on an ARM+GPU prototype.
  • Talk about Cooling and Energy, from PSNC, explained why cooling matters, by showing the yearly costs and proportions between energy input and generated heat, gave overview of various methods of cooling and gave suggestions about using air for isolation rather than cooling.
  • In presentation titled Electricity in HPC Data centre, from PSNC, general information about the electrical power distribution were given along with results of a survey among 15 HPC Data centres across Europe, presenting running costs, and providing some recommendations:
    • Secure at least two independent main supplies
    • Use MV power lines for small/middle size Data Centres and HV ( 110 - 137 kV ) power lines for larger one
    • Assume modular design
    • Always ensure future expansion (of 20 to 25 percent spare capacity)
    • Use low-loss equipment
    • Use distributed UPS ( ultra capacitors ) and ATS switch instead of Static UPS
    • Use USP + Diesel backup and „N+1” redundancy only for „critical systems”
    • Maintenance is a key to safe and smooth HPC Data Center operation (important to have scheduled and detailed maintenance and operational plan as well as technical condition assessment at least once per year)
    • Importance of safety (providing a comprehensive training for the staff)
    • Allways try to negotiate energy price

Processor architectures

  • Evaluation of hybrid CPU/GPU cluster, by CaSTorC, presented results of evaluating prototype GPU cluster, which included investigation of programming models, paradigms and techniques for multi-GPGPU programming, new developments in interconnection of GPUs and evaluation of power efficiency in terms of energy required to solve a given problem.
  • Presentation of Evaluation of AMD GPUs, from PSNC, gave evaluation of new AMD Accelerated Processing Units (APU) which are x86 cores with GPU on single chip, concluding that Zero-copy worked great for tightly coupled gpu-cpu code.

Interconnect technologies

  • The talk about evolution and perspective of topology based interconnect, from Eurotech (http://www.eurotech.com/), gave an overview of copper and optical interconnect technology, followed by coverage of interconnect topologies focusing on fat tree and 3D-torus. The rest of the talk covered 3D-torus in FPGA, IB-based 3D torus and next gen FPGA-based 3D torus.
  • Presentation from CEA and CINES, titled Exascale I/O prototype evaluation results, has shown results of benchmarking various storage systems, presenting its findings about Xyratex RAID engine and embedded server I/O model. More information about Xyratex can be found at http://www.xyratex.com/

Memory and storage units

  • The presentation by Aad van der Steen, titled "An (incomplete) survey of future memory technologies", surveyed a number of memory technologies, that might help ease speed mismatch between cpu and memory, and have better cost, durability, reliability, lower power consumption, size, and being nonvolatile. Survey covered P(C)RAM, MRAM, Memristors (RRAM), Racetrack memory and graphene memory

Miscellaneous

SC 2012

The International Conference for High Performance Computing 2012, 10-16 November 2012, Salt Lake City, Utah

Technical digest:

The fastest supercomputer of top500 is Titan and the greenest supercomputer of green500 is Beacon:

I. Software

a) Tools and libraries

  • The rCUDA middleware provides applications with the illusion of dealing with a real local GPU. This solution virtualizes remote CUDA-compatible devices, enabling concurrent access to them in a way that is completely transparent to programmers and applications. Website: http://rcuda.net
  • There are over a hundred GPU-accelerated applications and growing. More details: http://www.nvidia.com/TeslaApps

b) System software, middleware and programming environments

  • The innovative Versatile SMP™ (vSMP) architecture aggregates multiple x86 systems into a single virtual x86 system, delivering an industry-standard, high-end symmetric multiprocessor (SMP) computer. Using software to replace expensive custom hardware and components, ScaleMP offers a new, revolutionary computing paradigm. This can be a good alternative for the SMP/NUMA architecture. Webpage: http://www.scalemp.com
  • ScienceSoft is an initiative to assist scientific communities in finding the software they need, to promote the development and use of open source software for scientific research and provide a one-stop-shop to match user needs and software products and services. Website: http://sciencesoft.web.cern.ch

II. Hardware

a) Green HPC (energy efficiency, cooling)

  • More energy-efficient equipment
  • More efficient cooling which is disposed closer to the chips

b) Processor architectures

  • Increased the usage of GPU by all vendors (fastest supercomputer (Titan) also has GPUs)
  • Other coprocessor types also come to the fore. Intel Xean Phi has great support from the vendors. More details: http://www.intel.com/xeonphi

c) Interconnect technologies

  • The usage of 10G/100G Ethernet is increasing beside the Infiniband interconnect technology

d) Memory and storage units

  • The SSD and flash disks are becoming more popular
  • More energy saving memories, example: Samsung 4th generation green memory, More details: http://www.samsung.com/greenmemory
  • Interesting initiative: Hybrid Memory Cube Consortium. The goal is to facilitate HMC Integration into a wide variety of systems, platforms and applications by defining an adoptable industry-wide interface that enables developers, manufacturers and enablers to leverage this revolutionary technology. More details: http://hybridmemorycube.org

ISC 2012

International Supercomputing Conference 2012, 17-21 June 2012, Hamburg, Germany

Technical digest:

Good overview of new technologies by HPC vendors: http://lecture2go.uni-hamburg.de/konferenzen/-/k/13747

Personal tools