On Thursday, November 14, 2019 Prof. Dr. Dieter Kranzlmüller, head of the Leibniz Supercomputing Centre (not to be confused with the Leibniz Institute) honored us with a talk about SuperMUC, the most powerful computer in Germany.
The Leibniz Supercomputing Centre provides its services since 1962, which include IT for the universities in Munich, “BayernShare”, a sort of dropbox for Bavarian scientists and the national task of having the strongest computer in Germany. This computer called SuperMUC is available for free to every German scientist, provided their application with a research proposal is accepted. Scientists from other parts of Europe can apply as well, though usually they will not be prioritized. However, SuperMUC is intended to be used for civil applications only, so military projects for example will not be approved, no matter how much money is offered.
The computing power of SuperMUC is mostly used for simulating and modelling things such as earthquakes, blood flow in the human body or the rotation of galaxies. The Leibniz Supercomputing Centre also cooperates with the environmental research station on the Zugspitze, but the supercomputer is also available for less CPU-intensive purposes. One example of this is German TV show “Tatort”, one episode of which the computer played a huge part in. The movie “Snowden” was also shot in the SuperMUC building.
The very building itself is already considered a research project, as there is not that much data regarding building and operating supercomputers. The racks are located in the top-most floor, as this floor does not contain any pillars. The processors are cooled using hot water (40-45°C), which is pumped to the rooftop once it reaches 68°C, where it then cools down and enters the building again. This requires 3,000 liters of water per rack and hour. On an additional note, SuperMUC contains not a single GPU. Still, the power of the so-called first phase is about 3.2 PetaFLOPS and Phase 2 can reach 3.6 PetaFLOPS using only one-fourth of the area and one-third of the electricity of Phase 1, which was then dismantled due to its inefficiency. The next phase, called NextGen, consists of approximately 311,000 cores with a combined maximum power of 26.9 PetaFLOPS. It also has 719 TB of memory. When benchmarking this phase, the goal was to reach the top 10 of the best computers worldwide, which was a task from the government, and then immediately stop in order to save money, as running SuperMUC for an hour generates an electricity bill of 1000-1200€ and its total power consumption equals three times that of Munich. Different benchmarks produce different results: When running a SSSP benchmark, SuperMUC is placed at rank 1 and running LINPACK is a perfect reliability test for the infrastructure, as major faults in key components become evident if the whole system crashes and something smells burnt. But not only faulty components can cause failures, even too much dust on some parts can bring the whole system down, although they were cleaned as often as manufacturers recommended. If the whole power supply suddenly fails, this is not a big deal as long as the blackout lasts less than eleven seconds, as this is the duration the power wheels will continue to spin on their own. On average there is one failure everey nine hours and restarting the system takes about eight hours.
SuperMUC is powered using renewable energies and in order to become more sustainable the heat generated by the processors is used to heat the buildings of the Leibniz Supercomputing Centre. This, however does not nearly exhaust the hot water supply generated by the supercomputer, as this idea cannot be extended to all of Munich due to bureaucratic issues. Should the opportunity present itself Prof. Kranzlmüller considers using the hot water to brew a traditional Bavarian beverage.
We were very glad to receive such varied and detailed insights into the workings of a supercomputer and are thankful for the interesting talk.