National Energy Research Scientific Computing Center

From Wikipedia, the free encyclopedia
Shyh Wang Hall, which houses the National Energy Research Scientific Computing Center at Lawrence Berkeley National Laboratory.

The National Energy Research Scientific Computing Center, or NERSC, is a high performance computing (supercomputer) user facility operated by Lawrence Berkeley National Laboratory for the United States Department of Energy Office of Science. As the mission computing center for the Office of Science, NERSC houses high performance computing and data systems used by 7,000 scientists at national laboratories and universities around the country. NERSC's newest and largest supercomputer is Cori,[1] which was ranked 5th on the TOP500 list of world's fastest supercomputers in November 2016.[2] NERSC is located on the main Berkeley Lab campus in Berkeley, California.

History[]

NERSC was founded in 1974 as the Controlled Thermonuclear Research Computer Center, or CTRCC, at Lawrence Livermore National Laboratory, The center was created to provide computing resources to the fusion energy research community and began with a Control Data Corporation 6600 computer (SN-1). The first machine procured directly by the center was a CDC 7600, installed in 1975 with a peak performance of 36 megaflop/s (36 million floating point operations per second). In 1976, the center was renamed the National Magnetic Fusion Energy Computer Center.

Subsequent supercomputers included a Cray-1 (SN-6), which was called the "c" machine, installed in May 1978, and in 1985 the world's first Cray-2 (SN-1), which was the "b" machine, nicknamed "Bubbles" because of the bubbles visible in the fluid of its unique direct liquid cooling system. In 1983, the center began providing a small portion of its resources to researchers outside the fusion community. As the center increasingly supported science across many research areas, it changed its name to the National Energy Research Supercomputer Center in 1990.

In 1995, the Department of Energy (DOE) made the decision to move NERSC from LLNL to Lawrence Berkeley National Laboratory. A cluster of Cray J90 systems was installed in Berkeley before the main systems at Livermore were shut down for the move in 1996, thus ensuring continuous support for the research community. As part of the move, the center was renamed the National Energy Research Scientific Computing Center, but kept the NERSC acronym. In 2000, NERSC moved to a new site in Oakland to accommodate the growing footprint of air-cooled supercomputers.

In November 2015, NERSC moved back to the main Berkeley Lab site and is housed in Shyh Wang Hall.[3] As with the move from LLNL, a new system was first installed in Berkeley before the machines in Oakland were taken down and moved.

Computers[]

Cori-Cray Supercomputer at NERSC.

To reflect NERSC's mission to support scientific research, the center names its major systems after scientists. The center is located in Shyh Wang Hall, one of the nation's most energy-efficient supercomputer facilities. The building was financed by the University of California which manages Berkeley Lab for the U.S. Department of Energy (DOE). The utility infrastructure and computer systems are provided by DOE.

The newest supercomputer Perlmutter, is named in honor of Saul Perlmutter, an astrophysicist at Berkeley Lab who shared the 2011 Nobel Prize in Physics for his contributions to research showing that the expansion of the universe is accelerating. It is a Cray system based on the Shasta architecture, with Zen 3 based AMD Epyc CPUs ("Milan") and next-generation NVIDIA Ampere GPUs. [4]

NERSC's flagship supercomputer is Cori, a Cray XC40 system with a peak speed of 30 petaflop/s.

Another NERSC supercomputer is CORI, named in honor of Gerty Cori, a biochemist who was the first American woman to receive a Nobel Prize in science. Cori is a Cray XC40 system with 622,336 Intel processor cores and a theoretical peak performance of 30 petaflop/s (30 quadrillion operations per second). Cori was delivered in two phases. The first phase — also known as the Data Partition — was installed in late 2015 and comprises 12 cabinets and more than 1,600 Intel Xeon "Haswell" compute nodes. It was customized to support data-intensive science and the analysis of large datasets through a combination of hardware and software configurations and queue policies.

The second phase[5] of Cori, installed in summer 2016,[6] added another 52 cabinets and more than 9,300 nodes with second-generation Intel Xeon Phi processors (code-named Knights Landing, or KNL for short), making Cori the largest supercomputing system for open science based on KNL processors. With 68 active physical cores on each KNL and 32 on each Haswell processor, Cori has almost 700,000 processor cores. The two phases of Cori are integrated via the Cray Aries interconnect, which has a dragonfly network topology that provides scalable bandwidth.

Cori features a Burst Buffer based on the Cray DataWarp technology. The Burst Buffer, a 1.5 PB layer of NVRAM storage, sits between compute node memory and Cori's 30-petabyte Lustre parallel scratch file system. The burst buffer provides about 1.5 TB/sec of I/O bandwidth, more than twice that of the scratch file system. NERSC has also added software-defined networking features to Cori to more efficiently move data in and out of the system, giving users end-to-end connectivity and bandwidth for real-time data analysis, and a real-time queue for time-sensitive analyses of data.

NERSC used to run a system called Edison, a Cray XC30 named in honor of American inventor and scientist Thomas Edison, which has a peak performance of 2.57 petaflop/s. Fully installed in 2014, Edison consists of 133,824 compute cores for running scientific applications, 357 terabytes of memory, and 7.56 petabytes of online disk storage with a peak I/O bandwidth of 168 gigabytes (GB) per second. Edison was replaced by Perlmutter. In May 2019, the computer was shutdown and shipped back to Cray.[7]

Other systems at NERSC include:

  • PDSF, a networked distributed computing cluster designed primarily to meet the detector simulation and data analysis requirements of physics, astrophysics and nuclear science collaborations. PDSF is the longest continually operating Linux cluster in the world.
  • Genepool, an Intel-based cluster dedicated to the computing needs of the DOE Joint Genome Institute.
  • A 100 petabyte[8] High Performance Storage System (HPSS) installation for archival storage. In use since 1998, HPSS is a modern, flexible, performance-oriented mass storage system. NERSC was one of the original developers of HPSS, along with five other DOE labs and IBM.

NERSC facilities are accessible through the Energy Sciences Network, or ESnet, which is also managed by Lawrence Berkeley National Laboratory for the Department of Energy.

Projects[]

NERSC staff are leading a number of special projects to advance computational science while also helping prepare the broader research community for the exascale era. Examples are:

NESAP: The NERSC Exascale Science Applications Program is a collaborative effort in which NERSC is partnering with code teams and library and tools developers to prepare critical applications to make the most effective use of Cori's manycore architecture. NESAP represents an important opportunity for researchers to prepare application codes for the new architecture and to help advance the missions of the Department of Energy's Office of Science. The NESAP partnership allows 20 projects to collaborate with NERSC, Cray, and Intel by providing access to early hardware, special training and preparation sessions with Intel and Cray staff. Eight of those 20 will also have an opportunity for a postdoctoral researcher to investigate computational science issues associated with energy-efficient manycore systems.

Shifter: NERSC is working to increase flexibility and usability of its HPC systems by enabling Docker-like Linux container technology. Developed by NERSC staff, Shifter is an open-source software tool based on Docker containers that enables NERSC users to more easily analyze datasets from experimental facilities. Such containers allow an application to be packaged with its entire software stack - including some portions of the base OS files - as well defining needed user environment variables and application "entry point."

HPC4Mfg (High Performance Computing for Manufacturing): NERSC is one of three DOE supercomputing centers working to create an ecosystem that allows experts at DOE's national laboratories to work directly with manufacturing industry members to teach them how to adopt or advance their use of high performance computing (HPC) to address manufacturing challenges with a goal of increasing energy efficiency, reducing environmental impacts and advancing clean energy technologies. The project is led by Lawrence Livermore National Laboratory.

NERSC's user community[]

In 2016, NERSC supported nearly 7,000 active users from universities, national labs and industry who used about 3 billion supercomputing hours. NERSC has users in 49 states across the U.S., as well as in 45 countries around the world.

University researchers accounted for about half of all the computing time used (1.23 million) in 2016, followed by DOE labs (1.51 million), other government labs (157 million), industry (32 million) and non-profits (1 million).

The top 10 research areas (in terms of computing time) are fusion energy, materials science, climate, lattice QCD, chemistry, astrophysics, high energy physics, nuclear physics, computer science and geosciences.

Of the 129 universities using NERSC, the University of California San Diego logs the most compute time (141 million hours) with University of Arizona, Massachusetts Institute of Technology, University of California Berkeley, Princeton University, University of California Los Angeles, University of Kentucky, University of California Irvine, George Washington University and the University of Chicago rounding out the top 10.

Geographically, 5,853 of NERSC's users are in North America, 30 in South America, seven in Africa, 335 in the Middle East/Asia Pacific region and 662 in Europe.

References[]

  1. ^ "Cori Cray XC40 2016". www.nersc.gov. Retrieved 2017-03-23.
  2. ^ "November 2016 | TOP500 Supercomputer Sites". TOP500. Retrieved 2017-03-23.
  3. ^ "Berkeley Lab Opens State-of-the-Art Facility for Computational Science | Berkeley Lab". News Center. 2015-11-12. Retrieved 2018-02-08.
  4. ^ "Perlmutter".
  5. ^ "Cori Intel Xeon Phi (KNL) Nodes". www.nersc.gov. Retrieved 2018-02-09.
  6. ^ "Cori Supercomputer Now Fully Installed at Berkeley Lab". www.nersc.gov. Retrieved 2018-02-09.
  7. ^ "NERSC's Edison Supercomputer to Retire after Five Years of Service".
  8. ^ "About". www.nersc.gov. Retrieved 2018-02-08.

External links[]

Retrieved from ""