
Homegrown China Supercomputer 'Sunway BlueLight MPP' Hits One Petaflop with Lower Power 神威蓝光千万亿次计算机系统
China is making strides in the technological world by producing its own semiconductor chips for the latest Chinese supercomputer, officials announced earlier this week.
The supercomputer, known as the Sunway BlueLight MPP (Chinese: 神威蓝光), was installed in September at the National Super Computer Center in Jinan, the capital of Shandong Province in eastern China.
The Sunway system, which can perform about 1,000 trillion calculations per second — a petaflop — will probably rank among the 20 fastest computers in the world. More significantly, it is composed of 8,700 ShenWei SW1600 microprocessors, designed at a Chinese computer institute and manufactured in Shanghai. The ShenWei Sunway BlueLight MPP has 150TB of main storage and 2PB of external storage. Each ShenWei SW1600 processor is 64-bit, has 16-cores and is RISC-based.
Currently, the Chinese are about three generations behind the state-of-art chip making technologies used by world leaders such as the United States, South Korea, Japan and Taiwan.
China’s creation of its own chips was viewed as an astounding action.

China’s most powerful supercomputer, the 2.5-petaflop Tianhe-1A, used 2,048 Chinese-developed FT1000 processors it got the bulk of its performance from 7,168 Nvidia Tesla GPUs and 14,336 Intel CPUs
Last fall, another Chinese-based supercomputer, the 2.5-petaflop Tianhe-1A, created an international sensation when it was briefly ranked as the world’s fastest, before it was displaced in the spring by a rival Japanese machine, the K Computer, designed by Fujitsu. But the Tianhe was built from processor chips made by American companies, Intel and Nvidia, though its internal switching system was designed by Chinese engineers. Similarly, the K computer was based on Sparc chips, originally designed at Sun Microsystems in Silicon Valley.
Sunway’s theoretical peak performance was about 74 percent as fast as the fastest United States computer — the Jaguar supercomputer at the Department of Energy facility at Oak Ridge National Laboratory, made by Cray Inc and uses microprocessors from Advanced Micro Devices Inc.
. That machine eats up about seven megawatts to output roughly 1.7 petaflops of processing performance and is currently the third fastest on the list.
The Department of Energy is planning three supercomputers that would run at 10 to 20 petaflops. And the United States is embarking on an effort to reach an exaflop, or one million trillion mathematical operations in a second, sometime before the end of the decade, though most computer scientists say the necessary technologies do not yet exist.
To build such a computer from existing components would require immense amounts of electricity — roughly the amount produced by a medium-size nuclear power plant.
The computer is power-efficient, consuming a megawatt of power when running, compared to seven megawatts for the US’s fastest computer, Jaguar, which is capable of 1.7 petaflops. This is partly due to an advanced water cooling system. If true, that would be less than half power used by the one petaflop Blue Gene/P JUGENE system in Germany, one of the most energy efficient CPU-based supercomputers in production today. The Tianhe supercomputer consumes about four megawatts and the Jaguar about seven.
The ShenWei microprocessor appears to be based on some of the same design principles that are favored by Intel’s most advanced microprocessors, according to several supercomputer experts in the United States.
But there is disagreement over whether the machine’s cooling technology is appropriate for designs that will be required by the exaflop-class supercomputers of the future.
China is also committed to employing its latest Godson processors in supercomputers. In February, at the International Solid State Circuits Conference (ISSCC), Godson lead engineer Weiwu Hu said the Godson-3B will power the 300-teraflop Dawning machine that was scheduled to be deployed over the summer.
There may be even more of this kind of news on the horizon. According to a report from CPU World back in March, besides the Godson-based and ShenWei-based systems, another design based on something called “Yinhe” will be used in a supercomputer before the end of 2011. The CPU World report attributes both the ShenWei and Yinhe designs to the Jiangnan Institute of Computing Technology and National University of Defense Technology.
Photos of the new Sunway supercomputer reveal an elaborate water-cooling system that may be a significant advance in the design of the very fastest machines.
China’s goal is to be able to deploy an exaflop-class system, using China-built chips, by 2020. The U.S. is hoping to reach an exaflop by 2019 through upgrades to its Jaguar – soon-to-be “Titan” – supercomputer, and Europe expects to reach its exaflop goal within a similar timeframe.
Ranking – World’s Top Ten Fastest Supercomputers
1)Fujitsu K computer ( Japan, June 2011 – present)
2)NUDT Tianhe-1A ( China, November 2010 – June 2011)
3)Cray Jaguar ( United States, November 2009 – November 2010)
4)IBM Roadrunner ( United States, June 2008 – November 2009)
5)IBM Blue Gene/L ( United States, November 2004 – June 2008)
6)NEC Earth Simulator ( Japan, June 2002 – November 2004)
7)IBM ASCI White ( United States, November 2000 – June 2002)
8Intel ASCI Red ( United States, June 1997 – November 2000)
9)Hitachi CP-PACS ( Japan, November 1996 – June 1997)
10)Hitachi SR2201 ( Japan, June 1996 – November 1996)
神威蓝光:全国产化的超级计算机问世
在刚刚发布的《2011年中国高性能计算机TOP100排行榜》中,排名第二的神威蓝光(Sunway BlueLight MPP)受到与会业界专家的广泛关注,该机器获得科技部863计划支持,由国家并行计算机工程技术研究中心制造,于2011年9月安装于国家超算济南中心,全部采用自主设计生产的CPU(ShenWei processor SW1600),按照MPP万万亿次架构设计,系统共8704个CPU,峰值1.07016PFlops,持续性能795.9TFlops, Linpack效率74.37%,总功耗1074KW。其最大特点是核芯处理器全部采用国产CPU申威1600处理器。国家超级计算济南中心是科技部批准成立的全国3个千万亿次超级计算中心之一,由山东省科学院计算中心负责建设、管理和运营。
落户国家超级计算济南中心的神威蓝光高效能计算机,是国内首台全部采用国产中央处理器(CPU)和系统软件构建的千万亿次计算机系统,标志着我国成为继美国、日本之后第三个能够采用自主CPU构建千万亿次计算机的国家。

神威蓝光拥有四大特点:一是全部采用国产的CPU;二是Linpack效率高达74.4%,而一般的千万亿次机都在50%左右;三是采用液冷技术,节能;四是高密度,在一个机仓(机柜)里可以装入1024颗CPU,千万亿次规模仅需要9个这样的机仓
神威蓝光拥有四大特点:一是全部采用国产的CPU;二是Linpack效率高达74.4%,而一般的千万亿次机都在50%左右;三是采用液冷技术,节能;四是高密度,在一个机仓(机柜)里可以装入1024颗CPU,千万亿次规模仅需要9个这样的机仓。
神威蓝光的计算机节点,在1U高的机箱中可以放入4个CPU板,每个板上可以装两颗16核的CPU。
神威蓝光使用的CPU名叫申威1600,拥有16个核,采用的是RISC架构,主频在1GHz上下
神威蓝光使用的CPU名叫申威1600,拥有16个核,采用的是RISC架构,主频在1GHz上下
装有两颗申威1600的CPU板
高密度设计:在一个机仓(机柜)里可以装入1024颗CPU,千万亿次规模仅需要9个这样的机仓
在计算节点中采用液冷(据说是使用500元1吨的纯净水)设计也是神威蓝光的一大技术特色,中间是铝制液冷散热板






里可以装入1024颗CPU,千万亿次规模仅需要9个这样的机仓.jpg)
设计也是神威蓝光的一大技术特色,中间是铝制液冷散热板.jpg)



