Tag Archives: supercomputer

World’s fastest supercomputer identifies 77 chemicals that might stop the coronavirus

The Summit supercomputer. Credit: Wikimedia Commons.

The coronavirus crisis is becoming every day as the virus spreads across the world at an exponential rate. However, research is also moving at an accelerated pace — virtually all the world’s foremost scientists and public health experts are rallying together to find solutions to the pandemic.

Using IBM’s Summit — the world’s fastest supercomputer capable of 200 quadrillion calculations per second — researchers at Oak Ridge National Laboratory looked for chemical substances that might interact with the novel coronavirus and stop it from spreading.

Finding what works

Coronaviruses like SARS-CoV-2 — the virus that causes the COVID-19 respiratory illness — infect individuals by hijacking their cells. The commandeered cells then start replicating viral material. Each infected cell can release millions of copies of the virus before the cell finally breaks down and dies. Then, these viruses can infect nearby cells or end up in droplets that escape the lungs (the main site of infection for SARS-CoV-2) through sneezing or coughing, thereby potentially infecting other people.

The coronavirus is named after the crownlike spikes that protrude from its surface. It is through these spikes that the virus injects genetic material into host cells in order to replicate.

Researchers at Oak Ridge led by Micholas Smith employed Summit’s phenomenal computing power to simulate how various atoms and particles in the coronavirus spike chemically react to different compounds.

After thousands of simulations, the researchers identified 77 candidate chemicals that could bind to the spike protein of the coronavirus and block it from hijacking cells.

These chemicals, which were reported in the pre-print server ChemRxiv, could be employed in novel vaccines or antiviral treatments meant to curtail the spread of COVID-19 or even cure the disease.

The researchers plan on performing more simulations as they receive more information about the coronavirus proteins and how it spreads. A more accurate model of the coronavirus’ protein spike appeared this month, and the researchers at Oak Ridge plan on including it for their next simulations.

However, these chemicals will still have to be incorporated into a vaccine or antiviral drug, and then tested in clinical trials — which can last at least six months.

So, despite researchers’ best efforts, it will take time to find treatments that are both efficient and safe. In the meantime, it is important to curb the spread of the virus by practicing social distancing and good hygiene. This whole situation might take a while to unfold, so brace yourself.

A UA-led team of scientists generated millions of different universes on a supercomputer, each of which obeyed different physical theories for how galaxies should form. (Image: NASA, ESA, and J. Lotz and the HFF Team/STScI)

Researchers simulate millions of virtual universes to study star formation

Researchers have turned to a massive supercomputer — dubbed the ‘UniverseMachine’ — to model the formation of stars and galaxies. In the process, they created a staggering 8 million ‘virtual universes’ with almost 10¹⁴ galaxies.

A UA-led team of scientists generated millions of different universes on a supercomputer, each of which obeyed different physical theories for how galaxies should form. (Image: NASA, ESA, and J. Lotz and the HFF Team/STScI)
A UA-led team of scientists generated millions of different universes on a supercomputer, each of which obeyed different physical theories for how galaxies should form. (Image: NASA, ESA, and J. Lotz and the HFF Team/STScI)

To say that the origins and evolution of galaxies and the stars they host have been an enigma that scientists have sought to explore for decades is the ultimate understatement.

In fact, desire to understand how the stars form and why they cluster the way they do, predates science, religion and possibly civilisation itself. As long as humans could think and reason — way before we knew what either a ‘star’ or a ‘galaxy’ was— we looked to the heavens with a desire to have knowledge of its nature.

We now know more than we ever have, but the heavens and their creation still hold mysteries for us. Observing real galaxies can only provide researchers with a ‘snapshot’ of how they appear at one moment. Time is simply too vast and we exist for far too brief a spell to observe galaxies as they evolve.

Now a team of researchers led by the University of Arizona have turned to supercomputer simulations to bring us closer to an answer for these most ancient of questions.

Astronomers have used such computer simulations for many years to develop and test models of galactic creation and evolution — but it only works for one galaxy at a time — thus failing to provide a more ‘universal’ picture.

To overcome this hurdle, Peter Behroozi, an assistant professor at the UA Steward Observatory, and his team generated millions of different universes on a supercomputer. Each universe was programmed to develop with a separate set of physical theories and parameters.

As such the team developed their own supercomputer — the UniverseMachine, as the researchers call it —to create a virtual ‘multiverse’ of over 8-million universes and at least 9.6 x 10¹³ galaxies.

The results could solve a longstanding quirk of galaxy-formation — why galaxies cease forming new stars when the raw material — hydrogen — is not yet exhausted.

The study seems to show that supermassive black holes, dark matter and supernovas are far less efficient at stemming star-formation than currently theorised.

The team’s findings — published in the journal Monthly Notices of the Royal Astronomical Society — challenges many of the current ideas science holds about galaxy formation. In particular, the results urge a rethink of how galaxies form, how they birth stars and the role of dark matter — the mysterious substance that makes up 80% of the universe’s matter content.

Behroozi, the study’s lead author. says: “On the computer, we can create many different universes and compare them to the actual one, and that lets us infer which rules lead to the one we see.”

What makes the study notable is it is the first time each universe simulated has contained 12 million galaxies, spanning a time period of 400 million years after the ‘big bang’ to the present day. As such, the researchers have succeeded in the creation of self-consistent universes which closely resemble our own.

Putting the multiverse to the test — how the universe is supposed to work

To compare each universe to the actual universe, each was put through a series of tests that evaluated the appearance of the simulated galaxies they host in comparison to those in the real universe.

Common theories of how galaxies form stars involve a complex interplay between cold gas collapsing under the effect of gravity into dense pockets giving rise to stars. As this occurs, other processes are acting to counteract star formation.

The Hubble Space Telescope took this image of Abell 370, a galaxy cluster 4 billion light-years from Earth. Several hundred galaxies are tied together by gravity. The arcs of blue light are distorted images of galaxies far behind the cluster, too faint for Hubble to see directly. (Image: NASA, ESA, and J. Lotz and the HFF Team/STScI)

For example, we believe that most galaxies harbour supermassive black holes in their centres. Matter forming accretion discs around these black holes and eventually being ‘fed’ into them, radiate tremendous energies. As such, these systems act almost as a ‘cosmic blowtorch’ heating gas and preventing it from cooling down enough to collapse into stellar nurseries.

Supernova explosions — the massive eruption of dying stars — also contribute to this process. In addition to this, dark matter provides most of the gravitational force acting on the visible matter in a galaxy — thus, pulling in cold gas from the galaxy’s surroundings and heating it up in the process.

Behroozi elaborates: “As we go back earlier and earlier in the universe, we would expect the dark matter to be denser, and therefore the gas to be getting hotter and hotter.

“This is bad for star formation, so we had thought that many galaxies in the early universe should have stopped forming stars a long time ago.”

But what the team found was the opposite.

Behroozi says: “Galaxies of a given size were more likely to form stars at a higher rate, contrary to the expectation.”

Bending the rules with bizarro universes

In order to match observations of actual galaxies, the team had to create virtual universes in which the opposite was the case — universes in which galaxies continued to birth stars for much longer.

Had the researchers created universes based on current theories of galaxy formation — universes in which the galaxies stopped forming stars early on — those galaxies appeared much redder than the galaxies we see in the sky.

Ancient galaxies such as z8_GND_5296 appear red for two reasons; the lack of young blue stars and the stretching in the wavelength of emitted light due to cosmic redshift. (V. Tilvi, Texas A&M University/S.L. Finkelstein, University of Texas at Austin/C. Papovich, Texas A&M University/CANDELS Team and Hubble Space Telescope/NASA)
Ancient galaxies such as z8_GND_5296 appear red for two reasons; the lack of young blue stars and the stretching in the wavelength of emitted light due to cosmic redshift. (V. Tilvi, Texas A&M University/S.L. Finkelstein, University of Texas at Austin/C. Papovich, Texas A&M University/CANDELS Team and Hubble Space Telescope/NASA)

Galaxies appear red for major two reasons. If the galaxy formed earlier in the history of the universe cosmic expansion — the Hubble flow — means that it will be moving away from us more rapidly, causing significant elongation in the wavelength of the light it emits shifting it to the red end of the electromagnetic spectrum. A process referred to as redshift.

In addition to this, another reason an older galaxy may appear red is intrinsic to that galaxy and not an outside effect like redshift. If a galaxy has stopped forming stars, it will contain fewer blue stars, which typically die out sooner, and therefore be left with older — redder — stars.

Behroozi point out that isn’t what the team saw in their simulations, however. He says: “If galaxies behaved as we thought and stopped forming stars earlier, our actual universe would be coloured all wrong.

“In other words, we are forced to conclude that galaxies formed stars more efficiently in the early times than we thought. And what this tells us is that the energy created by supermassive black holes and exploding stars is less efficient at stifling star formation than our theories predicted.”

Computing the multiverse is as difficult as it sounds

Creating mock universes of unprecedented complexity required an entirely new approach that was not limited by computing power and memory, and provided enough resolution to span the scales from the “small” — individual objects such as supernovae — to a sizeable chunk of the observable universe.

Behroozi explains the computing challenge the team had to overcome: “Simulating a single galaxy requires 10 to the 48th computing operations. All computers on Earth combined could not do this in a hundred years. So to just simulate a single galaxy, let alone 12 million, we had to do this differently.”

In addition to utilizing computing resources at NASA Ames Research Center and the Leibniz-Rechenzentrum in Garching, Germany, the team used the Ocelote supercomputer at the UA High-Performance Computing cluster.

Two-thousand processors crunched the data simultaneously over three weeks. Over the course of the research project, Behroozi and his colleagues generated more than 8 million universes.

He explains: “We took the past 20 years of astronomical observations and compared them to the millions of mock universes we generated.

“We pieced together thousands of pieces of information to see which ones matched. Did the universe we created look right? If not, we’d go back and make modifications, and check again.”

Behroozi and his colleagues now plan to expand the Universe Machine to include the morphology of individual galaxies and how their shapes evolve over time.

As such they stand to deepen our understanding of how the galaxies, stars and eventually, life came to be.

Original research:https://academic.oup.com/mnras/article/488/3/3143/5484868


Hewlett Packard supercomputer to be delivered to the ISS next Monday

The ISS is set to get a massive PC upgrade. SpaceX and the  Enterprise are sending a supercomputer up to the station on SpaceX’s next resupply mission, set for Monday.


Image via Pixabay.

As far as opportunities go, ISS certainly does deliver. This space-borne orbital laboratory allowed government and private groups test technology and perform research in microgravity, gave us a testbed for astronaut health in-space, and gave NASA a good toehold for proving technology future deep space missions will need.

Processing power

There is one field of technology, however, that hasn’t received that much love on the ISS — computers. Currently, the station is handled by computers relying mostly on i386 processors which are, to put it mildly, absolute rubbish. It’s not much of a problem however since all of the station’s critical systems are monitored by ground control, who can work with astronauts in real time to fix any problems that might appear.

It starts to become a problem the farther away you go from the Earth, though. If we want to have any chance of sending a human crew beyond the Moon, we’ll need computers powerful enough to operate in a deep space environment without backup from ground control. For starters, because of the longer distances involved, communications will start experiencing delays in excess of half an hour at the more remote points of the mission. When that happens, the crew and its computers will have to be able to deal with any issue that arises.

We’re talking a lot more processing power than a few i386s can churn out. That’s why NASA and Hewlett-Packard Enterprise (HPE) are launching the supercomputer to the ISS — to see how it fares in the cold, zero g environment of outer space. The device will be shuttled Monday aboard SpaceX’s next supply mission to the station.

The 1 teraflop super’computer isn’t that powerful by planetside standards, but it is the most powerful computer to ever make its way into space. It’ll stay there for one year, installed inside a rack in the Destiny module of the space station. It will spend this time powering through an endless series of benchmarks designed to detect if and how the computer’s performance is degraded in space.
An identical copy of the computer will run the same tests in a lab down on Earth to serve as a control.

If everything works out fine, the supercomputer might even stay on the ISS after the experiment to help astronauts in their data-crunching needs, saving up a lot of broadband. Let’s hope the experiment works, so NASA will soon have the computers it needs to send people further into the solar system.


China set to take the lead in supercomputers with ridiculously powerful ‘exascale’ machine slated for 2018

China hopes to usher about breakthroughs in the field of high-performance processors and other key systems by building the first exascale supercomputer said Meng Xiangfei, director of applications at the National Super Computer Tianjin Centre, on Monday.


When they say supercomputer, they mean it. Dubbed Tianhe-3 (meaning ‘Heavenriver-3’), the supercomputer would make any device you’re reading this on cry digital tears of shame. Operating in the ‘exascale’ means that the system will be able to handle a quintillion (1018) calculations each second.

The prototype device is expected to be ready in 2018, with full operational capability scheduled for 2020.

One computer to rule them all

There has been somewhat of a supercomputer ‘arms-race’ going on, with China and the US both vying for supremacy in the computational arena. In 2013, the US could boast 252 of the most powerful 500 supercomputers in the world, dwarfing China’s 66. But the same year China’s Tianhe-2 wrestled for the title of the most powerful supercomputer from Oak Ridge’s Titan, outperforming it by an almost 2-to-1 factor.

Image credits Oak Ridge National Laboratory.

Still, the system was not without its limitations. First of all was its huge power bill. Critics also pointed out to the lack of usable software as its Achilles’ Heel. Since the main drive was on developing the hardware, users would often have to write their own programs.

“It is at the world’s frontier in terms of calculation capacity, but the function of the supercomputer is still way behind the ones in the US and Japan,” Chi Xuebin, deputy director of the Computer Network and Information Centre under the Chinese Academy of Sciences, said about Tianhe-2 in 2014.

“Some users would need years or even a decade to write the necessary code”, he added.

Since then, China has made huge progress. It ranked first for the total number of top supercomputers in June 2016, holding 168 of the top 500 systems and briefly overcoming the US. By November 2016, the two contenders were evenly matched, with 171 systems each. China has the most powerful one in the world — the Sunway TaihuLight, the country’s first supercomputer built with domestically designed processors that clocked in at 125 quadrillion calculations per second — but it’s second by total computational power. You can check out the ebb and flow between the two countries on Top500. It’s pretty cool.

Tianhe-3 is expected to crown the country’s achievements. The delivery date puts China in first place on the exascale race, overtaking the US’s ECP (Exascale Computing Project), which aimed to produce the first device of this kind by 2021, by a full three years prototype-wise and one year by full operational capability.

“Its computing power is on the next level, cementing China as the world leader in supercomputer hardware,” said Meng Xiangfei.

Scientific applications

Sunway Taihulight.
Image credits Top500.

But it’s not only about flexing their industrial muscles. Because supercomputers can crunch calculations that would make mere computers give up and blue-screen, access to this class of devices opens up a lot of possibilities for researchers. Tianhe-1 for example, the first Chinese system to pass the 1-quadrillion (1015) calculations per second mark, is now busy solving more than 1,400 tasks each day, furthering research in fields from biology to astronomy.

Tianhe-3 is expected to be 100 times faster than its grandfather, and ten times as powerful as the Sunway. It will be available for public use and will “help us tackle some of the world’s toughest scientific challenges with greater speed, precision, and scope”, Meng added. It’s already been earmarked to analyze smog distribution throughout China, as current systems can only handle the models on a district-level, China Daily reported.

It will also be powerful enough to simulate earthquake and epidemic patterns in greater detail than ever before, improving the government’s ability to respond to such events, Meng added. Alternatively, it can be used to unravel genetic sequences and protein structures with unprecedented scale and speed — data which can be used to create more efficient medicine in the future, Meng said.

Tianhe-3 will be produced using only domestical sources, with Chinese industry and know-how supplying everything from the processors to the operating system.

OpenAI will use Reddit and a new supercomputer to teach artificial intelligence how to speak

OpenAI, Elon Musk’s artificial intelligence research company, just became the proud owner of the first ever DGX-1 supercomputer. Made by NVIDIA, the rig boasts a whopping 170 teraflops of computing power, equivalent to 250 usual servers — and OpenAI is gonna use it all to read Reddit comments.

OpenAI’s researchers gather around the first AI supercomputer in a box, NVIDIA DGX-1.
Image credits NVIDIA.

OpenAI is a non-profit AI research company whose purpose is to “advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.” And now, NVIDIA CEO CEO Jen-Hsun Huang just delivered the most powerful tool the company has ever had at its disposal, a US$ 2 billion supercomputer.

This “AI supercomputer in a box” isn’t much bigger than a large-ish desktop PC, but it packs a huge punch. It’s 170 teraflops of computing power makes is roughly equivalent to 250 conventional servers working together, and all that oopmh is being put to good use for a worthy cause.

“The world’s leading non-profit artificial intelligence research team needs the world’s fastest AI system,” NVIDIA said in a statement.

“I thought it was incredibly appropriate that the world’s first supercomputer dedicated to artificial intelligence would go to the laboratory that was dedicated to open artificial intelligence,” Huang added.

But an OpenAI needs to do more than just process things very fast. It needs to learn and Musk, unlike my parents, believes that the best place to do so is on the Internet — specifically, on Reddit. The forum’s huge size makes for an ideal training ground for DGX-1, which will be spending the next few months processing nearly two billion comments to learn how to better chat with human beings.

“Deep learning is a very special class of models because as you scale up, they always work better,” says OpenAI researcher Andrej Karpathy.

The $129,000 super-computer relies on eight NVIDIA Tesla P100 GPUs (graphic processing units), 7 terabytes of SSD storage, and two Xeon processors (apart from the aforementioned 170 teraflops of performance) to go through this data and make sense of it all.

“You can take a large amount of data that would help people talk to each other on the internet, and you can train, basically, a chatbot, but you can do it in a way that the computer learns how language works and how people interact,” Karpathy added.

Even better, the new supercomputer is designed to function with OpenAI’s existing software. All that’s needed is to scale it up.

“We won’t need to write any new code, we’ll take our existing code and we’ll just increase the size of the model,” says OpenAI scientist Ilya Sutskever. “And we’ll get much better results than we have right now.”

NVIDIA CEO Jen-Hsun Huang slides open the DGX-1’s GPU tray at OpenAI’s headquarters in San Francisco.
Image credits NVIDIA

The biochip is like a busy city. Image: PNAS

Book-sized biological supercomputer is powered by ATP

A revolutionary new supercomputer powered by  Adenosine triphosphate (ATP), the energy source for every living cell in your body, is ridiculously small and much more efficient than a traditional supercomputer. That’s because instead of electricity, this computer is powered by biological agents. This means it needs little to any cooling, and can be scaled to the size of a book.

The breakthrough was made by a team led by Prof. Nicolau, the Chair of the Department of Bioengineering at McGill, in collaboration with researchers from Germany, Sweden and Holland.

“We’ve managed to create a very complex network in a very small area,” says Dan Nicolau, Sr. with a laugh. He began working on the idea with his son, Dan Jr., more than a decade ago and was then joined by colleagues from Germany, Sweden and The Netherlands, some 7 years ago. “This started as a back of an envelope idea, after too much rum I think, with drawings of what looked like small worms exploring mazes.”

Today, supercomputers can be as big as a warehouse and cost in the range of hundreds of millions of dollars. They also suck a lot of energy, which seems to run directly counter to calls for energy efficiency and conservation. The National Security Agency, for example, has created a mammoth 1.2 million square foot facility in the deserts of Utah that cost at least $1.5 billion and consumes 65 MW of power and 1.7 million gallons of water per day. But are they worth it? You bet.

The biochip is like a busy city. Image: PNAS

The biochip is like a busy city. Image: PNAS

All those silicon powerhouses work in tandem, processing data in parallel computations. Some operations might take your computer years to make, but only minutes on a supercomputer. Take the simulations of the interactions between 64 million atoms that form the HIV capsid, the protein shell that protects the virus’ genetic material. The computations required are simply staggering, but once complete scientists could discover how atoms work together to protect the HIV genetic material. Then you can design drugs, probably modeled using the same supercomputer, that interfere with that cooperation, and the virus can then be neutralized. This process could be significantly streamlined by a biological computer, if it can ever be fully scaled.

Stripped to its bear essence, the biological supercomputer looks like the road map of a busy city, where vehicles of different sizes and power move along designated roads (channels). In this case, the city is a 1.5 cm square chip with etched channel through which short string of proteins, not electrons, zip past. This movement is powered by the chemical ATP — the juice of life. “This started as a back of an envelope idea, after too much rum I think, with drawings of what looked like small worms exploring mazes,” Nicolau said.

This circuit was used to solve  a complex classical mathematical problem by using parallel computing of the kind used by supercomputers. Of course, its capabilities are light years away from a conventional supercomputer. These are just the first baby steps, though. Moreover, Nicolau reckons some very interesting things could happen if you combine the two — biological and conventional computing — in a hybrid model. “Right now we’re working on a variety of ways to push the research further.”

Findings appeared in PNAS.

When supercomputers start to cook: meet Chef Watson

There are probably a million cooking apps out there, but none of them are backed by a supercomputer. Meet Chef Watson: a “cognitive computing app” that promises to revolutionize the way you cook and expand your gastronomic comfort zone.

Bon Appetit announced a new collaboration with IBM, called “Chef Watson,” which invites amateur and professional chefs to try new, unexpected flavor combinations, based on Watson’s analysis of over 10,000 recipes. The supercomputer analyzed what ingredients go together well in different styles of cooking and recommends ingredients based on what you are already using. For each ingredient, Chef Watson suggests three other ingredients (in various quantities) that probably complement its flavor, as well as dish suggestions. Now, the Web app is open to everyone – log on, tell it what you want to cook and try its suggestions.

Programmers also made it easier to implement food restrictions (such as vegetarianism or allergies). It also suggests dishes based on what ingredients you already have in your kitchen.

While this probably won’t revolutionize cooking, it’s definitely a great example of how high-end technology can help people in day to day tasks.

“Chef Watson demonstrates how smart machines can help people explore the world around them and discover new possibilities and new ways of getting things done, whether it’s finding promising treatment pathways to fight diseases or helping law firms build courtroom strategies by discovering connections between their cases and earlier precedents,” IBM Chief Storyteller Stephen Hamm wrote in a blog post. “It also signals that there will be a wide variety of uses for cognitive technologies designed to help individuals live better and have more fun.”


One of Tianhe-2's corridors - the current most powerful supercomputer in the world. Image: Intel

The most powerful supercomputer of tomorrow: Aurora (180 petaflop/s)

One of Tianhe-2's corridors - the current most powerful supercomputer in the world. Image: Intel

One of Tianhe-2’s corridors – the current most powerful supercomputer in the world. Image: Intel

The US Department of Energy (DoE) has sealed a deal with Intel worth $200 million to build what’s supposed to be the world’s most powerful computer in 2018: the Aurora. The behemoth will be based on a next-generation Cray supercomputer, code-named “Shasta,” and will use Intel’s HPC scalable system framework. Aurora will likely reach a peak performance of 180 petaflop/s, or 180 quadrillion floating point operations per second (completed algorithm action, not just instruction). For comparison, a  2.5 GHz processor has a theoretical performance of 10 billion FLOPS.

According to the DoE, Aurora will be open to use by any scientific effort. Mainly it will serve in:

  • Materials science: Designing new classes of materials that will lead to more powerful, efficient and durable batteries and solar panels.
  • Biological science: Gaining the ability to understand the capabilities and vulnerabilities of organisms that can result in improved biofuels and more effective disease control.
  • Transportation efficiency: Collaborating with industry to improve transportation systems with enhanced aerodynamics features, as well as enable production of better, more highly-efficient and quieter engines.
  • Renewable energy: Engineering wind turbine design and placement to greatly improve efficiency and reduce noise.

The crown supercomputer spot is held by  Tianhe-2, a supercomputer developed by China’s National University of Defense Technology, with a peak performance of 55 petaflop/s and a Rmax performance of 33.86 petaflop/s. Coincidentally or not, the news of Aurora broke in the same day as the Feds announced they refused to grant Intel a licence to supply Xeon chips for  Tianhe-2. The reason was that the Chinese supercomputer is used to develop nuclear weapons.

true north

Breakthrough in computing: brain-like chip features 4096 cores, 1 million neurons, 5.4 billion transistors

true north

Image: IBM

The brain of complex organisms, such as humans but just as well other primates or even mice, is very difficult to emulate with today’s technology. IBM is moving things further in this direction after it announced the whooping features of its new brain-like chip: one million programmable neurons and 256 million programmable synapses across 4096 individual neurosynaptic cores, all made possible using 5.4 billion transistors. TrueNorth as it’s been dubbed, looks amazing not just because of its raw computer power – after all, this kind of thing was possible before, you just had to build more muscle and put more cash and resources into the project – but also because of the tremendous leap in efficiency. The chip, possibly the most advance of its kind, operates at max load using only 72 milliwatts. That’s  176,000 times more efficient than a modern CPU running the same brain-like workload, or 769 times more efficient than other state-of-the-art neuromorphic approaches. Enter the world of neuroprogramming.

[ALSO READ] The most complex human artificial brain yet

Main components of IBM’s TrueNorth (SyNAPSE) chip. Image: IBM

Main components of IBM’s TrueNorth (SyNAPSE) chip. Image: IBM

The coronation of a six-year old IBM project, partially funded by DARPA, TrueNorth made its first baby steps in an earlier prototype. The 2011 version only had 256 neurons, but in between the developers made some drastic improvements like switching to the Samsung 28nm transistor process. Each TrueNorth chip consists of 4096 neurosynaptic cores arranged in a 64×64 grid. Like a small brain network that communicates with other networks, each core bundles 256 inputs (axons), 256 outputs (neurons),  SRAM (neuron data storage), and a router that allows for any neuron to transmit to any axon up to 255 cores away. In total, 256×256 means each core is capable of processing 65,536 synapses, and if that wasn’t crazy enough IBM  already built a 16-chip TrueNorth system with 16 million neurons and 4 billion synapses.

[ALSO] New circuitboard is 9,000 times more efficient at simulating the human brain than your PC

By now, some of you may be confused by all these technicalities. What do these mean? Why should you care, for that matter? The ultimate goal is to come to an understanding, complete and absolute, of how the human brain works. We’re far off from this goal, but we need to start somewhere. To run complex simulations of deep neural networks you need dedicated hardware that can be up to the job, preferably closely matching organic brain parallel computation. Then you need software, but that’s for another time.

Of course, there’s also a commercial interest. IBM is in with the big boys. They’ve constantly been on the forefront of technology for decades and people managing IBM know that big data interpretation is huge slice of the global information pie. Watson, the supercomputer that demonstrated it can win against Jeopardy’s top veterans, is just one of IBM’s big projects in this direction – semantic data retrieval. Watson’s nephews will be ubiquitous in every important institutions, be it hospitals or banks. Expect TrueNorth to play a big part in all of this, running from the inside to help the world grow faster on the outside.

More details can be found in the paper published in the journal Science.



Robot passes the Turing Test for the first time in history

The 65 year-old iconic Turing Test was passed for the very first time by a supercomputer program named Eugene Goostman. Eugene managed to convince 33% of the human judges that it too was human.

The Turing Test

The Turing test is a test of a machine’s ability to exhibit intelligent behaviour equivalent to, or indistinguishable from, that of a human (via Wikipedia). In this test, a human judge engages in natural language conversations with a human and a machine. If the judge can’t tell which is the human and which is the program, the machine passes the test. The test itself doesn’t check the robot’s ability to give straight or correct answers, but rather its human-like behavior.

The test was introduced by Alan Turing in his 1950 paper “Computing Machinery and Intelligence,” which opens with the words:

“I propose to consider the question, ‘Can machines think?'” Because “thinking” is difficult to define, Turing chooses to “replace the question by another, which is closely related to it and is expressed in relatively unambiguous words.”

Turing, who was a genious way ahead of his time, believed that sooner or later his test would be passed, but he was a bit off. He estimated that in the year 2000, machines with around 100 MB of storage would be able to fool 30% of human judges in a five-minute test. Futurist Ray Kurzweil estimated in 1990 that a machine will pass it by 2020, but in 2005, he changed the date to 2029. He even made a bet with Mitch Kapor about when the Turing test will be passed – but nobody was expecting it to happen so soon.

Turing, meet Eugene

Eugene is a computer programme that simulates a 13 year old boy, developed in Saint Petersburg by Vladimir Veselov, who was born in Russia and now lives in the United States, and Ukrainian born Eugene Demchenko who now lives in Russia. Eugene managed to pass the test and convince 33% of the judges that he is human in a 5 minute chat discussion (a score of 30% is needed to pass the test). The event was organized by the University’s School of Systems Engineering in partnership with RoboLaw, an EU-funded organisation focused on the development of robotics. Professor Kevin Warwick, a Visiting Professor at the University of Reading and Deputy Vice-Chancellor for Research at Coventry University, said:

“In the field of Artificial Intelligence there is no more iconic and controversial milestone than the Turing Test, when a computer convinces a sufficient number of interrogators into believing that it is not a machine but rather is a human. It is fitting that such an important landmark has been reached at the Royal Society in London, the home of British Science and the scene of many great advances in human understanding over the centuries. This milestone will go down in history as one of the most exciting.”

Still, there are of course some contradictory discussions about this achievement. Some claim that the program’s task was eased by the fact that it mimics a 13 year old boy from Odessa – which can’t be expected to have the same level of knowledge as a grown man, and can be excused for small grammar errors. Veselov stated:

“Eugene was ‘born’ in 2001. Our main idea was that he can claim that he knows anything, but his age also makes it perfectly reasonable that he doesn’t know everything. We spent a lot of time developing a character with a believable personality. This year we improved the ‘dialog controller’ which makes the conversation far more human-like when compared to programs that just answer questions. Going forward we plan to make Eugene smarter and continue working on improving what we refer to as ‘conversation logic’.”


Scientists show for the first time that climate change will cause more intense summer storms in Britain

The British summer has always been a subject of fascination and annoyance, for its fickle, rainy nature. But a new study using supercomputers has shown that climate change will cause even more intense storms and rainfall, with flash flooding becoming a common occurrence.

The study published in the journal Nature Climate Change shows the first evidence that summer downpours in the UK could become heavier with climate change.

“We used a very high-resolution model more typically used for weather forecasting to study changes in hourly rainfall. Unlike current climate models, this has a fine resolution and is able to realistically represent hourly rainfall, so this allows us to make these future projections with some confidence.”, the study reads.

What they found was that summers are likely to become drier; and before you go crazy about this, remember that warmer air can hold more moisture and is associated with much more intense rainfall.

As always with climate models, there will always be some debate regarding how it was developed. They used this model to simulate two 13-year periods, one based on the current climate and one based on the climate at the end of the century under a high-emissions scenario (the IPCC’s RCP8.5 scenario). Now, this model is not meant to be a reliable forecast of what the weather or climate will be like in 2100 – especially as large scale models should always be taken with a grain of salt; it just shows how the weather and climate will tend to be, compared to today’s conditions. The trends are very clear.

They reported some differences when they changed the resolution of the model, and only reached these conclusions with the highest resolution.

“The simulation showed increased hourly rainfall intensity during winter, consistent with the simulations for the future provided by coarser resolution models and previous studies looking at changes on daily timescales. However the finely grained model also revealed that short-duration rain will become more intense during summer, something that the coarser model was unable to simulate.”

So, again, this is not trying to create panic or anything, and the model shouldn’t be taken as absolute. But, according to the physics and climatology we know so far, climate change will cause these negative effects in Great Britain.

Scientific Reference: Heavier summer downpours with climate change revealed by weather forecast resolution model

Klaus Schulten, professor of physics; and Juan Perilla, postdoc with Theoretical and Computational Biophysics Group at the Beckman Institute with projection of atomic-level detail of the structure of the HIV capsid (outer shell).

Largest supercomputer bio-simulation ever reveals key HIV protective shell structure

Klaus Schulten, professor of physics; and Juan Perilla, postdoc with Theoretical and Computational Biophysics Group at the Beckman Institute with projection of atomic-level detail of the structure of the HIV capsid (outer shell).

Klaus Schulten, professor of physics; and Juan Perilla, postdoc with Theoretical and Computational Biophysics Group at the Beckman Institute with projection of atomic-level detail of the structure of the HIV capsid (outer shell).

One big obstacle scientists face in their efforts to develop effective drugs against HIV is the virus’ capsid – an outer cell membrane-derived envelope and an inner viral protein shell that protects HIV essential proteins and genetic information. Current drugs have a hard time breaching this structure, however this might change. Using a supercomputer that crunched immense amounts of data, scientists have recently reported they have decoded the structure that contains and protects HIV’s genetic material.

“The capsid is critically important for HIV replication, so knowing its structure in detail could lead us to new drugs that can treat or prevent the infection,” said senior author Peijun Zhang, associate professor at the University of Pittsburgh School of Medicine.“This approach has the potential to be a powerful alternative to our current HIV therapies, which work by targeting certain enzymes, but drug resistance is an enormous challenge due to the virus’ high mutation rate.”

The capsid is one tricky fellow though, and accurately describing its structure was no easy task by any means. For one, the shell is comprised of nonuniform combinations of five- and six-subunit protein structures that link together to form an asymmetric shape. To model it, scientists had to piece together each of the 3 to 4 million atoms comprising it, while also  accounting for all of the water molecules and salt ions also present, creating an output of some 64 million atoms.

The  University of Pittsburgh researchers first used electron-scan microscope to see in incredible detail the protein molecules that comprise the capsid, then used an imaging technique to visualize how these molecules connect to each other to form the general shape of the shell. The data was then sent to University of Illinois physicists, who fed the data into computer models they ran on Blue Waters, their new supercomputer at the National Center for Supercomputing Applications capable processing 1 quadrillion operations per second. The data was run through the computer using a process called molecular dynamic flexible fitting, which outputted  the minute details of the capsid’s structure.

 The researchers used the supercomputer Blue Waters to determine the complete HIV capsid structure, a simulation that accounted for the interactions of 64 million atoms.

The researchers used the supercomputer Blue Waters to determine the complete HIV capsid structure, a simulation that accounted for the interactions of 64 million atoms.

The process revealed a three-helix bundle with critical molecular interactions at the seams of the capsid, areas that are necessary for the shell’s assembly and stability, which represent vulnerabilities in the protective coat of the viral genome.

“This is a big structure, one of the biggest structures ever solved,” said University of Illinois physics professor Klaus Schulten. “It was very clear that it would require a huge amount of simulation – the largest simulation ever published. You basically simulate the physical characteristics and behavior of large biological molecules but you also incorporate the data into the simulation so that the model actually drives itself toward agreement with the data.”

If capsid assembly or disassembly is disrupted, viral replication, and consequently transmission, can be stopped. Now armed with this new found information, researchers have opened up a new front in their ongoing war with HIV.

“The capsid is very sensitive to mutation, so if we can disrupt those interfaces, we could interfere with capsid function,” Zhang said. “The capsid has to remain intact to protect the HIV genome and get it into the human cell, but once inside it has to come apart to release its content so that the virus can replicate. Developing drugs that cause capsid dysfunction by preventing its assembly or disassembly might stop the virus from reproducing.”

The findings appeared in Nature.

The tiny neurosynaptic core produced by IBM. (c) IBM

Cognitive computing milestone: IBM simulates 530 billon neurons and 100 trillion synapses

First initiated in 2008 by IBM, the Systems of Neuromorphic Adaptive Plastic Scalable Electronics (SyNAPSE) program whose final goal is that of developing a new cognitive computer architecture based on the human brain. Recently, IBM announced it has reached an important milestone for its program after the company successfully simulated 10 billion neurons and 100 trillion synapses on most powerful supercomputer.

It’s worth noting, however, before you get too exited, that the IBM researchers have not t built a biologically realistic simulation of the complete human brain – this is still a goal that is still many years away. Instead, the scientists devised a cognitive computing architecture called TrueNorth with 1010 neurons (10 billion) and 1014 synapses (100 trillion) that is inspired by the number of synapses in the human brain; meaning it’s modular, scalable, non-von Neumann, ultra-low power. The researchers hope that in the future this essential step might allow them to build an electronic neuromorphic machine technology that scales to biological level.

 “Computation (‘neurons’), memory (‘synapses’), and communication (‘axons,’ ‘dendrites’) are mathematically abstracted away from biological detail toward engineering goals of maximizing function (utility, applications) and minimizing cost (power, area, delay) and design complexity of hardware implementation,” reads the abstract for the Supercomputing 2012 (SC12) paper (full paper link).

Steps towards mimicking the full-power of the human brain

 Authors of the IBM paper(Left to Right) Theodore M. Wong, Pallab Datta, Steven K. Esser, Robert Preissl, Myron D. Flickner, Rathinakumar Appuswamy, William P. Risk, Horst D. Simon, Emmett McQuinn, Dharmendra S. Modha (Photo Credit: Hita Bambhania-Modha)

Authors of the IBM paper(Left to Right) Theodore M. Wong, Pallab Datta, Steven K. Esser, Robert Preissl, Myron D. Flickner, Rathinakumar Appuswamy, William P. Risk, Horst D. Simon, Emmett McQuinn, Dharmendra S. Modha (Photo Credit: Hita Bambhania-Modha)

IBM simulated the TrueNorth system running on the world’s fastest operating supercomputer, the Lawrence Livermore National Lab (LBNL) Blue Gene/Q Sequoia, using 96 racks (1,572,864 processor cores, 1.5 PB memory, 98,304 MPI processes, and 6,291,456 threads).

IBM and LBNL achieved an unprecedented scale of 2.084 billion neurosynaptic cores containing 53×1010  (530 billion) neurons and 1.37×1014 (100 trillion) synapses running only 1542 times slower than real time.

The tiny neurosynaptic core produced by IBM. (c) IBM

The tiny neurosynaptic core produced by IBM. (c) IBM

“Previously, we have demonstrated a neurosynaptic core and some of its applications,” continues the abstract. “We have also compiled the largest long-distance wiring diagram of the monkey brain. Now, imagine a network with over 2 billion of these neurosynaptic cores that are divided into 77 brain-inspired regions with probabilistic intra-region (“gray matter”) connectivity and monkey-brain-inspired inter-region (“white matter”) connectivity.

“This fulfills a core vision of the DARPA SyNAPSE project to bring together nanotechnology, neuroscience, and supercomputing to lay the foundation of a novel cognitive computing architecture that complements today’s von Neumann machines.”

According to Dr. Dharmendra S. Modha, IBM’s cognitive computing manager, his team goal is that of mimic processes of the human brain. While IBM competitors focus on computing systems that mimic the left part of the brain, processing information sequentially, Modha is working on replicating functions from the right part of the human brain, where information can be processed in parallel and where incredibly complex brain functions lie. To this end, the researchers combine neuroscience and supercomputing to reach their goals.

Imagine that the room-sized, cutting-edge, billion dollar technology used by IBM to scratch the surface of artificial human cognition still doesn’t come near our brain’s capabilities, which only occupies a fixed volume comparable to a 2L bottle of water and needs less power than a light bulb to work. The video below features Dr. Modha explaining his project in easy to understand manner and its only 5 minutes long.

source: KurzweilAI


US takes fastest supercomputer crown

In October 2010, China developed the fastest computer of the day, beating the previous record by 30% – quite an impressive feat. But the US didn’t just lie on its back.

Picture source

Titan, which resides at the Oak Ridge National Laboratory in Tennessee is an upgrade from the 2009 record holder, working at 17.59 petaflops per second – meaning it can make 17,590 trillion calculations per second. That’s about as much as the entire population of the US working together would make in a gazillion years.

Titan leapfrogged the previous champion IBM’s Sequoia, which sadly, is working on how to extend the life of nuclear weapons, by mixing together Nvidia’s CPUs and Tesla GPUs. This is a different approach from other supercomputers, which relied only on CPUs. GPUs, despite being slower at individual calculations, make it up by being able to perform more at the same time.

“Basing Titan on Tesla GPUs allows Oak Ridge to run phenomenally complex applications at scale, and validates the use of ‘accelerated computing’ to address our most pressing scientific problems,” said Steve Scott, chief technology officer of the GPU accelerated computing business at Nvidia.

Yellowstone ncar supercomputer

Most powerful supercomputer dedicated to geosciences is now live

Yellowstone ncar supercomputer

Some of the Yellowstone supercomputer’s racks. A mosaic of the Yellowstone National Park was put in place as a tribute. (c) CARLYE CALVIN / NCAR

While climate change may be a subject of intense debate with equally enthusiastic supports on both sides of the fence, one thing anyone, no matter their side, shouldn’t argue is allocating resources for its study. Just recently, one of the most powerful tools for studying the planet’s climate in great detail has been powered up – the “Yellowstone” 1.5 petaflops supercomputer, which has already been listed under the top 20 supercomputers in the world list.

The system went live at the NCAR-Wyoming Supercomputing Center in Cheyenne, Wyoming where it was met with bold enthusiasm by meteorologists and geoscientists stationed there, and from the rest of the world for that matter. Yellowstone promises to aid scientists in performing complex climate models which should allow for studying anything from hurricanes and tornadoes to geomagnetic storms, tsunamis, wildfires, as well as locating resources such as oil miles beneath the Earth’s crust.

People “want to know what [climate change] is going to do to precipitation in Spain or in Kansas,” said Rich Loft, the director of technology development at the center.

The supercomputer can perform computations at 1.5 petaflops, which translates in a staggering 1,500 teraflops or 1.5 quadrillion calculations per second. Just so you can get an idea of both the kind of upgrade Yellowstone offers and the degree of technological advancements witnessed in the past few years, consider that NCAR’s previous supercomputer, Bluefire, which was commissioned in 2008,  peaked at 76 teraflops, yet still it was one of the most powerful supercomputers of its day.

The $70 million data center is comprised of 100 racks with 72,288 compute cores from Intel Sandy Bridge processors, a massive 144.6 terabyte storage farm and a system for visualizing all of its data.

A powerful tool for predicting our planet’s climate

All these numbers might not mean very much to you, but if you put its tasks into context, you suddenly become impressed. For instance, a short-term weather forecast which would typically require a few hours to complete for Bluefire, can be rendered in mere minutes by Yellowstone. But it’s not in speed where Yellowstone shines, but in the complex tasks it can undertake. Scientists typically create climate change models of a particular region by arranging it in 100 km wide grids, yet Yellowstone is capable of refining the resolution to as much as 10 km. This significant improvement allows for a more detailed and accurate assessment of climate change closer to reality.

“Scientists will be able to simulate these small but dangerous systems in remarkable detail, zooming in on the movement of winds, raindrops, and other features at different points and times within an individual storm. By learning more about the structure and evolution of severe weather, researchers will be able to help forecasters deliver more accurate and specific predictions, such as which locations within a county are most likely to experience a tornado within the next hour,” according to a NCAR statement.

Currently, 11 research projects have already been planned to make use of Yellowstone “to try to do some breakthrough science straight away and try to shake the machine,” according to NCAR officials.


IBM to develop world’s most powerful computing system tasked with finding origins of Universe

Backed by an international consortium, ten years from now the world’s largest and most sensitive radio telescope in the world will be built – the Square Kilometer Array (SKA). The project will consist in thousands of antennas displaced across thousands of miles, with a collecting area equivalent to one square kilometer (hence the name), that will hopefully help astronomers take a peek at the Universe’s closest moments after the Big Bang. However, such a grand scientific effort requires an equally humongous computing power, one that only seven million of today’s fastest computers could match. Recently, IBM has been granted the privilege to research the exascale super computing system to be integrated with the SKA, after it won the $42 million contract to work with the Netherlands Institute for Radio Astronomy (ASTRON).

IBM has thus marched for the Herculean task of developing a solution that will match SKA’s need for reading, storing and processing one exabyte of raw data per day. An exabyte is the equivalent of 1,000,000 terabytes or 12,000,000 latest generation iPods fully stored. If you didn’t quite get the scale involved, consider that one exabyte roughly equals two days worth of global internet traffic. Massive!

In Drenthe, Netherlands, ASTRON and IBM will look at energy-efficient exascale computing, data transport at light speed, storage processes and streaming analytics technology. “We have to decrease power consumption by a factor of 10 to 100 to be able to pay the power bill for such a machine,” said Andreas Wicenec, head of computing at the International Centre for Radio Astronomy Research in the state of Western Australia.

With this purpose in mind, the researchers are currently investigating advanced accelerators and 3-D stacked chips, architectures already proven to be highly energy-efficient at IBM labs. Also, they’ll have a look at how they can optimize huge data transfers by using novel optical interconnect technologies and nanophotonics. For the task at hand, 50 people, along with astronomers from 20 countries, will work to build the most complex super-computing system in the world for the next five years.

Artist impression of the  SKA radio telescope were it to be built in Australia. (c) SKA Program Development Office

Artist impression of the SKA radio telescope were it to be built in Australia. (c) SKA Program Development Office

“To detect the signals, you really need a good antenna,” said Ronald Luitjen, an IBM scientist and data motion architect on the project. “It would be the equivalent of 3 million TV antennae dishes. This will be a unique instrument. Nothing else can do this kind of science.”

Radio telescopes in operation today are very powerful, but SKA will be in a whole different league. It will provide a real-time all-sky radio survey, on the lookout for some of the Universe’s most strange phenomena, unexplored with today’s technology. The telescope will be used to explore evolving galaxies, dark matter, look for complex organic molecules in interstellar space and study data from the Big Bang, the primordial cosmic event which gave birth to anything matter and anti-matter in the Universe more than 13 billion years ago. All these, you guessed it, require a huge computing effort – hopefully it’s to be served in the coming years before the SKA’s completion in 2024.

The $2 billion SKA will be located either in Australia/New Zealand or South Africa, with the latter being currently most favored. These regions were selected because of their low radio pollution. Nevertheless, the scientists involved in the project are looking at the bright side of the lengthy completion time. “It is really relying on the fact that technology is improving at a certain rate,” said Andreas Wicenec, head of computing at the International Centre for Radio Astronomy Research in the state of Western Australia. Well, how about quantum computing?

The SKA might hold the key to unlocking some of the Universe’s well kept secrets today, and, if anything, it will open a new era of computing, with ramifications in all spheres of science.

Stars create gaps devoid of gas giants, supercomputer simulation shows – contradicted by our own solar system

Gas giants might just be the most whimsical planets of all: they don’t just settle at any old point on the orbit – instead, they only choose certain regions and stay clear of others – at least according to a new supercomputer simulation.

A new study recently revealed that the orbital deserts and pile-ups caused by these preferences might actually be caused by starlight itself. Using supercomputer simulations of young solar systems, astronomers Richard Alexander of the University of Leicester in the United Kingdom and Ilaria Pascucci of the University of Arizona’s Lunar and Planetary Laboratory have found that powerful ultraviolet and X-ray emissions from the star tend to carve out empty spaces.

When planetary systems are formed, planets initially start out as spinning disks of dust and gas particles, and some clump in to form planets or satellites, while some only live to be comets, asteroids, or other such bodies.

“The disk material that is very close to the star is very hot, but it is held in place by the star’s strong gravity,” said Alexander in a press release from the University of Arizona. “Further out in the disk where gravity is much weaker, the heated gas evaporates into space.”

Around a star like our Sun, these gasless gaps seem to form 100 million to 200 million miles from the star.

“The planets either stop right before or behind the gap, creating a pile-up,” said Pascucci in the press release. “The local concentration of planets leaves behind regions elsewhere in the disk that are devoid of any planets. This uneven distribution is exactly what we see in many newly discovered solar systems.”

However, while this model seems correct enough, and was validated by other solar systems, our own solar system seems to stand in contradiction. Earth orbits the sun at a distance of about 93 million miles, where the void should begin, while Jupiter, the closes gas giant to the Sun, orbits at about 500 million miles. Time, however, will tell if Alexander and Pascucci’s model is correct, as telescopes discover more and more different solar systems.

Shorties: IBM sets up supercomputer to fight climate change

IBM has recently developed a new 1.6-petaflop high-performance computer for the National Center for Atmospheric Research with the purpose of installing a new supercomputing ability and help the center’s research in atmospheric and climate change.

A pentaflop is a unit of measure of a supercomputer’s performance; it is basically the ability to do a quadrillion floating point operations per second (FLOP).

GPU upgrade makes Jaguar the fast computer in the world again

No, not the sports car, neither the predatory feline, but Oak Ridge National Labs Jaguar – a supercomputer of immense computing capabilities set to top the ranks of the fastest computers in the world, for the second time, after a GPU (graphical processing unit) upgrade.  Capable of simulating physical systems with heretofore unfeasible speed and accuracy -from the explosions of stars to the building blocks of matter – the new upgraded Jaguar will be capable of reaching an incredible peak speed of 20 petaflops (20,000 trillion computations per second). The speedy computer will be renamed “Titan” after its overhaul.

RELATED: Supercomputer simulation confirms Universe formation model

This is the second time the ORNL supercomputer will peak the top500 supercomputers of the world list, after it was surpassed by Japan’s K Computer and China’s Tianhe-1A supercomputer last year. The title will be earned as a result of an inked deal between Cray Inc., the manufacturer of the XT5-HE supercomputer at the heart of Jaguar, and ORNL, which will overhaul the DARPA computer with thousand of graphics processors from NVIDIA as well as chips from Advanced Micro Devices.

“All areas of science can benefit from this substantial increase in computing power, opening the doors for new discoveries that so far have been out of reach,” said associate lab director for computing Jeff Nichols.

“Titan will be used for a variety of important research projects, including the development of more commercially viable biofuels, cleaner burning engines, safer nuclear energy and more efficient solar power.”

The multi-year contract, valued at more than $97 million, will make out of Titan at least twice as fast and three times as energy efficient as today’s fastest supercomputer, which is located in Japan.


Data Center

IBM is building the largest data array in the world – 120 petabytes of storage

Data Center

IBM recently made public its intentions of developing what will be upon its completion the world’s largest data array, consisting of 200,000 conventional hard disk drives intertwined and working together, adding to 120 petabytes of available storage space. The contract for this massive data array, 10 times bigger than any other data center in the world at present date, has been ordered by an “unnamed client”, whose intentions has yet to be disclaimed. IBM claims that the huge storage space will be used for complex computations, like those used to model weather and climate.

To put things into perspective 120 petabytes, or 120 million gygabites would account for 24 billion typical five-megabyte MP3 files or 60 downloads of the entire internet, which currently spans across 150 billion web pages. And while 120 petabytes might sound outrageous by any sane standard today, in just a short time, at the rate technology is advancing, it might become fairly common to encounter a data center similarly sized in the future.

“This 120 petabyte system is on the lunatic fringe now, but in a few years it may be that all cloud computing systems are like it,” Hillsberg says. Just keeping track of the names, types, and other attributes of the files stored in the system will consume around two petabytes of its capacity.

I know some of you tech enthusiasts out there are already grinding your teeth a bit to this fairly dubious numbers. I know I have – 120 petabytes/200.000 equals 600 GB. Does this mean IBM is using only 600 GB hard drives? I’m willing to bet they’re not that cheap, it’s would be extremely counter-productive in the first place. Firstly, it’s worth pointing out that we’re not talking about your usual commercial hard drives. Most likely, the hard-drives used will be of the sort of 15K RPM Fibre Channel disks, at the very least – which beats the heck out of your SATA drive currently powering your computer storage. These kind of hard-drives are currently not that voluminous in storage as SATA ones, so this might be an explanation. There’s also the issue of redundancy which is encountered in data centers, which decreases the amount of available real storage spaces and increases as a data center is larger. So the hard-drives used could actually be somewhere between 1.5 and 3 TB, all running on cutting edge data transfer speed.

Steve Conway, a vice president of research with the analyst firm IDC who specializes in high-performance computing (HPC), says IBM’s repository is significantly bigger than previous storage systems. “A 120-petabye storage array would easily be the largest I’ve encountered,” he says.

To house these massively numbered hard-drives IBM located them horizontaly on drawers, like in any other data center, but made these spaces even wider, in order to accommodate more disks within smaller confines. Engineers also implemented a new data backup mechanism, whereby information from dying disks is slowly reproduced on a replacement drive, allowing the system to continue running without any slowdown. Also, a system called GPFS, meanwhile, spreads stored files over multiple disks, allowing the machine to read or write different parts of a given file at once, while indexing its entire collection at breakneck speeds.

Last month a team from IBM used GPFS to index 10 billion files in 43 minutes, effortlessly breaking the previous record of one billion files scanned in three hours. Now, that’s something!

Fast access to huge storage is of crucial necessity for supercomputers, who need humongous amounts of bytes to compute the various complicate model they’re assigned to, be it weather simulations or the decoding of the human genome. Of course, they can be used, and most likely are already in place, to store identities and human biometric data too. I’ll take this opportunity to remind you of a frightful fact we published a while ago – every six hours the NSA collects data the size of the Library of Congress.

As quantum computing takes ground and eventually the first quantum computer will be developed, these kind of data centers will become highly more common.

UPDATE: The facility has indeed opened in 2012. 

MIT Technology Review