Monday, September 24, 2012

Power, pollution and the internet: article on data centres


A very informative New York Times article on data centres, carried  in today's Mint (link at the end).

The transition from 99% reliability to 99.99999.... percent reliability in most fields is not considered worth the effort. It results in too much wastage of resources that could be productively used elsewhere. The human brain which is the most efficient computer imaginable runs on a minuscule amount of electricity, but is never 100% reliable. It has though, an inherent capacity to sift through loads of information and focus on those bits which are most relevant. 

The modern craze for big data, internet of things, and digitization goes against this fundamental maxim. Google is storing all the data in all the books of the world, all the data on continuously changing landscapes and streetscapes, and generally all the data it can lay its hands on, indexing, classifying and sorting it and making it available through increasingly intelligent search engines. Social media sites such as Facebook derive their valuations from all the data they hold about trillions of human interactions - which are only going more and more online and generating more and more data. Companies and other networks where humans interact want to store all the data that they generate, for all the time, and some of this is even mandated by law. From simple text it moved on to pictures, and then to videos and from there to storing 3d movies - and one does not know where it will head from here, only that it will be more data intensive and growth will be more exponential.

If all the people in the world were given a unique number or identity, all their characteristics identified and stored, all their interactions indexed and logged; if all the things in the world were similarly indexed; if one took a photograph every second of the state of the world through a zillion cameras; if all the communications between all these elements were recorded and stored; if there were algorithms capable of extracting threads of meaningful information out of them; there is no end to the amount of data that can be generated. Which is where we are heading - the amount of data sloshing around the world's computers now is nothing compared to what is in store in the future.

And you need to store all this information of course. Maybe in future data centres will come with their own nuclear power plants attached. With the enormous 
amounts of electricity needed to power them, there will be no other option. 

We don't need this much data, of course. Maybe at an individual level we can take a philosophical call to resist this explosion. But as with all advances in technology, the group or nation that pursues the technology enjoys too much advantage over those who don't - the only way to equalize is not for everyone to agree to moderate their pace (which will never happen) - rather, everyone competes with each other in a race to the finish line. The finish line in many cases, is a sheer drop over a cliff - but then, lemmings are not the only animals that are known for collective insanity.


Dinesh