Viewing entries tagged
cloud computing


Do we know the real resource costs and benefits of the data cloud?

Unique site visitors, number of clicks, downloads and newsletter subscriptions; customer demographics; water and fuel usage; costs of production compared across factories. Data. Credit to Fat Cow (free data center photos)

In today’s business world, data has become central to strategies for remaining competitive and gaining that elusive market edge. Driving gains in efficiency and informing predictive analytics - data has become a key to accelerating innovation. Combined with our rapidly expanding capacity to collect, track, process, analyze, synthesize and apply massive amounts of information, the result is a frenzy to hoard more and more.

As the collection and storage of data grows, it’s a good time to consider the potential benefits of using that data in relation to the resource costs associated with managing it. Our shift to a virtual world has all but eliminated the visible and cumbersome relics of historic data collection: musty folios of hand-written measurements or reams of reports stacked in a basement. Sure, each piece of digital data is a barely visible package of tiny bits and bytes; but it is matter - it has mass and takes up space. The more we collect and squirrel away into the vast “cloud,” the greater is the demand for resources to power the equipment and systems that makes this possible. The problem is not one individual or business or agency collecting data, but when everyone holds on to everything - often in duplicate, triplicate or more, indefinitely. It all adds up and although the growing mass is virtually invisible, out of sight and largely out of mind, everything must go somewhere.

And that somewhere is the enormously energy-intensive data centers exploding worldwide. With immense energy budgets - and consequent emissions and strain on the existing power grid, as well as high capital costs and physical resources for expansion and construction - these roughly three million centers worldwide consume an estimated 30 billion watts of electricity or the equivalent output from 30 nuclear power plants each year. This according to the New York Times report published last September that thoroughly dispelled any myths about the elegant efficiency and environmental benignity of virtual data. It’s no surprise that a single data center can use more power than a medium-sized town when about a million gigabytes of data must be processed and stored to create a single 3D animated movie, or when each day as much as 2,000 gigabytes of data produced by the New York Stock Exchange alone must not only be stored but also retained for years. According to the US EPA, however, at least in the United States, the federal government is a foremost driver of increased demand for energy storage with its requirements for retention of digital records; information and national security operations; and provision of digital services such as e-filing for taxes, for example. Data is piling up and staying put.

There has been some movement to temper the very real footprint of the phantasmal cloud through measures to “green” up operations, including greater efficiency in equipment design, repurposing empty department stores as data centers, and the use of more renewable power sources (with questionable results). But these interventions to address the symptoms of our data obsession ignore the elephant in the room: how can we improve and re-design our relationship to data itself? In our zeal to know, to gain power, insight, that elusive edge over the competition - have we stopped to weigh the costs and benefits?

We need methods to understand the value of the information we’re collecting and storing relative to the resources being expended to hold on to it. These methods should take into account the benefits of today’s data systems, including our relatively new abilities to search across disparate information sources to make connections and gain important insights. At the same time, they must acknowledge the very real resource costs associated with collecting and storing data, or the energy, water, and other material resources tied up in our data systems. By understanding the relationship between these costs and benefits, we may be able to better prioritize data collection and management - creating new criteria for what data is collected and stored, for example - and ultimately improve the resource performance of our data systems. HP, which just announced new tools to help companies deal with “digital landfills,” estimates that upwards of 50 percent of stored data is not only useless, but costing businesses significant money to maintain and putting companies at risk from potential leaks. While improving the efficiency of data centers is important, imagine the resource savings if we were able to halve the need for data storage.