How to Understand the Data Explosion

The Future of Everything covers the innovation and technology transforming the way we live, work and play, with monthly issues on transportation, health, education and more. This month is Data, online starting Dec. 2 and in print Dec. 9.
Remember getting your first digital camera? Suddenly you could take (and keep) hundreds of photographs. Today, it’s likely that you’re keeping thousands of photos—as well as video, audio and other files—on the phone in your pocket.
The files on your phone are only a microcosm of the global increase in data creation and storage. Individuals, businesses and governments are generating an almost astronomical amount of data that require strings of zeros and arcane terms just to describe the sheer volume. About 64 zettabytes was created or copied last year, according to IDC, a technology market research firm. That’s 64 followed by 21 zeros. A zettabyte is 1 billion terabytes or 1 trillion gigabytes.
It isn’t only hard to wrap your head around. All that data is a challenge to store, process and retrieve, and it will become more difficult as the volume surges and data stewards confront the sustainability of using immense amounts of electricity and water to power and cool data centers. Data-center construction will grow at compound annual rates of 5% to 10%, market researchers estimate.
The prospect has researchers seeking radical alternatives, including synthetic DNA. Its information density far exceeds what’s possible with storage media like magnetic tape or optical discs. One project, sponsored by the federal Office of the Director of National Intelligence, is funding several teams with the short-term goal of producing DNA technology that can encode and retrieve up to 10 terabytes of information a day. The goal is to lay groundwork to shrink what now takes a full-size data center into a machine that sits on a desk.
To be sure, consumers and businesses throw away most of the data they create every year.
The digital master of a movie might be just a few gigabytes, and relatively little storage is needed to place copies on servers scattered around the world. But millions of viewers means making millions of copies. Even if that data isn’t stored, industries have grown up around the infrastructure to distribute all that data smoothly. Or think of deleting six episodes of “The Voice” from the family video recorder just before the Super Bowl. Likewise, millions of cameras monitor assembly lines, stores, front porches and what the dog’s doing when no one is home. Much of it isn’t needed and is soon overwritten.
Almost two-thirds of last year’s data existed only briefly, according to John Rydning, IDC’s research vice president. The other third was stored but overwritten or deleted within the year. Only a tiny fraction—less than 2%, IDC estimates—survived into this year.
But even 2% of 64 zettabytes is a huge amount.
To understand the data explosion, it may be easier to look at it in reduced form—as the amount of data that piles up daily as the result of activities that involve individuals, directly or indirectly.
To help visualize amounts, we’ve translated digital storage to rice. A grain of rice represents 1 megabyte, about the size of a low-resolution digital photo. A quarter-cup of rice—a typical serving—is about 2,000 grains or 2 gigabytes, about the amount that would fit on a small thumb drive.
In 2019, EZ Pass Group transponders handled 3.8 billion toll transactions on highways, bridges and tunnels—more than 10 million a day. The consortium includes agencies in 19 Eastern U.S. states. Each acts as a clearinghouse so that a driver with a single account can use the pass anywhere.


