Thirty years ago, back in 1983, the biggest hard drives stored about 10MB of data. That’s barely enough to store two or three .mp3 tracks. Now, a typical notebook has one terabyte of storage or nearly 100,000 times more but even this is figure is laughable when you consider how much data we’re generating. According to IBM, every day we’re creating 2.5 quintillion bytes of data and 90% of today’s digital data was created in the last two years.
Even those who are computer savvy still look at data at the gigabyte or terabyte scale but it’s clear we’re moving well past this point. It can get confusing and dizzy so let’s take a brief overview of how we quantify data and put some context on some of the more obscure units of digital information like the petabyte or yottabyte.
About digital storage or memory
We humans perceive information in analog. For instance, what we see or hear is processed in the brain from a continuous stream. In contrast, a computer is digital and estimates such information using 1s and 0s.
Communicating only in 1s and 0s may sound limiting at first but people have been using sequences of on and off to transmit messages for a long time. In Victorian times, for instance, people used the telegraph to send ‘dots’ (short signal) or ‘dashes’ (a longer signal) by changing the length of time a switch was on. The person listening on the other end would then decipher the binary data written in Morse code into plain English. Transmitting a message over telegraph could take a while, much longer than a message relayed over the telephone for instance, but in today’s digital age this is not a problem because digital data can be decoded in an instant by computers. In binary, 01100001 could be the number 97, or it could represent the letter ‘a’.
Digital storage has several advantages over analog much in the same way digital communication of information holds advantages over analog communication. Perhaps the clearest example of why digital storage is superior to analog is resistance to data corruption. Let’s look at audio or video tapes for a moment. To store data, a thin plastic tape is impregnated with particles of iron-oxide which become magnetized or demagnetized in the presence of a magnetic field from an electromagnet coil. Data is then retrieved from the tape by moving it past another coil of wire which magnetizes certain spots around the tape to induce a voltage.
If were to use analog techniques to store data, like representing a signal by the strength of magnetization of the various spots on the tape, we’d run into a lot of trouble. As the tape ages and magnetization fades, the analog signal will be altered from its original state when the data was first recorded. Moreover, any magnetic field can alter the magnetization on the tape. Since analog signals have infinite resolution, the smallest degree of change will have an impact on the integrity of the data storage.
This is no longer a problem in binary digital form because the strength of magnetization on the tape will be considered in two discrete levels: either ‘high’ or ‘low’. It makes no difference what the in-between states are. Even if the tape experiences slight alterations from magnetic fields, the data is safe from corruption because the discrete levels are still there.
Units of data
The bit, short for BInary digiT, is the smallest unit of data a computer can read. Simply put, it can be either a 1 or 0.
The byte is composed of eight bits.
- 0.1 bytes: A binary decision
- 1 byte: A single character
- 10 bytes: A single word
- 100 bytes: A telegram OR A punched card
Kilobyte (1024 Bytes)
- 1 Kilobyte: A very short story
- 2 Kilobytes: A Typewritten page
- 10 Kilobytes: An encyclopaedic page OR A deck of punched cards
- 50 Kilobytes: A compressed document image page
- 100 Kilobytes: A low-resolution photograph
- 200 Kilobytes: A box of punched cards
- 500 Kilobytes: A very heavy box of punched cards
Megabyte (1024 Kilobytes)
- 1 Megabyte: 4 books (873 pages of plain text) OR A 3.5-inch floppy disk
- 2 Megabytes: A high-resolution photograph
- 5 Megabytes: The complete works of Shakespeare OR 30 seconds of TV-quality video
- 10 Megabytes: A minute of high-fidelity sound OR A digital chest X-ray
- 20 Megabytes: A box of floppy disks
- 50 Megabytes: A digital mammogram
- 100 Megabytes: 1 meter of shelved books OR A two-volume encyclopedic book
- 200 Megabytes: A reel of 9-track tape OR An IBM 3480 cartridge tape
- 500 Megabytes: A CD-ROM OR The hard disk of a PC
Gigabyte (1,024 Megabytes, or 1,048,576 Kilobytes)
- 1 Gigabyte: A pickup truck filled with paper OR A symphony in high-fidelity sound OR A movie at TV quality. 1 Gigabyte could hold the contents of about 10 yards of books on a shelf.
- 2 Gigabytes: 20 meters of shelved books
- 5 Gigabytes: An 8mm Exabyte tape
- 20 Gigabytes: A high-quality audio collection of the works of Beethoven OR A VHS tape used for digital data
- 50 Gigabytes: A floor of books OR Hundreds of 9-track tapes
- 100 Gigabytes: A floor of academic journals OR A large ID-1 digital tapes.
Terabyte (1,024 Gigabytes)
- 1 Terabyte: An automated tape robot OR All the X-ray films in a large technological hospital OR 50,000 trees made into paper and printed.
- 1 Terabyte: 1,613 650MB CDs or 4,581,298 books.
- 1 Terabyte: 1,000 copies of the Encyclopedia Britannica.
- 2 Terabytes: An academic research library OR A cabinet full of Exabyte tapes
- 10 Terabytes: The printed collection of the US Library of Congress
Petabyte (1,024 Terabytes, or 1,048,576 Gigabytes)
- 1 Petabyte: 5 years of Earth Observing System (EOS) (at 46 mbps)
- 1 Petabyte: 20 million 4-door filing cabinets full of text or 500 billion pages of standard printed text.
- 2 Petabytes: All US academic research libraries.
- 20 Petabytes: Production of hard-disk drives in 1995
- 200 Petabytes: All printed material ever OR Production of digital magnetic tape in 1995
Exabyte (1,024 Petabytes)
- An exabyte of data is created on the Internet each day in 2012 or 250 million DVDs worth of information.
- 5 Exabytes: All words ever spoken by human beings.
Zettabyte (1,024 Exabytes)
- Cisco estimates 1.3 zettabytes of traffic annually over the internet in 2016
Yottabyte (1,204 Zettabytes, or 1,208,925,819,614,629,174,706,176 bytes)
- It’s equal to one septillion (1024) or, strictly, 280 bytes.
- Its name comes from the prefix ‘Yotta’ derived from the Ancient Greek οκτώ (októ), meaning “eight”, because it is equal to 1,0008
- In 2010, it would have cost $100 trillion to make a yottabyte storage system made out of the day’s hard drives.
After ‘Yotta’, the officially recognized prefix system comes to a halt, likely because humans haven’t had the need to work with larger quantities of… anything really. There are some other measurement units, however, which go well beyond the Yotta and which are recognized by some experts in their fields. For instance, the brontobyte is 1 followed by 27 zeros and some believe will be the scale of data enabled by the internet of things (smart devices from toasters to fridges to home sensors that constantly transmit and receive data). Gegobyte is 10 to the power 30, which by now is futile to count in DVDs or anything like it.