Harvard team turns bacteria into living hard drives

A research team from Harvard University, led by Seth Shipman and Jeff Nivala, has developed a novel method of writing information into the genetic code of living bacterial cells. They pass the information on to their descendants, which can later be read by genotyping the bacteria.


Storing information into DNA isn’t a new idea — for starters, nature’s been doing it for a long, long time now. Researchers at the University of Washington have also shown that we can synthetically manufacture DNA in the lab and write any information they want into it — and to prove it, they encoded a whole book and some images into DNA strands. But combining the two methods into an efficient data storage process has proven beyond our grasp up to now.

“Rather than synthesizing DNA and cutting it into a living cell, we wanted to know if we could use nature’s own methods to write directly onto the genome of a bacterial cell, so it gets copied and pasted into every subsequent generation,” says Shipman. “But working within a living cell is an entirely different story and challenge.”

The team exploited an immune response certain bacteria use to protect themselves from viral infection, called the CRISPR/Cas system. When the bacteria are attacked by viruses, they physically cut out a segment of the invaders’ DNA and paste it into a specific region of their own genome. This way, if that same virus attacks again, the bacteria can identify it and respond accordingly. Plus, the cell passes this information over to its progeny, transferring the viral immunity to future generations.

The geneticists found that if you introduce a piece of genetic data that looks like viral DNA into a colony of bacteria that have the CRISPR/Cas system, it would incorporate it into their genetic code. So Shipman and Nivala flooded a colony of E. coli bacteria that has this system with loose segments of viral-looking DNA strands, and they gulped it all up — essentially becoming tiny, living hard drives.

The segments used were arbitrary strings of A, T, C, G nucleotides with chunks of viral DNA at the end. Shipman introduced one segment of information at a time and let the bacteria do the rest, storing away information like fastidious librarians.

Conveniently enough, the bacteria store new immune system entries sequentially, with earlier viral DNA recorded before that of more recent infections.

“That’s quite important,” Shipman says. “If the new information was just stored randomly, that wouldn’t be nearly as informative. You’d have to have tags on each piece of information to know when it was introduced into the cell. Here it’s ordered sequentially, like the way you write down the words in a sentence.”

Bugs with the bugs

One issue the team ran into is that not all of the bacteria record every strand of DNA introduced to the culture. So even if you introduce the information step by step, let’s say the numbers from 1 to 5, some bacteria would have “12345” but others may only have “12” or “245” and so on. But Shipman thinks that because you can rapidly genotype thousands or millions of bacteria in a colony and because the data is always stored sequentially, you’ll be able to clearly deduce the full message even with these errors.

Shipman adds that the 100 bytes his team demonstrated are nothing near the limit. Cells like the microorganism Sulfolobus tokodaii could potentially store more than 3,000 bytes of data. And with synthetic engineering, you could design or program specialized hard-drive bacteria with vastly expanded regions of their genetic code, able to rapidly upload vast amounts of data.