99 percent of the microbes inside us are unknown to science

A survey of DNA fragments circulating in the blood suggests the microbes living within us are vastly more diverse than previously known. In fact, 99 percent of that DNA has never been seen before.

Of all the non-human DNA fragments the team gathered, 99 percent of them failed to match anything in existing genetic databases the researchers examined.

With that in mind, Mark Kowarsky, a graduate student in Quake’s lab and the paper’s first author, set about characterizing all of that mystery DNA.

The “vast majority” of it belonged to a phylum called proteobacteria, which includes, among many other species, pathogens such as E. coli and Salmonella. Previously unidentified viruses in the torque teno family, generally not associated with disease but often found in immunocompromised patients, made up the largest group of viruses.

“We’ve doubled the number of known viruses in that family through this work,” Quake said. Perhaps more important, they’ve found an entirely new group of torque teno viruses. Among the known torque teno viruses, one group infects humans and another infects animals, but many of the ones the researchers found didn’t fit in either group. “We’ve now found a whole new class of human-infecting ones that are closer to the animal class than to the previously known human ones, so quite divergent on the evolutionary scale,” he said.

Human Microbiome Contains an Unexpected Diversity of Novel Phages and Viruses

From the 2,917 placed novel contigs, 276 (9%) correspond to novel viral sequences, which are predominantly either phages or torque teno viruses (TTVs). Distinguishing between a phage and its bacterial host is difficult with short sequences, as they both are prone to incorporate each other’s genes. Indeed, of the 523 contigs containing phage genes, 333 also have bacterial genes. Nonetheless, identifying these is important, as the contigs with the most predicted genes are all phage or prophage candidates. Half of the genes have no homology for the top 15 such contig.

Deep sequencing of cfDNA from a large patient cohort revealed previously unknown and highly prevalent microbial and viral diversity in humans. This demonstrates the power of alternative assays for discovery and shows that interesting discoveries may lurk in the shadows of data acquired for other purposes. Many megabases of new sequences were assembled and placed in distant sectors of the tree of life. With deeper sequencing and targeted sample collection, we expect numerous new viral and bacterial species to be discovered in the circulating nucleic acids of organisms that will complement existing efforts to characterize the life within us. Novel taxa of microbes inhabiting humans, while of interest in their own right, also have potential consequences for human health. They may prove to be the cause of acute or chronic diseases that, to date, have unknown etiology and may have predictive associations that permit presymptomatic identification of disease.

PNAS – Numerous uncharacterized and highly divergent microbes which colonize humans are revealed by circulating cell-free DNA


Through massive shotgun sequencing of circulating cell-free DNA from the blood of more than 1,000 independent samples, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Previous studies targeted specific niches such as feces, skin, or the oral cavity, whereas our approach of using blood effectively enables sampling of the entire body and reveals the colonization of niches which have been previously inaccessible. We were thus able to discover that the human body contains a vast and unexpected diversity of microbes, many of which have highly divergent relationships to the known tree of life.


Blood circulates throughout the human body and contains molecules drawn from virtually every tissue, including the microbes and viruses which colonize the body. Through massive shotgun sequencing of circulating cell-free DNA from the blood, we identified hundreds of new bacteria and viruses which represent previously unidentified members of the human microbiome. Analyzing cumulative sequence data from 1,351 blood samples collected from 188 patients enabled us to assemble 7,190 contiguous regions (contigs) larger than 1 kbp, of which 3,761 are novel with little or no sequence homology in any existing databases. The vast majority of these novel contigs possess coding sequences, and we have validated their existence both by finding their presence in independent experiments and by performing direct PCR amplification. When their nearest neighbors are located in the tree of life, many of the organisms represent entirely novel taxa, showing that microbial diversity within the human body is substantially broader than previously appreciated.