CrAss-like phages (crassviruses) are an order of bacterial viruses (bacteriophages) that represent the most abundant viruses in the human gut, discovered in 2014 by cross assembling reads in human fecal metagenomes. In silico comparative genomics and taxonomic analysis have found that crAss-like phages represent a highly abundant and diverse family of viruses. CrAss-like phage were predicted to infect bacteria of the Bacteroidota phylum and the prediction was later confirmed when the first crAss-like phage (crAss001) was isolated on a Bacteroidota host (Bacteroides intestinalis) in 2018. Crass
CrAss-like phages (crassviruses) are an order of bacterial viruses (bacteriophages) that represent the most abundant viruses in the human gut, discovered in 2014 by cross assembling reads in human fecal metagenomes. In silico comparative genomics and taxonomic analysis have found that crAss-like phages represent a highly abundant and diverse family of viruses. CrAss-like phage were predicted to infect bacteria of the Bacteroidota phylum and the prediction was later confirmed when the first crAss-like phage (crAss001) was isolated on a Bacteroidota host (Bacteroides intestinalis) in 2018. Crassviruses used to be part of the abolished family of podoviruses, possessing short non-contractile tails and icosahedral capsids. The first 3D structure of a crassvirus was determined by cryo-EM in 2023. While the presence of crAss-like phage in the human gut is not yet associated with any specific health condition, they are generally associated with a healthy gut microbiome and likely impact significantly on the gut Bacteroidota.
== Discovery == The crAss (cross-assembly) software used to discover the first crAss-like phage, p-crAssphage (prototypical-crAssphage), relies on cross assembling reads from multiple metagenomes obtained from the same environment. The goal of cross-assembly is that unknown reads from one metagenome align with known reads, or reads that have similarity to known reads, in another metagenome, thereby increasing the total number of usable reads in each metagenome. The crAss software is an analysis tool for cross-assemblies which specializes in reference-independent comparative metagenomics. CrAss assumes that a contig(s) made up of reads from differing metagenomes (cross-contig) is representative of a biological entity present in each of the differing metagenomes. P-crAssphage was discovered when crAss was used to analyze the cross-assembly of twelve human fecal metagenomes. Several cross-contigs consisting of unknown reads were identified in all twelve individuals and through re-assembly techniques, the p-crAssphage genome was re-constructed. P-crAssphage has a ~97kbp circular DNA genome which contains 80 predicted open reading frames. Using co-occurrence analysis and CRISPR spacer similarities, the phage was predicted to infect Bacteroidota bacteria which are dominant members of the gut microbiome in most individuals.
Discovered by embedding cosine similarity (sentence-transformers MiniLM, 384-dim).