"There are humans walking around - and you and I are probably among them - that have missing genes or extra copies of genes, and are fine," Evan Eichler, associate professor of genome sciences at the University of Washington, told BioWorld Today.
That is his conclusion from data, to be published in Nature Genetics and now available via advance online publication, that investigate structural variation of the human genome at an intermediate scale.
"In the past, there were two big ways of looking at the genome," Eichler explained. One of them, the cytogenetic approach, looked at whole chromosomes and their banding patterns. While it is useful to detect genetic disorders at the chromosomal level, such as trisomy 21, it's certainly not what one might call detailed.
The other method, currently receiving the most research attention, is the ultimate detailed method: SNP analysis. That detects genomic variations consisting of just a single base pair.
"With this method, we are peering into the genome at a new level that was previously almost impossible to see," Eichler said.
"This method" is the use of fosmid DNA to find structural variation at an intermediate scale. There is plenty of intermediate space between a chromosome and a base pair, of course, but fosmids are a cloning vector that produce clones of about 40 kilobases in length.
The researchers, from the University of Washington School of Medicine in Seattle, Case Western Reserve University in Cleveland and the University of California at San Francisco, compared the human genome reference sequence with a second sequence from a different individual that had been cloned into such fosmids. They looked at each fosmid and compared whether the ends and the reference sequences matched up. If they did, the sequences in the middle, which can and presumably do contain single nucleotide polymorphisms, cannot differ in the gross amount of DNA that they contain.
"In 99 percent of cases, they do [match up]," Eichler said. "But in a subset, they are inserted wrong, or [the fosmid DNA] is too small, or too big."
In the second part of the paper, the researchers compared the fosmid mismatches they had identified to a larger reference dataset to validate them. Eichler said, "About 75 percent of them were real and polymorphic" - a total of almost 300 insertions, deletions and inversions, ranging from 8 to 40 kilobases in size, more than 580 megabases of sequence.
Asked about the respective functional significance of insertions, deletions and inversions, Eichler said that while the functional significance differs from gene to gene, overall, "extra copies are probably less deleterious than deletions. Inversions are tricky, because you may or may not change the expression level of that gene."
About half of the variations Eichler and his colleagues identified were in what he terms "environmental sensor genes" - genes that are associated with functions such as drug detoxification, innate immunity and inflammation, among others. Eichler and his team wrote in the Nature Genetics article that "although many of these environmental sensor' genes may not be essential for viability, they may be an important component of adaptability. Gains and losses of several of these genes are known risk factors for disease."
To further nail down the association between the variations they identified and disease, Eichler and his colleagues plan, among other things, to investigate larger populations for the prevalence of the variation they have uncovered and possible correlations with specific diseases.
"This could solve a lot of problems - not all of them, but a lot," Eichler said, adding that to make the technology more widely used, "the main thing is that the price has to come down. Right now it's a million dollars a pop, and that has to come down by at least two orders of magnitude."