Releasing the Truth

Digging for knowledge…

Human–chimp DNA similarity re-evaluated

A review of the common claim that the human and chimpanzee (chimp) genomes are nearly identical was found to be highly questionable solely by an analysis of the methodology and data outlined in an assortment of key research publications. Reported high DNA sequence similarity estimates are primarily based on prescreened biological samples and/or data. Data too dissimilar to be conveniently aligned was typically omitted, masked and/or not reported. Furthermore, gap data from final alignments was also often discarded, further inflating final similarity estimates. It is these highly selective data-omission processes, driven by Darwinian dogma, that produce the commonly touted 98% similarity figure for human–chimp DNA comparisons. Based on the analysis of data provided in various publications, including the often cited 2005 chimpanzee genome report, it is safe to conclude that human–chimp genome similarity is not more than ~87% identical, and possibly not higher than 81%.



A common claim is that the DNA of chimpanzees (Pan troglodytes) and humans (Homo sapiens) are about 98% similar. This oversimplified and often-touted estimate can actually involve two completely separate concepts. 1) Gene content (the comparative counts of similar types of coding sequences present or absent between different species) and 2) similarities between the actual base pairs of DNA sequences in alignments. For the most part, the modern similarity paradigm refers to DNA sequence alignment research. Biological sequence data often goes through several levels of prescreening, filtering and selection before being summarized and discussed.

One of the major problems with overall research in the field of comparative genetics, as we will show, is that in most studies there is a great deal of preselection applied to the available biological samples and data before the final analysis is undertaken. Only the most promising data from a larger pool is typically extracted for a final analysis.


Early human–chimp studies with reassociation kinetics

The initial estimates of high human-chimp DNA similarity came from a field of study called reassociation kinetics. These initial reports fueled early claims by such popular evolutionary luminaries as Oxford Professor Richard Dawkins, who stated “Chimpanzees and we share more than 99 per cent of our genes.” At the time, this statement was presumptuous, because gene numbers for humans and chimps were not known. The initial drafts of the human and chimp genomes were not announced until 2001 and 2005, respectively.

The supposed gene data Dawkins referred to in 1986 was an indirect estimate based on the reassociation kinetics of mixed human and chimp DNA, not clearly defined genes.1 In reassociation kinetics, heat and/or chemistry are used to separate double-stranded DNA into single strands. When the DNA is allowed to reassociate in a controlled manner, it can be fractionated using various protocols. The slower the reassociation, the more complex and gene-dense the DNA is thought to be. In general, three types of DNA can be recovered: high-copy (highly repetitive, gene poor), low-copy (moderately repetitive, low levels of genes), and single copy (gene-rich). For comparative studies, the single copy fraction of DNA is collected from two species, mixed together, disassociated and allowed to reassociate so that human and chimp DNA can recombine. The level of complementary base matching between strands can be indirectly measured by a variety of methods that indirectly measure rates/levels of reassociation.

The caveat is that only the single-copy fractions of the human and chimp genomes were utilized to obtain early estimates of similarity. Scientists focused on the single-copy fraction because of the high gene content. However, many genes are located in the other genome fractions and were thus left out of the analysis. Another problem is that virtually the entire genome is now known to be functional in some aspect and the non-coding regions have been shown to provide many critical control features and nucleotide templates.


Genomics research—affirming the myth

Subsequent research using sequenced DNA built upon the early high similarity dogma established by reassociation kinetics. In a companion to this paper, we discuss the possibility that an unspoken dogma-based ‘Gold Standard’ regarding the human–chimp similarity issue was established during the initial studies involving reassociation kinetics.

A review paper written by creationist Todd Wood on biological similarity between human and chimp highlighted and supposedly confirmed evolutionary similarity claims, yet ignored the important bioinformatic issues surrounding widespread data omission and selective analyses. Wood’s review did little to support creationist claims that humans were uniquely created in the image of God rather than being a few DNA base pairs from a chimp. Therefore, our focus on DNA sequence similarity will address the same publications listed in Wood’s review in addition to several more recent papers.


Total genomic bases analyzed

Aligned bases

Reported DNA identity

Actual DNA identity*

Britten, 2002




~ 87%

Ebersberger et al., 2002




< 65%

Liu et al., 2003

10,600,000 (total for human, chimp, baboon, and marmoset)

4,968,069 (human–chimp)

98.9% no indels


Wildman et al., 2003

~90,000 (exons from 97 genes)




Chimp. Chrom. 22 Consort.



98.5% excluding indels

80–85% including indels

Nielson et al., 2005



99.4% selected gene regions


Chimp. Seq. Consort. 2005

Whole genome (5X redundant coverage)

2.4 Gb




* Based on the amount of omitted DNA sequence in the alignments
** Compared to data from The International Human Genome Sequencing Consortium (2004)—((.9577 x 2.4 Gb) / 2.85 Gb) x 100
? Cannot calculate actual percent identity because data was not provided.

Roy Britten, one of the early pioneers in DNA reassociation kinetics, compared the genomic sequence from five chimp large-insert DNA clones (Bacterial Artificial Chromosomes, or BACs) to human genomic sequence using an atypical fortran-based computer program. These five chimp BAC sequences were chosen because they were the only ones then available.Researchers typically choose initial seed BACs for genome sequencing because of their single-copy DNA content, which makes them easier to assemble and compare to other species. The total length of the DNA sequence for all 5 BACs was 846,016 bases. However, only 92% of this was alignable to human DNA, thus the final statistics reported on only 779,132 bases. To his credit, Britten included the alignment data on insertions and deletions (indels) and reported a human–chimp similarity of ~95%. However, a more realistic figure would include the complete high-quality sequence of all five BACs, which is just as legitimate as the indels within the alignments; giving a final DNA similarity of 87%


Figure 1. Illustration showing the caveats of a hypothetical pairwise alignment between homologous sequences from two different species Figure 1.

Another notable study published by Ebersberger et al. the same year as Britten’s paper utilized chimp genome sequence obtained from randomly sheared, size-selected fragments in the 300 to 600 base range.These DNA sequences were aligned to an early version of the human genome assembly using the BLAT (Blast-Like Alignment Tool) algorithm. Researchers selected two-thirds of the total sequence for more detailed analyses. One-third of the chimp sequence would not align to the human genome and was discarded. The methods section in the paper19 describes how the subset of prescreened data was further filtered to obtain only the very best alignments. The resulting data was then subjected to a variety of comparative analyses that, for all practical purposes, are completely meaningless given the extremely high level of selection, data masking, and filtering applied. Not surprisingly, they report only a 1.24% difference in only highly similar aligned areas between human and chimp. A more realistic sequence similarity  is not more than 65% .

Shortly after these initial human–chimp comparison papers, a disturbing trend quickly emerged. This trend involved only reporting final alignment results and omitting the specific details of how such data was filtered, masked and selected. Key data to allow critical readers of human–chimp similarity papers to calculate a more accurate overall similarity began to be consistently omitted. For example, Liu et al. reported on the alignment of human genomic sequence with chimp, baboon, and marmoset. Important information concerning the starting set of sequences and specific data for the alignments was omitted. They state only that they used a total amount of 10.6 Mb of sequence for all species combined. Their similarity estimate on the final alignment, omitting indels and non-aligned areas, was 98.9%. Including indels, we derived a value of 95.6% for the alignments, similar to Britten’s research. Important data outside the aligned areas was impossible to evaluate because of the omitted sequence data.

Another disturbing trend is that only highly conserved protein-coding sequence (exons) are often utilized to report genome-wide similarity. We now know that non protein-coding sequences, which comprise greater than 95% of the genome, are critical to all aspects of genetics and genome function. Typical of the trend to only align exonic sequences, Wildman, et al. reported on a study that compared only human and chimp protein coding regions of 97 exon fragments for a total of 90,000 bases.

In 2004, Watanabe et al. used a variety of BAC libraries to select clones for DNA sequencing representing chimp chromosome 22. The sequence was then compared to its similar human homolog. The caveat is that the individual chimp BAC clones were only selected if they each contained 6 to 10 human DNA markers. Unfortunately, critical overall DNA alignment statistics are not given in the paper or in the supplemental information. The authors state a nucleotide substitution rate of 1.44% in aligned areas, but do not give similarity estimates to include indels. While indels are omitted from the alignment similarity, the authors indicate that there were 82,000 of them and provide a histogram that graphically shows the size distribution based on binned data groupings. Oddly, no data for average indel size or total indel length was provided. Likewise, the number of sequence gaps were given, but nothing about cummulative gap size.  Based on an estimate using the limited graphical data provided regarding base substitutions and indels, an estimate of about 80 to 85% overall similarity can be inferred.

One of the most ambiguous of all human–chimp studies was published by Nielson et al. In keeping with the established obfuscational trend, only highly conserved exons were used and no data were given to allow one to calculate any type of real overall similarity. Of the total starting number of gene sequences in the analysis (20,361) the researchers decided to throw out 33% (6,630) in an ambiguously stated “very conservative quality control”. In other words, one third of the initial chimp data did not align to human, so it got tossed out. In fact, no hard data was actually given.


Chimpanzee rough draft genome assembly data—81% similarity?


The major milestone publication regarding human–chimp genome comparisons was the 2005 Nature paper from the International Chimpanzee Genome Sequencing Consortium.4 Unfortunately, this paper followed the previously established trend where most of the comparative data was given in a highly selective and obfuscated format and detailed information about the alignments was absent. The majority of the paper was primarily concerned with a variety of hypothetical evolutionary analyses for various divergence rates and selective forces. Hence, the critical issue of overall similarity was carefully avoided.

However, based on the numbers given in the chimp genome paper, one can determine a rough overall genome similarity between humans and chimp by including published concurrent information from the human genome project. In regards to the overall alignment, the authors state, “Best reciprocal nucleotide-level alignments of the chimpanzee and human genomes cover ~2.4 gigabases (Gb) of high-quality sequence”. At this time, the human euchromatic assembly was estimated to be 99% complete at 2.85 Gb and had an error rate of 1 in 100,000 bases. The chimp genome authors state, “The indel differences between the genomes thus total ~90 Mb. This difference corresponds to ~3% of both genomes and dwarfs the 1.23% difference resulting from nucleotide substitutions.”

In summary, only 2.3 Gb of chimp sequence aligned onto the highly accurate and complete human genome (2.85 Gb) an operation that included the masking of low complexity sequences. For the chimp sequence that aligned, the data for substitutions and indels indicates 95.8% similarity, a biased figure which excludes the masked regions. Using these numbers, an overall estimate of chimp compared to human DNA produces a conservative estimate of genome-wide similarity at 80.6%.


The paradigm starts to crumble


A study by Ebersberger et al., in which a large pool of human, chimp, orangutan, rhesus and gorilla genomic sequences was used in constructing phylogenies (multiple alignments analyzed in evolutionary tree format). The original pool of DNA sequences actually went through several levels of selection to preanalyze, trim and filter them for optimal alignment. First, a set of 30,112 sequences were selected that shared homology (overlapping similarity) between the five species. These sequences were aligned and only those which produced ≥ 300 base alignments were retained for another series of alignments and only the sequences that produced superior statistical probabilities > 95% were used in the final analysis. This filtering process removed over 22% of already-known, pre-selected homologous sequence. Despite all of this data filtering designed to produce the most favourable evolutionary alignment and trees, the results did not show any clear path of ancestry for humans with chimps or any of the great apes. What emerged was a true mosaic of unique human and primate DNA sequences; discounting any clear path of common ancestry. Perhaps the best summary of the research can be found in the author’s own words.

“For about 23% of our genome, we share no immediate genetic ancestry with our closest living relative, the chimpanzee.

“Thus, in two-thirds of the cases a genealogy results in which humans and chimpanzees are not each other’s closest genetic relatives. The corresponding genealogies are incongruent with the species tree. In accordance with the experimental evidences, this implies that there is no such thing as a unique evolutionary history of the human genome. Rather, it resembles a patchwork of individual regions following their own genealogy.”


The Y-chromosome

One of the most intriguing studies is the Y-chromosome comparison between humans and chimps. In this study, the male-specific region (MSY), was compared between human and chimp. The result was 25,800,000 bases of highly accurate chimp Y-chromosome sequence distributed among eight contiguous segments. When compared to the human Y-chromosome, the differences were enormous. The authors state, “About half of the chimpanzee ampliconic sequence has no homologous, alignable counterpart in the human MSY, and vice versa.”

The ampliconic sequence contains ornate repeat units (called palindromes) that read the same forwards as they do backwards. Dispersed within these palindromes are families of genes that are expressed primarily in the male testes. Not only did 50% of this type of sequence fail to align between human and chimp in the Y-chromosome, humans had over twice as many total genes (60 in humans vs 25 in chimp). There were also three complete categories of genes (gene families) found in humans that were not even present in chimps. Related to this large difference in gene content, the authors note, “Despite the elaborate structure of the chimpanzee MSY, its gene repertoire is considerably smaller and simpler than that of the human MSY,” and “the chimpanzee MSY contains only two-thirds as many distinct genes or gene families as the human MSY, and only half as many protein-coding transcription units.”

A comparison of the so-called X-degenerate gene regions between humans and chimps also showed distinct organizational and locational differences in addition to differences in gene content. In fact, humans have three types (classes) of X-degenerate genes that are not even present in chimps.

Besides the large differences in gene content between human and chimp MSY regions, the overall structural differences were enormous. Take note of some of the additional comments from the authors:

“Moreover, the MSY sequences retained in both lineages have been extraordinarily subject to rearrangement: whole chromosome dot-plot comparison of chimpanzee and human MSYs shows marked differences in gross structure.

“The chimpanzee ampliconic regions are particularly massive (44% larger than in human) and architecturally ornate, with 19 palindromes (compared to eight in human) and elaborate mirroring of nucleotide sequences between the short and long arms of the chromosome, a feature not found in the human MSY.

“Of the 19 chimpanzee palindromes, only 7 are also found in the human MSY; the other 12 are chimpanzee-specific. Unlike the human MSY, nearly all of the chimpanzee MSY palindromes exist in multiple copies.”

The large differences in both structural arrangements of unique DNA features and gene content described in the Y-chromosome study, is particularly damaging to human-chimp DNA similarity mythos and the dogma of primate evolution. In fact, the authors shockingly note that given “ … 6 million years of separation, the difference in MSY gene content in chimpanzee and human is more comparable to the difference in autosomal gene content in chicken and human, at 310 million years of separation.”

A large study of genetic variation in the human genome showed that the Y-chromosome was exceptionally stable and had five times less genetic variation than the autosomes. This data makes perfect sense because the Y-chromosome has no similar homolog in the genome and undergoes very little recombination with the X-chromosome during meiosis. Given this lack of recombination and sequence diversity on the Y-chromosome, the primate evolution model encounters a serious problem, because the human and chimp Y-chromosomes should be considerably more similar to each other.

Some cases of high similarity may be due to contamination

Another factor to consider in the human-chimp similarity debate is that some cases of high sequence similarity may be due to contamination. Not only is the chimpanzee genome assembly still largely based on the human genomic framework, it also now appears that the wide-spread contamination of non-primate databases with human DNA is a serious problem and can run as high as 10% in some cases.Human contamination results from the process of cloning DNA fragments in the lab for sequencing where airborne human cells come from coughing, sneezing, and physical contact with contaminated fingers.

On a recent website at the Ensembl database (joint bioinformatics project between EMBL-EBI and the Wellcome Trust Sanger Institute), a webpage titled ‘Chimp Genebuild’ provides the following information as to one of the ways in which the human genome is used as a guide to assemble and annotate the chimp genome:

“Owing to the small number of proteins (many of which aligned in the same location) an additional layer of gene structures was added by projection of human genes. The high-quality annotation of the human genome and the high degree of similarity between the human and chimpanzee genomes enables us to identify genes in chimpanzee by transfer of human genes to the corresponding location in chimp.

“The protein-coding transcripts of the human gene structures are projected through the WGA [whole genome assembly] onto the chromosomes in the chimp genome. Small insertions/deletions that disrupt the reading-frame of the resultant transcripts are corrected for by inserting ‘frame-shift’ introns into the structure.”



Em Português clique aqui


5 responses to “Human–chimp DNA similarity re-evaluated

  1. novoline spiele gratis October 28, 2013 at 16:43

    hey there and thank you for your information – I’ve definitely picked up something new from right here.
    I did however expertise a few technical points using this
    web site, as I experienced to reload the website many times previous
    to I could get it to load properly. I had been wondering if your web host is
    OK? Not that I am complaining, but sluggish loading instances
    times will sometimes affect your placement in google and can damage your quality
    score if ads and marketing with Adwords.
    Well I’m adding this RSS to my e-mail and could look out for a lot more of your respective interesting content.
    Make sure you update this again soon.

  2. Ryan June 5, 2013 at 16:34

    Good article.
    Any info on the similarities to other animals?
    In other words is there any info saying something like “Hey we are really no closer related to chimps than we are ducks.”

    • adonizedek June 10, 2013 at 15:45

      Hello! Well, I haven’t any specific number in regard to other animals, it’s said that we share ~50% of similarity with bananas lol

      However, it’s a complicated subject, there are species, like wheat, certain onions, bacteria that have much more genetic material than humans, but this doesn’t cause them to be more complex, advanced organisms, nor similar to us. And most gene similarity is only due to biochemical, organic functions common to any living being, like protein productions, cell maintenance, reproduction, energy storing, and so on..

      To use genetic, functional similarities as “evidence” for common origin is wrong. God bless!

  3. Sam Manning June 5, 2013 at 15:57

    The fib in all this is that data were omitted. A simple google scholar search returns many papers in which various types of noncoding DNA have been used in these analyses. For that reason alone, I find Tomkins to be unreliable. he is, afterall, trying to save his soul, not follow the evidence.

    • adonizedek June 10, 2013 at 15:56

      Yes, at least the Nature presented proper data, so that other researchers like him could make the correct analyse, therefore showing the reality, i.e, the alleged high similarity between humans-chimps is nothing but bias..

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: