AT and CG: What are their percentages? There are four genetic codes, A, T, C, G. A and T pair, having the same amount. C and G pair, also having the same amount. What is the average ratio between AT and CG? Intuitively, or from the information theory, one would expect the average ratio of AT to CG to be one. This would maximize the efficiency of information storage. However, the average ratio of AT to CG doesn’t seem to be one. There is a special terminology for this issue, GC-content. It is defined as (G+C)/(G+C+A+T) The level of GC-content varies from species to species. The level of GC-content of human genome is around 40%. The levels of GC-content of genomes of many species are also around 40%. Why 40%? Why not 50%? A and T are connected with two hydrogen bonds. C and G are connected with three hydrogen bonds. We might say A and T are cheaper to make than C and G. Hence we would expect more AT than CG. This is what we have seen. Indeed, the average ratio of AT to CG in human genome is roughly 3:2, the inverse of ratio of hydrogen bonds between AT and CG. Of course GC-content varies with species. Randomness could be one factor. Environment could be another factor. For example, in a very volatile environment, stronger bonds between GC pairs, which have three hydrogen bonds, could have an advantage. In this case, higher GC-content might be expected. Reference GC-content https://en.m.wikipedia.org/wiki/GC-content
|