Talk:Human genome/Archive 1

This is an archive of past discussions about Human genome. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 1

Archive 2

Archive 3

Number of Base Pairs

At present the page does not state the size of the human mitochrondrial DNA. It's needed. —Preceding unsigned comment added by 171.64.168.252 (talk) 21:35, 31 January 2011 (UTC)

The size of the Homo Sapiens Genome is often quoted to be "3 billion base pairs", and this includes many of my Biochemistry Lectures, but apparently it is more like 3.9 billion (cf. http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?mode=Info&id=9606&lvl=3&lin=f&keep=1&srchmode=1&unlock ). Personally, I heard either at any level of academia. Should we change it to 3.9?

I'm not exactly sure what you're referring to on the page you linked, but if it's the first row of the table entitled "Entrez records" which says "3,924,784" in the "Nucleotide" row, that number refers to the number of Entrez-Nucleotide records in NCBI's database. An Entrez-Nucleotide record is typically the cDNA sequence of some mRNA. This has nothing at all to do with genome size.

AFAIK, the only major source of uncertainty is whether one counts heterochromatin and gaps in the HGP reference sequence or not, but I think they are included in the 3 billion bp number (i.e. the alternative number would be smaller). Can you find another source giving a higher number?

--Mike Lin (talk) 21:20, 23 November 2008 (UTC)

All chromosomes in the human genome are now considered "finished", and the assembly hasn't changed in recent NCBI releases. The standard for what's considered "finished" is available here, and is pretty strict. Barring further investigation (say, by crawling through the individual chromosome finishing papers) it looks to be me like the only gaps that remain are in ambiguities due to tandem repeats, which isn't really that important. The number as given is accurate, although if you compute the total I think it comes to 3.1 billion. Anyone interested in verifying can do so via the UCSC genome browser. Graft | talk 20:25, 24 November 2008 (UTC)

Error in the Chromosomes section

"It was well with in the boundrys of the five minute rule. No its the five second rule!? No no its the five minute rule I looked it up on wikipedia!" appears in the beginning of the text of the "Chromosomes" section. I couldn't find it in the edit window. Vandalism, to be sure -- but I don't know how to fix it. —Preceding unsigned comment added by 71.94.86.197 (talk) 15:32, 29 March 2008 (UTC)

DNA expression

can we explain DNA as the "edge" flutterings between dark energy and the light of reality, understanding that the reality created is expressing delicately controlled energy above H bomb level output ? ... ommmmmmmmmm mane 69.121.221.97 (talk) 04:05, 20 May 2011 (UTC)

Size of the human genome

I'm confused by the number of coding genes. HGP gives 30,000 CODING genes. Wiki gives 20,000 - 25,000 TOTAL genes with only 1.5% and 2.0 % coding. Can someone reconcile these discrepancies?

Thanks, Norm

The article says there are 3 billion base pairs. If each base pair has 4 possible values, at 2 bits per base pair, this adds up to about 715MB of digital data, just a bit more than is storable on a CD (you could probably store it on a CD if you WinZipped it). Are my calculations correct, and is this interesting/relevant enough (as a more comprehensible size reference) to put in the article?

Edit: correction, you might only need 1 bit per base pair (since there are only 2 possible values). Is this correct? If so, the human genome would be about 360MB.

Indeed, with a parsimonious encoding, one could store the human genome in well under 1GB. Moreover, it compresses very well, due to the high fraction of repetitive content. Conversely, PBS/Discovery Channel documentaries are always fond of the statement that if you printed out the human genome in telephone books, the stack would reach as high as the Washington Monument. The Human Genome Project cost a total of $4.6B or about $1.50 per base pair. These are all interesting tidbits of trivia, but I don't think they are really relevant to this scientifically-oriented article. But I could see a "trivia" section in Human Genome Project. --Mike Lin 04:11, 19 April 2006 (UTC)

No, there are 4 possible values, not just 2. Even though they come in matching pairs, a G base has a different meaning than a C base, even though they are a matching pair. -- 88.74.42.123 (talk) 10:41, 1 April 2008 (UTC)

I dont know why, but think that this is a neat little fact. The human genome (in document form) is 1/33 of a terabyte. In comparison, the whole of wikipedia is only would only take up 1/113 of a terabite. I think that fact puts it in perspective for us computer geeks out there.

That sentence is kinda odd, it is trying to express how big the genome information-whise, which is a different number from that in the "inefficient" digital format and is pointless biologically due to the C-value paradox. reguarding the former: ok, 1 bit per base for 3 billion gives you a CD, but there are several standard of encoding genetic material, which not only are ASCII coded (1 base is a letter, a byte) but some such as genbank have spaces and line numbers (10 and 60 bases). One can download download the human genome here if anyone cares to see how small (genome is repeat rich) you can store it with metadata (annotations), as it is a gunzipped tarball (btw, the user who wrote WinZip should be embarraced of himself for his OS). It is a cool trivia but it needs a bit of specification. --Squidonius (talk) 12:59, 5 May 2008 (UTC)

The CD thing is indeed a little screwy, but I think it's helpful to give a rough idea of the size of the genome in terms that the average Wikipedia reader is likely to understand. The idealization underlying this calculation, of DNA as a two-bit information storage medium, is simplistic but not unreasonable. I'd be open to rephrasing the thought to clarify the fact that it is intended as a rough analogy, not a scientifically precise statement. --Mike Lin (talk) 22:59, 6 May 2008 (UTC)

Talking about its future

I think you can talk about what you can do with it in the future.

Good point! — Preceding unsigned comment added by Maurice Carbonaro (talk • contribs) 11:15, 18 October 2006‎ (UTC)

Illustrations

The content of the page is pretty good and not too technical. I think that it could benefit from some illustrations. A nice chromosome painting picture would be best, at least for now, I think I'll go to NCBI and UCSC to see if I make something using the genome browsers.--Plociam 18:05, 9 August 2005 (UTC)

That would be awesome. My efforts so far have been to create a solid technical/scientific base for the article. It certainly needs illustration and some more introductory/"pop science" information. BTW please vote for This Week's Improvement Drive :o) --Mike Lin 18:48, 9 August 2005 (UTC)

Expansion Ideas

The human genome and disease
The future of human genome research
- $1000 genome sequencing

--Plociam 07:14, 10 August 2005 (UTC)

Qualifications

Plociam asked for clarification of the cop-out last part of this sentence:

Thus follows the popular statement that "all humans are at least 99% genetically identical", although this would be somewhat qualified by most geneticists.

This thought should definitely be expanded upon. What I really wanted to do here was to explain this excerpt from Bill Clinton's 2000 State of the Union address:

I just want to say one more thing about this, and I want every one of you to think about this the next time you get mad at one of your colleagues on the other side of the aisle. This fall, at the White House, Hillary had one of her millennium dinners, and we had this very distinguished scientist there, who is an expert in this whole work in the human genome. And he said that we are all, regardless of race, genetically 99.9 percent the same.

Now, you may find that uncomfortable when you look around here. (Laughter.) But it is worth remembering. We can laugh about this, but you think about it. Modern science has confirmed what ancient faiths has always taught: the most important fact of life is our common humanity. Therefore, we should do more than just tolerate our diversity -- we should honor it and celebrate it. (Applause.)

The "qualifications" that I think a geneticist would attach to this statement would go as follows: if you look an SNPs, they cover much less than 1% of the genome. But SNPs have a very specific, technical definition. We can get a lot more SNPs by saying that the base substitution has to be present in only 0.1% of the population instead of 1%. Going further, if you also look at repeats or heterochromatin, for example, actually you can have a fair bit of different stuff going on from person to person. But that stuff doesn't really seem to matter to the phenotype...basically, this is just recognizing that a statement like "we are all, regardless of race, genetically 99.9 percent the same" is a nice soundbite for John Q. Public, but there is a lot of technical caveats underneath in how you define that percentage.

There is a related issue, also in the article, in saying "the species ABC genome is XYZ% identical to the human genome". What does that mean? There are two parts to the "real" answer: how much of the ABC genome aligns to the human genome, and of those portions that align, what percentage of base identity do you have? But there are arbitrary cutoffs made in genome alignments, of whether something aligns or not...so the simplified statement is really just a rough approximation, and it should be presented as such.

None of this is really captured in the article text because I worried about introducing distracting technical minutae...but we should try to convey it somehow.

--Mike Lin 04:50, 16 August 2005 (UTC)

Mike Lin has a good point. While this is a pretty technical caveat, I agree that this article should not make the blanket statement that humans are "genetically 99.9% identical." For now, I suggest that the article qualifies the above statement by replacing "although this would be somewhat qualified by most geneticists" with something simple but more specific, such as "however, this estimate depends on the precise definition of a SNP, which must underestimate the total variation within the genome." In the future, there may be a place for the complete explanation, perhaps in a "criticism" or "controversy" section, particularly if a citation to that viewpoint can be provided. In the meantime, let's leave the complete explanation in the talk section.

--Plociam 00:41, 19 August 2005 (UTC)

Someone should just look this up in the HAPMAP paper. I dunno if they reported on it yet, since it might be reserved for the later ENCODE paper on human variation, but they did complete resequencing in something like 5MB of human genome sequence for a number of populations, which should be more than enough to provide a reasonable estimate of true heterozygosity in humans and avoids the whole "definition of a SNP" debate. Anyway the data is published, so I'm sure they've at least said something about it... Graft 22:30, 27 January 2006 (UTC)

Its very shadey to say "we're 99.9% identical" because many of the sequences are introns (junk DNA) and there are other factors that affect gene-expression (phenotype) and it undermines a principle of biology of the extreme diversity among a species even if the genome is highly repetitive. For example, someone who has been infected with HIV, has had their DNA changed on under a hundred sequences . That would equate to a insurmountably small percentile difference using the same math rules. I favor leaving this out, as it will inadvertently miseducate. For a deeper response, and references to support this, please use my talk page. It takes a deep understanding of genetics to not misinterpret that misnomer of a quote. Sentriclecub (talk) 12:38, 24 April 2008 (UTC)

Genetic basis of human intelligence

Here's a rough draft of a section that I'm working on, although it is probably more appropriate for a broader article on Human genetics, which I'm also working on at a meta-level. That aside, any comments appreciated. --Mike Lin 13:34, 31 August 2005 (UTC)

The remarkable overall similarity of the human genome to those of other mammals has given rise to a scientific debate over the genetic basis of human intelligence. The central issue is whether the recent evolution of human intelligence was relatively typical or extraordinary. If it was typical (that is, if the relevant genes evolved at an average rate), then our capacity for abstract reasoning, arts, and science are most likely the result of relatively modest changes to a small number of genes, simply since so little evolutionary time has passed between humans and other primates. If, on the other hand, uniquely intense selective pressures led to extraordinarily fast evolution of the relevant portions of the human genome, then human intelligence could be the result of rapid evolution of many genes.

A recent study has found that more than one hundred key genes thought to govern the development of the brain have evolved significantly faster in humans than in other mammals, providing some evidence for extraordinary selective pressures and large numbers of genes governing intelligence. However, the issue remains far from settled, since the evolutionary context giving rise to such extraordinary selective pressure has not been convincingly explained.

This debate, which is likely to continue for some time, could ultimately have significant ethical and societal implications. If human intelligence is, in fact, guided by a small number of genes, then it is forseeable that, in the reasonably near future, geneticists might be able to determine or even engineer a person's natural predisposition towards particular intellectual pursuits (such as mathematics or music) on the basis of their genes.

See also: Race and intelligence

Dorus, S. et. al. Accelerated evolution of nervous system genes in the origin of Homo sapiens. Cell 119(7):1027-40, December 2004.

Why are we to "see also" Race and intelligence?

I do not have a copy of the referenced article by Dorus et al, I only read the abstract. I doubt that anyone has identified "more than one hundred key genes thought to govern the development of the brain" unless you take "key gene" to have a meaning like "is involved in". I'd also like to know the quantitative data that stand behind the claim of: "extraordinary selective pressures".

The main genetic changes that account for most of the differences in brain function betwee humans and chimps could have originally involved only a small number of key regulatory genes, for example, alterations in a few transcription factor genes might theoretically account for the greater post-natal brain growth in humans. After such "genetically small" initial changes, hundreds of other genes that have lesser roles in brain development and function could have been modified during subsequent evolution as a secondary response to the initial changes.

Even "If human intelligence is, in fact, guided by a small number of genes" it does not follow that "in the reasonably near future, geneticists might be able to determine or even engineer a person's natural predisposition towards particular intellectual pursuits".--JWSchmidt 15:35, 31 August 2005 (UTC)

Thanks. Most of your points are well taken. The quantitative measure used by Dorus et. al. are Ka/Ks on human vs. macaque nervous system genes as compared to Ka/Ks on the mouse vs. rat orthologs. (Higher Ka/Ks suggests either faster evolution or loss of function, and presumably our nervous system genes aren't losing function -- yes, they make a better argument than that in the paper.) The set of "nervous system genes" was culled from a manual literature search and known nervous system disease genes. They find 1) significantly higher average Ka/Ks in humans compared to rodents; 2) significantly more individual genes with higher Ka/Ks in human than rodent than the other way around; 3) significantly more biased distribution of Ka/Ks values in human compared to rodents; 4) an even stronger bias in all of the above when the "nervous system genes" is whittled down to those specifically implicated in development; 5) (control) statistically indistinguishable Ka/Ks values between humans and rodents on housekeeping genes.

With respect to a few small changes leading to "hundreds" of secondary changes -- I think there are issues with the very short timespan since human/chimp divergence. Once we nailed the brain size (and remember we think brain size varied between neanderthalensis and sapiens), has there been enough time for selection on hundreds of secondary changes without something really unusual going on?

The part about determining genetic predisposition should be more heavily qualified (already it's only "forseeable", "might be", and merely "natural predisposition"), but under the given assumptions and qualifications, it's reasonable to predict that it could be nailed down in the near future, meaning a few decades. I'm getting this from Weinberg [1]. --Mike Lin 17:28, 31 August 2005 (UTC)

version 2:

The remarkable overall similarity of the human genome to those of other mammals, especially primates, has given rise to a scientific debate over the genetic basis of human intelligence; that is, compared to other primates, how extensive are the genomic changes that give us the capacity for abstract reasoning, arts, and science? The central issue in answering this question is whether the recent evolution of whatever genes govern human intelligence (most likely by controlling nervous system development) was a typical or an extraordinary process. If it was typical (that is, if the relevant genes, whatever they are, evolved at a "normal" rate), then our intellectual capacities over other primates are most likely the result of relatively modest changes to a small number of genes, simply because too little time has passed for large-scale concerted evolution of many genes to have taken place.

If, on the other hand, some intense selective pressures led to extraordinarily fast evolution of the relevant portions of the human genome, then human intelligence could be the result of rapid evolution of many genes. A recent study has found that more than one hundred genes involved in the function of the nervous system, and especially some of those thought to control brain development, have evolved significantly faster in humans than in other mammals, providing some evidence for extraordinary selective pressures and large numbers of genes governing intelligence. However, the issue remains far from settled, since the evolutionary context that could give rise to such extraordinary pressure has not been convincingly explained.

This debate, still in its nascent stages and likely to continue for some time, could ultimately have significant ethical and societal implications. If human intelligence is, in fact, guided by a small number of genes, then it is forseeable that, within the moderately near future, geneticists might be able to estimate a person's natural predisposition towards particular intellectual pursuits, such as mathematics or music, on the basis of their genes.

PMID 15869325 "A scan for positively selected genes in the genomes of humans and chimpanzees." PLoS Biol. 2005 Jun;3(6):e170. Epub 2005 May 3. --JWSchmidt 12:47, 1 September 2005 (UTC)

"If, on the other hand, some intense selective pressures led to extraordinarily fast evolution of the relevant portions of the human genome, then human intelligence could be the result of rapid evolution of many genes." This seems circular to me: If selective pressure for a certain phenotype made the relevant parts of the human genome (i.e., genes) evolve fast, then the resulting phenotype is the result of genes evolving fast. See the problem? Evolver 13:51, 10 September 2005 (UTC)

Size over history

Is there anywhere that shows a graph or something similar to show the growth/shrinkage of the genome of the human ancestors? porges 06:34, 11 January 2006 (UTC)

First find your ancestors DNA sample. Or do you mean comparing extant species? David D. (Talk) 07:09, 11 January 2006 (UTC)

I meant along the human lineage - of course, most of it would be projected. Is there anything along these lines or would it be purely speculative? porges 08:53, 11 January 2006 (UTC)

That's actually quite difficult to do, since we don't actually have that much complete genome sequence available. What you're proposing would require us to have, more or less, the complete genomes of the entire primate tree, so we could trace major duplication/deletion/fusion/etc. events and reconstruct ancestral genome sizes going up the human lineage. We have, to date, two complete primate genomes, human and chimp. Graft 22:13, 27 January 2006 (UTC)

Where complete should be in inverted commas: "complete". Genomes are usually declared complete when all but the most difficult sections have been successfully sequenced several times over (usually a number between 6 and 8 times). So it is a somewhat arbitrary consensus decision taken by the scientists involved. But this is a minor point. - Samsara _contrib ^talk 14:11, 10 February 2006 (UTC)

I doubt that there would be that much difference in genome size in the human ancestors anyway, since there isn't a lot of difference between humans and primates. The principle differences would be changes in gene regulation, and there is no way that we could measure those kind of things from bones.--nixie 22:50, 27 January 2006 (UTC)

I beg to differ as far as there being no important differences in size. Also, we can probably reconstruct a substantial fraction of ancestral genomes computationally. As to predicting gene expression computationally... welll... maybe not yet, but it's not ridiculous. Graft 05:50, 28 January 2006 (UTC)

Population geneticists would predict that the human genome has grown, at least compared to the earliest primate ancestors, which probably had larger effective population sizes than we do today. Even if the number of genes had not changed, genomes are still thought to expand when there is less selection for them to remain small, by picking up functionally less efficient or useless (junk) DNA. Selection is always less efficient at smaller effective population sizes (all other things being equal). - Samsara _contrib ^talk 14:15, 10 February 2006 (UTC)

global human genome matching global family

perhaps there should be added a paragraph(s) discussing the tracking of single global family (northern hemisphere E & W) vs southern hemisphre (where Neander , bonobo and other ape variations to "human" genome come in ? and further tracking the Adamic man human genome (out of ME) vs chimp argued, Darwin man presently being postulated ab in this article, etc. etc.) Inmah Imig III 69.121.221.97 (talk) 16:05, 18 May 2011 (UTC)

Congratulations on SCOTW

You've succeeded. This article is now Science Collaboration of the Week. Now make it really, really good! - Samsara _contrib ^talk 10:11, 27 January 2006 (UTC)

Moral, social and legal consequences

What do other editors feel about inserting a section devoted to the moral and political aspects of the human genome project? I'm gonna be kinda busy over the next few days, but I should be able to contribute something over the next week or so, if that suits? --Nicholas 10:14, 10 February 2006 (UTC)

That would be better suited to the article about the human genome project, no? - Samsara _contrib ^talk 14:08, 10 February 2006 (UTC)

sugars

I have a concern about "There are an estimated 20,000-25,000 human protein-coding genes. " Aren't there genes coding for various sugars which is now a growth field?

You may be thinking about protein glycosylation, which can produce modified proteins. JWSchmidt 14:12, 10 February 2006 (UTC)

It's worth remembering that the number of proteins does not necessarily equal the number of genes. Apart from glycosylation, there is also alternative splicing to take account of. - Samsara _contrib ^talk 14:17, 10 February 2006 (UTC)

Human genetics

There is a rather short article on Human genetics, I htink it probably should be a stand alone article, but it probably should be incorporated into this article somehow.--Peta 02:50, 18 May 2006 (UTC)

Sorry, I am not sure how to do suggest a change.

I believe the part that says the chimp and human genomes are 95% identical should be changed to state that they are "about 98.6% identical (when indels are excluded)." The reference for that is "Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels", Roy J. Britten, PNAS, October 15, 2002, vol. 99, no. 21, p13633-13635.

If someone feels that an indels-included value should be given too, then I suggest it be changed to something like "about 98.6% identical (when indels are excluded: 96% when they are included)." The reference for this is "Backgrounder: Comparison of Human and Chimp Genomes Reveals Striking Similarities and Differences", August 30, 2005, http://www.broad.mit.edu/news/links/chimp-backgrounder.html —Preceding unsigned comment added by 24.94.224.193 (talk) 06:56, 17 October 2007 (UTC)

Number and sistinctiveness of chromosomes

I was always taught that there were 46 chromomsomes in most people's genetic complement, which is good because it is true. I changed the intro bit to note that most pairs of chromosomes contain two chromosomes that have similar structure, which may help those who wish to change the 24 to 23.

— Preceding unsigned comment added by 72.194.193.198 (talk) 18:41, 12 July 2006‎ (UTC)

External Link

The National Office of Public Health Genomics offers insight into how human genomic discoveries can be used to improve health & prevent disease. http://www.cdc.gov/genomics/default.htm Lid6 17:35, 15 September 2006 (UTC)

Added. --apers0n 18:41, 15 September 2006 (UTC)

References

I took a stab at cleaning up the references using a nifty tool. One of the web links referenced a Science mag article, so I dug up the Science mag and used the press release from LBL as a summary link on the reference. Note that the tool I used creates HUGE lists of authors that I summarized into the relevant group that did the work. --Chrispounds 02:48, 26 September 2006 (UTC)

Please stop. If you continue to vandalize Wikipedia, you will be blocked from editing. The first reference reads:

International Human Genome Sequencing Consortium (2004). "Finishing the euchromatic sequence of the human gspot."

Obviously, the last word in the reference should be "genome". (Derekjames 20:39, 22 October 2007 (UTC))

X inactivation

It seems rather random to put X inactivation in an introductory section to chromosomes. In fact, is it even needed in an article on genome? If so, shouldn't it be in a section on regulation/dosage compensation? Ted^Talk/_{Contributions} 02:48, 30 September 2006 (UTC)

For me, the second option seems to be good. X inactivation is important, and it'd need an own section. NCurse _work 05:44, 30 September 2006 (UTC)

But, is it important in an article on the human genome? There is discussion about regulatory sequences, but nothing about regulation itself. Before we should talk about X-inactivation, we should discuss multiple-copy genes (such as rRNA genes, or even alpha hemoglobin) -- they have regulatory effects, but are at least genome-related. I don't believe it belongs in this article. Ted^Talk/_{Contributions} 14:56, 1 October 2006 (UTC)

As for "female mosaic," that was a cutsy way of describing X-inactivation from the 1980s. Does anyone really use this anymore? The current use of "female mosaic" refers to mosaicism in the X-chromosome, such as XX/XO mosaics. Ted^Talk/_{Contributions} 16:44, 1 October 2006 (UTC)

I have not heard of the mosaic reference being dropped with respect to females and the expression of their X genotype. What makes you think this? David D. (Talk) 03:29, 6 October 2006 (UTC)

References I see to mocaicism in medical/genetic sources are to XX/XO or XX/XY or some such. I used to see female mosaic referring to X-inactivation in Human Genetics textbooks, but I no longer see that. Where do you see it now? Ted^Talk/_{Contributions} 03:38, 6 October 2006 (UTC)

The Role of X Inactivation and Cellular Mosaicism in Women's Health and Sex-Specific Diseases Barbara R. Migeon, MD JAMA. 2006;295:1428-1433, was one I found with google and there were more examples. There are three examples that I had heard of before, all referred to as mosaics. The first is the sweat gland phenotype that is x linked. Some patches of epidermis are normal others are not. Likewise, women that have coloured sectors in their iris, or even two differently coloured irises, I have heard as been described as mosaic. The final common example I have read described as mosaicism are the rare examples of tetrachromic females. Most basic text books do not even cover X-inacticvation. Which book are your referring to where it is conspicuously absent? Do they state the term is not used or could it just be they didn't know the term has been used for X-inactivation too?David D. (Talk) 04:45, 6 October 2006 (UTC)

Thanks for the examples. All human genetics textbooks cover X-inactivation. I do not normally look at medical genetics textbooks, so it is possible they do not (once again pointing out the difference between human genetics and medical genetics). I have over a dozen human genetics textbooks on my shelf, with a target audience ranging from College Freshmen to Graduate students. They all mention X-inactivation and none of them talk about "female mosaicism". I have some older textbooks from the 1980s that I recall mentioning "female mosaicism", such as Hartl's book from that era. A Google scholar search picked up very few examples, and when it did, they normally talk about "mosaic pattern", which is an OK phrase (I used the search phrase "female mosaic"/"female mosaicism" along with various words associated with X-inactivation: X-inactivation, Barr, Lyonization, etc). It does bring up a good question about terminology. When there is a difference in terminology between professional use and common use, should common use be given any place? My opinion is that an encyclopedia should be "technically correct", although with a nod towards common usage. An article about X-inactivation could mention "female mosaic" in the proper context, but it should be avoided elsewhere. Ted^Talk/_{Contributions} 11:46, 6 October 2006 (UTC)

Fascinating, i had no idea the usage was so different. I am referring to basic texts such as Snusted (upper level genetics) and b elow down to Campbell biology. None have anything significant on X-inactivation. It sounds like your books might be more professional. I assume by common usage you mean by outside science but i wonder if this is actually a basic research vs medical professional difference? I check my texts in more detail. I agree we need to give a nod to both usages. David D. (Talk) 12:49, 6 October 2006 (UTC)

I just checked Snustad & Simmons. It does mention "...female mammals are genetic mosaics [boldface theirs] containing two types of cell lineages." This usage doesn't fit the definition, "an individual composed of two or more cell lines of different genetic or chromosomal constitution...." (King and Stansfield, Dictionary of Genetics), and is still short of "female mosaicism". Common usage today does not include regulatory differences — if it did, the definition of mosaicism would be so broad as to be meaningless. I'm not sure it even includes such genetic diffferences as B & T cell diversity (which does involve DNA differences, so would strictly follow the definition). Ted^Talk/_{Contributions} 14:01, 6 October 2006 (UTC)

Good article nomination

My suggestions:

In section Chromosomes, a citation is needed.
"The evolutionary branch between the human and mouse, for example, occurred 70-90 million years ago." (a source would be useful)
~~"Protein-coding sequences (specifically exons) comprise less than 1.5% of the human genome." (Source?)~~

Anyway it seems to be good for me now. NCurse _work 14:49, 1 October 2006 (UTC)

Two done (struck out). - Samsara (talk • contribs) 15:04, 1 October 2006 (UTC)

The citation for X-inactivation is an article dealing with imprinting and inactivation. While it is interesting work, it isn't relevant to the general concept of X-inactivation, and even less relevant to "female mosaicism." Lyon's original work (1961) would be better, but I'm working on getting the paragraph deleted as irrelevant. Or, if you want a more modern reference, then maybe a reference on the XIST gene, possibly referencing the general treatment of Therman from 1974, or, specifically, Brown in 1991 for the XIST gene itself. Ted^Talk/_{Contributions} 16:31, 1 October 2006 (UTC)

I'm not bothered. You know more about it than I. - Samsara (talk • contribs) 23:16, 1 October 2006 (UTC)

The article "Estimation of divergence times from multiprotein sequences for a few mammalian species and several distantly related organisms" advocates a method of estimation the rodent-human divergence at 96 million years ago, and this article discusses other estimates that range from 82 MYBP to over 100 MYBP. --JWSchmidt 15:21, 1 October 2006 (UTC)

It meets good article standards, no question about that. Nearly every statement is referenced. Well done! NCurse _work 16:29, 1 October 2006 (UTC)

New human gene map shows unexpected differences

Story here. Maybe it's interesting material for the article. —msikma <user_talk:msikma> 06:59, 23 November 2006 (UTC)

A better source would be the article which is located here. --WS 12:57, 23 November 2006 (UTC)

Peer-review of DNA

Hi there. I wondered if the contributors to this page might have some input to this article. TimVickers 22:40, 24 December 2006 (UTC)

Change to Chromosomes section

The statement under the Chromosomes section... "Somatic cells usually have one copy of chromosomes 1-22 from each parent, plus an X chromosome from the mother, and either an X or Y chromosome from the father, for a total of 46" does not make sense to me. I am not a biology guy, but I do consider myself capable of understanding things when I read them. And this statement doesn't make sense to me, perhaps a re-write would be useful.Thoughtbox 18:38, 27 February 2007 (UTC)

LINEs and SINEs both Retrotransposons and Interspersed repeats?

I tried to make a little conclusion about the different parts that make the genome. As has gotten clear, LINEs and SINEs appeared both in the section of Retrotransposons and Interspersed repeats. Could this be true? I would appreciate if somebody arranged the articles in a way, that placed LINEs and SINEs in only one type. Thank you. Mortsggah 18:13, 6 April 2007 (UTC)

These are clearly self replicating - as such I removed them from the Interspersed repeats. Microtubules (talk) 16:43, 13 September 2012 (UTC)

Chromosome article names

Why are we adding (human) to the article names? If there were articles on the non-human forms that would make sense, but I never like to see a more specific name used when the general one just redirects to it. Besides this, it's pretty obvious we are talking about human chromosomes when we say, e.g. chromosome 12.

While I'm here, I think this should be delisted from GA - there are numerous tags around, and the lead is a little short as well. Richard001 09:48, 26 October 2007 (UTC)

Other chromosomes from others organisms are of interest. Somebody may want to know details about, say, chromosome 2 (mouse). We don't want to be speciesist! :) – Clockwork Soul 19:30, 8 March 2008 (UTC)

Genetic disorders

I have updated the section entitled Genetic Disorders. I tried not to give too much detail, so as not to overlap excessively with the standalone article on that topic. But the previous version was not that clearly written as noted by a request for an update by someone working in that field. I am a human molecular geneticist myselfOudguy (talk) 19:18, 8 March 2008 (UTC)

Number of genes

The article currently estimates the number of human protein-coding genes between 20,000 and 25,000. How relevant is the research published by Clamp et al.[2] that shows humans have only 20,500 protein-coding genes? Do other geneticists agree with this lower number? It's interesting that the article Gene mentions a very similar number as the estimate of another recent research:[3] 20,488 plus perhaps 100 more. Should the number given in the article be revised? --Eleassar ^{my talk} 12:09, 1 July 2008 (UTC)

The Science magazine news article refers to the same Clamp et al. study. I wouldn't venture to say that those results are yet widely agreed upon. The major gene catalogs such as Ensembl and RefSeq still list closer to the 25,000 end. And there is still controversy about the extent to which small ORFs from the oodles of short transcripts out there may get translated. So, this will probably still take another couple years to play out before there is a more specific, encyclopedia-worthy number. --Mike Lin (talk) 10:23, 2 July 2008 (UTC)

Where does the estimate of "between 10,000 and 25,000" human protein-coding genes come from? The citation is to a 1969 article. There are lots of better sources. --Nbauman (talk) 21:16, 31 May 2012 (UTC)

While there might be a better citation, I doubt there is a better number - in fact, the number given is certainly not taken from the citation, as nobody thought the number was that low until about the year 2000. About 5 years ago I saw an article about a group that was going through the genome, gene by gene, and trying to figure out which ones were actually functional, and they determined that many of those counted in the original annotation were in fact pseudogenes. At the time, they had it down to no less than 13 and no more than 18 thousand genes, yet you still see numbers in the low 20s batted around. To add to the confusion, some sources just refer to genes, while others specify 'protein-coding'. Agricolae (talk) 00:05, 1 June 2012 (UTC)

Megabytes measure computer data not human data

"The haploid human genome occupies a total of just over 3 billion DNA base pairs and has a data size of approximately 750 megabytes"

I know this is a quote from The New York Times, but it's written by someone that doesn't understand how data is stored in computers. When you use bytes (as in 8 bits) as a unit of measurement you are measuring the amount of 1s and 0s, which is meaningless here since the human-readable information (what he's actually trying to measure) can be stored in a million different formats in a million different ways resulting in huge size variations. It's worse than measuring your computer's memory by the amount of songs it can hold or the length of a book's content by the number of pages. You can store that same information in a lot less or a lot more memory very easily. It means nothing, it's potentially misleading, and should be removed/replaced. 75.4.164.226 (talk) 10:12, 16 January 2009 (UTC)

this doesn't make any sense. DNA base pairs can directly be translated into bits as a matter of course, seeing that there are four different nucleotides, each pair is equivalent to two bits. Of course it would be stupid to measure the size of the genome in terms of the "number of songs" that could be stored in it, but this isn't what has been done here. 3 billion base pairs equals 6 billion bits, or 750 million bytes~~, or about 715 megabyte.~~ This has the very real implication that you can store your (haploid) genome on a single CD ROM, without compression. This is extremely relevant, although the really intresting question would be as to the Kolmogorov complexity of the genome. This question is more relevant, but also rather less trivial to answer. --dab (𒁳) 09:55, 13 August 2009 (UTC)

I found a single 2008 paper estimating the entropy rate of the genome. Apparently this is still rather bleeding edge. Unfortunately, the authors do not seem to give a single figure for the entropy rate of the entire genome, they just plot the entropy rate for each chromosome individually. Looking at the figure, "about 1.8 bits per base pair (90% of maximal entropy)" would seem to be about right. I still wonder, has nobody simply downloaded the Human Genome Project data from Project Gutenberg and run it through gzip? The size of the compressed file would already be eminently useful in estimating information content. --dab (𒁳) 16:29, 14 August 2009 (UTC)

the gutenberg page for chromosome 1 already tells us that the zipped file has 65.14 MB. This encodes 247 million base pairs, or 494 million bit = 61 MB uncompressed. This means that the compressed file is actually larger than the naked uncompressed information (this is probably due to the information being ultimately encoded in ASCII, and the zip algorithm being less than optimal). This shows that entropy is rather large for chromosome 1 (entropy rate near 2).

here is the Y chromosome. 7.97 MB for 60 million base pairs = 120 million bit = 15 MB uncompressed. As in the the 2008 paper I cited, the Y chromosome can be compressed by about 50% (entropy rate below 1). --dab (𒁳) 16:36, 14 August 2009 (UTC)

The table under "Information content"

I'm having difficulty understanding how the total number of base pairs given in the table can be 3,080 million for XY and only 3,022 million for XX. Firstly, I'm given to understand that the Y chromosome is actually smaller, in terms of both size and number of base pairs. (The table corroborates this.) How, then, can the total be larger for males. Secondly, I find no way of adding up the numbers in the chromosome columns to arrive at the numbers 3,080 and 3,022. Would somebody care to explain how it all fits together? — Irrbloss (talk) 12:16, 16 September 2009 (UTC)

I'm having a tough time too... How can the raw data be more than the zipped data? Michi zh (talk) 15:33, 3 November 2009 (UTC)

That's why you zip data: To make it smaller. clacke (talk) 00:31, 20 December 2009 (UTC)

I'm sure Michi meant the opposite: how can the zipped be more than the raw. The answer I suppose is that they are not zip files created directly from binary raw data. It would be more interesting to know what size file you get when you zip, say, a raw binary file of 770 Mb for the XY case. As for the questions of Irrbloss, first of all, you have to realize that they're talking here about a haploid genome for the XX case. If you add up the numbers, you get 3019 million, but if you were to add up the exact numbers (rather than these rounded numbers) it would round to 3022. For the XY case, you add another 58 million for the Y. Eric Kvaalen (talk) 10:33, 18 June 2010 (UTC)

Is the table consistent when it comes to what the values mean? 247 M base pairs for chromosome 1, is that for the pair or for one of the chromosomes? What about the values for the X and Y chromosomes? Are those also for pairs? The sums seem to be for XX, chromosomes 1-23 plus X (3019, table says 3022), for XY, chromosomes 1-23 plus X plus Y (3077, table says 3080). Sampling articles chromosome 1 and X chromosome it seems that the numbers are for a single chromosome, so the real sums should be 6038 for XX, 5941 for XY. This means that the amount of data that encodes a human's complete DNA sequence is around 1.5 GB. A lot of that data is redundant, sure, but the chromosomes in a pair are actually not identical. So either you count the total amount of base pairs or you might as well count the LZ factor too, bringing it down to below 400 MB clacke (talk) 00:31, 20 December 2009 (UTC)

The numbers are for just one chromosome of each pair, except that for the XY case you count both. You're right that theoretically we each have around 6 milliard, or 1.5 GB. But most of that is just two almost identical copies. The Human Genome Project data gives only one of the two. The total entropy (that is, the minimum you could get with compression) is approximately 770×1.70/2 = 655 MB for XY and 756×1.71/2 = 646 MB for XX. The differences between the chromosomes coming from your father and those coming from your mother don't add much to this. Eric Kvaalen (talk) 10:33, 18 June 2010 (UTC)

I agree with clacke. The totals are meaningless and misleading. Readers will quote 3174Mbp as the total number of base pairs. The answer that "most of that is just two identical copies" is not enough. So are most base pairs in XX for that matter. Either count all base pairs (ie 6044/5947) or else take half of that 3022/2973. Chrystomath (talk) 17:01, 13 February 2011 (UTC)

Variation section: SNPs

The section currently says that "SNPs occur on average somewhere between every 1 in 100 and 1 in 1,000 base pairs in the euchromatic human genome". I have a source that gives a better estimate, of 3 in 1000, for SNP variation in humans.

“The published sequence of the human genome is actually a mosaic obtained from the analysis of DNA isolated from ten different individuals. On average, about 99.7% of the bases in your genome will match perfectly with this published sequence, or with the DNA base sequence of your next-door neighbor. But the remaining 0.3% of the bases vary from person to person, creating features that make us unique individuals. Differences involving single base changes are called single nucleotide polymorphisms, or SNPs (pronounced ‘snips’). Although 0.3% might not sound like very much, 0.3% multiplied by the 3.2 billion bases in the human genome yields a total of roughly 10 million SNPs.”

The source is: "The World of the Cell: Seventh Edition", Wayne M. Becker, et al., Pearson/Benjamin Cummings, 2009, p527) —Preceding unsigned comment added by 70.94.236.188 (talk) 15:11, 19 December 2009 (UTC)

Information content - entropy rate in bits per base pair

This sub-section needs a simple explanation for readers who are not up to speed on entropy rate. A formal definition expressed as an equation is not an explanation. Rate usually means time rate. Entropy rate in bits per base pair per what? What interval of time is this based on? Per year? Per million years? What does entropy rate in bits per base pair per year mean, without reference to an equation? Since every base pair can be coded by 2 bits and the chromosome 17 entropy rate is 1.87 which is close to the maximum 2, what does that imply? That the entire human genome mutates about 93.5% per million years? That would be nonsense, because then it would no longer be a human genome. Greensburger (talk) 21:16, 11 August 2010 (UTC)

'junk DNA'

Is it really necessary to still refer to DNA we have yet to understand or study as 'junk DNA'? The reference cited is ten years old. —Preceding unsigned comment added by 149.169.224.155 (talk) 20:38, 8 April 2011 (UTC)

Junk DNA is not the same as non-coding DNA. And it is not true that non-coding DNA used to be called junk DNA. Functional non-coding DNA has been known for 50 years!!!!!!! It is a term still used. As such, I modified the text to reflect this.Microtubules (talk) 16:46, 13 September 2012 (UTC)

Bottom text removed

Based on this revision. Moved to Discussion until it's properly integrated to the article (if it's relevant):

As everyone knows, there are forty-six chromosomes corresponding to twenty-three pairs of human genes. The double helix filament not only moves, but also acts for some things as some cell genes required and released. From scientific studies and the discovery of genes many years ago, it was found that all DNA consists of the double helix filament with a heredity code, but acted on each other.

Scientific studies reached a conclusion: all animal genes are the same as humans, even if their differences are only within 1-2 % each other. As far as the gene report, all animals including humans had the same double helix filament of genes. But there was a common vibration, transferred information, and a connection between some animals, too. The above opinions are very important: they established a basic, common vibration in genes, including a whole-animal frequency of self vibration in any body. This includes, for example, the transfer of information and connections to each other, even if in difference categories or difference subject fields. So basically, not only is some information transferred, but some goodwill or intentions could also be shown to each other. Humans could use common gene vibrations with some animals connections, such as for training or the transfer some information to the animal in one look or facial expression used.

There are four points to explain regarding the double helix filament of genes;

First of all, longer steel spiral springs could be vibrating in three dimensions in general from a physical view. For example, the vibrations could be changed sharply from top to bottom, back to front, or expansion to compression along the longitudinal axis under some acting force that changes at the horizontal level in hand. But for the double helix filament of genes, it is the same except there is a three-dimensional vibration, too, and the double helix is able to expand under a self-shared vibration toward the outside under a very small idea in the brain, or self-body blood pounding, or earthquake force etc..

Secondly, these small filament vibrations were not only caused in three-dimensional waves, but they were also later able to expand more and more, so it will cause a common vibration by which some information from other animals could be mutually transferred and received. These double helix filaments of genes can more easily transfer or receive any information through three-dimensional waves from others.

Scientists discovered that the double helix filament of the genes is very thin and flat. These characteristic could be to increase the margin of common vibrations with the most sensitive vibrations within each sectional area. It could be used for transferring into space in every direction and expanding the field for reception or connection, too.

Third, static electricity could also be more and more expanded throughout the animal’s body, including the response reaction in the genes of small mammals’ bodies. For example, very small noises or very small vibrations, even light waves of human sight, etc. All things will be caused immediately even if they are far away. The response reaction in their heart of small mammals or humans will be able to transfer genetic information between brothers, sisters, or parents. On the big grasslands of Africa, there are many kinds mammals animals looking for food, and they do not come across each other. This is because of the normal shared vibration of genes------ they get friendly vibrations. As tigers or lions are approaching, they receive common vibrations of genes------ they get unfriendly vibrations, but they immediately run far away.

Finally, as one animal was shaking because it was frightened or very cold, the lives of the cells’ protein was threatened, and the animal will be shaking throughout its whole body, but not under self-control. This is best large common vibration caused by genes within one body.

From the book Life Itself------Exploring The realm of The Living Cell------By Boyce Rensberger》. In every case, the cell behaves in an organized way, responding appropriately to signals it receives from the surrounding environment. …… Each cell can take in information about its circumstances and respond in a purposeful way, carrying out its crawling ability in a manner that seems coordinated throughout the cell. “I say that this is a form of intelligence and that, therefore, the cell must have a control center, a brain,” says Guenter Albrecht-Buehler of Northwestern University’s Medical School in Chicago. …… He finds that cells behave in ways that are uncannily like the behaviors that a whole animal would display. …… He said too: “This is not the behavior of a mere automaton,” Albrecht-Buehler says. “The cell is exercising a form of intelligence, testing each possibility and only then moving on. Somehow the cell is taking in information about its environment and it is processing that information. The cell does not try to go in all directions at the same time.”……“Calls the cell’s brain”……“ called the centrosome or cell center” …… What are the sensory organs of a cell? Does it have eyes and ears, or feelers? How does it transmit the signal, once received, to the centrosphere? How are the signals processed, and then, assuming a “decision” is made to respond, how is that decision carried out?

How would a cell tell its actin-based cytoskeleton to remodel itself? Somehow, it seems obvious, the activity must be coordinated if the cell is to move in a coherent fashion. …… Albrecht-Buehler concedes he has no idea how the cell might then process this information, but he cites a common observation of cell biologists watching cells crawling about the bottom of a culture dish. If two cells come within a certain distance of one another, they often change course, each steering toward the other. Often when cells come within this range, they quickly extend a ruffle or pseudopod toward the other cell. Somehow, it seems, they can detect one another’s presence. The cells meet, touch briefly, and then, most often, recoil and veer apart.

My opinion: The cells that don’t have eyes and ears or other feelers, they are shaken more and more by the double helix filament of these cell genes. Then feelers can receive and release messages to each other. To feel closer, they need to be far away, and it stands only briefly.

A cell has very little, the book says: The double helixical very small filament of cell genes is increasingly tiny, but its length has a lot more than a kilometer. Its common vibration is a name that matches real microwave vibrations. The strong point of the microwave is that energy is much less used, the distance is mush further, and the propagation velocity is the fastest. General mobile telephones use microwaves transferred from man-made satellites. This distance is more than any location on the earth that we live. So, through the common vibration of the double helixical very small filaments of cell genes, it should be very easy for any mammals or humans to contact or transfer any message.

Some example to show some transferred and connecting notices between animals are as follows:

1, The best escape actions such as an earthquake warning in some small animals;

Because the earth’s crust moves slightly up and down, etc. in the earth’s depth layer, this shaking was received by the double helix filament of the genes of these small animals, such as frogs, crocodiles, mice, snakes, etc., and they all knew danger was coming, so they immediately organized the best option to escape or hide. They were not chaotic, running here and there to escape to every direction, but they had a coordinated action into one path------the way road------ less and less the way which could be passed by the common vibrations of the double helix very small filament of each brain gene.

2, Migratory birds, some fish or sea tortoises always go back to their breeding ground for some reason each year;

These animals used to transfer and connect by their brain genes, which each have double helix filaments. It could be received or noticed through common vibrations. As they go back to their breeding grounds during some periods for breeding or reproduction. This is an explanation as to why all these animals are able to go back to their breeding ground together.

3, How do non-seeing small bats touch flying insects?

It is probided by a wave from the common vibrations of brain genes while flying to look for food. Some flying insects receive common vibrations of the genes from the bats. The flying insect was shaking with fear because their double helix filaments in their genes were vibrating in common. The flying insect was touched by the bat that followed the increasing vibrating wave. Practically, it is microwave common vibration, but is not radar-wave which it was said befor in their brain of the bats. There are microwave common vibration in lot of kind animals body, but its don't be used found food.

4, Sharks looking around for food often use their smell or the vibrating transfer wave of their double helix filaments of genes from something shaking with fear;

The shark’s sight is poor, but its sense of smell and touch of vibrating genes is very strong. Should you go swimming in the sea, some experienced navigators always say: don’t be afraid of the sharks; if you are slightly afraid, the sharks will come here quickly. If someone is afraid with their genes vibrating in common in the sea, the sharks will receive it immediately and come immediately. It uses the common shaking of genes to follow the increasing vibrations to where it could find the food.

5, Some baby dolphins and whales were often lost from their groups;

When the baby dolphins and whales general are playing as a part of groups, if their cells don’t release, and the double helix filament doesn’t show out, too. As the masses go out to sea, they don’t connect to these babies, so they are easy lost near the sea.

6, Weather changes could cause someone arthritic pains;

As there is with static electricity and far away thunder, a person with existing joint arthritis of the legs very often feels trouble, and there is unsatisfactory pain in whole body. This was electromagnetic energy causing common vibrations in double helix filaments in people with arthritis. There is static electricity in cells themselves, it has been proven biophysically. Depending on whether each neurosis system is normal or not, the static electricity is changed more or less in different body parts. This produces mutual static electricity induction between some with natural fields in varied parts.

7, If relatives were injured in a traffic accident, the genes of their closes relatives fled;

If parents or relatives have an accident in traffic, the one who is wounded is thinking of closes relatives at a far away place. This time, there was common vibration of the double helix filament, and it was transferred to the genes of the close relatives. The common vibration shall be the cause of the state of mind to feel disturbed at far away close relatives. Sometimes, the double helix filaments are not manifested, so it cannot transfer or connect to closes relatives’ genes or common vibrations.

8, What clever respond called between relations or friend each other?

A lot of overseas students or workers always have the same experience: when they want to dial the telephone to their parents or their daughters or sons, the telephone rings. This telephone call is their mother or son. Sometimes, the phone is even busy. After one or two minutes passes, the phone rings again and the other person said: mother! who were you ringing just a minute before? Mother said: for you! According to the mathematics of probability, calculating the free time for each other, the rate should be one or two-hundredth in spare time per day. But now in general for meeting a phone call it is about one or two- tenth or twentieth times. This is where there are common vibrations in the double helix filaments between the parents, closes relations, or their daughters or sons. The clever respond transfer called causes them to want to ring each other. Even if distance is half across the earth, microwaves seem to have the same quick reaction of an electric current. The common vibrations of the double helix filaments and microwave propagated character to transfer and connect to each other under the same basis of self-shaking frequency. 9, Qigong, the resonance of genetic filaments;

     Qigong masters who have managed 60-70% of the techniques, can already open all the cells in the palm and make genetic filament resonate to  promote health and cure diseases. The palm heat during the process (due to the microwave) inspires and arranges the lesion’s genetic filament resonance, thus further inspires the localized resonance and heat to palliate the pain. After several circles of treatment, genetic filaments in the lesions regain their order and cells regain their normal function, the disease is then cured. When the master manages 90% of the techniques, he can detect the lesions inside the patients by vision. By scanning the patient’s eye sight wave, the master inspires the genetic filament resonance and cures the acute disease. For chronic diseases, the progress goes along two directions: malignant tumors----cancer, resonance can only constrain it to buy some time, since the inspiration can no longer inspire genetic filament resonance. Ordinary chronic lesions will regain filaments order and normal function after several circles of treatment. It is the level of the Qigong masters that matters.    — Preceding unsigned comment added by 116.49.10.217 (talk) 11:41, 11 March 2012 (UTC)

Darwin’s theory of evolution had said: Biological evolution is often using some acquired characteristics where there could be progress forward used but non-used left behind, even if there is some functional degeneration. Because the capacity of the human brain increases more and more with scientific knowledge through the struggle for existence and development, the common vibration function was degenerated gradually. So the common vibration function of gene filaments will be less than other animals in their transferring and connecting effects.

B, Some none felt heredity function in Human Gene: Dominant genes exert influence from the stages of embryo, infant, child to adult and senior age. The combination of chromosomes being “turned on” differs according to different time schedules in different period, thus the combination of the dual-spiral gene filaments differs as well. Therefore, we can see different characteristics shown in childhood and teenage. Some girls were naughty as boys in childhood, but grow to be clever and quiet ladies who love dressing up; some boys were quite unfortunate-looking in childhood, but turn to be sunny teenagers and reliable middle-aged men later in their life stages. It is the continual changes of the dominant genes in different period making us become more and more mature when we grow. Recessive genes not only come from our ancestors, but mostly also from the thoughts of our parents before giving birth to us. Parents' anxiety and worries about their financial status can easily be inherited as recessive genes. Some pathogenic recessive genes get repelled by “receptor” before egg-sperm fusion, but some parts of them might still be passed to the next generation. When the environment fits, these evil recessive genes may become dominant. With suitable external environment or a flash of thought, these recessive gene can be turned on to be dominant and express themselves.

Siblings can have different personalities and temper. This is associated with the surroundings, the financial status and even the political environment our parents situated in before giving birth to us. Parents transform the information from the environment and their thoughts into acquired experience or memory which then pass to their next generation. For example:

1.　Parents who live with fewer anxieties always conceive pure and　energetic children who are less prone to cry, less naughty, less　demanding for hugs and bolder.

They can play all by themselves instead　of sticking to their parents and are more independent when they grow　up and are willing to study or work abroad.

2.　Parents who live with more anxieties or under financial stress　always conceive nervous and unhappy children who are prone to cry,　naughty, demanding for hugs, vulnerable to getting frightened.

We　often see 1 to 2 years old kids with long face and in fact they don't　know what sadness is but inherited the “unhappy genes” from their　parents.

3. If one parent live in an unfamiliar place and always feels lonely,　their children would be desperate for hugs; the children would not　feel secured if they are not held in arms by their parents.

 When　parents try to train their independence by leaving them alone, they　will cry very loudly and non-stoply even they have thrown up all that

they have eaten. This is due to the genetic inheritance of the lonely　and unsecured life that their parent had gone through. In their later stage of childhood, they may feel so worried and fall ill when their　parents have to depart with them for several days. 4. Parents who think often and have reading habit and those who don't give birth to different children. During the Chinese cultural revolution, development of many schools and offices were in stagnancy. Political learning and discussion were held very often and people are not required to think much. Some people were even sent for laboring work for twelve years. Some educated people in that period have children reluctant to read, but some who kept their reading habit before and after marriage have children who were keen on studying and reading a large volume of extracurricular books. 5. Parents who think often and have reading habit and those who don't　give birth to different children. 　During the Chinese cultural revolution, development of many schools　and offices were in stagnancy. Political learning and discussion were　held very often and people are not required to think much. Some　people were even sent for laboring work for twelve years. Some　educated people in that period have children reluctant to read, but　some who kept their reading habit before and after marriage have　children who were keen on studying and reading a large volume of　extracurricular books. 6. Parents who like to invest or play Mahjong may have children eager to gamble. 　The eagerness in gambling is a concept comparing gambler with those　who never gamble. A couple might have children with different degree　of eagerness in gambling: some may feel lost and a sense of vacuum　when they cannot find Mahjong companions for several days while some　may be willing to play when there are companions, but are also willing　to watch others playing even they don't get the chances to play.　Parents having Mahjong-loving children should please recall if they　did play Mahjong sometimes and had investment in stocks before having　babies. There are a lot of people losing all their money or even their　lives because of gambling. Parents who blame their children for　gambling should bear in mind that such situation can actually　attributed to the genes they pass to them. As an old saying goes,　“Raising a child takes a hundred year”, every step taken by parents would directly affect their children. But still there are exceptional　cases that kids grow into pathological gamblers as they are influenced by the external environment. 7. Young people's brain activities before and after marriage including　flashes of thoughts can been memorized by genes　It is common to see good-looking gentlemen sitting around beautiful　ladies in different occasions.

 If we explore further, we may find out　that their parents are not always as good-looking as the children.　Going deeper for the cause, we may find out that, their parents may　have been attracted by the appearance of  good-looking people before　giving birth to their babies and these memories are recorded by their　genes which then passed to their generation for evolution.  Pondering　or evaluating on others'  appearance will also be recorded and passed　to the children.  In the past, new couples may stick some pretty women　or little boys' photos on the wall; this is quite reasonable as this

might trigger gene mutation if pregnant women often face these　pictures and make their genes to memorize them. If they stick the　photos after the babies were born, this will turn to be an acquired　learning process and the effect is thus less. Nowadays, young couples　always stick pictures of models or celebrities from all over the world　on the wall. In these photos, the models and celebrities are very　charming but they don't smile much, and such information is all passed　to the next generation. That’s why we see less and less true smile　outside these days, leaving only cold smile and jokes around.

　Genes can be divided into recessive and dominant ones. This be　understood easily. Genes being used often can express as dominant　while those not being used often as recessive. We bargain in vegetable　market, we swarm to the market for products on sale; love for　advantages taking is common, and this gene turns to be dominant when　our love for this increases. People may reversely lose more when　gaining extra advantage becomes a hobby. When we think less about　advantages taking and this gene will turn to recessive, people may　reversely gain more. This is the dialectical relation of the philosophy “the best gain is to lose”.

　Children who are reluctant to read and poor in school performance will　be fond of gambling and cheating, or even being involved in bribes when they grow up. Very often these are all determined by the　activities taken by parents before they have children. The actions of　parents will be passed to the children as their inherited recessive　genes. So parents should think how they live during pregnancy of the　mother before they blame their poorly performed children. They should　then guide their children to go in a right direction patiently without　the fear to be nagging.

Ｃ, Why does a biracial person often have more strength and intelligence than normal one?－－gene chemical (Enzyme) function: 　The form of the genes points to an analysis of skin allergies because some dominant or recessive gene was generally alternated or changed. With two parents, the father from inside the continent who generally drinks milk, but a mother from a coastal area where sea fish and some sea food is always eaten, their son or daughter or would generally have a skin allergy on their body. Their cell genes do not match suitably with some milk or some sea food. So we could often hear about two kinds of foods that are eaten after a skin allergy. If the baby could drink suitable milk of some brand from 2—3 years of age, but until 7-- 8 years of age it was unsuitable. This is the hiding and display of the effect of the genes. The general skin allergy on the body could be cured by the body’s self-release or medicine of the enzyme system, which is adenosine triphosphatase (ATPase). The enzyme is the main element in one of the ginsengs, which is a Chinese herb. It could be strong and intelligent for the human body to stay young. As children often have a skin allergy for some food, but the body always increases the enzyme of ATP, so these children shall have a strong body and be very intelligent as they grow up to become an adult.

　Some dominant or recessive genes could often be changed, which affects a person’s mind and brain. So it causes some slight change towards the fine way of natural selection. For example, a sharpt exterior face, self character and temperament, even mind and intelligence etc. for children or female and male.

( Talker;Q.Y.Zhang, Hong Kong 張其澐116.49.9.210 (talk) 13:18, 19 May 2011 (UTC)) （Q.Y.Zhang, Hong Kong 　--116.49.10.217 (talk) 01:59, 11 March 2012 (UTC)） ^That's the text.--Sisyphos23 (talk) 18:04, 23 May 2011 (UTC),Addition one sentence in 3 item.--119.237.48.218 (talk) 22:56, 3 June 2012 (UTC))

karyotype

The illustration is not a karyotype, per se. It is a graphical representation of an idealized karyotype. (Yes, graphical: Graphics = "visual presentations on some surface, such as a . . . computer screen, paper, . . . to . . . inform, illustrate". That is exactly what this is doing.) I don't understand the objection to this phrasing, which distinguishes it from the illustration seen at karyotype, a photographic representation (and the original data from which this graphical representation was prepared). And no, we don't need a WP:RS to know that unlike the illustration on the other page, this is clearly not a photograph of actual chromosomes, any more than we need an RS to tell us that this is just an artistic representation of a car, and not an actual car. Agricolae (talk) 00:36, 16 January 2012 (UTC)

The objection to the phrasing is that I once worked for a magazine editor who told me never to begin a caption with "Illustration of..." because that would apply to every caption, and it would be ridiculous to have a magazine issue with every caption beginning with," "Illustration of...." Have you ever seen a magazine issue like that? This is what professional editors do, and if I had a University of Chicago or other style manual I could look it up. It's equally ridiculous to start a caption with "Graphical representation..." since every image is a graphical representation, including the pie chart in this entry. I have never seen a caption in a WP:RS publication that began with "Illustration of... or "Graphical representation..." Can you show me one?

In fact, what is the actual caption that was used in the source of this illustration? Does the source call it an "Illustration"? Does the source call it "idealized"? Or is that your WP:OR and WP:SYNTH? The image is so poorly sourced that I can't track it back to the source. I don't know if "idealized" is the right word. Science has changed a lot since Plato and Aristotle. --Nbauman (talk) 22:37, 19 January 2012 (UTC)

OK, I see where you are coming from. Here is the problem. A karyotype is not what is being shown. In (very) common terms, the karyotype is the DNA that an individual has. That illustration is representing the banding pattern seen in karyograms when using the most common staining - however most people have minor differences from the 'typical' pattern at one place or another (particularly when viewed at a higher resolution. This figure represents the banding pattern seen most commonly in each particular part of each chromosome across the population, not what is seen in any specific individual, per se. The karyogram is the most common way of visualizing a person's karyotype, but is not a karyotype itself. Thus this idealized . . . (yes, that is a word commonly used scientifically to describe such a pattern that doesn't exist in reality due to individual variation) . . . banding pattern is not only not a karyotype, it isn't even a karyogram (which is the actual photo of the stained chromosomes). It is just an illustrative figure representing a karyogram/karyotype. To call this simply a karyotype because it would make your old editor happy is rather missing the point. Agricolae (talk) 23:13, 19 January 2012 (UTC)

Answer for above questions:

Some cell in culture dish, " As the two cells are close to each other to a certain distance, they usually change their directions." Why could turn, back and get away?
There is cell liquid in brain, it is of a hard or soft materials under some condition,such as, some air seem a hard in car tire. As being hard in the karyotype framing ducts, it shall be spiral rope along the double helix filament of genes. it will be of physical function.
Karyotype seem unrelated for the physical function, except used the double helix filament of genes framing only. But it is of related for the gene heredity function.
How to explain except common vibration through the double helix filament: " what clever respond called between relations or friend each other"?

( Talker; Q. Y. Zhang H. K. 香港. 張其澐--116.49.10.217 (talk) 08:28, 9 March 2012 (UTC))

ENCODE data makes some info. out-of-date

Some of the information in this article (e.g., "the 97% of human genome has unknown function"-type) are out-of-date with the 5-Sep-2012 release of data from the ENCODE project. Most major newspapers around the world reported these results yesterday (5-Sep-2012). We need to update this article in many locations. When I find some time, I shall help (and add more references). --Thorwald (talk) 18:20, 6 September 2012 (UTC)

I have fixed some of the material that has been rendered obsolete, but the article still needs a paragraph on the Encode project and its findings, as this is a fundamental characterization of the genome, and entirely relevant to the article. That being said, there are also parts of the article that are showing their age, or seem to be given undue weight. The table of information content and entropy for each chromosome is completely opaque, and not something that gets significant coverage so as to merit such a table, or maybe I just don't understand what it is supposed to be showing. Id the memory equivalent in ASCII really that notable? Likewise, the section about the genome sequence and personal genomics seems to have been frozen in time about 2008. There have been over 1000 personal genomes completed, including Desmond Tutu and a frozen Inuit, and it is just no longer worth listing individuals. Likewise the concept that there is a 'Korean genome' is quite out of touch with recent population genomics (can't believe I just said that - recent population genomics, like 2008 represents ancient times, but in effect it does). Maybe someone with more time than I currently have can give these a look. Agricolae (talk) 19:54, 6 September 2012 (UTC)

Please be careful. Much of what ENCODE found was that much of the genome is transcribed, however it is likely that much of this represents transcriptional noise. By their own estimates the actuall percent that influrence the phenotype of the organism is somewhere between 8 and 20%.Microtubules (talk) 16:21, 9 September 2012 (UTC)

Is there something specific that you think required more care? I tried to report it as it appeared in the secondary source cited. Agricolae (talk) 00:07, 10 September 2012 (UTC)

I have now extensively revised the non-coding section. As for vthe 20% figure, I will reread the papers. Ewin Birney, head of ENCODE, on his own blog states that they are confident that 8% of the human genome contains physiologically relevant sequences, and they project that this represents 50% of the total number of such elements. When you add protein-coding genes the total percent of the genome that influences phenotype should be ~20%. (http://genomeinformatician.blogspot.ca/2012/09/encode-my-own-thoughts.html) Microtubules (talk) 16:37, 13 September 2012 (UTC)

The 20% figure comes directly from a table in the Science article describing the ENCODE results (an independent reliable secondary source), which says "80% - functional portion of the human genome". That is pretty straight forward. What is not straight forward is what this number represents. Certainly, transcription is not necessarily an indication of function, while one could also question how to classify some function, for example, SINEs, which are transcribed and have a function, but only a selfish one hardly relevant to the cell. In general, though, I think we should be careful about being too precise in these numbers, as history shows them very much subject to change, and basing the article on a researcher's blog is problematic as well (although the Science summary mentions similar numbers to what you are drawing from Birney's blog). Agricolae (talk) 17:40, 13 September 2012 (UTC)

Please see Talk:ENCODE. The way the results were presented was quite controversial and has resulted in several articles. 86.121.137.227 (talk) 01:47, 17 September 2012 (UTC)

Stamatoyannopoulos mused they could expect 40% regulatory space ("It is thus not unreasonable to expect that 40% and perhaps more of the genome sequence encodes regulatory information—a number that would have been considered heretical at the outset of the ENCODE project") in a Genome Research review. Anyway, if we want that ~20% number we ought to cite the blog it comes from instead of just keeping a citation to a paper where it isn't mentioned. Might be better to give a range of guesses though. Narayanese (talk) 07:49, 21 September 2012 (UTC)

Plans for future edits of Human genome article

I've started work on what I hope will be a fairly thorough rewrite of the article, with the goals of bringing it up to date, organizing it more clearly, and improving readability. Comments and assistance are most welcome! --John Mackenzie Burke (talk) 14:17, 21 September 2012 (UTC)

Introduce myself:

  This article of the human genome of mine, was helped by translation company HK. from ‘Chinese article’ to ‘English…’. I am an engineer of Civil 、Building 、Structure  and  Construction in Hong Kong. So, to improve above article, it shall be of difficult. (Now I am living in Boston,MA. USA. Coming Nov. middle, Back to HK.) Please to refer the Chinese article. I am sorry for regards to the improvement the article. 
  I am main works, it is structural calculation and design、construction or planning.The engineer license:

In 1979, China; In 1990 or 1991,Hong Kong;

  I interested in biological to span over forty years. In general,calculation structure works is normal works, but to do the interest works above time to do out.

(Talker: Q. Y. Zhang; 香港張其澐--173.76.134.128 (talk) 14:14, 29 September 2012 (UTC))

Length in table

I have two issues with the length data in the chromosome table. First, it is unclear if this information is found in the original source or if the calculation has been done by the Wikipedia editor compiling it. If the latter, then it is Original Research. If the former, it should perhaps be made more clear what the source is. The second issue relates to the actual number used in the calculation. The table of DNA dimensions (Comparison Geometries of the Most Common DNA Forms) given at Z-DNA, A-DNA and Nucleic acid double helix give the per-base distance for B-DNA as 3.32 Å in two of the tables (Z & B) and 3.4 Å in the third, a discrepancy that has been around since 2008. We really shouldn't be doing our own analysis, but if we do, I don't know that 3.4 is the most accurate number to use. Agricolae (talk) 18:04, 29 September 2012 (UTC)

While we are at it, I would suggest that tRNAs deserve their own column, rather than being relegated to 'Misc ncRNA' . Agricolae (talk) 18:09, 29 September 2012 (UTC)

Thanks for those thoughts and suggestions. (1) All info except length of chromosomes is taken directly from the linked EBI database entries, and was not calculated. (2) The actual value of Å/bp will vary according to the sequence, ionic conditions, and locally-bound proteins, so there is no single correct value. I used the standard textbook value of 3.4 angstroms/bp, and I've changed "calculated" to "estimated" in the article to reflect the fact that these are not measured values and may be off by a little. (3) Agree on tRNAs, but have not found a source where those are published, so I'd have to commit the sin of original research to break them out. John Mackenzie Burke (talk) 18:46, 7 December 2012 (UTC)

About point 2, I know all about the actual science - my comment was about consistency and reflecting the most common given estimate (which my textbooks are reporting differently than yours). In this table, we need to use the 3.4 of the source, but for the other tables, they should at least have the same number as each other. We should figure out the 'best' number to use, and use it everywhere rather than giving different #s int he same table on different pages. Too bad about the tRNAs. It does say a lot about how much things have changed as it used to be taught as rRNA, MRNA & tRNA, while all the little ones known even then, snRNAs, gRNAs, etc., didn't even get mentioned and now tRNA is just another miscRNA. It would be great if something usable could be found (I don't remember seeing anything on tRNAs in the ENCODE papers, but I wasn't looking either). Agricolae (talk) 19:15, 7 December 2012 (UTC)

Repetitive DNA section contains major errors

I have signed up for an account to bring the mistakes in this article to the attention of editors. Apparently 'About 8%' of the human genome is repetitive DNA. This is completely false. It is almost certainly upwards of 50% depending upon the definition of repetitive. Here is a recent estimate: http://www.plosgenetics.org/article/info%3Adoi%2F10.1371%2Fjournal.pgen.1002384

Please can someone more familiar with the citation procedures here amend the statement?

Repeats may also be kilobases in length, in disagreement with the current article. — Preceding unsigned comment added by Bedeabc (talk • contribs) 22:46, 17 November 2012 (UTC)