Wikipedia talk:WikiProject Molecular Biology/Gene Wiki/Archive 4
This is an archive of past discussions about Wikipedia:WikiProject Molecular Biology. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 |
Gene Wiki – Discussion |
Split out Interactions
Discussion moved from Talk:Mammalian target of rapamycin (see also Talk:BRCA1#Interactions):
The Interactions section has a huge number of references. Can we split the list (and references) into a separate mTOR interactions article or similar ? Rod57 (talk) 18:26, 6 December 2010 (UTC)
- The idea behind the protein interaction section of Gene Wiki articles is to cross link articles about proteins that are known to interact with each other. These sections were added in part to reduce the number of orphan Gene Wiki articles. There are a large number of Gene Wiki articles (~10,000) and a significant percentage of these contain interaction sections (so there is a total of several thousand interaction sections). Hence it would be a lot of work to split these out. In addition, these would create a large number of stubby articles and each one of these would be essentially an WP:ORPHAN (each only linked from one article). Furthermore these proposed stubs would be magnets for WP:AFDs discussions and proposals to WP:MERGE back the stubs into the parent protein article.
- The problem is not the text in these interaction sections which is normally quite short but rather the number of citations. One way to reduce the number of citations is to keep only the most recent one for each pairwise interaction (in the hope that the most recent citation will in turn cite the earlier papers). Another way might be to instead provide a single link to database where the list of interactions (and citations) were obtained. For example, a citation for the mTOR interactions might look like:
- "mTOR protein interactors". Human Protein Reference Database. Johns Hopkins University and the Institute of Bioinformatics. Retrieved 2010-12-06.
{{cite web}}
: Cite has empty unknown parameter:|coauthors=
(help)
- "mTOR protein interactors". Human Protein Reference Database. Johns Hopkins University and the Institute of Bioinformatics. Retrieved 2010-12-06.
- One advantage of this second approach is that the reader is provided a link to the most up-to-date information. Thoughts? Boghog (talk) 22:23, 6 December 2010 (UTC)
- Yes, the number of citations is definitely an issue. But I am personally against special pages only for interactions. Scientific articles provide usually max 2-3 citations per statement since the article is limited in size. This could be a rule of thumb also for us to provide 2-3 citations (either very classical or new ones) per interaction partner as their purpose is to inform reader and provide link to evidence rather than collect all knowledge at one place. --Mashin6 (talk) 01:06, 7 December 2010 (UTC)
- I agree that the number of citations should be cut down to 2 to 3 per protein (and that the interactions should be kept in the main page). There are also some papers that cover multiple interaction partners, keeping those would also cut down the number of interactions. Citing reviews would also be helpful, as they should in turn link to the original papers. MichaK (talk) 10:28, 7 December 2010 (UTC)
- I agree with 2-3 refs per protein on the main page. How about we fix the most egregious offenders by hand now and then to a mass programmatic fix later on. (Still trying to find a willing student to do some of these housekeeping projects!). I've added it to the ideas page so we don't forget about it... (diff) Cheers, AndrewGNF (talk) 13:59, 7 December 2010 (UTC)
- Fixing these by hand is a real pain even if we are only talking about a few examples. The most egregious offenders are by definition very long lists of citations. It is very tedious to go through a long list of citations, sort them by date, identify if any are review articles, and then make a decision about which to retain. Also regardless of what is done with the long citation list, I think it would be very useful to add a link to the Human Protein Reference Database (HPRD) entry. Finally, why not replace all the interaction citations with a single link to the appropriate HRPD entry? This link would provide the full list interacting proteins and citations. Furthermore the HPRD is continuously updated while there currently is no mechanism in place to update Gene Wiki interaction section. Finally this solution would be much easier to implement and would reduce or even eliminate the need to maintain the interaction lists. Boghog (talk) 06:06, 8 December 2010 (UTC)
- Maybe you are right. Since the HRPD list also contain links to pubmed. But for the interaction partners not listed there, we have to provide citations and restrict it to max 3 citations. --Mashin6 (talk) 00:08, 9 December 2010 (UTC)
- Good point about interaction partners not listed there that have been added by human editors. These citations (maximum 3 per interaction) should be retained. Boghog (talk) 04:23, 9 December 2010 (UTC)
- Keep in mind that HPRD is not the best source. The most complete list of experimentally supported human protein-protein interactions can be found here. Hodja Nasreddin (talk) 23:13, 12 March 2011 (UTC)
- I have not looked deeply into which interaction database is the best to use, but since CPDB is a metadatabase that includes data from HPRD, that would seem to favor CPDB. On the other hand, CPDB does not contain the latest data that is stored in HPRD and some of the back links from CPDB to HPRD do not appear to be working. Similar to the issue that we had with links from the Pfam infobox to PDB databases, to make this work, we need the appropriate query links to CPDB and/or HPRD. These are the closest that I have been able to come up with:
- CPDB: entrez:2475 (close, but one still needs to select the exact hit from a list)
- HPRD: hprd_id:03134 (not clear how to map entrez_id → hprd_id)
- Unless these databases already have the appropriate hooks or are willing to implement them, it will be difficult to link to them. Boghog (talk) 10:17, 13 March 2011 (UTC)
- Right. When I looked six months ago ago, they did not provide convenient links. Another place to look might be Pathway Commons. Hodja Nasreddin (talk) 03:24, 14 March 2011 (UTC)
- I have not looked deeply into which interaction database is the best to use, but since CPDB is a metadatabase that includes data from HPRD, that would seem to favor CPDB. On the other hand, CPDB does not contain the latest data that is stored in HPRD and some of the back links from CPDB to HPRD do not appear to be working. Similar to the issue that we had with links from the Pfam infobox to PDB databases, to make this work, we need the appropriate query links to CPDB and/or HPRD. These are the closest that I have been able to come up with:
- Keep in mind that HPRD is not the best source. The most complete list of experimentally supported human protein-protein interactions can be found here. Hodja Nasreddin (talk) 23:13, 12 March 2011 (UTC)
- Good point about interaction partners not listed there that have been added by human editors. These citations (maximum 3 per interaction) should be retained. Boghog (talk) 04:23, 9 December 2010 (UTC)
- Maybe you are right. Since the HRPD list also contain links to pubmed. But for the interaction partners not listed there, we have to provide citations and restrict it to max 3 citations. --Mashin6 (talk) 00:08, 9 December 2010 (UTC)
- Fixing these by hand is a real pain even if we are only talking about a few examples. The most egregious offenders are by definition very long lists of citations. It is very tedious to go through a long list of citations, sort them by date, identify if any are review articles, and then make a decision about which to retain. Also regardless of what is done with the long citation list, I think it would be very useful to add a link to the Human Protein Reference Database (HPRD) entry. Finally, why not replace all the interaction citations with a single link to the appropriate HRPD entry? This link would provide the full list interacting proteins and citations. Furthermore the HPRD is continuously updated while there currently is no mechanism in place to update Gene Wiki interaction section. Finally this solution would be much easier to implement and would reduce or even eliminate the need to maintain the interaction lists. Boghog (talk) 06:06, 8 December 2010 (UTC)
- I agree with 2-3 refs per protein on the main page. How about we fix the most egregious offenders by hand now and then to a mass programmatic fix later on. (Still trying to find a willing student to do some of these housekeeping projects!). I've added it to the ideas page so we don't forget about it... (diff) Cheers, AndrewGNF (talk) 13:59, 7 December 2010 (UTC)
- I agree that the number of citations should be cut down to 2 to 3 per protein (and that the interactions should be kept in the main page). There are also some papers that cover multiple interaction partners, keeping those would also cut down the number of interactions. Citing reviews would also be helpful, as they should in turn link to the original papers. MichaK (talk) 10:28, 7 December 2010 (UTC)
I think the actual problem is the poor presentation of interactions as tediously long lists, and propose converting them to prose. See discussion at: talk:Converting interactions-sections to prose. Mikael Häggström (talk) 07:00, 15 May 2011 (UTC)
en-dash
An Error has occurred retrieving Wikidata item for infobox
This template causes this to appear:
- 46.66 - 46.73 Mb
But obviously it should instead have said this:
- 46.66 – 46.73 Mb
with an en-dash rather than a hyphen (see WP:MOS). As far as I can tell I can't edit this thing. Could whoever tends to this thing fix the punctuation error? Michael Hardy (talk) 23:15, 12 February 2011 (UTC)
- Yes, you are right. I have edited the sandbox version of {{GNF_Protein_box}} template to replace the dash with en-dash. You can see the result in text cases (current version to the left, sandbox version to the right). If this looks OK, I can make a request for an administrator to update the production version of the template. Thanks for the heads-up. Cheers. Boghog (talk) 12:48, 13 February 2011 (UTC)
Complementarity determining region
Could somebody sort out Complementarity determining region? Part of it was deleted some time ago (including refs), then it was expanded a bit, and I just can't sort out what is constructive and what not. Thanks, ἀνυπόδητος (talk) 10:59, 13 February 2011 (UTC)
- Good grief! That is a pretty confusing edit history. Immunology is not my field, but I made an attempt to restore some of the previous material that was deleted without explanation. I hope that this is an improvement. Cheers. Boghog (talk) 12:34, 13 February 2011 (UTC)
- Thanks for the help. I added a bit more and hope the CDR locations are more or less clear now. Still can't find out what/where a hypervariable region is in that context. Is it part of CDR1, or of all three, or is it a synonym for CDR as an old version seems to suggest? --ἀνυπόδητος (talk) 14:31, 13 February 2011 (UTC)
- According to this (see especially figure 3), the hypervariable region is synonymous with the three CDR locations (CDR1 + CDR2 + CDR3) as the suggested in the old version. Boghog (talk) 16:29, 13 February 2011 (UTC)
- Even better sources here and here that state that the complementarity determining region is synonymous with the hypervariable region. Boghog (talk) 16:33, 13 February 2011 (UTC)
- Thanks --ἀνυπόδητος (talk) 17:52, 13 February 2011 (UTC)
- Even better sources here and here that state that the complementarity determining region is synonymous with the hypervariable region. Boghog (talk) 16:33, 13 February 2011 (UTC)
- According to this (see especially figure 3), the hypervariable region is synonymous with the three CDR locations (CDR1 + CDR2 + CDR3) as the suggested in the old version. Boghog (talk) 16:29, 13 February 2011 (UTC)
Gene Duplication
What is the best way to handle when one is referring to several isoforms of a gene. This came up in the FIG4 page where a link to yeast Atg18 was pointed to one of four mammalian WIPI proteins. For protein families, should a new page be generated? Davebridges (talk) 22:57, 8 March 2011 (UTC)
- In my opinion, discussion about isoforms derived from the same gene should generally be contained within the same article unless there is something particularly notable about a specific isoform in which case a separate article about that isoform may be warranted. Isoforms derived from different genes should be covered in separate articles. The scope of Gene Wiki articles is the human gene/protein (included all splice variants derived from that gene) as well as orthologs (as listed in HomoloGene) that exist in other species. If there are paralogs in humans (and by extension other species), then a family article (see for example dopamine receptor) would be appropriate. Does this sound reasonable? Boghog (talk) 05:02, 9 March 2011 (UTC)
PREP (gene) is a article completely bot-edited, it seems equivalent, but different from, Prolyl endopeptidase. Can some-one have a look at those two articles? --Havang(nl) (talk) 10:36, 12 March 2011 (UTC)
- Done Seeing as there is only one human enzyme with this enzymatic activity, I went ahead and merged the two articles. Thanks for the heads up. Cheers. Boghog (talk) 16:51, 12 March 2011 (UTC)
APRIL
What is "A Proliferation Inducing Ligand (APRIL)" mentioned in B-cell activating factor and Belimumab, among other pages? HGNC knows two proteins with the synonym APRIL: ANP32B and TNFSF13. I suspect the latter, but could someone (i. e. most likely Boghog) confirm this? Thanks, ἀνυπόδητος (talk) 19:53, 13 March 2011 (UTC)
- According to Q92688, an alternative name for the protein encoded by the ANP32B gene is "acidic protein rich in leucines" (APRIL) whereas O75888 lists "a proliferation-inducing ligand" (APRIL) as an alternate name for TNFSF13. Hence as you suspect, the correct link appears to be TNFSF13. Boghog (talk) 20:13, 13 March 2011 (UTC)
- Thanks, as ever! --ἀνυπόδητος (talk) 20:38, 13 March 2011 (UTC)
File:PBB Protein MRPL11 image.jpg listed for deletion
A file that you uploaded or altered, File:PBB Protein MRPL11 image.jpg, has been listed at Wikipedia:Files for deletion. Please see the discussion to see why this is (you may have to search for the title of the image to find its entry), if you are interested in it not being deleted. Thank you. Skier Dude (talk) 03:04, 28 April 2011 (UTC)
- Replied at the discussion page. Boghog (talk) 03:55, 28 April 2011 (UTC)
Style guide
A draft MCB Style Guide for gene/protein articles has been written. Since this style guide in principle applies to all Gene Wiki articles, I thought I should mention it here. Contributions and comments are welcome. Boghog (talk) 20:23, 19 May 2011 (UTC)
BioGPS GeneWikiGenerator Bug
There's a small escaping bug in the GeneWikiGenerator: "<" is shown as "& l t ;" (sans spaces), e.g. currently http://biogps.gnf.org/GeneWikiGenerator/#goto=genereport&id=29911 . MichaK (talk) 15:54, 31 May 2011 (UTC)
Lead sentence
Sorry to dig up an old issue, but I recently went through a bunch of gene/protein articles and changed the wording of all the lead sentences (for example), not knowing that there was an agreed style for that first sentence. I think the problem was that it's not clear from the lead:
- <recommended UniProt name> is a protein that in humans is encoded by the <approved HUGO gene symbol> gene.
that the article is meant to address both the gene and the protein.
I think it would be a good idea to tweak this slightly to prevent another editor from coming through and making the same mistake. I was thinking maybe a hidden notice at the top of the page, like <!-- The lead sentence should be in the format:<recommended UniProt name> is a protein that in humans is encoded by the <approved HUGO gene symbol> gene.--> or adding a sentence at the top similar to a disambiguation notice, along the lines of "This article addresses both <protein> and <gene>. I don't know, though. Thoughts? Kerowyn Leave a note 06:33, 24 June 2011 (UTC)
Main subunit of cytochrome c oxidase
Main subunit of cytochrome c oxidase states that COX1 and MT-CO1 contain this peptide. Does COX1 refer to PTGS1 here? Furthermore, COI (which is also a synonym of MT-CO1) links to a disambig and CO1 (??) to carbon monoxide which is rather unlikely to contain a polypeptide. Could someone clarify? Thanks --ἀνυπόδητος (talk) 06:45, 20 July 2011 (UTC)
- The "list of human proteins containing this domain" in Main subunit of cytochrome c oxidase were all synonyms of the same gene symbol and several as you point out were linked to inappropriate pages. I have fixed the problem in this edit. Part of the confusion is that the same symbol COX1 refers to two completely different genes (MT-CO1 and PTGS1). Cheers. Boghog (talk) 10:57, 20 July 2011 (UTC)
- Thanks --ἀνυπόδητος (talk) 12:44, 20 July 2011 (UTC)
Signaltransduction-stub
(cross-post from the semi-active WP:WikiProject Cell Signaling) Any objections if I propose deletion of the signal transduction stub type? Category:Signal transduction stubs contains a rather arbitrary collection of 13 pages, most of which could go into Category:Receptor stubs or Category:Transmembrane receptor stubs. Thanks, ἀνυπόδητος (talk) 17:15, 12 September 2011 (UTC)
- I support deletion. There are a couple of signal transduction core articles that are long past being stubs and a virtually endless list of articles that theoretically could be considered signal transduction stubs. Therefore this category seems neither informative nor useful. Cheers. Boghog (talk) 19:24, 12 September 2011 (UTC)
- Nominated for deletion. --ἀνυπόδητος (talk) 07:38, 13 September 2011 (UTC)
{{PBB}} and the edit links it produces
It's come to my attention that ProteinBoxBot, which maintains this project's infoboxes, uses a curious template known as {{PBB}} to add them to articles. For some reason this template insists on adding a wrapping table around the infobox and sticking an [edit] link above it, which not only pushes the infobox down articles but is also completely unnecessary as the infobox code supports an edit link anyway. ProteinBoxBot should instead be adding the name =
attribute to the infobox pages themselves. Adding | name = PBB/9353
to {{PBB/9353}}, for instance, results in the output shown at {{PBB/testcases}} (an unobtrusive link at the bottom of the template).
user talk:ProteinBoxBot redirects here, so if the ProteinBoxBot maintainers need any further details then please point them at my talk page. This is a trivial change which should positively impact the appearance and accessibility of thousands of articles.
Chris Cunningham (user:thumperward) - talk 19:58, 14 September 2011 (UTC)
- I also think this change is a good idea. The present way these boxes are displayed places an [edit] link at the upper right hand side of the page right below the [edit] link for the lead. I think this has confused some editors since I have seen a number of cases where text has been added to the template where text should have been added to the lead. Moving the template edit link to the inside the box would reduce the confusion. My own preference would be to place the link inside the upper right hand side of the infobox as is done in Lisinopril. This makes the edit button somewhat easier to find and more closely follows the convention for section edit links (links placed above the section). Boghog (talk) 04:22, 15 September 2011 (UTC)
- One additional suggestion, in the {{GNF Protein box}} template, the name parameter should set to "| name = {{Hs_EntrezGene}}" instead of "| name = {{PAGENAME}}" since these special purpose templates are currently named after the Hs_EntrezGene ID. Furthermore using the Hs_EntrezGene ID should require less maintenance. The page name of the parent article that transcludes the special purpose template might change over time, but the Hs_EntrezGene ID will never change. Boghog (talk) 05:51, 15 September 2011 (UTC)
- There's a much stronger convention for infobox templates, where separately editable, to place the navbar at the bottom of the template. I'm averse to changing this based on what would appear to be an entirely undiscussed (and indeed, undeclared) change to the drugbox template, which hasn't been updated to use the {{infobox}} system yet like this template has. As for the
{{{name}}}
parameter, I think you misunderstand its purpose: it is not the same as{{{Name}}}
(note the change in capitalisation) which gives the subject title, but in fact needs to refer to the current page title in order for the navbox magic to link to the correct view / talk / edit pages. Anyway, the first thing to do is to get ProteinBoxBot to add the links in question: once that's done, I'll update {{PBB}} to take them into account. I've pinged the bot's owners with a link to this discussion. Chris Cunningham (user:thumperward) - talk 09:27, 15 September 2011 (UTC)
- There's a much stronger convention for infobox templates, where separately editable, to place the navbar at the bottom of the template. I'm averse to changing this based on what would appear to be an entirely undiscussed (and indeed, undeclared) change to the drugbox template, which hasn't been updated to use the {{infobox}} system yet like this template has. As for the
- I agree with the change to move the edit link within the infobox itself, and also to move it to the bottom. Now that we have real infrastructure to maintain the infoboxes (we promise, Boghog!), it should be quite rare for the average user to want to edit the infobox. So the more subtle location sounds great to me. As to where to put the
PBB/9353
field, I'm a little hesitant to have both 'Name' and 'name' fields that do very different things. What if we put it under 'path' or 'PageLocation', or something similar? Boghog, I'm still not 100% sure I understand your suggestion regarding PAGENAME -- can you clarify? Cheers, AndrewGNF (talk) 16:38, 15 September 2011 (UTC)
- I agree with the change to move the edit link within the infobox itself, and also to move it to the bottom. Now that we have real infrastructure to maintain the infoboxes (we promise, Boghog!), it should be quite rare for the average user to want to edit the infobox. So the more subtle location sounds great to me. As to where to put the
- As far as Boghog's suggestion, when I put "| name = {{Hs_EntrezGene}}" on the template it works identically to the solution Chris suggested. I assume using this gives flexibility in the future if the page name changes (which seems unlikely, since the page name for the template is based off the Entrez gene id?). I also agree that two identical (save for capitalization) fields would be confusing. If the field has to be "name", would we be okay with putting it at the bottom of the template, preceded by an HTML comment saying something like "field used for infobox navbar"? Pleiotrope (talk) 18:48, 15 September 2011 (UTC)
- (edit conflict) The proposal to place editing links at the top of the {{Drugbox}} was made here. Wide spread implementation was put on hold not because of the location of the link but rather because there was reservation of creating special purpose drugbox templates that would be transcluded back into drug articles. The proposal to use Hs_EntrezGene ID was to eliminate the need for the {{PBB}} template. But if the PAGENAME parameter is needed to link back from the special purpose template to the parent article that transcludes the special purpose template, then I take back my suggestion. I must admit I do not completely understand how linking in the {{infobox}} system works. Cheers. Boghog (talk) 19:11, 15 September 2011 (UTC)
- OK, I guess I was really mixed up. The geneid (in the PBB template) = Hs_EntrezGene (in the special purpose template). In the PBB template, it doesn't matter if the parameter is called geneid or Hs_EntrezGene. My initial confusion was caused by trying to compare how Lisinopril links to its special purpose template using the PAGENAME parameter whereas the locations of the special purpose templates for GNF_Protein_boxes are based on the geneid. Hence I support Chris' initial suggestion. I apologize for the confusion. Boghog (talk) 19:36, 15 September 2011 (UTC)
AndrewGNF: I agree that having both {{{name}}}
and {{{Name}}}
is confusing. If you want to go with {{{path}}}
for the parameter then I'll update {{GNF_Protein_box}} to use that for the navbar links instead. Chris Cunningham (user:thumperward) - talk 06:10, 16 September 2011 (UTC)
- Great! No promises on when, but Pleiotrope will be the one handling it... We'll let you know how it goes... Cheers, AndrewGNF (talk) 16:34, 16 September 2011 (UTC)
- I've gone ahead and modified the bot to add the
{{{path}}}
parameter on its next run, which should be sometime in the next few days. I'll let Chris know so he can change {{GNF_Protein_box}} at that time. Pleiotrope (talk) 18:08, 16 September 2011 (UTC)
- I've gone ahead and modified the bot to add the
- I see that ProteinBoxBot is now adding
{{{path}}}
parameters and these are working fine, so I've updated {{PBB}}. See the new, cleaner look on one of the new pages such as TTLL3. Thanks, folks. Chris Cunningham (user:thumperward) - talk 11:41, 23 September 2011 (UTC)
- I see that ProteinBoxBot is now adding
Albondin
Albondin (gp60) is a sub-stub that might profit from a gene infobox. Or there already might be a gene article to which it could be merged. Thanks, as ever, ἀνυπόδητος (talk) 08:08, 15 September 2011 (UTC)
- Albondin (which might be related to megalin) is a rather obscure protein whose gene has not apparently been cloned. What gene or genes encode this receptor is far from clear. Hence at the present time, I don't think it will be possible to add a infobox to this article. Sorry. Boghog (talk) 19:48, 15 September 2011 (UTC)
- No problem. Thanks for looking into it. Cheers, ἀνυπόδητος (talk) 19:51, 15 September 2011 (UTC)
Discussion
Hello, I dropped a message with question here : http://en.wikipedia.org/wiki/Template_talk:GNF_Protein_box#IDs. Thanks :) Anthere (talk) 07:12, 20 September 2011 (UTC)
GeneWikiGenerator down
The GeneWikiGenerator is down (504 Gateway Time-out). There has been a recent flurry of interest. Also is there any chance that the bugs first described here could be fixed? Thanks. Boghog (talk) 17:15, 23 September 2011 (UTC)
- Hi Boghog, it's back up now and better than ever. I've done a complete re-write so the encoding bugs and other issues shouldn't be a problem, and also incorporated the suggestions you made here. Hopefully you find it easier to use. As it's a re-write, I'm aware that there are a brand-new set of bugs and I'm actively fixing the ones I can find. I would appreciate it if you let me know when you find any. Please let me know if you have any questions or suggestions. -- Pleiotrope (talk) 21:42, 28 September 2011 (UTC)
- Hi Pleiotrope. Fantastic! The new GeneWikiGenerator is a big improvement and it will make creating new pages much easier. Cheers. Boghog (talk) 04:09, 30 September 2011 (UTC)
Semantic link proposal over at the Village Pump
We're soliciting feedback for the use of a new template that should be pretty useful on GeneWiki pages over at the village pump and we'd like your opinions. The example we use there is targeted to a broader audience, but we have some GeneWiki specific examples and motivation here: WP:Gene Wiki/SWL proposal. Thanks, Pleiotrope (talk) 17:15, 19 October 2011 (UTC)
PBB's text could benefit from some wikilinks
Since wikipedia is supposed to be useful to non-specialists, I went and added a whole bunch of wikilinks to DLX1: humans homeobox transcription factor Drosophila TGF-β craniofacial differentiation neurons forebrain chromosome 2 Alternatively spliced isoforms. Then I noticed that PBB updates the text and my wikilinks could get deleted pretty soon.
Is is worth adding some wikilinking ability to PBB so it can do this automatically to (the first appearance of) biology buzzwords? (Then you could get rid of the "require_manual_inspection" flag I added.)
I'm not quite sure what The Right Thing is here. Could PBB at least be changed to preserve the wikilink if the linked word is still in the replacement text? 71.41.210.146 (talk) 16:56, 7 November 2011 (UTC)
- Thanks for your additions to the DLX1 article. Don't worry about your edits being over written by the PBB. The original plan was that the bot was going to update the summary text, but this plan was never implemented (see WP:Gene_Wiki/Discussion/Archive3#PBB_Summary). Concerning bot additions of wiki links, this I think would be incredibly difficult. The first issue is deciding what terms should be linked. The second is find an appropriate target for the link. This is often far from trivial. I think this task really requires a human editor. Boghog (talk) 17:14, 7 November 2011 (UTC)
- Well, I was thinking, here's a really simple technique: when updating some text from old to new, for each wikilink in the original, find the first instance of the same text in the new, and install the matching wikilink. (Thus, each summary would have to be hand-edited once to insert the links, although you could have the bot make a guess if there is no old text.) If the linked text disappears, you can do one of:
- Delete the wikilink
- Place a comment somewhere with a few words on either side of the old link text and put the page into a category for manual review
- Hold the entire update for manual review.
- Option 2 looks fairly practical. In the meantime, I'll get rid of the PBBControl template.
- Thanks for the response! 71.41.210.146 (talk) 03:51, 8 November 2011 (UTC)
- Well, I was thinking, here's a really simple technique: when updating some text from old to new, for each wikilink in the original, find the first instance of the same text in the new, and install the matching wikilink. (Thus, each summary would have to be hand-edited once to insert the links, although you could have the bot make a guess if there is no old text.) If the linked text disappears, you can do one of:
Speedy deletion nomination of Template:PBB/HP1BP3
A tag has been placed on Template:PBB/HP1BP3, requesting that it be speedily deleted from Wikipedia. This has been done for the following reason:
Under the criteria for speedy deletion, articles that do not meet basic Wikipedia criteria may be deleted at any time.
If you think that this notice was placed here in error, contest the deletion by clicking on the button labelled "Click here to contest this speedy deletion". Doing so will take you to the talk page where you will find a pre-formatted place for you to explain why you believe the page should not be deleted. You can also visit the page's talk page directly to give your reasons, but be aware that once tagged for speedy deletion, if the page meets the criterion, it may be deleted without delay. Please do not remove the speedy deletion tag yourself, but don't hesitate to add information to the page that would render it more in conformance with Wikipedia's policies and guidelines. If the page is deleted, you can contact one of these administrators to request that the administrator userfy the page or email a copy to you. Bulwersator (talk) 08:40, 19 December 2011 (UTC)
Creation request: Phosphacan and NG2 (protein)
Linked from / mentioned in a number of articles [1] [2]. NG2 is about a Puerto Rican Salsa duo which is probably unrelated. Anyone willing to write a bit about these? --ἀνυπόδητος (talk) 14:01, 20 December 2011 (UTC)
- Articles about both of these genes already exist, but the aliases you list were not linked. I have added a redirect to PTPRZ1 (Phosphacan) and an other uses banner linking to CSPG4 article. Boghog (talk) 07:09, 21 December 2011 (UTC)
- Thanks. Where can I look up such things? A search on HGNC didn't yield any results for phosphacan, and the UniProt results are rather ambiguous. --ἀνυπόδητος (talk) 08:22, 21 December 2011 (UTC)
- As you mentioned, the first thing I check is HGNC and UniProt. If that doesn't work, I fall back to Google. For example searching, google NG2+proteoglycan+gene lead me to gene cards CSPG4. For Phosphacan, I used UniProt. I agree that the search results are a little ambiguous, but in both rat and mouse (but not human), phosphacan is listed as an alias of PTPRZ1. So it appears phosphacan is a fairly obscure name that someone used to name the rodent version of PTPRZ1. Boghog (talk) 08:40, 21 December 2011 (UTC)
- Thanks. Where can I look up such things? A search on HGNC didn't yield any results for phosphacan, and the UniProt results are rather ambiguous. --ἀνυπόδητος (talk) 08:22, 21 December 2011 (UTC)
Possibly unfree File:PBB Protein GSTA2 image.jpg
A file that you uploaded or altered, File:PBB Protein GSTA2 image.jpg, has been listed at Wikipedia:Possibly unfree files because its copyright status is unclear or disputed. If the file's copyright status cannot be verified, it may be deleted. You may find more information on the file description page. You are welcome to add comments to its entry at the discussion if you are interested in it not being deleted. Thank you. Stefan2 (talk) 23:58, 16 January 2012 (UTC)
Possibly unfree File:PBB Protein GSTA1 image.jpg
A file that you uploaded or altered, File:PBB Protein GSTA1 image.jpg, has been listed at Wikipedia:Possibly unfree files because its copyright status is unclear or disputed. If the file's copyright status cannot be verified, it may be deleted. You may find more information on the file description page. You are welcome to add comments to its entry at the discussion if you are interested in it not being deleted. Thank you. Stefan2 (talk) 23:58, 16 January 2012 (UTC)
Model organisms
Further to the suggestion by BogHog that a discussion on my talk page (see here) would benefit from wider input, I'm bringing it here. In summary, I have started on a project to populate as many of the Gene Wiki pages as possible with "functional" information gleaned from model organisms. I guess the wider goal is to bring phenotypes to the Gene Wiki in as standardized a manner as possible. I think this is important, because much of the content we have on these genes are from various model organisms, and currently the information on the pages are a bit of a mishmash of references to mouse and human, and occasionally fly and yeast, orthologues. Orthologous genes do not necessarily share all the same characteristics between species, so there is a potential to mislead readers unless we are careful with these distinctions, and generally we haven't been.
So I have been creating "model organism" sections where that information, when present, can be sectioned off from the information on human genes. I have also been including new content pulled from the set of standarized mouse KO lines, generated by the International Knockout Mouse Consortium, and where available the standardized phenotypic analysis carried out on these animals by the various international phenotyping centres (accompanied by a collapsed summary table). A nice example of this is SLX4, though there are many more genes with much less interesting phenotypes (e.g. Ciliary neurotrophic factor receptor). There are currently around 500 genes with this information available in the mouse, and I plan to add them all to their Gene Wiki pages, and many more should be generated over the next few years. I should point out that I work at one of the phenotyping centres (The Wellcome Trust Sanger Institute). Though I am not directly involved the phenotyping project, I have been able to get the support of their informatics team. So I am starting with their data because I can generate it easily in a wiki-accessible format. It is my intention, however, to move on to the lines from the other centres (and try and convince them to provide me Wiki-friendly data dumps).
The bigger picture, of course, is to try and encourage the contribution of other model organism information for each gene, thereby increasing the knowledge base on Wikipedia. I'm content to continue adding these by hand over the coming months (and probably years) as I think there is value in a human parsing the information to identify the model organisms derived content, and then writing the section in a way that suits the article. However, there is also scope - as BogHog suggests - for bot assistance or merging some content to the infobox. I'm happy to defer to the consensus on that and any suggestions on how to proceed. But if there are no major objections to this, in the meantime I will continue to work on these one at a time. Rockpocket 00:33, 27 January 2012 (UTC)
- Thanks Rockpocket for bringing up the issue here. I have left request for comment messages on several other Wikiprojects redirecting to this discussion so hopefully we will get wider input. My opinion is that model organism data is very valuable since it gives clues as to the function and potential significance of a protein. Given that you will be adding data relatively slowly over the course of time and the number of articles involved is still relatively small, I don't think there is any urgency in adding it directly to the Gene Wiki info box. At the number of involved articles grow, we might come back and reconsider how to better integrate the data within the articles. I would be interested to known what others think. Cheers. Boghog (talk) 21:09, 28 January 2012 (UTC) I just noticed this highly relevant discussion: User_talk:AndrewGNF#Phenotypic_annotation_of_Gene_wiki_genes. Boghog (talk) 21:53, 28 January 2012 (UTC)
- I agree that these additions are highly welcome. Just a small thing: the red and blue backgrounds are rather dark depending on the screen and ambient light. And what about creating a template? --ἀνυπόδητος (talk) 21:19, 28 January 2012 (UTC)
- Yes, a template is probably a good idea - though somewhat beyond my technical capabilities! I'll look into it. Thanks for you comments. Rockpocket 12:48, 4 February 2012 (UTC)
- Sorry to be late to the party. I'll chime in my support for adding these data to the relevant Gene Wiki page, and for creating a template for better maintainability. Cheers, AndrewGNF (talk) 19:12, 16 February 2012 (UTC)
- I agree that a template is the way forward. My plan is to get the first 250 genes out in the current format because A) I've already created them this way in my muserspace and B) we want to have them all live by the time we submit a corresponding paper a next week. Then I'll turn my attention to creating a template to replace them all, and use that for the subsequent genes. Rockpocket 13:32, 18 February 2012 (UTC)
- I like the model organism section in the SLX4 article, but the phenotypes section is blank - why not include phenotypes from MGI (e.g. Slx4tm1a(EUCOMM)Wtsi phenotypes)? Without getting into the what-is-a-phenotype debate, I think characteristics and phenotypes could be merged. See also User:Frog21's mockups. I would also like to see the human phenotypes displayed for a gene (using HPO annotations as a source). Cmungall (talk) 19:47, 10 October 2012 (UTC)
- I agree that a template is the way forward. My plan is to get the first 250 genes out in the current format because A) I've already created them this way in my muserspace and B) we want to have them all live by the time we submit a corresponding paper a next week. Then I'll turn my attention to creating a template to replace them all, and use that for the subsequent genes. Rockpocket 13:32, 18 February 2012 (UTC)
- Sorry to be late to the party. I'll chime in my support for adding these data to the relevant Gene Wiki page, and for creating a template for better maintainability. Cheers, AndrewGNF (talk) 19:12, 16 February 2012 (UTC)
- Yes, a template is probably a good idea - though somewhat beyond my technical capabilities! I'll look into it. Thanks for you comments. Rockpocket 12:48, 4 February 2012 (UTC)
Improving descriptions in PDBbot image files
The message here pointed out some issues with the description of images uploaded by PDBbot, for example the previous description for Protein_CREBBP_PDB_1f81.png: "Structure of the CREBBP protein. Based on PyMOL rendering of PDB 1f81." The main issue is that PDB 1f81 corresponds to a fragment of the CREBBP protein, and a statement like "Structure of the CREBBP protein" could plausibly be interpreted to say that the image shows the entire protein. Also, there is nothing to indicate that 1f81 is from a mouse protein, and is not from a human -- here again, the omission of a detail could lead the reader to assume the image shows something it does not.
These issues potentially exist with many of the ~2,500 images uploaded by PDBbot. I updated the description of the image in question here; it now reads: "Structure of the TAZ2 domain (amino acids 1764-1850) of the CREBBP protein, as expressed in the common house mouse (Mus musculus). Based on PyMOL rendering of PDB 1f81". The information needed for that expanded description exists as structured data in the PDB entry for 1f81. I think most of the information needed for similarly expanded descriptions for all PDBbot images also exists in corresponding PDB files -- i.e., the range of residues depicted in the structure, the common and scientific name of the source organism.
Are there any issues with the revised description above? Does a description close to that form seem applicable to all PDB files? Thanks in advance for any feedback. Emw (talk) 04:06, 13 March 2012 (UTC)
- This description looks good. ProteinBoxBot also does some image generation and uploading akin to PDBbot, so I will go ahead and update our caption/description strings to mimic your proposal. Pleiotrope (talk) 17:46, 13 March 2012 (UTC)
- In the linked Commons thread it was pointed out that the italicized image description above uses "express" ambiguously. For example, PDB 1f81 is from a mouse, but the actual expression organism for the crystal was in E. coli. So I think a better form for the description would be: "Structure of the TAZ2 domain (amino acids 1764-1850) of the CREBBP protein from the common house mouse (Mus musculus)".
- On a related note, another point I took away from that thread was that the captions for PDB images in Gene Wiki articles (e.g. "PDB rendering based on 1f81") should be improved so that they don't lead readers to assume that the image is the structure of the full, human protein. I think there's some merit to the critique. Should the image description in the associated media file also replace the image caption in the article, or would the description be too long and/or detailed? Emw (talk) 17:33, 25 March 2012 (UTC)
- Sorry to chime in here, actually the caption "PDB rendering based on 1f81" is IMHO good enough if the 1f81 is a link, so people interested in the source organism can get that if wanted. Rather, I'd like to see articles in en-WP have less human bias. If I may bore you with how it's done in de-WP, most protein articles start with "...is an enzyme/protein found in vertebrata/mammals/all organisms (etc) in the cytosol. ... The human protein is ..." so you can see from the start what the article is talking about, namely all homologs of the protein. Of course, humans are most important so most information is about the human homolog, but you get the idea. This way, people also see that evolution happened, and we're not the only ones who have eg hemoglobin. --Ayacop (talk) 16:11, 7 May 2012 (UTC)
Hi! Concering the lead sentence in Gene Wiki articles, as discussed here and here, we have tried to make clear that these articles are not only about the human gene/protein, but also orthologs that exist in other species. The wording that was reached through consensus is perhaps a little awkward, but it is both accurate and concise:
- The "that" in the above sentence is non-limiting implying that the protein (and gene) exists in other species besides human. Boghog (talk) 20:15, 7 May 2012 (UTC)
- I see, and I can guess that where this isn't found it will be at some time. I also don't want to nit pick in a language I wasn't educated in before my first ten years. However, I have the notion that your sentence above not only implies what you say; it can also imply that in other organisms something completely different happens which we atm don't know. This possible implication is wrong, as you know. We know much much better. And we want to share this knowledge, not merely imply that there might be something which may well be controversial to the feelings of some. Knowledge is political and should not be censored. --Ayacop (talk) 07:54, 12 May 2012 (UTC)
- It is certainly not controversial that orthologs human proteins exist in other species and we certainly are not trying to hide that fact. Quite to the contrary, if there is a mouse ortholog, it will be displayed prominently in the {{GNF Protein box}}. If there are orthologs in other species, they will be listed in the HomoloGene link in the protein box. See for example, HomoloGene: 469 in Hemoglobin, alpha 1. (Compare with de-WP: Hämoglobin, alpha 1 for which no information about the orthologs in other species is provided). Again, the lead sentence in Gene Wiki articles was reached by consensus and was by design meant to be simple enough for a bot to implement. Of course, editors are free to change the lead sentence if they wish. However mass changes to the lead sentence in all Gene Wiki articles would require a new consensus and I doubt there would be much support for such a change. Boghog (talk) 09:31, 12 May 2012 (UTC)
- I see, and I can guess that where this isn't found it will be at some time. I also don't want to nit pick in a language I wasn't educated in before my first ten years. However, I have the notion that your sentence above not only implies what you say; it can also imply that in other organisms something completely different happens which we atm don't know. This possible implication is wrong, as you know. We know much much better. And we want to share this knowledge, not merely imply that there might be something which may well be controversial to the feelings of some. Knowledge is political and should not be censored. --Ayacop (talk) 07:54, 12 May 2012 (UTC)
Enabling molecular structure manipulation with WebGL
I recently stumbled upon GLmol, which enables in-browser 3D manipulation of chemical structures as shown here, and was thinking to myself how neat it would be if something like that could be integrated into Wikipedia's protein infoboxes. GLmol use WebGL, a new web technology that puts OpenGL (the 3D engine used by PyMOL) into modern browsers. Ideas for using Jmol to do this have been brewing for a while (see here and here), but WebGL might be a worthwhile alternative to consider.
I see a few obstacles for implementing interactive WebGL-enabled PDB structures in Wikipedia infoboxes, along with possible solutions:
- 1) WebGL support isn't universal among browser. Internet Explorer doesn't currently support it, and it looks like IE10 won't either. How WebGL would be handled on mobile browsers would also have to be considered. This might be solvable by detecting whether WebGL exists in the given browser, and if not showing a static image like we do now.
- 2) Code for this would need to get into MediaWiki, and ultimately enabled in Wikipedia's MediaWiki deployment assuming it is deemed important enough to include. I've done some initial work and set up a local MediaWiki deployment for development, and think there's a decent chance I can figure out how to actually get WebGL working in protein infoboxes with corresponding PDB structure. I think it will be more difficult to make the case to WMF that a feature like this is worth incorporating into a WP-en's MediaWiki deployment
Does this seem like something worth pursuing? What are others' thoughts? Emw (talk) 04:06, 13 March 2012 (UTC)
- I've got a proof of concept of GLmol integrated into a local Mediawiki deployment as a media handling extension. This means that PDB files can be uploaded through the standard Mediawiki upload form, and included on a page similar to how a JPG would be -- for example, [[File:2A07.pdb]]. The fact that the Jmol extension was not a media handler seems like it was the main obstacle to integration with Mediawiki (see https://bugzilla.wikimedia.org/show_bug.cgi?id=16491), so it's convenient that this has already been addressed in the GLmol extension. That potential roadblock was pointed out by helpful Mediawiki developers on the #mediawiki connect IRC channel, who seemed to think that it would be feasible to incorporate a GLmol media handler into Wikipedia's Mediawiki deployment in time. If so, this would bode well for potential obstacle #2 in my post above.
- Writing the GLmol extension as a media handler also implies a solution for potential obstacle #1 above -- how to gracefully degrade the feature for browsers that don't support WebGL. Upon upload of a PDB file, a static image representation of the 3D structure would be generated by the server (or, possibly, retrieved from a remote server). When a page containing a PDB file link were requested, the extension would respond with either that pre-generated static image or an interactive 3D model depending on the requesting agent's support for WebGL.
- Given that, I would like to gather thoughts from Gene Wiki participants on a default method of getting that static image from an uploaded PDB file. The software underlying PDBbot seems like a natural candidate for this, but I'm not familiar with any alternative methods that might have been developed in the last two and a half years which might be better. Could folks please chime in with their thoughts on this? Emw (talk) 02:02, 4 April 2012 (UTC)
- While in principle I like the idea of providing interactive 3D images, I am still very worried about the page load times and the quality of the images. One possible solution is to by default only download and display a ray traced static image. But in addition, provide a button so if a user wants an interactive image, the 3D coordinates and the GLmol code are also downloaded. The interactive image would then replace the static image in the original web browser window. Thoughts? Boghog (talk) 04:57, 4 April 2012 (UTC)
- At least in Safari, WebGL is not enabled by default. One first has to turn on the display of the "develop" pull down menu in Safari preferences and then enable WebGL in the "develop" menu. Finally while I really like glmol viewer example, it takes several seconds before the coordinates are downloaded and the display becomes active (and I have a reasonably fast internet connection and computer). This would be unacceptably long wait for a default display of a Gene Wiki web page. Boghog (talk) 05:12, 4 April 2012 (UTC)
- That's a good point about latency. With a fresh cache, it took about 10 seconds for all the GLmol demo resources to download and render for me. Fortunately, it looks like there's a ton of low-hanging optimization fruit there. I think the Mediawiki GLmol extension could probably cut the time-to-render down from roughly 10 seconds in the linked demo on Sourceforge.jp to 0.5 - 1.5 seconds for most users once it's deployed on Wikipedia. Do you think that would be fast enough to show the interactive model by default?
- Regarding the visual quality of the interactive 3D models compared to ray-traced static images, it's worth pointing out that the crispness of the models varies noticeably between browsers. On Windows 7, Firefox 11 shows slight pixelization in the model and manipulations have some minor lag; however, in Chrome 18, the model is as visually high-resolution and fluidly manipulable as it is in PyMOL. If it were deemed important for an initial release, it would probably also be possible to ray-trace the initial display of the interactive models; otherwise this could be moved into a subsequent release.
- Independent of the approach taken, it seems like a static image would needed for at least browsers that don't support WebGL by default. Would it be best to do that static image generation using adapted PDBbot code, or to use some other method? I'd like to post a basic public demonstration in 3 to 4 weeks, but would need to begin working on a method for getting static images from PDB files soon. Thanks for the input so far, Emw (talk) 01:52, 5 April 2012 (UTC)
- Render times on the order of one second are probably acceptable, but we would need to make sure that the total load time for a Gene Wiki page with an interactive 3D image is not significantly increased for those with slow internet connections. Concerning the static images, the PDBe has already generated PyMOL ray traced graphics for the entire PDB and has offered to make them available to us via a server. (see this discussion; toward the bottom, it is a very long thread). Boghog (talk) 02:33, 5 April 2012 (UTC)
Prototype of new MediaWiki extension available
A prototype of this feature is available at http://pdbhandler.wmflabs.org. I've developed the feature as a MediaWiki media handling extension named "PDBHandler". It enables interactive 3D models of proteins and DNA on MediaWiki deployments like Wikipedia without the need for browser plug-ins. For browsers that don't support WebGL or have it enabled, a static image of the structure is shown instead of an interactive model. Some background that informed the development of the extension is in a discussion thread on the wikitech-l mailing list here.
Any feedback would be appreciated. I'll be working on this extension all day Tuesday and/or Wednesday at the hackathon in Wikimania 2012, so any input today or tomorrow is particularly welcome.
This feature loads 3D models only after the article's text and other media is ready and displayed on the user's browser, so users don't need to wait for the 3D model to finish loading to begin reading the article. In this sense, PDBHandler does not increase articles' page load time. To minimize the time needed to load and render the 3D models themselves, I've developed a custom compression scheme for PDB files that decreases their gzipped file size by roughly 30-70%.
The static images of structures shown for users that don't have WebGL enabled in their browser are identical to the kind produced by PDBbot. The static image is generated immediately after the PDB file's upload by modified PDBbot code that runs on a MediaWiki server. This approach doesn't have the disadvantage of relying on a remote third-party web service, and got favorable feedback from MediaWiki developers on IRC and the mailing list (e.g. here) relative to the remote-server approach.
An early prototype had a favorable user experience review from Wikimedia Foundation staff. It was decided that, for the extension's initial release to Wikipedia, interactive 3D models would an opt-in feature -- i.e., they would need to be manually enabled by users in their account preferences/'My preferences' page. The default behavior for the initial release, where users had not explicitly opted in, would be to show the static image representations of structures where an article had a PDB file wikilinked, e.g. [[File:1MBO.pdb]].
The feature has some loose ends, but I hope to mostly tie those up next week. I'll update here with news on the extension. Thanks in advance for any feedback on the prototype. Emw (talk) 18:33, 8 July 2012 (UTC)
Duplicated GO annotations
For {{PBB/6790}}, several GO annotations appear multiple times in the list: e.g. centrosome, spindle, mitotic cell cycle. What's going on there? As an aside, how often are the GO annotations updated? MichaK (talk) 10:36, 22 May 2012 (UTC)
Bug in {{PBB}} template
There seems to be a major bug in the {{PBB}} template. See, for example, CD20 -- the beginning of the article now reads: , {{{1}}}, {{{1}}}, Hide Hide, </a> </a>, , Membrane-spanning 4-domains, subfamily A, member 1 Rendering based on PDB 1S8B. Available structures PDB Ortholog search: PDBe, RCSB , {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, , {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, MyPDB MyPDB, , {{{1}}}, {{{1}}}, Hide Hide, </a> </a>, , [show]List of PDB id codes , , , {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, {{{1}}}, , {{{1}}}, , {{{1}}}, Username: Username: , {{{1}}}, Password: Password: ...etc., etc. Pjbradle (talk) 16:42, 30 May 2012 (UTC)
- Thanks for the heads up. The parent template itself is OK, but there seems to be a bug in the bot that updates these templates. I have undone the problematic edit. Cheers. Boghog (talk) 18:54, 30 May 2012 (UTC)
- I looked through some of PBB's contributions, in total 21 templates had bugs: {{PBB/23385}}, {{PBB/2263}}, {{PBB/5473}}, {{PBB/860}}, {{PBB/51206}}, {{PBB/1832}}, {{PBB/170302}}, {{PBB/6822}}, {{PBB/931}}, {{PBB/3725}}, {{PBB/2931}}, {{PBB/3718}}, {{PBB/3661}}, {{PBB/3717}}, {{PBB/7518}}, {{PBB/4985}}, {{PBB/643}}, {{PBB/23643}}, {{PBB/4987}}, {{PBB/951}}, {{PBB/967}}. I undid the revisions that had not been corrected by others. May I kindly suggest a sanity check that PBB may only add or remove 10kb? :-) MichaK (talk) 19:09, 30 May 2012 (UTC)
Collaboration with NCBI?
I am in contact with the National Center for Biotechnology Information (who run web services like PubMed Central) over them providing references in a way that allows for easy copy-pasting into Wikipedia articles (similar to what Europeana does or the Biomedical citation maker). Where would be the best place to discuss what Wikipedia template formats (e.g. {{Cite web}}, {{Cite journal}}, {{Citation}}, {{Cite book}}) would be best to implement at what NCBI projects? Thanks for any pointers. Please reply at WikiProject NIH.
Another point of interest is how NCBI content could be used to update infoboxes (or even article text), similar to what PBB does - what is the best place to discuss such matters? -- Daniel Mietchen - WiR/OS (talk) 04:00, 18 July 2012 (UTC)
- The initial discussion for the second aspect is also over at WP:NIH. -- Daniel Mietchen - WiR/OS (talk) 02:13, 19 July 2012 (UTC)
MIR941-1 - problem with ProteinBoxBot, and an interesting article
Hi. I saw this Nature Communications article (as widely reported), and so searched for MIR941-1, which was created today - that article has some sort of a {{PBB}} problem at the top, (maybe just a remnant that the bot is leaving behind?) and too much whitespace under the intro-sentence - both should probably be fixed at the source-level. This is mostly just a note for the bot maintainer (User talk:ProteinBoxBot redirects to here), but also a pointer towards interesting articles (both ours and Nature's). HTH. —Quiddity (talk) 18:59, 23 November 2012 (UTC)
Adding systematic redirects
A bot request to create systematic redirects for Gene Wiki articles so that they are easier to located has been submitted here. Comments and suggestions are welcome. Boghog (talk) 05:02, 2 January 2013 (UTC)
PBB problem
it's not working for ARGLU1. the gene id is correct, so no clue as to what's happening. regards, FoCuSandLeArN (talk) 13:51, 5 February 2013 (UTC)
- Done Thanks for the heads up. The template needs to be created before it can be transcluded. The easiest way to create these template is to use this template generator. I went ahead and created the required template for this gene. Cheers. Boghog (talk) 14:44, 5 February 2013 (UTC)
Similar problem over at C9orf16 (gene). Brightgalrs (/braɪtˈɡæl.ərˌɛs/)[1] 17:28, 17 May 2013 (UTC)
- Add ARHGAP44 (gene) to the list. utcursch | talk 00:32, 25 May 2013 (UTC)
- Done} Cheers, Andrew Su (talk) 14:00, 25 May 2013 (UTC)
- Add ARHGAP44 (gene) to the list. utcursch | talk 00:32, 25 May 2013 (UTC)
Another ATXN7L2 (gene) Brightgalrs (/braɪtˈɡæl.ərˌɛs/)[1] 01:25, 8 June 2013 (UTC)
- Done Boghog (talk) 06:46, 8 June 2013 (UTC)
Wrong namespace?
Having just discovered this portal, it seems like it belongs in the Project namespace. According to WP:Portal, "The idea of a portal is to help readers and/or editors navigate their way through Wikipedia topic areas through pages similar to the Main Page." This page seems to be more of a WikiProject. -- Ypnypn (talk) 15:19, 26 May 2013 (UTC)
- I agree. It's always seemed inconsistent, to me. Klortho (talk) 19:27, 1 June 2013 (UTC)
- Yes, you could be right. It is intended to help readers and editors navigate through related WP pages, but it is true that it is not like the Main Page. In any case, if anyone is sufficiently motivated to move everything, no objection from me... Cheers, Andrew Su (talk) 01:02, 5 June 2013 (UTC)
- Alas, I would do it, but still don't have move subpages permissions. Sigh. Klortho (talk) 02:19, 5 June 2013 (UTC)
- Wow, bizarre. Huh, I doubt I have the permissions either then... Cheers, Andrew Su (talk) 04:03, 5 June 2013 (UTC)
- Alas, I would do it, but still don't have move subpages permissions. Sigh. Klortho (talk) 02:19, 5 June 2013 (UTC)
- Yes, you could be right. It is intended to help readers and editors navigate through related WP pages, but it is true that it is not like the Main Page. In any case, if anyone is sufficiently motivated to move everything, no objection from me... Cheers, Andrew Su (talk) 01:02, 5 June 2013 (UTC)
How to link from wikipedia to wikigenes.org?
Gang -- Look at this page as an example: https://en.wikipedia.org/wiki/Ubiquitin#Human_proteins_containing_ubiquitin_domain In that section "Human proteins containing ubiquitin domain" I would like to link to wikigenes.org for the genes that do not have wikipedia pages. For example, UBL7 has no wikipedia page but does have a page here: http://www.wikigenes.org/e/gene/e/84993.html
Is the goal to link to wikigenes.org? Or some other gene wiki? How to do so in wiki code? JesseAlanGordon (talk) 14:59, 9 July 2013 (UTC) Jesse Gordon, July 9, 2013
- Hi there. I think the preferred method of handling that case would be to simply create the Wikipedia page for that missing gene. We have a tool to do that at http://biogps.org/GeneWikiGenerator. Once the Wikipedia page is created, you are welcome to link to wikigenes.org if you think it adds something... Make sense? I went ahead and created the page for UBL7. Feel free to create any of the others if you think they are noteworthy! (If you have any problems with that tool, you are welcome to just post page creation requests here and someone will take care of it. But in the interest of teaching a man to fish....) Cheers, Andrew Su (talk) 18:37, 9 July 2013 (UTC)
NCBI traffic data
If you have ideas or suggestions as to which data NCBI should make available about traffic they get from Wikimedia servers, please list them here. Thanks! -- Daniel Mietchen (talk) 21:21, 25 October 2013 (UTC)
ProteinBoxBot using non-existent templates
Article AAVS1 (gene) was recently created by User:ProteinBoxBot and it includes two templates that don't exist: Template:PBB/17 and Template:Gene-17-stub. --Derek Andrews (talk) 01:51, 11 November 2013 (UTC)
- Fixed. Thanks for the head ups. Boghog (talk) 07:41, 11 November 2013 (UTC)
BioGPS site down
The BioGPS for generating new Gene Wiki articles has been down some time. There are still ~10,000 gene to go. :-) Boghog (talk) 07:45, 11 November 2013 (UTC)
- OK, I now see the site is still up, but the url has changed. Boghog (talk) 14:05, 14 November 2013 (UTC)
ProteinBoxBot updates to UniProt links
Hi. ProteinBoxBot is making a large number updates that seem to be in error. See for example diff. Regards Boghog (talk) 10:42, 15 November 2013 (UTC)
This change has been corrected. Uniprot had a misconfigured mirror which accounted for the request bodies being injected in their response. I updated our bot to fix the incorrect pages that were introduced and all seems back to normal. Cheers x0xMaximus (talk) 21:39, 16 November 2013 (UTC)
Adipose triglyceride lipase
This is mentioned in Pirinixic acid. Is this the same as PNPLA2, as the link in Serine hydrolase suggests? --ἀνυπόδητος (talk) 12:28, 25 November 2013 (UTC)
- Q96AD5 indicates that it is. I therefore have been bold and moved PNPLA2 to adipose triglyceride lipase. Cheers. Boghog (talk) 16:27, 25 November 2013 (UTC)
- Thanks --ἀνυπόδητος (talk) 18:48, 27 November 2013 (UTC)
PBB link
In the course of editing underlinked articles, I've discovered that the PBB template for HMGCS2 isn't working, and I don't know how to fix it...thought I should let you know. WQUlrich (talk) 00:25, 26 November 2013 (UTC)
- Fixed Thanks for the head up. Cheers. Boghog (talk) 05:28, 26 November 2013 (UTC)
Is this also encoded by DRD2? If yes, should it be merged to Dopamine receptor D2? If no, should it have an infobox? --ἀνυπόδητος (talk) 08:40, 9 February 2014 (UTC)
- Thanks for pointing this out. In my opinion, as D2sh is a splice variant of Dopamine receptor D2, the former should be merged into the later. I have been bold and went ahead with the merger. Boghog (talk) 10:12, 9 February 2014 (UTC)
MGI data pages moved
Well, they are redirected, but it would be quicker and cause less traffic if the URL http://www.informatics.jax.org/searches/accession_report.cgi?id=MGI:{{{1}}} in {{MGI}} were changed to http://www.informatics.jax.org/marker/MGI:{{{1}}}. I haven't found evidence that this is true for all pages, and given the high visiblity of the template, I'd rather leave the change to somebody else, i.e. Boghog. --ἀνυπόδητος (talk) 08:40, 9 February 2014 (UTC)
- Fixed in this edit. Thanks for the heads up. Boghog (talk) 10:30, 9 February 2014 (UTC)
Mab-21 domain containing 2 is broken
This article's trying to transclude a non-existent template (that doesn't seem to have been deleted previously, so maybe it never existed?), but it's not some sort of vandalism or accidental on-page breakage (no edits since shortly after the bot created it). Facing the Sky (talk) 16:36, 1 March 2014 (UTC)
- Fixed. Thanks for the heads up. The bot that created the article that should have also simultaneously created the template. Apparently there was some sort of glitch. The template has now has been created by the bot. Boghog (talk) 18:32, 1 March 2014 (UTC)
What is Leukocyte-promoting factor?
I can't find anything about this cytokine(?). The refs are about macrophage migration inhibitory factor and platelet activating factor as far as I can see. --ἀνυπόδητος (talk) 19:46, 2 December 2014 (UTC)
- Not sure either. The supplied references do not seem to directly talk about leukocyte-promoting factor. I did find a recent book that suggests that the factor is an obsolete term that encompasses a variety of factors that stimulates leukopoiesis. I have modified the article accordingly. Boghog (talk) 20:39, 2 December 2014 (UTC)
Broken copies of Template:PBB
CACNA2D3 and CAMK1D both contain nonexistent subtemplates of Template:PBB. I can see how I would create those subpages using Template:GNF Protein box, but if ProteinBoxBot can instead that would be even better.—Neil P. Quinn (talk) 17:33, 11 January 2015 (UTC)
BioGPS no longer creating GeneWiki pages
For several weeks now, the BioGPS GeneWikiGenerator has not been able to create Wikipedia pages (failed creating both PBB template and GeneWiki article, Status: creation failed). One could manually create the pages by copy and pasting the output of GeneWikiGenerator into Wikipedia. Can someone take a look at this? Thanks. Boghog (talk) 08:08, 3 April 2015 (UTC)
- The Gene Wiki team has been working on using Wikidata as the source for the infoboxes. I heard in a presentation last week that fixing this is on the radar. Gtsulab (talk) 16:59, 23 October 2015 (UTC)
Links/Dynamic content from other sources
I am not sure where to post this request, but I saw on the Ideas Page that additional links are planned in the future for the protein box. I wanted to ask if there was a possibility to add another source to this list. Specifically I would like to suggest Genevisible which is a free resource for finding the top 5 expressing tissues/cancers/perturbations for a given gene. I will be honest here and state that I am one of the maintainers of that resource and am willing to help get a dynamic plot working (something akin to the GeneAtlas plots already in the protein box) should there be interest for it. However, I would already be grateful if a direct link from a protein/gene page to its relevant entry on our resource was added to the protein box. Here is an example for HP1BP3. I am ready to answer any questions you might have regarding this and would gladly help in implementing/testing the feature should it be accepted.Ovoggen (talk) 11:45, 30 July 2015 (UTC)
- I think the Gene Wiki team is moving to feed the infoboxes with info from Wikidata, now that the required info (and then some) has been imported into Wikidata. Gtsulab (talk) 17:05, 23 October 2015 (UTC)
Proteasome subunit α8
Hi! Just wanted to note that PSMA8 is the only one of the {{Proteasome subunits}} that hasn't got an article. --ἀνυπόδητος (talk) 08:50, 18 October 2015 (UTC)
- Done Thanks for the heads up and for creating the {{Proteasome subunits}} navobox. Boghog (talk) 09:31, 18 October 2015 (UTC)
- Thanks for creating the article. Here are some more if you've got the time: ALDH5A1 is a redirect, and ALDH8A1 has a protein box instead of a gene infobox. Cheers --ἀνυπόδητος (talk) 11:42, 23 October 2015 (UTC)
Uniprot KB
Hi, what do i have to do to get parts of the reviewed items from UniprotKB (accession number, protein name, gene name, organism, GO - molecular and biological function, keywords, length, mass and sequence) into wikidata? A lady at UniprotKB told me that you already have their permission. Please see the discussion there, all the best, --Ghilt (talk) 21:59, 9 May 2016 (UTC)
Importing drug-protein interaction data from DrugBank
Hi,
I would like to import drug-protein (drug-target, drug-enzyme, etc) interactions to gene_wiki from DrugBank. Would the gene_wiki community be interested in this, and, if so, what is the best way to import this data.
Thank you for your help.
Crowegian (talk) 04:33, 24 May 2016 (UTC)
- Hello! Unfortunately, DrugBank's license is not compatible with Wikipedia: "The DrugBank database is a freely available resource for non-commercial use." while Wikipedia/Wikidata have no such restrictions on commercial use. Therefore, it's not possible to import interactions from DrugBank on a large scale. MichaK (talk) 06:26, 24 May 2016 (UTC)
- MichaK Thank you for the quick response. That's too bad. I've found another data source here, where the data is licensed under CC-BY. I understand that Wikidata licenses structured data under CC0, which is what I'd like to do with the data. Do you know if there would be any licensing issue going from CC-BY to CC0? Crowegian (talk) 17:29, 25 May 2016 (UTC)
- I'm glad you found STITCH, because I maintain this database. ;-) On one hand, I'm not sure if importing STITCH into WikiData is a good idea: you have very many interactions with varying confidence levels, whereas I think Wikidata as being more black and white. On the other hand, CC-BY and CC0 are also not compatible: E.g. with CC-BY, you have to cite STITCH as the source for the information. In CC0 there's no such restriction. MichaK (talk) 11:29, 30 May 2016 (UTC)
C11orf52
C11orf52 is an orphaned protein/gene article without an infobox. Thanks as always, Boghog! --ἀνυπόδητος (talk) 09:49, 10 September 2016 (UTC)
WikiFactMine
We are delighted that we have been awarded a WMF grant for WikiFactMine - a project to link Wikidata to the primary bioscience literature. We are really keen to integrate this with GeneWiki activities. The latest (Databases) paper is very impressive and echoes much of our thinking. This is just to reach out and make contact - we already know several of you. Marti Johnson is the WMF coordinator of the project and she was very keen for us to make contact. I think she will be coordinating meetups in the near future.
We've already prototyped the machine identification of genes in EuropePMC content and are optimistic of very useful precision-recall. We will shortly have a (probably weekly) trawl of EPMC Open Access content and we can work retrospectively later if required. Initially this could be bidirectional: "What Wikidata genes are in this paper?", and "what genes are in this paper that Wikidata might be interested in?" (possibly including a WikiCite addition). Petermr (talk) 10:36, 11 November 2016 (UTC)
Proposal of using full size image for RNA expression pattern
Hello Gene Wiki! In infoboxes of gene/protein articles, there are small thumbnail images of RNA expression pattern. I proposed switching these small thumbnail to full size image, because Wikipedia/Mediawiki image system was changed. Since image data are recalled from Wikidata, I post the proposal at Wikidata page (wikidata:Property talk:P692#How about using full size image instead of small thumbnail?). I would like to get your thoughts on that. Thank you. --Was a bee (talk) 06:31, 18 March 2017 (UTC)
Dihydropyrimidine dehydrogenase
DPYD is the gene, and Dihydropyrimidine dehydrogenase is the enzyme. I keep forgetting if and how such pages should be merged. Thanks, ἀνυπόδητος (talk) 15:41, 14 April 2017 (UTC)