User talk:Citation bot/Archive 8
This is an archive of past discussions about User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 5 | Archive 6 | Archive 7 | Archive 8 | Archive 9 | Archive 10 | → | Archive 15 |
Updated release now published
With apologies for the delay, I've now finished wrangling with various new credential protocols and have pulled the latest version of the bot -- with many long-anticipated bug fixes -- onto the production site. Hopefully this will work for all with no glitches, but being realistic, please do raise any issues either here as usual, or (if the issue relates to the implementation, i.e. the service being unavailable) try raising a GitHub issue, which may catch my attention more punctually. Please do let me know how yous all get on! In particular, if a reported bug is now fixed, please do mark it as such by setting its status to {{fixed}}. Martin (Smith609 – Talk) 07:09, 23 July 2018 (UTC)
{{notabug}} we have already moved on. Flag for archiving. AManWithNoPlan (talk) 17:28, 24 July 2018 (UTC)
Bot replaced translator-first with unrecognized and incorrect parameter
- Status
- new bug
- Reported by
- (t) Josve05a (c) 20:26, 23 July 2018 (UTC)
- Type of bug
- Deleterious
- What happens
- The bot replaced
|translator-first=
and|translator-last=
with|inventor-first=
and|inventor-last=
, which isn't recognized by{{cite book}}
, nor is correct in this situation. - What should happen
- The bot should not replace human added
|translator-first=
and|translator-last=
with other parameters - Relevant diffs/links
- Special:Diff/851668101&oldid=851545607
- Replication instructions
- Run bot on England
- We can't proceed until
- Bot operator's feedback on what is feasible
- This is because the citation template people add parameters like candy. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 23:08, 23 July 2018 (UTC)
- Should probably double check Module:Citation/CS1/Whitelist/sandbox against the list in the code. --Izno (talk) 00:50, 24 July 2018 (UTC)
- Added a bunch more. AManWithNoPlan (talk) 03:37, 24 July 2018 (UTC)
- Should probably double check Module:Citation/CS1/Whitelist/sandbox against the list in the code. --Izno (talk) 00:50, 24 July 2018 (UTC)
{{fixed}}
vauthors replaced with deprecated authors
- Status
- new bug
- Reported by
- Boghog (talk) 06:05, 24 July 2018 (UTC)
- Type of bug
- Deleterious
- What happens
|vauthors=
is replaced with|authors=
- What should happen
- Should not touch
|vauthors=
- Relevant diffs/links
- diff
- Replication instructions
- Run bot on Antioxidant
- We can't proceed until
- Bot operator's feedback on what is feasible
The long supported |vauthors=
produces clean metadata while the deprecated |authors=
does not. Boghog (talk) 06:05, 24 July 2018 (UTC)
I note a recent discussion where this behavior was mentioned with a question about whether this is the desired behavior. Boghog (talk) 06:29, 24 July 2018 (UTC)
- It think will fix it https://github.com/ms609/citation-bot/pull/428. AManWithNoPlan (talk) 12:32, 24 July 2018 (UTC)
{{fixed}}
Bot should not replace access-date and dead-url
- Status
- {{fixed}}
- Reported by
- (t) Josve05a (c) 07:18, 24 July 2018 (UTC)
- Type of bug
- Cosmetic
- What happens
- The bot replaces
|access-date=
and|dead-url=
with|accessdate=
and|deadurl=
. Both are accpeted - however, access-date and dead-url are prefered per template documentation. - What should happen
- The bot should not replace parameters with other parameters with/without a hyphen.
- Relevant diffs/links
- Special:Diff/851731112&oldid=828984182
- Replication instructions
- Run the bot on a page with
|access-date=
in{{cite web}}
- We can't proceed until
- Bot operator's feedback on what is feasible
The templates have added so many things. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:41, 24 July 2018 (UTC)
- The RFC on hyphenated parameter names was four years ago. – Jonesey95 (talk) 15:44, 24 July 2018 (UTC)
hdl-access
- Status
- {{fixed}}
- Reported by
- (t) Josve05a (c) 10:54, 24 July 2018 (UTC)
- Type of bug
- Deleterious
- What happens
- The bot replaces
|doi-access=
with|hdl-access=
for no reason - What should happen
- Do not replace accceptable parameters with content, without guarantee that it is not causing an error
- Relevant diffs/links
- Special:Diff/851749497&oldid=851648489
- Replication instructions
- Run the bot on Reptile
- We can't proceed until
- Bot operator's feedback on what is feasible
See also this edit where the bot replaced |url-access=
with |hdl-access=
.
—Trappist the monk (talk) 11:18, 24 July 2018 (UTC)
Added to white list https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:39, 24 July 2018 (UTC)
Support for new parameters
User:AManWithNoPlan has kindly added new parameters to the bot's dictionary. I've pulled through this update now, so hopefully replacement of unrecognized parameters will no longer be an issue. Martin (Smith609 – Talk) 16:29, 24 July 2018 (UTC)
{{fixed}} flagged for archiving.
Redandant europepmc.org URLs added
- Status
- new bug
- Reported by
- Boghog (talk) 06:05, 24 July 2018 (UTC)
- Type of bug
- Inconvenience
- What happens
- Redundant europepmc.org URLs are added to templates containing
|pmc=
- What should happen
- should not add redundant URLs
- Relevant diffs/links
- diff
- Replication instructions
- Run bot on Antioxidant
- We can't proceed until
- Bot operator's feedback on what is feasible
Europe PubMed Central is a mirror of PubMed Central. |pmc=
links the title of the article to the relevant page on PubMed Central. Adding the redundant |url=
replaces the already linked title with a link to a mirror site. Boghog (talk) 06:21, 24 July 2018 (UTC)
{{fixed}} https://github.com/ms609/citation-bot/pull/430 AManWithNoPlan (talk) 15:24, 24 July 2018 (UTC)
Should recognize HDL
- Also in that same edit, handle system has its own cs1|2 parameter:
|hdl=
; instead of: - write:
|hdl=10397/34754
- Also in that same edit, handle system has its own cs1|2 parameter:
https://github.com/ms609/citation-bot/pull/433 AManWithNoPlan (talk) 17:26, 24 July 2018 (UTC)
{{fixed}}
Bot adds arxiv urls rather than use the arxiv parameter
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 13:10, 24 July 2018 (UTC)
- Type of bug
- Inconvenience
- What happens
- [1]
- What should happen
- No arxiv urls added. Use
|arxiv=
for this. - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=LOBPCG&diff=prev&oldid=851764808
- Replication instructions
- Run on LOBPCG
- We can't proceed until
- Agreement on the best solution
- Requested action from maintainer
- Fix this.
https://github.com/ms609/citation-bot/pull/430 AManWithNoPlan (talk) 17:00, 24 July 2018 (UTC)
{{fixed}}
bot replaced |doi-access= with |hdl-access=
- Status
- {{Fixed}}
- Reported by
- Trappist the monk (talk) 10:49, 24 July 2018 (UTC)
- Relevant diffs/links
- this edit
- We can't proceed until
- Agreement on the best solution
Added to white list. https://github.com/ms609/citation-bot/pull/428 AManWithNoPlan (talk) 12:36, 24 July 2018 (UTC)
Improperly adds journal to citation template with contribution/title/series parameters
- Status
- new bug
- Reported by
- David Eppstein (talk) 05:11, 21 June 2018 (UTC)
- Type of bug
- Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
- What happens
- Some academic journals are also simultaneously book series. When a citation is made to a book in such a series using the citation template with the contribution/title/series parameters (for the title of the paper, title of the book, and title of the series) it is incorrect to add a duplicate journal parameter with the same value as the series. This creates a faulty citation, because the citation template does not allow both contribution and title in citations with nonempty journal parameters, and also because the series parameter means something different in citations with a journal. In the linked case, the citation was already correct as it stands. It would also work to use title/department/journal instead of contribution/title/series, but the bot's choice of contribution/title/journal is just broken.
- Relevant diffs/links
- Special:Diff/846802069
- We can't proceed until
- Agreement on the best solution
https://github.com/ms609/citation-bot/pull/435 AManWithNoPlan (talk) 20:48, 24 July 2018 (UTC)
{{fixed}}
Google Books
- Linking a title to Googlebooks is ok when the link leads to a preview; when it doesn't, as in this:
- it is better to omit
|url=
so that user expectation (that the citation title links to a source that can be read) is not confounded; users can get to Googlebooks through|isbn=978-3-527-30673-2
and its link through Special:BookSources. - —Trappist the monk (talk) 10:12, 24 July 2018 (UTC)
- Is there a cross-Wikipedia consensus on this? I can see editors becoming upset if links that they have added are removed by an automatic process. Martin (Smith609 – Talk) 16:35, 24 July 2018 (UTC)
- There is no consensus to remove urls to google books information page. However, the bot should not add the links to all cite books without a url either. (t) Josve05a (c) 16:40, 24 July 2018 (UTC)
- Is there a cross-Wikipedia consensus on this? I can see editors becoming upset if links that they have added are removed by an automatic process. Martin (Smith609 – Talk) 16:35, 24 July 2018 (UTC)
- Why does the bot suddenly add links to google books out of nowhere? That should not be done. Headbomb {t · c · p · b} 11:42, 24 July 2018 (UTC)
- I have created a pull request. https://github.com/ms609/citation-bot/pull/431 Probably a good idea until we almost all agree and until we verify that the hundred other link types do not exist AManWithNoPlan (talk) 17:06, 24 July 2018 (UTC)
Adding |url=
when the cs1|2 template has |title-link=
will produce the same undesirable results. I have not seen this, but when fixing this bug, you might check to make sure that the bot does not add |url=
when |title-link=
is set.
—Trappist the monk (talk) 10:21, 24 July 2018 (UTC)
- I assume Smith is sleeping right now. I know his and my time zone are not the same! AManWithNoPlan (talk) 23:30, 24 July 2018 (UTC)
- Deployed. Martin (Smith609 – Talk) 06:10, 25 July 2018 (UTC)
- Either the deploy failed or the issue is not resolved correctly. This bot edit, three hours after the above deployment notice, adds superfluous google books links; one of which broke an existing citation template.
- —Trappist the monk (talk) 09:58, 25 July 2018 (UTC)
- Deployed. Martin (Smith609 – Talk) 06:10, 25 July 2018 (UTC)
- I will look at again. It worked for my test cases but not these. Half fixed but still broke. AManWithNoPlan (talk) 12:57, 25 July 2018 (UTC)
Found other case https://github.com/ms609/citation-bot/pull/438 AManWithNoPlan (talk) 14:04, 25 July 2018 (UTC)
{{fixed}}
some open access links are dead urls
diff This edit added a link to http://digitallibrary.amnh.org/bitstream/handle/2246/5906/v3/dspace/updateIngest/pdfs/N3610.pdf%3Bjsessionid%3D23866600E2892FD54861C9246EBA1DBB?sequence%3D1 which was dead. (t) Josve05a (c) 14:40, 25 July 2018 (UTC)
- that does suck that the author of the journal article explicitly tells us to use a dead URL AManWithNoPlan (talk) 16:03, 25 July 2018 (UTC)
- I will look into adding some code to test the url AManWithNoPlan (talk) 16:32, 25 July 2018 (UTC)
https://github.com/ms609/citation-bot/pull/440 AManWithNoPlan (talk) 04:50, 26 July 2018 (UTC)
{{fixed}}
When converting cite arxiv to cite journal, update the year/date
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 13:19, 24 July 2018 (UTC)
- Type of bug
- Inconvenience
- What happens
- When converting a cite arxiv to a cite journal, the bot keeps the original date
- What should happen
- The bot should use the date as can be determined via bibcode/doi/pmids/other versions of records
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=LOBPCG&diff=prev&oldid=851764808
- We can't proceed until
- Agreement on the best solution
How about this: https://github.com/ms609/citation-bot/pull/436/ AManWithNoPlan (talk) 20:44, 24 July 2018 (UTC)
- @AManWithNoPlan: not sure what that does exactly, but the net should be cast as wide as possible for anything that trigger an upgrade from cite arxiv to cite journal/cite conference/cite book (ISBN, Bibcodes, PMID, PMC, etc... if those apply) Headbomb {t · c · p · b} 21:25, 24 July 2018 (UTC)
- It catches all cite webs and cite arxiv that do not already have a doi. AManWithNoPlan (talk) 23:28, 24 July 2018 (UTC)
- @AManWithNoPlan: what happens if the preprint is published, but without a doi but other identifiers, like bibcodes? Headbomb {t · c · p · b} 12:56, 25 July 2018 (UTC)
- Not sure. Do you have an example to test. I think that you have to go through the DOI database first. AManWithNoPlan (talk) 13:25, 25 July 2018 (UTC)
- @AManWithNoPlan: what happens if the preprint is published, but without a doi but other identifiers, like bibcodes? Headbomb {t · c · p · b} 12:56, 25 July 2018 (UTC)
- It catches all cite webs and cite arxiv that do not already have a doi. AManWithNoPlan (talk) 23:28, 24 July 2018 (UTC)
Here's possibly a case
- Arnold, Douglas N.; Fowler, Kristine K. (2011). "Nefarious Numbers". Notices of the American Mathematical Society. 58 (3): 434–437. arXiv:1010.0278. Bibcode:2010arXiv1010.0278A.
- arXiv:1010.0278 says it's published in "Notices Amer. Math. Soc., 58(3):434-437, 2011" The metadata is poor, and the upgrade from arxiv to journal is messy [3], but it's an example of where it could be done in theory. There are better examples out there, with better metadata, so I'll keep looking for those. Headbomb {t · c · p · b} 13:43, 25 July 2018 (UTC)
- Neither one of those cases has a DOI to be found using the ARXIV database AManWithNoPlan (talk) 15:26, 25 July 2018 (UTC)
{{fixed}} code merged
Some DOI data is junk
- Status
- new bug
- Reported by
- 65.94.42.168 (talk) 05:33, 25 July 2018 (UTC)
- Type of bug
- Deleterious: Human-input data is deleted or articles are otherwise significantly affected.
- What happens
- BOT assisted edit at M32p deleted the journal article name and replaced it with a nonsense journal article name, deleted the authors, deleted the journal volume, issue, publication date
- What should happen
- It should have been an author correction; the information for the publication journal date, volume, issue, etc is available via http://adsabs.harvard.edu/abs/2018MNRAS.475.2754H
I suggest that the bot crosscheck PMID, arXiv and bibcode against the DOI to see if the DOi is faulty. If all other uses match against each other, and the DOI doesn't then the DOI is in error.
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=M32p&action=historysubmit&type=revision&diff=851774001&oldid=851769468
- We can't proceed until
- Agreement on the best solution
- I think this is very local to OUP manuscripts, and it's probably just simpler to check that the DOI info does not resolve to a pre-production placeholder thing. Headbomb {t · c · p · b} 13:00, 25 July 2018 (UTC)
https://github.com/ms609/citation-bot/pull/439/files AManWithNoPlan (talk) 14:03, 25 July 2018 (UTC)
- Just to clarify. I deleted the title and the authors, everything in fact, since it was poorly-formatted and generating CS1 errors. Then used the bot to recreate the citation. So the bot didn't do anything too radical like overwriting good info with bad, but it did pick up the wrong title as described. Lithopsian (talk) 20:03, 25 July 2018 (UTC)
{{fixed}} we will add more checking as more oddities are found AManWithNoPlan (talk) 12:49, 26 July 2018 (UTC)
Bot moves parameters for no reason
- Status
- {{fixed}}
- Reported by
- Headbomb {t · c · p · b} 19:29, 26 July 2018 (UTC)
- Type of bug
- Cosmetic
- What happens
- The bot takes existing parameters and put them in new locations
- What should happen
- Leave things where they are
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Quark&diff=prev&oldid=852114776
- We can't proceed until
- Agreement on the best solution
This is because of the new code that allows DOI information to override Arxiv information. I know how to fix this. The citation forgets and the remembers the year. I need to change it to a placeholder and the change it back or delete it AManWithNoPlan (talk) 19:48, 26 July 2018 (UTC)
Converts empty coauthors into empty vauthors
- Status
- {{fixed}}
- Reported by
- Headbomb {t · c · p · b} 02:23, 27 July 2018 (UTC)
- Type of bug
- Improvement
- What happens
- The bot converts
|coauthors=
to|vauthors=
- What should happen
- If
|coauthors=
is non-empty, leave it alone. If|coauthors=
is empty, remove it. - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Biological_neuron_model&diff=prev&oldid=852162733
- We can't proceed until
- Agreement on the best solution
https://github.com/ms609/citation-bot/pull/445 typo fixing is hard AManWithNoPlan (talk) 02:45, 27 July 2018 (UTC)
Bot converts orig-year to origyear
- What happens
- Bot converts
|orig-year=
to|origyear=
- What should happen
- Leave it alone (or convert
|origyear=
to|orig-year=
), since|orig-year=
is the canonical use. - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Absolute_zero&diff=prev&oldid=852171128
- We can't proceed until
- Agreement on the best solution
https://github.com/ms609/citation-bot/pull/445/files#diff-bb37b1a3125b1a812ed46c7dfdccae3b Added
- https://github.com/ms609/citation-bot/pull/446 Check out this. Adding code to automatically generate the 1-99 stuff and added ability to split out parameters that should be recognized but not changed to AManWithNoPlan (talk) 04:40, 27 July 2018 (UTC)
cite web handling improvements
- What happens
- Does not convert cite web to cite journal when pmc is set
- What should happen
- convert and improve cite web to cite journal when it makes sense to do so
- Relevant diffs/links
- When running on [4] it misses one improvement. When I convert a cite web (with pmc) to cite journal [5], the bot can then kick in on that cite [6].
- We can't proceed until
- Agreement on the best solution
That’s been wrong forever. Good catch. Also pmid too. https://github.com/ms609/citation-bot/pull/447 AManWithNoPlan (talk) 13:37, 27 July 2018 (UTC)
Do not remove the publisher
- Status
- Reported by
- (t) Josve05a (c) 06:46, 24 July 2018 (UTC)
- Type of bug
- Deleterious
- What happens
- The bot removes all
|publisher=
in{{cite journal}}
- What should happen
- It should not remove human inputted fields.
- Relevant diffs/links
- Special:Diff/851728907&oldid=833759254
- Replication instructions
- Run the bot on Paul Ashbee
- We can't proceed until
- Agreement on the best solution
Personally I love the new functionality. I'll be very sad to see it go. Headbomb {t · c · p · b} 14:27, 28 July 2018 (UTC)
- @Headbomb: You want the bot to remove publisher fields from the citation if manually provided? Why? (t) Josve05a (c) 21:02, 28 July 2018 (UTC)
- This is NOT a new feature, it has been highly regarded for a long time. People seem to think that providing a published is too much information. Also, that changes over time and is generally not useful. I have written the code, but it is not in because of lack of agreement.
https://github.com/ms609/citation-bot/pull/432 AManWithNoPlan (talk) 22:39, 28 July 2018 (UTC)
- Well, it is a manually entered field, and the cite template had been changed to allow for both journal and publisher now, so consusnss over at the template's talk page seem to be to allow bot fields. (t) Josve05a (c) 09:27, 29 July 2018 (UTC)
{{notabug}}
redundant page range
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 13:45, 27 July 2018 (UTC)
- Type of bug
- Improvement
- What happens
- leaves the citation as
{{cite book ... |pages=23–23 ...}}
- What should happen
- should convert to
{{cite book ... |page=23 ...}}
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Marie_Curie&diff=prev&oldid=852228914
- We can't proceed until
- Agreement on the best solution
- Requested action from maintainer
- write code
This should also be reported to the CS1 people too so they can have the templates do this just like they convert dashes? AManWithNoPlan (talk) 13:56, 27 July 2018 (UTC)
https://github.com/ms609/citation-bot/pull/454 Does it to years & pages but non issues. AManWithNoPlan (talk) 02:29, 29 July 2018 (UTC)
- Should do it to issues too. I'll post a notice at Help talk:CS1 too. Headbomb {t · c · p · b} 14:16, 30 July 2018 (UTC)
- Issues added. AManWithNoPlan (talk) 14:47, 30 July 2018 (UTC)
{{fixed}}
Leave journal capitalization after : alone
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 02:20, 28 July 2018 (UTC)
- Type of bug
- Inconvenience/Cosmetic
- What happens
- converts
|journal=Historical Biology: An International Journal of Paleobiology
to|journal=Historical Biology: an International Journal of Paleobiology
- What should happen
- leave
: An
as is - Relevant diffs/links
- [7]
- We can't proceed until
- Agreement on the best solution
This is always an ongoing battle of styles. Added this one: https://github.com/ms609/citation-bot/pull/448 and a more generic fix https://github.com/ms609/citation-bot/pull/453 AManWithNoPlan (talk) 02:36, 29 July 2018 (UTC)
{{fixed}}
If there's an isbn, don't convert amazon link to isbn, just remove it
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 02:25, 28 July 2018 (UTC)
- Type of bug
- Improvement
- What happens
- If an amazon link is given and
|isbn=
exists, the amazon link is converted to|asin=
- What should happen
- If an amazon link is given and
|isbn=
exists, the amazon link is removed - Relevant diffs/links
- [8]
- We can't proceed until
- Agreement on the best solution
- Requested action from maintainer
- code added
The bot is makeing the page better, but you are right it could do more ; especially if the ASIN is an ISBN AManWithNoPlan (talk) 02:32, 28 July 2018 (UTC)
- It's better yes, but then another edit needs to be made (User:CitationCleanerBot will cleanup what it can every now and then). The bot should also remove asin when isbn is present in general, the link-->asin is just an intermediate step. Headbomb {t · c · p · b} 02:53, 28 July 2018 (UTC)
- It seems to me that perhaps only if the asin is the same as the isbn. AManWithNoPlan (talk) 02:34, 29 July 2018 (UTC)
- It should straight up be removed. ASIN / amazon links should only be used when there's nothing else. See Help:CS1#Identifiers, ASIN section, or CitationCleanerBot 3. Headbomb {t · c · p · b} 04:27, 30 July 2018 (UTC)
- But my retirement savings are all invested in Amazon Stock!!!!. Just joking. https://github.com/ms609/citation-bot/pull/468 AManWithNoPlan (talk) 15:25, 30 July 2018 (UTC)
- It should straight up be removed. ASIN / amazon links should only be used when there's nothing else. See Help:CS1#Identifiers, ASIN section, or CitationCleanerBot 3. Headbomb {t · c · p · b} 04:27, 30 July 2018 (UTC)
- It seems to me that perhaps only if the asin is the same as the isbn. AManWithNoPlan (talk) 02:34, 29 July 2018 (UTC)
- A few subtilities here. Links with ASINs starting with letters / ASINs starting with letters should also be removed when ISBNs exist, or converted to
|ASIN=
when no ISBNs are set. If there is no ISBN, ASIN staring with numbers should be converted to ISBNs when possible (however those starting with|asin=630...
aren't ISBNs). Headbomb {t · c · p · b} 15:44, 30 July 2018 (UTC)
- A few subtilities here. Links with ASINs starting with letters / ASINs starting with letters should also be removed when ISBNs exist, or converted to
- I updated the code. If there is an ISBN, then ignore ASIN. If the ASIN is an ISBN then add as ISBN, if not then add as ASIN. AManWithNoPlan (talk) 17:32, 30 July 2018 (UTC)
- That doesn't sound right. I think it should be: if there is an ISBN or OCLC, remove the ASIN. If there is no ISBN and the ASIN starts with a letter or 630, leave the ASIN alone. If there is no ISBN and the ASIN is a valid ISBN, move the ASIN to
|ISBN=
. – Jonesey95 (talk) 17:40, 30 July 2018 (UTC)- It looks like it is all good now. AManWithNoPlan (talk) 19:32, 30 July 2018 (UTC)
- That doesn't sound right. I think it should be: if there is an ISBN or OCLC, remove the ASIN. If there is no ISBN and the ASIN starts with a letter or 630, leave the ASIN alone. If there is no ISBN and the ASIN is a valid ISBN, move the ASIN to
- Do we know for certain that 630-series numbers are not isbns? Have the isbn people given that series over to amazon? If there is some sort of official acknowledgement that 630-series numbers are not isbns (even though they validate as isbn numbers) then perhaps cs1|2 should stop adding articles to Category:CS1 maint: ASIN uses ISBN when
|asin=
holds a 630-series number. Similarly, the documentation for|asin=
should be updated to recognize the 630 series. - —Trappist the monk (talk) 11:08, 31 July 2018 (UTC)
- I updated the code. If there is an ISBN, then ignore ASIN. If the ASIN is an ISBN then add as ISBN, if not then add as ASIN. AManWithNoPlan (talk) 17:32, 30 July 2018 (UTC)
- Not that I'm aware. Doesn't mean that such a thing doesn't exist though, just that I never found it. There is List of ISBN identifier groups, however.Headbomb {t · c · p · b} 11:25, 31 July 2018 (UTC)
- Plug into https://www.isbn.org/ISBN_converter the ASIN 6303007759 and see that it is invalid. AManWithNoPlan (talk) 15:19, 31 July 2018 (UTC)
{{fixed}}
Don't capitalized "De" / Capitalize FASEB
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 04:29, 4 August 2018 (UTC)
- Type of bug
- improvement
- What happens
- Bot capitalizes "De"
- What should happen
- should be "de"
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Regulation_of_electronic_cigarettes&diff=prev&oldid=853352540
- We can't proceed until
- Agreement on the best solution
- Requested action from maintainer
- add more to the list
- Status
- new bug
- Reported by
- Headbomb {t · c · p · b} 04:48, 4 August 2018 (UTC)
- Type of bug
- improvement
- What happens
- Faseb
- What should happen
- FASEB
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=W._Mark_Saltzman&diff=853353920&oldid=853353896
- We can't proceed until
- Agreement on the best solution
- Requested action from maintainer
- add more to the list
Gonna anticipate a few more here
Uppercase
- AJHG
- BBA
- BMC
- BMJ
- DNA
- EMBO
- FASEB
- FEBS
- FEMS
- JAMA
- MNRAS
- NEJM
- NYT
- PCR
- PLOS/PLoS
- PNAS
- UK
- USA
Lowercase (but first-letter capital allowed after a . or :)
- a
- an
- el
- de
- la
- le
- für
- of
- on
- the
- van
- von
Some of the lowercase ones can be confused with abbreviations/other words. Headbomb {t · c · p · b} 05:08, 4 August 2018 (UTC)
- Upon further review, I think one of the main issues is when the journal is wikilinked, the bot goes cray with capitalization. Headbomb {t · c · p · b} 06:01, 4 August 2018 (UTC)
- Do you have an example of Wikilinks? We do not touch those. I really wish the databases we query actually formatted the titles right. AManWithNoPlan (talk) 13:15, 4 August 2018 (UTC)
- Upon further review, I think one of the main issues is when the journal is wikilinked, the bot goes cray with capitalization. Headbomb {t · c · p · b} 06:01, 4 August 2018 (UTC)
https://github.com/ms609/citation-bot/pull/504 AManWithNoPlan (talk) 14:58, 4 August 2018 (UTC)
{{fixed}}
do not touch wikilinks
Examples of wikilinks: [9] (at the very bottom) and [10] (look for Agricultural and Forest Meteorology and Proceedings of the National Academy of Sciences of the USA). Headbomb {t · c · p · b} 15:56, 4 August 2018 (UTC)
- that is a regression. https://github.com/ms609/citation-bot/pull/506 AManWithNoPlan (talk) 16:13, 4 August 2018 (UTC)
- the above also covers links too. AManWithNoPlan (talk) 20:25, 4 August 2018 (UTC)
{{fixed}}
bot broke citation template by leaving |work= in the template
- Status
- {{fixed}} enough
- Reported by
- Trappist the monk (talk) 14:53, 2 August 2018 (UTC)
- Type of bug
- Inconvenience
- What happens
- With this edit, citation bot converted this somewhat correct template:
{{Citation|title=Reauthorizing the Elementary and Secondary Education Act|url=https://dx.doi.org/10.1057/9781137030931.0011|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|access-date=2018-07-09}}
- "Reauthorizing the Elementary and Secondary Education Act", President Obama and Education Reform, Palgrave Macmillan, ISBN 9781137030931, retrieved 2018-07-09
to this broken template:
{{Citation|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|doi=10.1057/9781137030931.0011|chapter=Reauthorizing the Elementary and Secondary Education Act|title = President Obama and Education Reform|year = 2012}}
- "President Obama and Education Reform", President Obama and Education Reform, Palgrave Macmillan, 2012, doi:10.1057/9781137030931.0011, ISBN 9781137030931
{{citation}}
:|chapter=
ignored (help)
- "President Obama and Education Reform", President Obama and Education Reform, Palgrave Macmillan, 2012, doi:10.1057/9781137030931.0011, ISBN 9781137030931
The bot should have removed |work=
when it added |chapter=
because |work=
(and its alias) is the mechanism that switches {{citation}}
from 'book style' to 'periodical style'.
- We can't proceed until
- Agreement on the best solution
Perhaps just delete |work=
when empty or when has chapter and work is equal to series, journal, title, chapter, or publisher. AManWithNoPlan (talk) 16:39, 2 August 2018 (UTC)
- Another option is to change to {{cite book}} AManWithNoPlan (talk) 16:46, 2 August 2018 (UTC)
- But in this case,
|work=
wasn't empty ...
- But in this case,
-
- Changing to
{{cite book}}
wouldn't fix the problem for two reasons:- the bot created a new
|title=
by copying content from|work=
and retained|work=
so now we have redundant information in the rendered citation:{{Cite book|work=President Obama and Education Reform|publisher=Palgrave Macmillan|isbn=9781137030931|doi=10.1057/9781137030931.0011|chapter=Reauthorizing the Elementary and Secondary Education Act|title = President Obama and Education Reform|year = 2012}}
- "Reauthorizing the Elementary and Secondary Education Act". President Obama and Education Reform. Palgrave Macmillan. 2012. doi:10.1057/9781137030931.0011. ISBN 9781137030931.
{{cite book}}
:|work=
ignored (help)
- "Reauthorizing the Elementary and Secondary Education Act". President Obama and Education Reform. Palgrave Macmillan. 2012. doi:10.1057/9781137030931.0011. ISBN 9781137030931.
- style change from cs2 to cs1; and if there were short-form references depending on the automatic CITEREF links created by
{{citation}}
, those links are now broken
- the bot created a new
- —Trappist the monk (talk) 17:02, 2 August 2018 (UTC)
- Good points. The real problem is that citation templates have so many parameters that are almost the same but not the same. We cannot fix that. It seems that we could implement code that checks for
|work=
and if the new title/chapter/publisher/journal matches it then drop it. AManWithNoPlan (talk) 17:09, 2 August 2018 (UTC)- In cs1|2 the internal parameter is
Periodical
. Any of|journal=
,|newspaper=
,|magazine=
,|work=
,|website=
,|periodical=
,|encyclopedia=
,|encyclopaedia=
,|dictionary=
,|mailinglist=
are aliases that feed into that internal parameter so all of them generally act the same. Module:Citation/CS1 does look at the names that were used in the template source because for{{citation}}
the name of the parameter gives a clue to how the citation should be rendered. For example, when the source forPeriodical
is|journal=
, Module:Citation/CS1 knows to render|volume=
,|issue=
, and|page(s)=
using academic journal style and to emit the journal style COinS metadata.{{citation}}
balks at the combination of anyPeriodical
parameter in the presence of anyChapter
alias. In the example template, copying the content of aPeriodical
alias to|title=
should blank thePeriodical
alias so that{{citation}}
isn't confused. - —Trappist the monk (talk) 00:12, 3 August 2018 (UTC)
- just for the record is copying nothing: it just finds the same string again in its database search AManWithNoPlan (talk) 00:27, 3 August 2018 (UTC)
- Just need some code that notices if work===title and such and the deletes work. Case insensitive of course. AManWithNoPlan (talk) 00:30, 3 August 2018 (UTC)
- Really? What if work and title are off by one character because of a typo or whatever? If the bot is correcting a malformed citation, as it attempted to do in this example, and ends up with a configuration that is not supported then perhaps the correct response is to do nothing.
- —Trappist the monk (talk) 13:20, 3 August 2018 (UTC)
- Just need some code that notices if work===title and such and the deletes work. Case insensitive of course. AManWithNoPlan (talk) 00:30, 3 August 2018 (UTC)
- just for the record is copying nothing: it just finds the same string again in its database search AManWithNoPlan (talk) 00:27, 3 August 2018 (UTC)
- In cs1|2 the internal parameter is
- Good points. The real problem is that citation templates have so many parameters that are almost the same but not the same. We cannot fix that. It seems that we could implement code that checks for
- Changing to
- Not sure exactly what is best, but this is a good first step https://github.com/ms609/citation-bot/pull/507 AManWithNoPlan (talk) 00:33, 5 August 2018 (UTC)
bot added url for a different article
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 14:47, 6 August 2018 (UTC)
- Relevant diffs/links
- diff
- We can't proceed until
- Agreement on the best solution
I noticed this because the referenced edit caused a url–wikilink conflict error. The original template has an inappropriate wikilink in |title=
:
{{cite journal | doi = 10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2 | last1 = Lamanna | first1 = M.C. | last2 = Martinez | first2 = R.D. | last3 = Smith | first3 = J.B. | year = 2002 | title = A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]". | url = | journal = Journal of Vertebrate Paleontology | volume = 22 | issue = 1| pages = 58–69 }}
- Lamanna, M.C.; Martinez, R.D.; Smith, J.B. (2002). "A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of Patagonia"". Journal of Vertebrate Paleontology. 22 (1): 58–69. doi:10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2.
From that, the bot made this:
{{cite journal | doi = 10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2 | last1 = Lamanna | first1 = M.C. | last2 = Martinez | first2 = R.D. | last3 = Smith | first3 = J.B. | year = 2002 | title = A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]" | url = http://www.bioone.org/doi/pdf/10.4202/app.00132.2014| journal = Journal of Vertebrate Paleontology | volume = 22 | issue = 1| pages = 58–69 | format = Full text }}
- Lamanna, M.C.; Martinez, R.D.; Smith, J.B. (2002). "A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]"" (Full text). Journal of Vertebrate Paleontology. 22 (1): 58–69. doi:10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2.
{{cite journal}}
: URL–wikilink conflict (help)
- Lamanna, M.C.; Martinez, R.D.; Smith, J.B. (2002). "A definitive abelisaurid theropod dinosaur from the early Late Cretaceous of [[Patagonia]]"" (Full text). Journal of Vertebrate Paleontology. 22 (1): 58–69. doi:10.1671/0272-4634(2002)022[0058:ADATDF]2.0.CO;2.
If you follow the doi you get to the article that matches the bibliographic data. If you follow the title-link you end up at a vaguely related article (they are both about abelisaurids) that does not match the bibliographic data.
The value in the original |title=
is malformed: it has a wikilink (it shouldn't) and it has extraneous punctuation (the single unmatched double quote mark and a period – neither of which belong there). Still, the bot should not be adding a url when |title=
is wikilinked either explicitly (has wikilink markup) or indirectly by |title-link=
, or has wikilinks (which are almost always inappropriate). It could be argued that, for |title=
parameters with single-word wikilink markup, the markup should be removed. More difficult to know what to do with wikilinks in the form [[target|label]]
because this form of wikilink is commonly used when linking to sources at, for example, wikisource.
—Trappist the monk (talk) 14:47, 6 August 2018 (UTC)
Bad link: That is bad data in the database, but I have improved the code and the specific example will not occur https://github.com/ms609/citation-bot/pull/512 AManWithNoPlan (talk) 16:27, 6 August 2018 (UTC)
caps again
- [11] (or anti-bug diff for what I fixed after the bot.)
Touches 'zu', 'des', 'aus', 'dem', 'del', 'dei', 'of', 'di', 'ed', 'du', 'de', 'dans', 'les', 'e'. Headbomb {t · c · p · b} 03:09, 7 August 2018 (UTC)
{{fixed}}
Open access links that duplicate existing data links
- What happens
- Adds links to handle.net
- What should happen
- use
|hdl=
instead - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Fermi_paradox&diff=prev&oldid=853920938
- We can't proceed until
- Agreement on the best solution
The problem is that it tries to add as a |hdl=
and fails since it is already set. The solution is to view that as a success. This bug means that if you run the bot once you will get hdl set and then a second time it will add as a url. https://github.com/ms609/citation-bot/pull/517 AManWithNoPlan (talk) 21:43, 7 August 2018 (UTC)
- You are working my butt off by the way. Which is good. AManWithNoPlan (talk) 21:43, 7 August 2018 (UTC)
.pdf at the end of dois
- Status
- {{fixed}}
- Reported by
- Headbomb {t · c · p · b} 13:01, 9 August 2018 (UTC)
- Type of bug
- Inconvenience
- What happens
- bot adds
|doi=10.1007/BF00428580.pdf
based on|url=https://link.springer.com/content/pdf/10.1007/BF00428580.pdf
- What should happen
- Bot should be smart and strip .pdf at the end of dois.
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Otto_Kandler&diff=prev&oldid=854171791
- We can't proceed until
- Agreement on the best solution
https://github.com/ms609/citation-bot/pull/523 AManWithNoPlan (talk) 21:10, 9 August 2018 (UTC)
Better cross-checking against NLM/NIH databases
Running CitationBot on doi:10.1073/pnas.171325998 finds PMC 58796, but not PMID 11573006. Basically, the bot should query both Pubmed and PubMed Central every possible ways up until each of doi/pmid/pmc are found. And iterate when new identifiers are found.
- Pubmed
|doi=
(e.g. PubMed doi query)|pmc=
(e.g. PubMed PMC query)|pmid=
(e.g. PubMed PMID query)
- PubMed Central
|doi=
(e.g. PubMed Central doi query)|pmc=
(e.g. PubMed Central PMC query)
of citation templates in the NLM/NIH databases, and cross-reference things with each other.
The bot should also not assume the queries return 'complete' results. Very often, a PMID entry won't list the PMC, even if a PMC exists and could be discoverable by a DOI query (and vice-versa for PMCs listing a DOI, but not a PMID, or a PMID, but not doi, or every other such combination). Headbomb {t · c · p · b} 04:42, 9 August 2018 (UTC)
- I noticed that years ago. But, there were so many other issues to deal with that I forgot about it. AManWithNoPlan (talk) 14:11, 9 August 2018 (UTC)
- they changed their xml output. https://github.com/ms609/citation-bot/pull/530 https://github.com/ms609/citation-bot/pull/533 AManWithNoPlan (talk) 22:04, 9 August 2018 (UTC)
- They changed the DOI search method https://github.com/ms609/citation-bot/pull/534 This also includes tests so if they change it again we will see it. AManWithNoPlan (talk) 17:39, 10 August 2018 (UTC)
- they changed their xml output. https://github.com/ms609/citation-bot/pull/530 https://github.com/ms609/citation-bot/pull/533 AManWithNoPlan (talk) 22:04, 9 August 2018 (UTC)
{{fixed}}
citeseerx links
- Status
- Fixed in GitHub Pull 526
- Reported by
- Headbomb {t · c · p · b} 13:09, 9 August 2018 (UTC)
- Type of bug
- Improvement
- What happens
- Bot adds
|url=http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.752.4896
- What should happen
- Bot adds
|citeseerx=10.1.1.752.4896
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Quantum_nonlocality&diff=prev&oldid=854173273
- We can't proceed until
- new code
- Requested action from maintainer
- new code
See also the hdl issue above. Headbomb {t · c · p · b} 13:10, 9 August 2018 (UTC)
- completely unrelated to hdl issue. AManWithNoPlan (talk) 13:54, 9 August 2018 (UTC)
- Seems exactly the same type of issue to me: failing to use
|citeseerx=
, just like it failed to use|hdl=
, but you're the coder here. Headbomb {t · c · p · b} 14:05, 9 August 2018 (UTC)- The difference is that in the case of hdl, it already had the hdl set, so it failed to add it and then fell back on adding it as a url. In the case of the citeceers, the case of citeseerx, the bot has no code to even add one. AManWithNoPlan (talk) 14:11, 9 August 2018 (UTC)
- Seems exactly the same type of issue to me: failing to use
French words that have internal apostrophes
- [12] (or anti-bug diff for what I fixed after the bot.)
Touches 'l'', 'd''Headbomb {t · c · p · b} 03:09, 7 August 2018 (UTC)
- I need to think about 'l'' and '' in words like d'Évaporation AManWithNoPlan (talk) 16:00, 7 August 2018 (UTC)
HORRIBLE to fix, but {{fixed}} AManWithNoPlan (talk) 02:31, 12 August 2018 (UTC)
In cite journal, if work is set, publisher isn't removed, but if journal is set, publisher is removed
Work is such a poorly used parameter that removing published based upon it is dubious. I have added this code https://github.com/ms609/citation-bot/pull/545 so that if the |work=
is set and the journal title happens to be the same, then the |work=
is changed to |journal=
. AManWithNoPlan (talk) 17:59, 11 August 2018 (UTC)
wikilinked titles
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 14:47, 6 August 2018 (UTC)
- Relevant diffs/links
- diff
- We can't proceed until
- Agreement on the best solution
- https://github.com/ms609/citation-bot/pull/525 AManWithNoPlan (talk) 21:09, 9 August 2018 (UTC)
- @AManWithNoPlan: not sure I know what's being done in that exactly, but will this strip
|journal=Journal of Foobar
to|journal=Journal of Foobar
? Because if so, it shouldn't. Headbomb {t · c · p · b} 21:42, 9 August 2018 (UTC)- it removes all wikilinks from
|title=
. It remove all wikilinks from|journal=
UNLESS the link is the entire name of the journal. AManWithNoPlan (talk) 21:49, 9 August 2018 (UTC)- If you look at the changed files, one of them is a test suite and you can see the changes. AManWithNoPlan (talk) 22:59, 11 August 2018 (UTC)
- it removes all wikilinks from
- @AManWithNoPlan: not sure I know what's being done in that exactly, but will this strip
Researchgate links
The bot should trim ResearchGate links like
|url=https://www.researchgate.net/publication/320041870_Analysis_of_References_Across_Wikipedia_Languages
to the simpler
Headbomb {t · c · p · b} 13:31, 11 August 2018 (UTC)
- And upgrade http to https AManWithNoPlan (talk) 18:00, 11 August 2018 (UTC)
{{fixed}}
Physical Review E → Physical Review e
- What happens
- Physical Review E → Physical Review e
- What should happen
- leave Physical Review E alone
- Relevant diffs/links
- [16]
- We can't proceed until
- Agreement on the best solution
This should apply to every single character at the end of a string, or before a ':'. E.g. Journal of Physics E: Blah BLah BLuh or Chemical Physics A. Headbomb {t · c · p · b} 18:07, 14 August 2018 (UTC)
- https://github.com/ms609/citation-bot/pull/560 AManWithNoPlan (talk) 19:40, 14 August 2018 (UTC)
- This happened when added support for the Spanish "the" word "e". That fixed a lot of Spanish things, but we forgot about "j chem phys e" type stuff. But come on, who splits their journals five ways? Obviously physics people do. AManWithNoPlan (talk) 19:49, 14 August 2018 (UTC)
- That's because you haven't seen Proceedings of the Institution of Mechanical Engineers, parts A through P. Headbomb {t · c · p · b} 19:53, 14 August 2018 (UTC)
- Those organic chemists just need part H, O, N, and C. :-) AManWithNoPlan (talk) 20:20, 14 August 2018 (UTC)
- That's because you haven't seen Proceedings of the Institution of Mechanical Engineers, parts A through P. Headbomb {t · c · p · b} 19:53, 14 August 2018 (UTC)
- This happened when added support for the Spanish "the" word "e". That fixed a lot of Spanish things, but we forgot about "j chem phys e" type stuff. But come on, who splits their journals five ways? Obviously physics people do. AManWithNoPlan (talk) 19:49, 14 August 2018 (UTC)
Academia.edu links
- Status
- {{fixed}}
- Reported by
- Headbomb {t · c · p · b} 21:54, 14 August 2018 (UTC)
- Type of bug
- Improvement: The bot would be much better if ...
- What should happen
- simplify
|url=http://www.academia.edu/25456862/Theropod_dinosaurs_from_the_Late_Jurassic_of_Tendaguru_Tanzania
to
See also User_talk:Citation bot#Researchgate links
- We can't proceed until
- Agreement on the best solution
https://github.com/ms609/citation-bot/pull/564 Learned some things too. AManWithNoPlan (talk) 00:21, 15 August 2018 (UTC)