User talk:Citation bot/Archive 22
This is an archive of past discussions about User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 15 | ← | Archive 20 | Archive 21 | Archive 22 | Archive 23 | Archive 24 | Archive 25 |
Better PMID url cleanup
- What should happen
- [1]
- We can't proceed until
- Feedback from maintainers
I finally looked at these. When these links work, they redirect to publisher, so they are actually a duplicate of the DOI, not pubmed ID. Curious. Will work on. AManWithNoPlan (talk) 13:50, 21 July 2020 (UTC)
Caps: ecancermedicalscience
what should happen = [2]
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3351 AManWithNoPlan (talk) 20:00, 29 July 2020 (UTC)
strong tags
- What happens
- Since the title didn't match, I TNT'd it. This is what I got [3]
- What should happen
- Remove strong tags. And the stray dots. If not automatically, it should at least be removed for purposes of title matching.
- We can't proceed until
- Feedback from maintainers
Caps: PRZ / PRZ.
- What should happen
- [4]
- We can't proceed until
- Feedback from maintainers
This is the ISO 4 abbreviation for "Przegląd" Headbomb {t · c · p · b} 20:14, 31 July 2020 (UTC)
The Citation Bot is currently blocked because of disagreement over its usage. When will it be back up?
When will it be back up?
The Citation Bot is currently blocked because of disagreement over its usage. 101Fake101 (talk) 14:51, 4 August 2020 (UTC)
- please see above. {{Duplicate Issue}} AManWithNoPlan (talk) 16:01, 4 August 2020 (UTC)
Remove duplicate citations
- Status
- {{wontfix}} beyond scope of this bot. I am afraid that we will become the kitchen sink of bots, and we do not have the time to keep up with yet another possible vector for bugs
- Reported by
- RayScript (talk) 22:28, 11 May 2020 (UTC)
- What happens
- I noticed there are articles with duplicate citations. It seems like it would make sense to merge these duplicate citations (as the ReFill bot does) instead of leaving them separate. There are some cases (such as different pages of books) where it makes sense to cite one source many times. However, I can't think of a case where it is useful to have the exact same citation two times. Here's an example where number 32 and 33 were identical and could be collapsed. https://en.wikipedia.org/w/index.php?title=Patrick_Combs&diff=955932647&oldid=955932567&diffmode=visual
- We can't proceed until
- Feedback from maintainers
Extended content
|
---|
|
- I think everyone is in favor of doing this. If we implement it, then we would have to get a lot of test case. Have it only run in tool mode to start. Phase 2: combine citations that have same parameters but in different order - including blank ones. Phase 3: existing refs with different names. AManWithNoPlan (talk) 11:49, 21 May 2020 (UTC)
- Hi, anybody out here who knows what happenend to reFill? Thank you for your time. Lotje (talk) 11:22, 11 July 2020 (UTC)
- I think everyone is in favor of doing this. If we implement it, then we would have to get a lot of test case. Have it only run in tool mode to start. Phase 2: combine citations that have same parameters but in different order - including blank ones. Phase 3: existing refs with different names. AManWithNoPlan (talk) 11:49, 21 May 2020 (UTC)
Removing links from title
Why is the bot removing links from titles of articles? There is clear consensus that editors want titles to be linked to the best available online source, especially when that is free to read. --RexxS (talk) 17:57, 7 June 2020 (UTC)
- Agreed. The bot is removing other editors' work and is preventing users from freely accessing information. If nobody corrects this quickly, I suggest that editors remove doi's from citations that have direct links to titles. Corker1 (talk) 20:07, 29 July 2020 (UTC)
Extended content
|
---|
It is very discouraging to have to go through this all over again, after We Just Did This. It is also discouraging to get answers non-BOT people can't always decipher.[5] I only want to provide links to full articles, when possible, for the benefit of our readers. Can anyone explain to me why Semantic Scholar is in a position to be dictating how we link on Wikipedia? SandyGeorgia (Talk) 22:07, 7 June 2020 (UTC)
The copyright problem has many facets. S2 has copies of licenses papers and scraped from the web papers. Wikipedia rules say to not linked to copyright infringing works (there are exceptions) - that is the scraped ones. This bot wont add links to citeceerx for that reason. Similarly, it wont add S2 links. There are editors that activelly remove these links if they cannot verify a license. S2 now has an API for determining licenses (see discussion above) and so there is a debate on adding tbose. The agrument against is tbat all legal S2 links are also free from the publisher. AManWithNoPlan (talk) 11:26, 8 June 2020 (UTC)
|
{{fixed}} for S2CID and we can start this discussion anew as needed. AManWithNoPlan (talk) 13:58, 17 August 2020 (UTC)
June 2020
{{unblock|reason=Your reason here ~~~~}}
. RexxS (talk) 00:27, 8 June 2020 (UTC)lol… here we go again — Chris Capoccia 💬 01:36, 8 June 2020 (UTC)
Extended content
|
---|
Where is the official conversation occurring? —¿philoserf? (talk) 03:38, 8 June 2020 (UTC)
I didn’t want to say anything, but since this account has been blocked for related issues: There were two issues with this edit:
While I have repaired the damage, these changes did reduce the quality of the article. SkylabField (talk) 00:09, 11 June 2020 (UTC)
|
Arbitrary break
- When can we expect the unblock of Citation Bot? Grimes2 (talk) 15:37, 14 June 2020 (UTC)
- I have been waiting for the big OAuth changeover to be done before asking, which now done. Also, I have been busy doing a conversion from PHP 5.6 to 7.2, which is now done. The bot has been changed so that the S2 title-link will only be removed if one of these two is true
- There is an active auto-link (Currently PMC, but soon will be more I have heard via
|doi-access=free
and such). - The S2 page is not licensed by S2, and was just a web-scrape.
It is worth noting that the bot currently does not work since the OAuth tokens have not yet been updated by the bot operator (he has been asked). Similarly, the URL expansion part is completely down, and many tools will not work because of the DNS changes. AManWithNoPlan (talk) 14:48, 15 June 2020 (UTC)
- Can you please translate this for us lesser mortals who have no idea what the hey S2 and PMC actually are? Oh, and I don't like the sound of your first condition, but that may be because you choose to use cryptic descriptors. - Nick Thorne talk 13:20, 17 June 2020 (UTC)
- S2 is Semantic Scholar. PMC is PubMed Central. Headbomb {t · c · p · b} 13:28, 17 June 2020 (UTC)
Citation bot (block log • active blocks • global blocks • contribs • deleted contribs • filter log • creation log • change block settings • unblock • checkuser (log))
Request reason:
bot changed to only remove URL if a PMC link will take its place or if Semantic Scholar link is unlicensed. If other auto-linking flags such as doi-access=free create a link, we will recognize that in the future too. AManWithNoPlan (talk) 15:25, 19 June 2020 (UTC)
Decline reason:
Despite several requests, there has been no evidence shown of bot approval for the removal of links from citation titles. If such approval does actually exist, or it is sought and granted, please show the evidence in a new unblock request. If the bot is changed to remove these link removals, please make a new unblock request. Boing! said Zebedee (talk) 21:52, 25 June 2020 (UTC)
If you want to make any further unblock requests, please read the guide to appealing blocks first, then use the {{unblock}} template again. If you make too many unconvincing or disruptive unblock requests, you may be prevented from editing this page until your block has expired. Do not remove this unblock review while you are blocked.
@RexxS: Seeing as you blocked, your thoughts? The issue appears to have been dealt with. CaptainEek Edits Ho Cap'n!⚓ 23:25, 24 June 2020 (UTC)
- @CaptainEek: The issue I blocked for was the removal of links from citation titles, a task for which the bot is not authorised, nor is there any consensus for it. Has that issue now been resolved unambiguously? and will the bot be editing only within its authorisation in future? --RexxS (talk) 23:38, 24 June 2020 (UTC)
- The bot will only delink S2 from the title during conversion to
|S2CID=
if one of these is true: the PMC is linked in the title (there was a general liking if this idea) OR the S2 url is not a publisher approved copy (this will catch people off guard at times, but those links do violate WP policy). If other things like doi-access=free start auto-linking works and is available, I assume that removing the URL during S2 conversion makes sense just like PMC auto linking does now. AManWithNoPlan (talk) 23:57, 24 June 2020 (UTC)- Thanks for the update, AManWithNoPlan. Three questions then:
- One of the bot edits I complained about removing citation title links was this one, which removed the link from the citation title, and was not related to s2cid. Are we now certain that it won't remove any more links from citation titles, with the possible exception of links pointing to copyvios at Semantic Scholar?
- You state that it will remove the link from a citation title when it points to a copyvio at Semantic Scholar. Where is the bot approval for that task?
- If the answer to 1 is in the negative, where is the bot approval for that task?
- I think that satisfactory answers to those questions should be essential prerequisites to any unblocking. --RexxS (talk) 00:25, 25 June 2020 (UTC)
- Thanks for the update, AManWithNoPlan. Three questions then:
- The bot will only delink S2 from the title during conversion to
The JSTOR link does not link to a full free copy, so it would be removed. We have had consensus for converting URLs to IDs for a long time. As for removal of copyvio links, since we have consensus to convert links that do not link to full and open copies, the copyvio S2 links fall into the "not full and open" pile of URLs. Copyvio is a bid deal on wikipedia, a much bigger deal than linking non-free copies. AManWithNoPlan (talk) 00:38, 25 June 2020 (UTC)
- Removing copyvio S2 links seems like a big win to me. I don't see why the bot should be blocked for doing this. —David Eppstein (talk) 00:57, 25 June 2020 (UTC)
- @AManWithNoPlan: No, that's untrue. You have no consensus to remove links from citation titles, and the bot has no authorisation to do so. If you believe you have authorisation, then please quote and link the text of it. Without consensus and bot approval, I strongly oppose any unblocking.
- Copyright is indeed a big deal on Wikipedia, and is far too important to leave to a bot's judgement. --RexxS (talk) 01:06, 25 June 2020 (UTC)
- It actually S2 that provides the judgement on the copyvio status of their pages. There is no judgement, just clear facts straight from the horses mouth. AManWithNoPlan (talk) 01:46, 25 June 2020 (UTC)
- There is consensus to remove non-free links redudant with identifiers, yes, and it is authorized to do so as well as many other bots. Headbomb {t · c · p · b} 02:51, 25 June 2020 (UTC)
- So there's no authorisation for Citation bot to remove any links and you cannot supply text or link to its authorisation. There's no consensus for it either, and you are unable to present a link for where any consensus was reached. This bot has been used irresponsibly by a small self-selected group to impose their view of how citations should be presented. --RexxS (talk) 16:17, 25 June 2020 (UTC)
- Wikipedia:Bots/Requests_for_approval/DOI_bot_2, from 2008 [see also [6]]. And template documentation, since pretty much time immemorial: Use parameters, not URLs when specific parameters are available, because URLs should be used for freely accessible versions. Headbomb {t · c · p · b} 17:27, 25 June 2020 (UTC)
- You see, this a perfect example of the FUD produced when the bot's approval is questioned. From Wikipedia:Bots/Requests_for_approval/DOI_bot_2, we read
That's what we were promised by Smith609, and that's what was authorised: absolutely nothing about removing links from titles; a request to remove more than one instance of a parameter (not to remove one parameter when it points to the same place as different parameter); a clear recognition that opinions differ about what the title link may point to; and a suggestion that wider community input would be helpful. All of that has gone out of the window.Function Summary: Add missing parameters to citations from CrossRef database, and tidy citations
Function Details: ... Consensus appears to be that specifying a URL parameter is also useful; the bot can specify the URL that the DOI redirects to and in some cases make an intelligent guess as to its nature (abstract, fulltext etc) which can be recorded in the "format" parameter.
There have also been requests for the bot to correct common mistakes, such as replacing "id = PMID 123" with "pmid=123", percent-encoding parameters within dois so they link correctly, and replacing erroneously capitalised parameters (example: "Journal=Science" with "journal=Science"). Since these seemed uncontroversial I implemented these as I went, but my sense is that an official approval would placate some of Wikipedia's adminsitrators.
In cases where there is more than one instance of a parameter, the bot will remove: If one or more are empty, the empty one; Any identical duplicates ...
Adding URLs to nonfree articles? One question: the usual style in articles I edit is that url= is reserved for articles where the entire text is freely readable, and that url= is not used for articles where just the abstract is readable (for that, you can just live with the DOI or PMID or whatever). Will the bot support this convention? That is, on such articles will it refuse to add URLs to articles that aren't entirely readable? ... I envision this being a possible bone of contention. I envision the bot providing a link where only an abstract is visible, but marking the URL as "abstract" or "subscription required" (using the "format" parameter). The rationale for this is that casual readers may not understand that a DOI or PMID provides a link to the article, and that a title link is intuitive to follow. The bot can't really tell whether editors have only chosen to provide URLs to free texts, you see.
In the majority of articles I edit (which tend to be scientific rather than medical), the convention seeems to be to provide a link, whatever - but then I guess that DOIs are rarely specified. I guess the crux of the matter is whether the title being linked is a genuine help to users, which was the sense I got from discussions on my talk page - I guess each of us has our own entrenched opinion that we're unlikely to change, so it would be helpful to get some views from the wider community!
- The next part of that BRFA is really instructive:
So we had an AN complaint brought by MCB at Wikipedia:Administrators' noticeboard/Archive143 #DOI bot blocked for policy reconsideration for "implementing a major policy change in the way Wikipedia makes web references, without large-scale community consensus and buy-in". Read it. there's nothing in there that indicates any consensus for your use of the bot to systematically strip links from citation titles, and plenty of evidence of just the opposite. The bot should respect the judgement of the editor who links the title and not impose your vision as a fait accompli. You can certainly make a case for removing links that point to copyvios, and there would be support for that, but it would require approval, because there is no approval whatsoever for the bot to remove links from citation titles. --RexxS (talk) 21:33, 25 June 2020 (UTC)Proposal from Wikipedia:AN
2. The bot must not remove or alter an existing URL.
The second limitation was discussed at length on Wikipedia:AN. The third, fourth, and fifth items are the only things the bot should be doing.
- From that same BRFA The bot replaces "url=http://dx.doi.org/#" with "doi=#". Also from the bot description in May 2008 at the time of approval. Headbomb {t · c · p · b} 22:09, 25 June 2020 (UTC)
- Emphasis mine:
The bot replaces "url=http://dx.doi.org/#" with "doi=#" - I think this was the one URL manipulation deemed okay.
Levivich [dubious – discuss] 22:20, 25 June 2020 (UTC)- Yes, because back then, the bot was touching other non-identifier-based URLs like this. That's the context for that RFC. The DOI function has since been expanded to other identifiers. Headbomb {t · c · p · b} 22:26, 25 June 2020 (UTC)
- The DOI function has since been expanded to other identifiers without approval, which is why the bot is blocked right now. Let's just move on to the next part where the code that removes
|url=
is commented out, the bot is unblocked, and approval for removing|url=
is sought. Levivich [dubious – discuss] 22:32, 25 June 2020 (UTC)- It's been expanded in line with consensus. Bots do not need re-approval for the same tasks with minor changes in scope. There is nothing different about removing a PMID url to a PMID parameter, or a JSTOR url to a JSTOR parameter than from a DOI url to a DOI parameter. Headbomb {t · c · p · b} 22:41, 25 June 2020 (UTC)
- If it's been expanded in line with consensus, then it'll be a quick and easy BRFA. Levivich [dubious – discuss] 22:49, 25 June 2020 (UTC)
- There's no need for a BRFA when there already is a valid one and that the expansion is in line with consensus. Headbomb {t · c · p · b} 22:51, 25 June 2020 (UTC)
- There's a need when multiple editors are challenging whether or not the expansion is in line with consensus. The extreme hesitancy to seek explicit community approval is how I know, that you know, that the community will not approve. Anyone confident that consensus already exists would have started the discussion weeks ago. Levivich [dubious – discuss] 22:53, 25 June 2020 (UTC)
- There's one editor with an axe to grind. This does not undo 12+ years of smooth operation concerning this exact function, nor does it warrant holding the entire community hostage to the whims of that person. Headbomb {t · c · p · b} 23:07, 25 June 2020 (UTC)
- If that's true, it'll be a quick BRFA, and you'll get to say "I told you so". (But of course it's not just one editor.) Levivich [dubious – discuss] 23:11, 25 June 2020 (UTC)
- "is how I know" spoken by an arrogant mind-reading jerk AManWithNoPlan (talk) 23:18, 25 June 2020 (UTC)
- Mind WP:CIVIL. There's no need for this. Headbomb {t · c · p · b} 23:38, 25 June 2020 (UTC)
- I apologize, there was no need for Levivich to claim to read minds and no need for me to strike back. AManWithNoPlan (talk) 23:39, 25 June 2020 (UTC)
- Eh, I thought that was fair, my comment was jerk-ish, but we are simply past the point where anyone can credibly claim to hold a good faith belief that the bot is operating with clear consensus. This is not one user with an axe to grind; consensus for removing the url parameter is, at best, murky. Levivich [dubious – discuss] 06:03, 26 June 2020 (UTC)
- I apologize, there was no need for Levivich to claim to read minds and no need for me to strike back. AManWithNoPlan (talk) 23:39, 25 June 2020 (UTC)
- I think I lot of people just surprised that the bot went from first mention of the problem this page to being blocked in under 7 hours. People have been actively discussion this instead of just jumping straight to request the unblock. Plus, I personally was using the time to upgrade the bot to PHP 7.3. AManWithNoPlan (talk) 23:30, 25 June 2020 (UTC)
- Mind WP:CIVIL. There's no need for this. Headbomb {t · c · p · b} 23:38, 25 June 2020 (UTC)
- "is how I know" spoken by an arrogant mind-reading jerk AManWithNoPlan (talk) 23:18, 25 June 2020 (UTC)
- If that's true, it'll be a quick BRFA, and you'll get to say "I told you so". (But of course it's not just one editor.) Levivich [dubious – discuss] 23:11, 25 June 2020 (UTC)
- There's one editor with an axe to grind. This does not undo 12+ years of smooth operation concerning this exact function, nor does it warrant holding the entire community hostage to the whims of that person. Headbomb {t · c · p · b} 23:07, 25 June 2020 (UTC)
- There's a need when multiple editors are challenging whether or not the expansion is in line with consensus. The extreme hesitancy to seek explicit community approval is how I know, that you know, that the community will not approve. Anyone confident that consensus already exists would have started the discussion weeks ago. Levivich [dubious – discuss] 22:53, 25 June 2020 (UTC)
- There's no need for a BRFA when there already is a valid one and that the expansion is in line with consensus. Headbomb {t · c · p · b} 22:51, 25 June 2020 (UTC)
- If it's been expanded in line with consensus, then it'll be a quick and easy BRFA. Levivich [dubious – discuss] 22:49, 25 June 2020 (UTC)
- It's been expanded in line with consensus. Bots do not need re-approval for the same tasks with minor changes in scope. There is nothing different about removing a PMID url to a PMID parameter, or a JSTOR url to a JSTOR parameter than from a DOI url to a DOI parameter. Headbomb {t · c · p · b} 22:41, 25 June 2020 (UTC)
- The DOI function has since been expanded to other identifiers without approval, which is why the bot is blocked right now. Let's just move on to the next part where the code that removes
- Yes, because back then, the bot was touching other non-identifier-based URLs like this. That's the context for that RFC. The DOI function has since been expanded to other identifiers. Headbomb {t · c · p · b} 22:26, 25 June 2020 (UTC)
- Emphasis mine:
- From that same BRFA The bot replaces "url=http://dx.doi.org/#" with "doi=#". Also from the bot description in May 2008 at the time of approval. Headbomb {t · c · p · b} 22:09, 25 June 2020 (UTC)
- You see, this a perfect example of the FUD produced when the bot's approval is questioned. From Wikipedia:Bots/Requests_for_approval/DOI_bot_2, we read
- Wikipedia:Bots/Requests_for_approval/DOI_bot_2, from 2008 [see also [6]]. And template documentation, since pretty much time immemorial: Use parameters, not URLs when specific parameters are available, because URLs should be used for freely accessible versions. Headbomb {t · c · p · b} 17:27, 25 June 2020 (UTC)
- So there's no authorisation for Citation bot to remove any links and you cannot supply text or link to its authorisation. There's no consensus for it either, and you are unable to present a link for where any consensus was reached. This bot has been used irresponsibly by a small self-selected group to impose their view of how citations should be presented. --RexxS (talk) 16:17, 25 June 2020 (UTC)
- There is consensus to remove non-free links redudant with identifiers, yes, and it is authorized to do so as well as many other bots. Headbomb {t · c · p · b} 02:51, 25 June 2020 (UTC)
- It actually S2 that provides the judgement on the copyvio status of their pages. There is no judgement, just clear facts straight from the horses mouth. AManWithNoPlan (talk) 01:46, 25 June 2020 (UTC)
- Is there a list somewhere of the specific circumstances under which the bot deletes
|url=
from a citation template, currently? I see two such circumstances in the unblock request (and a third potential future circumstance), is that list complete? Levivich [dubious – discuss] 03:06, 25 June 2020 (UTC)
- @Levivich: I believe currently it basically it replaces/removes
|url=
with specific identifier (e.g.|url=https://www.jstor.org/
with|jstor=...
) when specific identifiers are available (this goes back to 2008 or so, and is in line with template documentation/standard usage). With S2CID urls currently remaining untouched when there are free full versions for now, but which will be removed once the CS1/CS2 templates are updated to support autolinking when|S2CID-access=free
is set. AManWithNoPlan or Martin609 would know more though. Headbomb {t · c · p · b} 17:40, 25 June 2020 (UTC)- That is correct about replacing
|url=
with specific identifier. S2CID is a fairly unique case in that it often includes a full copy. So, those|url=
will only be removed during the conversion IF some thing else will turn the title into a blue link (Such as PMC and hopefully soon things like|S2CID-access=free
. One other exception in the current code are the copyright violating pages on S2, which the|url=
will be removed (but the|S2CID=
will stay) in accordance with wikipedia's "don't link to copyright violations" policy. AManWithNoPlan (talk) 18:05, 25 June 2020 (UTC)- Where's the link and text of the approval for removing "
|url=
[and replacing] with specific identifier"? Where's the consensus for doing that? The clear answer is that neither of those exist. --RexxS (talk) 21:40, 25 June 2020 (UTC)
- Where's the link and text of the approval for removing "
- That is correct about replacing
@Boing! said Zebedee: "Despite several requests, there has been no evidence shown of bot approval for the removal of links from citation titles." That's patently untrue. See Wikipedia:Bots/Requests for approval/DOI bot 2 where conversions of |url=
to |doi=
is explicitly approved (search for The bot replaces "url=http://dx.doi.org/#" with "doi=#" at the bottom of the BRFA). This was explicitly trialled (e.g. https://en.wikipedia.org/w/index.php?title=Hubble_Space_Telescope&diff=prev&oldid=211876538). Also from the bot description in May 2008 at the time of approval. This is a function that's never been controversial since it's BRFA in 2008, which also has been approved in multiple other bots, such as Wikipedia:Bots/Requests for approval/CitationCleanerBot, and which is fully inline with template documentation (e.g. Template:Cite_journal#Identifiers): use identifiers parameters instead of parameter URLs. Headbomb {t · c · p · b} 22:06, 25 June 2020 (UTC)
- What an unbelievable piece of selective quoting! This is what was actually written:
"The bot replaces "url=http://dx.doi.org/#" with "doi=#" - I think this was the one URL manipulation deemed okay.
- "I think this was the one url manipulation deemed okay". No other url 'manipulation' has ever been approved. We already have recent overwhelming consensus that the citation title should be linked when a free
|doi=
is present at Wikipedia:Village pump (proposals)/Archive 167 #Auto-linking titles in citations of works with free-to-read DOIs, so replacing url with doi is an irrelevance, a settled issue, because it won't delink the citation title. You don't have the right to unilaterally and arbitrarily extend the bot's approval from "the one url manipulation deemed okay" to unlinking the citation title when any one of a dozen or more unspecified parameters are present. Fix that first. --RexxS (talk) 23:24, 25 June 2020 (UTC) - @Headbomb: Sorry if you disagree, but I have read all of this carefully and it's the only conclusion I can come to. The Village Pump consensus also influenced my unblock review (and I forgot to include it in my review comments, apologies - but I'm saying it here now). I do not see authorisation for what the bot is currently doing, and I see a consensus against what it is doing. I suggest the best thing to do at this point might be to make another WP:BAG request to clarify/confirm what the bot is authorised to do and what it is not - though it might be better to clarify the consensus as to how citation titles should be treated first. Boing! said Zebedee (talk) 05:25, 26 June 2020 (UTC)
- The thing that concerns me most in this whole sorry mess is that both AMWNP and Headbomb seem unable or unwilling to operate/maintain this bot within its authorisation. This is not acceptable, and how it has not ended up at AN/I is beyond me. - Nick Thorne talk 05:52, 26 June 2020 (UTC)
- It ended up there already. Levivich [dubious – discuss] 06:03, 26 June 2020 (UTC)
- 1) That is perfectly within the terms of it's approval, and explicitly so. 2) I'm neither maintainer, nor operator of this bot. Headbomb {t · c · p · b} 07:04, 26 June 2020 (UTC)
- The bot did not go to AN/I. It was the blocking of users of the bot instead of the bot itself that went there. AManWithNoPlan (talk) 11:46, 26 June 2020 (UTC)
- 1) That is perfectly within the terms of it's approval, and explicitly so. 2) I'm neither maintainer, nor operator of this bot. Headbomb {t · c · p · b} 07:04, 26 June 2020 (UTC)
- It ended up there already. Levivich [dubious – discuss] 06:03, 26 June 2020 (UTC)
- The thing that concerns me most in this whole sorry mess is that both AMWNP and Headbomb seem unable or unwilling to operate/maintain this bot within its authorisation. This is not acceptable, and how it has not ended up at AN/I is beyond me. - Nick Thorne talk 05:52, 26 June 2020 (UTC)
- This is one of the most important and widely used bots on the Wiki. This needs to be back up and running soon. I propose that AMWNP and Headbomb remove the disputed functionality, restore the bot to operational status, and then we can argue about its DOI and URL functions while the old version of the bot works. CaptainEek Edits Ho Cap'n!⚓ 18:58, 27 June 2020 (UTC)
- Perhaps a new WP:BRFA is needed here, though I still support unblocking an old version of the bot while the months long BRFA goes through. CaptainEek Edits Ho Cap'n!⚓ 19:09, 27 June 2020 (UTC)
- Again, I neither code, nor operate Citation bot. Headbomb {t · c · p · b} 19:24, 27 June 2020 (UTC)
- Oh gosh, I'm sorry Headbomb, I didn't read my post carefully enough. CaptainEek Edits Ho Cap'n!⚓ 19:48, 27 June 2020 (UTC)
- Again, I neither code, nor operate Citation bot. Headbomb {t · c · p · b} 19:24, 27 June 2020 (UTC)
- Perhaps a new WP:BRFA is needed here, though I still support unblocking an old version of the bot while the months long BRFA goes through. CaptainEek Edits Ho Cap'n!⚓ 19:09, 27 June 2020 (UTC)
- Why is this blocked? Yes, I see there's a dispute about some minor details of urls. But guys, please keep scope in mind. Url links within the title are pretty rare and while you argue many citations are languishing as just a doi. :( I don't care whether the controversal code is removed and the bot is unblocked or the bot is unblocked as is while a consensus is reached, I just wish you wouldn't drag every wikipedian who wants to fill a citation into this dispute. Iamnotabunny (talk) 15:32, 1 July 2020 (UTC)
- it's because the regular editors using citation bot don't see title URLs as any big deal but the people who actually pushed the block button are like OMG THE SKY IS FALLING!!111 NEED TITLE URLS BECAUSE NO ONE KNOWS HOW TO CLICK!! — Chris Capoccia 💬 14:53, 6 July 2020 (UTC)
- I look forward to the unblocking of this tool. I hope soon.--Dthomsen8 (talk) 14:27, 10 July 2020 (UTC)
- looks like CS1 has started title linking
|doi-access=free
… hopefully|S2CID-access=free
and all the other similar ones. are things ready to revisit this blocking and reactivate? — Chris Capoccia 💬 13:45, 12 July 2020 (UTC)- I hope so, but Wikipedia:Bots/Noticeboard#Citation_bot was not withdrawn yet. I think incremental fixes are better but maybe the proposers still prefer a full-scale review of everything under the sun. Nemo 15:17, 12 July 2020 (UTC)
- Sorry, I have been busy with other things. I have been busy teaching a college seminar and preparing the bot for PHP 7.4 which has found a few bugs (all the ones so far have no effect on output or crash the bot) AManWithNoPlan (talk) 17:27, 12 July 2020 (UTC)
- OK well it looks like
|S2CID-access=free
is not making title links, so we're probably not ready to go anyway. Maybe by August :( — Chris Capoccia 💬 23:26, 12 July 2020 (UTC)
- OK well it looks like
- Sorry, I have been busy with other things. I have been busy teaching a college seminar and preparing the bot for PHP 7.4 which has found a few bugs (all the ones so far have no effect on output or crash the bot) AManWithNoPlan (talk) 17:27, 12 July 2020 (UTC)
- I hope so, but Wikipedia:Bots/Noticeboard#Citation_bot was not withdrawn yet. I think incremental fixes are better but maybe the proposers still prefer a full-scale review of everything under the sun. Nemo 15:17, 12 July 2020 (UTC)
Restart?
really i was only joking up above when i suggested the bot might be out through august.... are we any closer to a restart? — Chris Capoccia 💬 20:40, 3 August 2020 (UTC)
- Since this appears to be down for a while, is there a way to remove the "Expand citations" tool (from the left side of each page) until it is available again? DougHill (talk) 18:14, 5 August 2020 (UTC)
- Please restart this useful tool, or make something that does the same thing, but better. A bunch of pointless bickering about relatively small issues has completely stalled what could have improved thousands of articles in the downtime. Shameful. --Animalparty! (talk) 01:27, 8 August 2020 (UTC)
- Anyone who wants to see a return of Citation bot needs to add their comments under Wikipedia:Village_pump_(proposals)#Issues_raised_by_Citation_bot. But right now it doesn't look too promising. — Chris Capoccia 💬 15:42, 8 August 2020 (UTC)
- @Chris Capoccia: I expect everybody wants to see Citation Bot restarted, myself included. But, as far as I'm aware, there has not been a single statement from the bot operator indicating any intention to address the many concerns raised over the bot's editing. I'm pessimistic about the chances of seeing the bot restarted if it's just going to cause the same concerns again. --RexxS (talk) 15:59, 8 August 2020 (UTC)
- I'm not seeing it. For ages the bot has deleted URLs that were duplicated by parameters. This is part of core functionality. There are some very different ideas of what the bot is supposed to be doing and I don't see the sides getting any closer. So the bot is going to stay blocked forever and not come back. — Chris Capoccia 💬 16:08, 8 August 2020 (UTC)
- Well, perhaps the community will conclude that all of the concerns that folks like myself have expressed are without value, and issues like the removal of links from citation titles are part of its remit with approval and broad consensus. Then it can be restarted as it is. However, if the community agrees that valid concerns exist, then either the operator will bring the functionality into line, or it will sadly stay blocked. I'm just disappointed that there has not been a shred of compromise on the part of the bot operator that might have met the concerns half-way and made possible a restart under mutually acceptable conditions months ago. --RexxS (talk) 16:21, 8 August 2020 (UTC)
- I'm not seeing it. For ages the bot has deleted URLs that were duplicated by parameters. This is part of core functionality. There are some very different ideas of what the bot is supposed to be doing and I don't see the sides getting any closer. So the bot is going to stay blocked forever and not come back. — Chris Capoccia 💬 16:08, 8 August 2020 (UTC)
- @Chris Capoccia: I expect everybody wants to see Citation Bot restarted, myself included. But, as far as I'm aware, there has not been a single statement from the bot operator indicating any intention to address the many concerns raised over the bot's editing. I'm pessimistic about the chances of seeing the bot restarted if it's just going to cause the same concerns again. --RexxS (talk) 15:59, 8 August 2020 (UTC)
- Turned off code that removed title links that violate wikipedia linking policy, so that a title link will stay in the case of S2 links. AManWithNoPlan (talk) 00:26, 13 August 2020 (UTC)
- RexxS, Does that address your issue for the time being? CaptainEek Edits Ho Cap'n!⚓ 05:58, 13 August 2020 (UTC)
- @CaptainEek: I'm not sure whether it will meet all of my concerns about leaving citation titles unlinked, but I guess we can't tell until we see the results of restarting. I'm certainly overjoyed that one of the bot programmers has now made an effort to address the concerns raised, and as I'm keen to see the bot restarted, I'd have no objection to seeing the bot restarted at present. I do expect that the present RfC will have significant implications for how the bot will operate in future, and I therefore expect the bot programmers to take heed of the implications of the consensuses forming there. --RexxS (talk) 19:56, 13 August 2020 (UTC)
- I would unblock, but consider myself to have gotten involved, though I would encourage any passing admin to unblock. CaptainEek Edits Ho Cap'n!⚓ 03:51, 14 August 2020 (UTC)
- @CaptainEek: I'm not sure whether it will meet all of my concerns about leaving citation titles unlinked, but I guess we can't tell until we see the results of restarting. I'm certainly overjoyed that one of the bot programmers has now made an effort to address the concerns raised, and as I'm keen to see the bot restarted, I'd have no objection to seeing the bot restarted at present. I do expect that the present RfC will have significant implications for how the bot will operate in future, and I therefore expect the bot programmers to take heed of the implications of the consensuses forming there. --RexxS (talk) 19:56, 13 August 2020 (UTC)
- RexxS, Does that address your issue for the time being? CaptainEek Edits Ho Cap'n!⚓ 05:58, 13 August 2020 (UTC)
Citation bot (block log • active blocks • global blocks • contribs • deleted contribs • filter log • creation log • change block settings • unblock • checkuser (log))
Request reason:
bot changed to only remove URL if a PMC link will take its place. If other auto-linking flags such as doi-access=free create a link, we might recognize that in the future too, but there would always stay a title link for the S2 links when converting to S2CID parameter. )
Accept reason:
Accepting unblock request Salvio 16:51, 16 August 2020 (UTC)
{{fixed}} AManWithNoPlan (talk) 01:16, 17 August 2020 (UTC)
Merge italics/bold
- What should happen
- [7]
- We can't proceed until
- Feedback from maintainers
With care to ensure that something like '''Bold''' ''italics''
is handled properly and not converted to say '''Bold' italics''
. Headbomb {t · c · p · b} 18:53, 31 July 2020 (UTC)
- {{wontfix}}, since it looks like a rat's nest of possible non-conforming inventive editors. AManWithNoPlan (talk) 13:49, 17 August 2020 (UTC)
Process pages in Category not working
- Status
- {{fixed}}
- Reported by
- Grimes2 (talk) 17:27, 16 August 2020 (UTC)
- What happens
- "Process pages in Category" not working, message: Category appears to be empty
- We can't proceed until
- Feedback from maintainers
That is really weird. I will have to look into that. AManWithNoPlan (talk) 14:17, 17 August 2020 (UTC)
- https://github.com/ms609/citation-bot/pull/3364/files Ooops. That was stupid bug. AManWithNoPlan (talk) 14:21, 17 August 2020 (UTC)
Caps i / I
- Status
- {{fixed}}
- Reported by
- Redalert2fan (talk) 21:28, 16 August 2020 (UTC)
- What happens
- journal= Elektriceskaja I Teplovoznaja Tjaga
- What should happen
- keep as journal= Elektriceskaja i Teplovoznaja Tjaga
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=VL11&diff=prev&oldid=973372345
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3364 AManWithNoPlan (talk) 13:52, 17 August 2020 (UTC)
Useless capitalization?
- Status
- {{fixed}}
- Reported by
- Redalert2fan (talk) 21:36, 16 August 2020 (UTC)
- What happens
- Useless capitalization of www to WWW
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=EP10&diff=prev&oldid=973373301
- We can't proceed until
- Feedback from maintainers
- Doesn't seem helpful to me. Redalert2fan (talk) 21:36, 16 August 2020 (UTC)
- Probably best to avoid any sting with www or http in them https://github.com/ms609/citation-bot/pull/3363 AManWithNoPlan (talk) 13:45, 17 August 2020 (UTC)
Lots of JSON errors
I'm getting a lot of errors of the following kind: ! Could not parse JSON for URL <urls here> Requests must have a user agent. - Redalert2fan (talk) 21:39, 16 August 2020 (UTC)
- https://github.com/ms609/citation-bot/pull/3362 This once deployed will fix that, but I think URL expansion is still down, but the error should be better at least. AManWithNoPlan (talk) 13:49, 17 August 2020 (UTC)
- This is {{fixed}}, but URL expansion is still down. AManWithNoPlan (talk) 14:14, 17 August 2020 (UTC)
Google Books API error
Multiple errors of:
! Google Books API reported error: Array
(
[0] => stdClass Object
(
[message] => The provided API key has an IP address restriction. The originating IP address of the call (IP adres here) violates this restriction.
[domain] => global
[reason] => forbidden
)
)
are showing up. The section "IP adres here" shows an actual ip adress which I have removed for this post. -Redalert2fan (talk) 21:43, 16 August 2020 (UTC)
- Note to self {{fixed}} at https://console.developers.google.com/apis/credentials?project=wikipediacitationbot AManWithNoPlan (talk) 01:17, 17 August 2020 (UTC)
"|chapter= ignored" error caused in cite web
- Status
- {{fixed}}
- Reported by
- Grimes2 (talk) 19:15, 16 August 2020 (UTC)
- What happens
|chapter=
ignored error caused in {{cite web}} and most others that are not {{cite book}}- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Julian_Bream&diff=973350505&oldid=973331691
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3365 AManWithNoPlan (talk) 18:15, 17 August 2020 (UTC)
bot adds author names that are not author names
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 12:43, 19 August 2020 (UTC)
- What happens
|last1=7 |first1=Völlig neu Bearbeitete und Erweiterte Auflage
- Relevant diffs/links
- diff
- We can't proceed until
- Feedback from maintainers
when bot changes |work= alias to |encyclopedia=
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 22:56, 19 August 2020 (UTC)
- What happens
|encyclopedia=
(and aliases) are constrained to{{cite encyclopedia}}
,{{cite dictionary}}
, and{{citation}}
(discussion at wt:cs1)- Relevant diffs/links
- diff
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3377
Chapter ignored error
- Status
- {{fixed}}
- Reported by
- Lithopsian (talk) 20:41, 19 August 2020 (UTC)
- What happens
- CS1 errors: chapter ignored category applied incorrectly
- What should happen
- Category should not be applied to citation template when work field (or aliases) is not included
- Relevant diffs/links
- Rho Persei
- We can't proceed until
- Feedback from maintainers
- FYI: this appears to relate to this diff from last March in which Citation bot saw a {{cite book}} template with correctly-formatted chapter=, title=, and edition= parameters but also with incorrect content in the chapter= parameter (not actually a chapter) and with an incorrect journal= parameter and decided to change the template to {{cite journal}}, without changing the parameters, leaving a broken citation. I hope this is long fixed by now. —David Eppstein (talk) 23:24, 19 August 2020 (UTC)
Down?
The Citations button seems to hang. Expand citations too. I tried a few different things, nothing... Abductive (reasoning) 23:35, 19 August 2020 (UTC)
- under very heavy load AManWithNoPlan (talk) 23:48, 19 August 2020 (UTC)
OAuth tasks done
- All URLs must be updated in GitHub (done)
- Gadget and sidebar button code updated on Wikipedia (done)
- Dev code, if anyone has it (done - that's their problem, and that bot is down anyway)
- other people with their own scripts (done - that's their problem, and the ones I know about told)
- DNS moved (done)
- Update Bot wiki pages (done)
- Update https://en.wikipedia.org/wiki/Template:Automated_tools (done)
AManWithNoPlan (talk) 16:39, 10 June 2020 (UTC)
- Do we already have a permissive CORS rule as suggested in https://wikitech.wikimedia.org/wiki/News/Toolforge.org#Cross-Origin_Resource_Sharing_(CORS)_requests_broken ? I'm currently getting errors on that front. Nemo 11:28, 24 June 2020 (UTC)
- I now know that is not relevant here. But, thanks for the link, that was a good idea to check out. AManWithNoPlan (talk) 18:06, 25 June 2020 (UTC)
{{fixed}} flag to archive. AManWithNoPlan (talk) 02:03, 21 August 2020 (UTC)
we skipped this step : Once action taken or determined as not required, mark off as 'done' at Here
s2cid towards end with rest of identifiers
It's great that Citation bot is adding all these S2CID entries, but is there some reason why they are being added between authors and title instead of towards the end with the rest of the identifiers? — Chris Capoccia 💬 20:13, 21 August 2020 (UTC)
- I see. The "2" in the name confuses it. I will fix that. AManWithNoPlan (talk) 21:23, 21 August 2020 (UTC)
- {{fixed}}
Convert wrong citeseerx/doi
- Status
- {{wontfix}} way too rare to do.
- Reported by
- Headbomb {t · c · p · b} 17:25, 21 August 2020 (UTC)
- What should happen
- [8]
- We can't proceed until
- Feedback from maintainers
Basically if you have 10.1.1... in a |doi=
it should be converted to a |citeseerx=
, and if you have a valid DOI in a |citeseerx=
, then that too should be converted to a |doi=
. Headbomb {t · c · p · b} 17:25, 21 August 2020 (UTC)
Processing of JSTOR citations
- Status
- {{notabug}}
- Reported by
- 凰兰时罗 (talk) 18:46, 22 August 2020 (UTC)
- What happens
- (1) Removal of the level of access from JSTOR links is wrong: different JSTOR materials have different levels of access. (2) Adding "issue=110" is just wrong.
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=John_Leary_(politician)&curid=56263000&diff=974379039&oldid=963559061
- We can't proceed until
- Feedback from maintainers
The issue=110 is actually correct, it is the original volume=110 that was wrong. Also "different JSTOR materials have different levels of access" this is very rare, which is why only |jstor-access=free
is the only option for |jstor-access=
allowed. By definition |jstor-access=closed-off
is assumed, until proven otherwise. AManWithNoPlan (talk) 19:04, 22 August 2020 (UTC)
- Yes, my bad – you're actually correct on both points :). Thanks! 凰兰时罗 (talk) 23:45, 22 August 2020 (UTC)
The Nation
- Status
- {{fixed}}
- Reported by
- Kaltenmeyer (talk) 17:59, 23 August 2020 (UTC)
- What happens
- what is added
|journal=The Nation : A Weekly Journal Devoted to Politics, Literature, Science, Drama, Music, Art, and Finance
- What should happen
- I believe the usual name of the journal should be added; add
|journal=The Nation
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=News_media_endorsements_in_the_2020_United_States_presidential_primaries&type=revision&diff=974337281&oldid=953468038
- We can't proceed until
- Feedback from maintainers
I have added some code for that ISSN specifically. Should be deployed after all tests are passed. https://github.com/ms609/citation-bot/pull/3404 AManWithNoPlan (talk) 18:40, 23 August 2020 (UTC)
Down again
The Citations button is hanging again. Expand citations too. Abductive (reasoning) 07:53, 21 August 2020 (UTC)
- When it gets slow, it can start to result in people trying again and again, which is like putting a fire out with gasoline. AManWithNoPlan (talk) 11:25, 21 August 2020 (UTC)
- seems like its inactive right now for over an hour https://en.wikipedia.org/wiki/Special:Contributions/Citation_bot AManWithNoPlan (talk) 11:25, 21 August 2020 (UTC)
- Got an operator to reboot it AManWithNoPlan (talk) 12:05, 21 August 2020 (UTC)
- {{fixed}} for now. AManWithNoPlan (talk) 13:59, 24 August 2020 (UTC)
- Got an operator to reboot it AManWithNoPlan (talk) 12:05, 21 August 2020 (UTC)
- seems like its inactive right now for over an hour https://en.wikipedia.org/wiki/Special:Contributions/Citation_bot AManWithNoPlan (talk) 11:25, 21 August 2020 (UTC)
Question about interwiki links
At WP:ANI#Citation bot someone said "The bot probably doesn't recognize the interwiki prefix ..." – is that so? Same question for interlanguage links. See Wikipedia:Namespace#Interwiki and interlanguage links. If the bot doesn't understand, it shouldn't mess with it, right? At least a lame excuse: it's the bot's task to understand, and not to remove legit links because it doesn't understand, thus filing a bug report:
- Status
- {{fixed}}
- Reported by
- Francis Schonken (talk) 06:01, 24 August 2020 (UTC)
- What happens
- bot de-links an interwiki link, i.e. it changed
|title=Pianoforte zu vier Händen
to|title=Pianoforte zu vier Händen
(for clarity: without providing a replacement link) - What should happen
- leave legit link alone: removing the link is in no universe helpful to the reader, and certainly not when such reader would want to verify Wikipedia's content which is referenced to this.
- Relevant diffs/links
- [9]
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3411 AManWithNoPlan (talk) 13:55, 24 August 2020 (UTC)
when creating |author-link= from author name parameters ...
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 13:25, 22 August 2020 (UTC)
- What happens
- bot apparently skips
|first=
- Relevant diffs/links
- diff
- We can't proceed until
- Feedback from maintainers
bot changes this:
{{cite web|last1=[[Matt Welch|Welch]]|first1=[[Matt Welch|Matt]]|date=March 4, 2020|url=https://reason.com/2020/03/04/libertarian-super-tuesday-big-night-for-jacob-hornberger-nota-john-mcafee-drops-out-and-backs-vermin-supreme/|title=Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme|website=Reason|accessdate=March 4, 2020}}
- Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020.
{{cite web}}
: Check|first1=
value (help)
- Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020.
to this:
{{cite web|last1=Welch|first1=[[Matt Welch|Matt]]|date=March 4, 2020|url=https://reason.com/2020/03/04/libertarian-super-tuesday-big-night-for-jacob-hornberger-nota-john-mcafee-drops-out-and-backs-vermin-supreme/|title=Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme|website=Reason|accessdate=March 4, 2020|author1-link=Matt Welch}}
- Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020.
{{cite web}}
: Check|first1=
value (help)
- Welch, Matt (March 4, 2020). "Libertarian Super Tuesday: Big Night for Jacob Hornberger, NOTA; John McAfee Drops Out and Backs Vermin Supreme". Reason. Retrieved March 4, 2020.
should also strip wikilink from |firstn=
.
Same should apply to other name parameters (and their aliases): |contributor-firstn=
, |editor-firstn=
, |interviewer-firstn=
, |translator-firstn=
—Trappist the monk (talk) 13:27, 22 August 2020 (UTC)
- this should be an improvement. https://github.com/ms609/citation-bot/pull/3412 AManWithNoPlan (talk) 14:38, 24 August 2020 (UTC)
Unauthorised date format change
- Status
- {{fixed}}
- Reported by
- Francis Schonken (talk) 06:50, 24 August 2020 (UTC)
- What happens
- bot changes
|...date=28 March 2020
to|...date=2020-08-24
, despite the fact that the article has a {{Use dmy dates}} tag in its header section - What should happen
- conversion to dmy dates is OK, not the other way around
- Relevant diffs/links
- [10]
- We can't proceed until
- Feedback from maintainers
Interestingly enough, the {{Use dmy dates}} template controls the display of the date to humans. That is really cool. This will enforce the data style in the meta-data https://github.com/ms609/citation-bot/pull/3410 AManWithNoPlan (talk) 13:35, 24 August 2020 (UTC)
Inventing first name
- Status
- {{fixed}}
- Reported by
- Francis Schonken (talk) 08:08, 24 August 2020 (UTC)
- What happens
- The bot "invents" a first name, in this case "D. K." – as it happens the "D." in that name refers to a last name (Dorling) as does the "K." (Kindersley)
- What should happen
- The bot should not try to invent authors: the series is called DK Eyewitness: even if, agreed, google books, erroneously, marks "DK Eyewitness" as the book's author, converting that to
|author=DK Eyewitness
would be bad enough, not trying to extract a "first name" from that. Applicable policy: WP:OR – original research by human editors is bad enough, it being programmed into a bot should absolutely be avoided. - Relevant diffs/links
- [11] (under "Line 160:")
- We can't proceed until
- Feedback from maintainers
I think this will help https://github.com/ms609/citation-bot/pull/3409 AManWithNoPlan (talk) 12:55, 24 August 2020 (UTC)
3RR
Your recent editing history at Red Fort shows that you are currently engaged in an edit war; that means that you are repeatedly changing content back to how you think it should be, when you have seen that other editors disagree. To resolve the content dispute, please do not revert or change the edits of others when you are reverted. Instead of reverting, please use the talk page to work toward making a version that represents consensus among editors. The best practice at this stage is to discuss, not edit-war. See the bold, revert, discuss cycle for how this is done. If discussions reach an impasse, you can then post a request for help at a relevant noticeboard or seek dispute resolution. In some cases, you may wish to request temporary page protection.
Being involved in an edit war can result in you being blocked from editing—especially if you violate the three-revert rule, which states that an editor must not perform more than three reverts on a single page within a 24-hour period. Undoing another editor's work—whether in whole or in part, whether involving the same or different material each time—counts as a revert. Also keep in mind that while violating the three-revert rule often leads to a block, you can still be blocked for edit warring—even if you do not violate the three-revert rule—should your behavior indicate that you intend to continue reverting repeatedly.
@AManWithNoPlan: after four reports above (#Unauthorised date format change, #Non-repair repair, #Equal treatment and #Inventing first name) about the same Citation bot edit to the Red Fort article, an edit that was, according to its edit summary, "Suggested by AManWithNoPlan", after which I reverted that edit, it seems hardly a good idea for you to suggest to the bot to edit-war over it, before even engaging in the bug reports above. --Francis Schonken (talk) 11:31, 24 August 2020 (UTC)
- My internet was flaky and my webbrowser must have tried reconnect to the category page multiple times, each time launching the DOI fixing run. That's super annoying for several reasons: one, it creates the appearance of an edit war. Two, it means the bot ran wasted resources processing pages multiple times. AManWithNoPlan (talk) 12:18, 24 August 2020 (UTC)
- A self-revert on this revert would be welcome then. @AManWithNoPlan: could you revert, if instructing the bot to do a self-revert would not be possible? --Francis Schonken (talk) 12:31, 24 August 2020 (UTC)
- {{fixed}} with a revert and made another fix while I was there. AManWithNoPlan (talk) 13:39, 24 August 2020 (UTC)
- Thanks. --Francis Schonken (talk) 13:41, 24 August 2020 (UTC)
- {{fixed}} with a revert and made another fix while I was there. AManWithNoPlan (talk) 13:39, 24 August 2020 (UTC)
- A self-revert on this revert would be welcome then. @AManWithNoPlan: could you revert, if instructing the bot to do a self-revert would not be possible? --Francis Schonken (talk) 12:31, 24 August 2020 (UTC)
Non-repair repair
- Status
- {{not a bug}}
- Reported by
- Francis Schonken (talk) 07:41, 24 August 2020 (UTC)
- What happens
- bot changed
|website=britannica.co,
to|website=britannica.co
– neither url is valid, nor the original one, nor the "repaired" one. Both|website=britannica.com
and|website=britannica.co.uk
work (obviously the first was intended, while that's the one appearing in the|url=https://www.britannica.com/...
parameter).|website=britannica.co
, on the other hand, doesn't work. - What should happen
- avoid non-repairs that give the impression that something was repaired.
- Relevant diffs/links
- [12]
- We can't proceed until
- Feedback from maintainers
Removing the trailing comma is a fix. A very small one, but a fix. |website=
is not supposed to be a URL, but human text: it is simply the name of the website, not a full URL. AManWithNoPlan (talk) 13:38, 24 August 2020 (UTC)
- In the case you didn't understand "Non-repair repair": the edit was a "non-fixing fix". For human readers "britannica.co" makes no sense either (without it being really clear that something is wrong), while the "britannica.co," form is at least clearer that something is wrong and needs fixing. The discussion whether it was a "repair" or a "fix" is irrelevant: that part of the edit was unhelpful on all levels. --Francis Schonken (talk) 13:52, 24 August 2020 (UTC)
- We do convert website to URL if it has the http in it. Otherwise, we usually assume that the website entered by a human is correct. The problem is that a website might called IamCool.co, but redirect to IamCool.cheaphosting.com. So, the website might be correctly "IamCool.co" and the URL be IamCool.cheaphosting.com at the same time. It is hard to fix such things automatically. We use blacklists and whitelists and otherwise leave those to the humans to fix. AManWithNoPlan (talk) 14:54, 24 August 2020 (UTC)
- Re. "It is hard to fix such things automatically" – that is the crux of the matter: if the bot can't fix it in a reasonable manner, then the bot shouldn't touch it, and leave it to human editors, and not implement a pseudo-fix, which is many times less helpful than not touching it.
- Other than that, you gave a perfect explanation of what I said above:
--Francis Schonken (talk) 15:45, 24 August 2020 (UTC)For human readers "britannica.co" makes no sense either (without it being really clear that something is wrong), while the "britannica.co," form is at least clearer that something is wrong and needs fixing.
- And the bot fixed what it could: the stray comma. That additional fixes needed to be done is inconsequential. Headbomb {t · c · p · b} 15:55, 24 August 2020 (UTC)
- Nah, the bot shouldn't have touched it. In no universe should removing the stray comma, without anything else, be considered a helpful fix. --Francis Schonken (talk) 15:56, 24 August 2020 (UTC)
- The bot fixes the error "CS1 maint: extra punctuation" and this is one of two errors. The other error is not fixable by a bot. Grimes2 (talk) 16:04, 24 August 2020 (UTC)
- This one wasn't fixable either (as evidenced above), so the bot should not engage in it: it is a bug while the bot tries to fix something it can't fix, and makes the situation worse. --Francis Schonken (talk) 16:22, 24 August 2020 (UTC)
- The stray punctuation was fixed. That the bot doesn't fix everything is known, and 'fixing everything' is impossible to do by bot. This is also known. The bot fixes what it can. Headbomb {t · c · p · b} 16:26, 24 August 2020 (UTC)
- Again, it was an unhelpful fix: seems best to remove the feature from the bot's code. --Francis Schonken (talk) 16:28, 24 August 2020 (UTC)
- The stray punctuation was fixed. That the bot doesn't fix everything is known, and 'fixing everything' is impossible to do by bot. This is also known. The bot fixes what it can. Headbomb {t · c · p · b} 16:26, 24 August 2020 (UTC)
- This one wasn't fixable either (as evidenced above), so the bot should not engage in it: it is a bug while the bot tries to fix something it can't fix, and makes the situation worse. --Francis Schonken (talk) 16:22, 24 August 2020 (UTC)
- The bot fixes the error "CS1 maint: extra punctuation" and this is one of two errors. The other error is not fixable by a bot. Grimes2 (talk) 16:04, 24 August 2020 (UTC)
- Nah, the bot shouldn't have touched it. In no universe should removing the stray comma, without anything else, be considered a helpful fix. --Francis Schonken (talk) 15:56, 24 August 2020 (UTC)
- And the bot fixed what it could: the stray comma. That additional fixes needed to be done is inconsequential. Headbomb {t · c · p · b} 15:55, 24 August 2020 (UTC)
- We do convert website to URL if it has the http in it. Otherwise, we usually assume that the website entered by a human is correct. The problem is that a website might called IamCool.co, but redirect to IamCool.cheaphosting.com. So, the website might be correctly "IamCool.co" and the URL be IamCool.cheaphosting.com at the same time. It is hard to fix such things automatically. We use blacklists and whitelists and otherwise leave those to the humans to fix. AManWithNoPlan (talk) 14:54, 24 August 2020 (UTC)
In short, these sort of fixes are not suitable for a bot running in automatic mode: assisted, or with a human checking before a proposed fix is saved would be far more effective for such fixes that need some interpretation that can't be delivered by bot. --Francis Schonken (talk) 16:33, 24 August 2020 (UTC)
- The bots failure to only do some things is not a bug. It did not do anything wrong, it just failed to do something right. A typo that existed in three pages on all of wikipedia is hardly a shortcoming of the bot. AManWithNoPlan (talk) 20:47, 24 August 2020 (UTC)
This discussion is still open. --Francis Schonken (talk) 02:52, 25 August 2020 (UTC)
On the ground of the matter, the fix applied by the bot goes against WP:BOTPOL, see WP:CONTEXTBOT: "Examples of context-sensitive changes include ... punctuation mistakes" – which, per the policy, should not be performed by unsupervised bots. That's why this is a bug that needs to be fixed. --Francis Schonken (talk) 03:19, 25 August 2020 (UTC)
- I can't tell whether you're serious. That sentence is about natural language, where punctuation is rarely black and white. Removing one stray character from an URL is not a "punctuation fix" in that sense. Nemo 06:15, 25 August 2020 (UTC)
- See above, AManWithNoPlan's first reply after the bug report box: "
|website=
is not supposed to be a URL, but human text: it is simply the name of the website, not a full URL." (my emphasis), so, indeed this falls under the WP:CONTEXTBOT policy. - Your "I can't tell whether you're serious" comment is quite unhelpful at this stage. Care to retract it? --Francis Schonken (talk) 07:00, 25 August 2020 (UTC)
- I find that the "I can't tell whether you're serious" makes it clear that you come across as a troll to some editors. I think it was a very kind way to express that sentiment. AManWithNoPlan (talk) 13:02, 25 August 2020 (UTC)
- This is getting into WP:TE terrority. Removing punctuation in general is a context sensitive change, yes. Here the context is clear. This is the removal of stray punctuation in a template parameter which should not have stray punctuation. There is no WP:CONTEXTBOT violation here. The bot is not changing Firstly, we should attempt... to Firstly we should attempt. This is no different than AWB enforcing WP:REFPUNCT. Headbomb {t · c · p · b} 20:40, 25 August 2020 (UTC)
- I find that the "I can't tell whether you're serious" makes it clear that you come across as a troll to some editors. I think it was a very kind way to express that sentiment. AManWithNoPlan (talk) 13:02, 25 August 2020 (UTC)
- See above, AManWithNoPlan's first reply after the bug report box: "
removing wikilink from title
- Status
- {{fixed}}
- Reported by
- Francis Schonken (talk) 02:56, 25 August 2020 (UTC)
- What happens
- bot changes
|title=Art through the Ages
to|title=Art through the Ages
- What should happen
- title should not be de-linked
- Relevant diffs/links
- [13]
- We can't proceed until
- Feedback from maintainers
Equal treatment
- Status
- {{notabug}}
- Reported by
- Francis Schonken (talk) 07:49, 24 August 2020 (UTC)
- What happens
- On the same page, in the same edit, the bot removes
|website=books.google.ca
from one {{cite book}} template, while it is left alone in another. - What should happen
- should be handled similarly in both instances
- Relevant diffs/links
- [14] (the "removed" one is under the "Line 41:" part of the diff, the "unmodified" one under "Line 160:")
- We can't proceed until
- Feedback from maintainers
google.books.ca is a spam site, not books.google.ca. I just removed all references to google.books.ca from wikipedia. Interesting typo. AManWithNoPlan (talk) 14:02, 24 August 2020 (UTC)
- Removed the "not a bug" assessment: if it is a rogue website it shouldn't be left alone in some cases, while it is removed in other cases. Either we can depend on the bot to remove it when it has gone through an article, or it leaves it to human assessment: randomly removing it and not removing it in a same update by the bot is untrustworthy behaviour of the bot, and should be addressed. --Francis Schonken (talk) 14:36, 24 August 2020 (UTC)
- We have a small list of websites that are removed. Google books is one of them, since that is simply incorrect. The source of the information is not google, but a book. Google is just a library and they have no say in the material. We assume that the information in
|website=
is good, unless it is on the blacklist. AManWithNoPlan (talk) 14:46, 24 August 2020 (UTC)- Again, the problem is that the bot went through the article, removing the
|website=books.google.ca
in one instance, and leaving it untouched in another {{cite book}} template (in which it did other changes, but not the removal of that website parameter): that is undependable random behaviour which should be repaired. --Francis Schonken (talk) 15:52, 24 August 2020 (UTC)- Failure to do everything useful is not a bug. if that was the case, then every edit on wikipedia would wrong. AManWithNoPlan (talk) 16:10, 24 August 2020 (UTC)
- Then it seems better to remove the undependable feature from the bot, and admit that the bot can't fix everything. --Francis Schonken (talk) 16:26, 24 August 2020 (UTC)
- Failure to do everything useful is not a bug. if that was the case, then every edit on wikipedia would wrong. AManWithNoPlan (talk) 16:10, 24 August 2020 (UTC)
- Again, the problem is that the bot went through the article, removing the
- We have a small list of websites that are removed. Google books is one of them, since that is simply incorrect. The source of the information is not google, but a book. Google is just a library and they have no say in the material. We assume that the information in
See also my suggestion about the "automatic" mode being the real problem for such fixes that need some human interpretation, in the #Non-repair repair section above. --Francis Schonken (talk) 16:36, 24 August 2020 (UTC)
- Again, comma removal and clutter removal is dependable. That it doesn't fix everything you want it to fix is a case of WP:SOFIXIT. If you have specific suggestions that can be dependable, do make them though. The above is not one, for the reasons mentionned by AManWithNoPlan. Headbomb {t · c · p · b} 17:40, 24 August 2020 (UTC)
This discussion is still open. --Francis Schonken (talk) 02:52, 25 August 2020 (UTC)
The undependable feature should probably best be removed from the bot's code. --Francis Schonken (talk) 03:24, 25 August 2020 (UTC)
Equal treatment RfC
Is it acceptable behaviour in an unsupervised process (automatic bot) to randomly remove a parameter from one cite template, and keep the same parameter, with the same content, in an identical cite template on the same page, without the bot's maintainers being able to explain why the bot behaves thus? 03:42, 25 August 2020 (UTC)
- No, unacceptable behaviour for the bot: the random feature should be removed from the bot's code. The bot's maintainers should at least be able to explain why the bot behaves thus. --Francis Schonken (talk) 03:42, 25 August 2020 (UTC)
- This supposed "RfC" does not comply with RfC guidelines, due to a ridiculously partisan introductory text, and should be ignore. Nemo 06:17, 25 August 2020 (UTC)
- If I understand the complaint, Editor Francis Schonken is arguing that
|website=books.google.ca
(line 41) is exactly the same as|website=google.books.ca
(line 160) (diff). Superficially, to a human, perhaps they are the same; to a computer they are not – in the formerbooks
is a second-level subdomain ofgoogle.ca
; in the latter,google
is a second level subdomain ofbooks.ca
. No doubt|website=google.books.ca
should be added to the bot's code so that the bot can remove it. That does not require an rfc.—Trappist the monk (talk) 10:07, 25 August 2020 (UTC)- Re. "That does not require an rfc" – apparently it did: the first assistant maintainer of the bot saw no way to address the issue. Hopefully now they can. --Francis Schonken (talk) 10:25, 25 August 2020 (UTC)
- I cannot address non-existent stupid issues. AManWithNoPlan (talk) 13:06, 25 August 2020 (UTC)
- Re. "That does not require an rfc" – apparently it did: the first assistant maintainer of the bot saw no way to address the issue. Hopefully now they can. --Francis Schonken (talk) 10:25, 25 August 2020 (UTC)
- This rfc introduction seems a little biased, but AManWithNoPlan's explanation here and in the non-repair seemed fine to me. books.google.ca and google.books.ca are clearly not the same text, even though they might be similar. It seems a bit harsh to immediately request removal of features and it does not look like a bug. Not doing something is not a bug if the intention wasn't to fix it. Now a bot does not have intentions, but it didn't "fix" it because the site was not included. This can be easily be rectified if wanted, but would be an addition, not a bug fix. Also if it did in fact make at least a good edit, but missed something, why remove the feature? One the one side it is suggested to remove a feature while on the other the bot should be able to fix everything? It seems like the bot mainter(s) did in fact explain how the bot works, whether someone thinks it was good enough or not does not seem like a reason for an RFC. Redalert2fan (talk) 20:22, 25 August 2020 (UTC)
- I've removed the RFC templates as a horribly, hopelessly biased leading question without any sort of example, based on a flawed premise (the behaviour is neither random, nor are the parameters the same). The behavior has been explained multiple times now.
|website=books.google.ca
is not the same as|website=google.books.ca
. You might argue that they should be treated as equivalent (they are not), but you don't need an RFC for this. Headbomb {t · c · p · b} 20:32, 25 August 2020 (UTC)\ - Agree, not a bug. Presumably google.books.ca was a typo and books.google.ca was intended, but we can't expect bots to automatically realize that incorrectly entered "website" parameters should match its patterns for bad and removable "website" parameters. —David Eppstein (talk) 21:15, 25 August 2020 (UTC)
Unauthorised date format change (2)
- Status
- {{fixed}} - legacy redirects added
- Reported by
- Francis Schonken (talk) 06:50, 24 August 2020 (UTC), modified/corrected by Matthiaspaul
- What happens
- bot changes
|...date=28 March 2020
to|...date=2020-08-24
, despite the fact that the article has a {{Use dmy dates}} tag in its header section - What should happen
- conversion to the date format specified by the
Use dmy/mdy dates
template's|cs1-dates=
parameter (if present), or (only if the|cs1-dates=
parameter is not present) to the format according to theUse dmy/mdy dates
template's name is OK, otherwise the format must not be changed - Relevant diffs/links
- [15]
- We can't proceed until
- Feedback from maintainers
Interestingly enough, the {{Use dmy dates}} template controls the display of the date to humans. That is really cool. This will enforce the data style in the meta-data https://github.com/ms609/citation-bot/pull/3410 AManWithNoPlan (talk) 13:35, 24 August 2020 (UTC)
- I couldn't find this in the code (but only had a cursory look), therefore:
- Does it adhere to the setting of the optional
|cs1-dates=
parameter of the{{Use dmy/mdy dates}}
template(s) as well (see Template:Use_dmy_dates#Auto-formatting_citation_template_dates)? This setting, if present, takes precedence over the setting derived from the template's name. If the code does not deal with this, the date format should not be changed at all.
- This is particularly important in conjunction with the
|cs1-dates=y
setting because something like{{Use dmy/mdy dates|date=August 2020|cs1-dates=y}}
means that the dates in the citation should be in ymd format, not dmy/mdy format.
- Also, does it check for the various aliases of the
{{Use dmy/mdy dates}}
templates as well? If it doesn't, it would miss the presence of the template if it's redirected. - FYI, these are the patterns searched for by CS1/CS2 citation templates:
- '{{ *[Uu]se dmy dates *[|}]'
- '{{ *[Uu]se mdy dates *[|}]'
- '{{ *[Uu]se DMY dates *[|}]'
- '{{ *[Uu]se MDY dates *[|}]'
- '{{ *[Uu]se *dmy *[|}]'
- '{{ *[Uu]se *mdy *[|}]'
- '{{ *[Uu]se MDY *[|}]'
- '{{ *[Uu]se DMY *[|}]'
- '{{ *[Dd]my *[|}]'
- '{{ *[Mm]dy *[|}]'
- '{{ *[Dd]MY *[|}]'
- '{{ *[Mm]DY *[|}]'
- --Matthiaspaul (talk) 04:15, 25 August 2020 (UTC)
- Our checking is case-insensitive. I will add the shorter MDY type ones. AManWithNoPlan (talk) 13:12, 25 August 2020 (UTC)
Regular expression failure
- Status
- {{fixed}}
- Reported by
- Whywhenwhohow (talk) 03:41, 28 August 2020 (UTC)
- What happens
- Regular expression failure
- Relevant diffs/links
- https://citations.toolforge.org/process_page.php?slow=on&edit=webform&page=Ranitidine&cat=
- We can't proceed until
- Feedback from maintainers
Fixed on the page: https://en.wikipedia.org/w/index.php?title=Ranitidine&type=revision&diff=975428346&oldid=975367091 Also, added some debug output to the bot so that you can find these yourself. AManWithNoPlan (talk) 13:13, 28 August 2020 (UTC)
ResearchGate is not a publisher
- Status
- {{fixed}}
- Reported by
- Nemo 13:58, 28 August 2020 (UTC)
- What happens
- Nothing
- What should happen
- special:diff/975434945
- We can't proceed until
- Feedback from maintainers
Researchgate.net and ResearchGat and wikillinked and of course case-insensitive. AManWithNoPlan (talk) 15:20, 28 August 2020 (UTC)
- https://github.com/ms609/citation-bot/pull/3433 soon. AManWithNoPlan (talk) 15:25, 28 August 2020 (UTC)
Adds journal=Report to cite book template
- Status
- {{fixed}}
- Reported by
- Whywhenwhohow (talk) 18:22, 29 August 2020 (UTC)
- What happens
- The bot adds journal=Report to a cite book
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Ledipasvir/sofosbuvir&diff=975654426&oldid=975654215
- We can't proceed until
- Feedback from maintainers
> Checking AdsAbs database no record retrieved. + Adding journal: Report
> Remedial work to clean up templates
! Citation should probably not have journal = Report as well as chapter / ISBN 9789241209946
- https://github.com/ms609/citation-bot/pull/3439 should be live soon. AManWithNoPlan (talk) 19:17, 29 August 2020 (UTC)
Fix biorxiv parameter
- What should happen
- [16]
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3466 AManWithNoPlan (talk) 23:29, 31 August 2020 (UTC)
Archive
- Status
- {{notabug}}
- Reported by
- Emir of Wikipedia (talk) 22:01, 31 August 2020 (UTC) (please mention me on reply; thanks!)
- What happens
- Removes archive url and archive data parameters.
- Relevant diffs/links
- Special:diff/976046997
- We can't proceed until
- Feedback from maintainers
@Emir of Wikipedia: Archives are supposed to be copies of the original URL at the wayback machine or someplace similar. Not the original URL. AManWithNoPlan (talk) 22:34, 31 August 2020 (UTC)
What is the point of pmid, bibcode?
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
What is the point of adding entries pmid and bibcode to references? The DOI already provides all information needed to locate the entry. --Jorge Stolfi (talk) 15:01, 1 September 2020 (UTC)
- Not all papers have DOIs. Also bibcode links often contain free full versions and additional bibliographic information (like how often something is cited, related publications), and PMIDs are most useful to health professionals and will have supplementary bibliographic information (much like bibcodes do). DOIs also sometimes go bad, and it's good to have backup links to confirm what is being cited. Headbomb {t · c · p · b} 15:26, 1 September 2020 (UTC)
Also since this does not concern Citation bot, this is {{not a bug}}. Further discussion about the purpose of identifiers can continue at Help talk:CS1 if you want.
Wiki-linked book titles are needlessly piped
- Status
- {{fixed}}
- Reported by
- XOR'easter (talk) 18:36, 1 September 2020 (UTC)
Thanks. Will be fixed soon. AManWithNoPlan (talk) 21:47, 1 September 2020 (UTC)
- also added code that will fix these when it finds them - there is at least one other bot that does this already. AManWithNoPlan (talk) 02:07, 2 September 2020 (UTC)
Bibcode lookup issues
- Status
- {{notabug}}
- Reported by
- Lithopsian (talk) 18:41, 1 September 2020 (UTC)
- What happens
- The Bot fails to add bibcode or arxiv fields to citations
- What should happen
- The bot normally (used to) add the bibcode and arxiv fields to citations that have them, for example when looking up a citation by doi. This isn't happening now. Probably related, citations with a bibcode but no doi are not expanded at all although there is no warning from the bot.
- Relevant diffs/links
- Try at User:Lithopsian/sandbox or GCIRS 16SW (reference 5)
- We can't proceed until
- Feedback from maintainers
The bot has used up its allocation of bibcodes for the a time period. Should start working again soon. AManWithNoPlan (talk) 21:45, 1 September 2020 (UTC)
|first= and |author-link=
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 00:25, 2 September 2020 (UTC)
- What happens
- leaves behind wikilinks in
|first=
when adding|author-link=
- Relevant diffs/links
- diff
- We can't proceed until
- Feedback from maintainers
https://github.com/ms609/citation-bot/pull/3472 AManWithNoPlan (talk) 01:37, 2 September 2020 (UTC)
Adds journal=Report to cite book template
- Status
- {{fixed}}
- Reported by
- Whywhenwhohow (talk) 02:26, 2 September 2020 (UTC)
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Ledipasvir/sofosbuvir&diff=976269033&oldid=976093612
- We can't proceed until
- Feedback from maintainers
It would be useful for the bot to display its version number and/or build date in its output when run via the web UI.
This bug was discussed and closed at https://en.wikipedia.org/wiki/User_talk:Citation_bot/Archive_22#Adds_journal=Report_to_cite_book_template but still appears to be a bug.
>Checking CrossRef database for doi. >Searching PubMed... nothing found. >Checking AdsAbs database no record retrieved. +Adding journal: Report
>Remedial work to clean up templates
!Citation should probably not have journal = Report as well as chapter / ISBN 9789241209946 Written to Ledipasvir/sofosbuvir
- Fixing at a much deeper level this time to catch them all. There is no simple way to display a version with git in any meaningful manner. https://github.com/ms609/citation-bot/pull/3473 AManWithNoPlan (talk) 12:22, 2 September 2020 (UTC)
- The Canadian Catholic Historical Association published a journal with the title Report 🤪 I think we can live we rejecting that. AManWithNoPlan (talk) 14:33, 2 September 2020 (UTC)
- Fixing at a much deeper level this time to catch them all. There is no simple way to display a version with git in any meaningful manner. https://github.com/ms609/citation-bot/pull/3473 AManWithNoPlan (talk) 12:22, 2 September 2020 (UTC)
Changed hyphenated page number to dashed page range.
- Status
- {{fixed}} for
|page=
- Reported by
- User-duck (talk) 00:16, 2 September 2020 (UTC)
- What happens
|page=3-159
was changed to|page=3–159
.- What should happen
- A single hyphenated page should not be changed.
- Relevant diffs/links
- Iroquois
- We can't proceed until
- Feedback from maintainers
I realize one of the common mistakes is to use a hyphen in a range. The bot should change |pages=3-159
to |pages=3–159
. User-duck (talk) 00:16, 2 September 2020 (UTC)
- 99.9% of the time the change is correct. Also, it is the kind of thing that human editors do all the time. Lastly, this change actually changes the wiki text to match what is displayed. People get this wrong often enough that the bots page and the citation template docs point this out too. AManWithNoPlan (talk) 01:11, 2 September 2020 (UTC)
- 99.9 percent is very far from good enough for a bot (a machine that can be programmed to be deterministic instead of doing guesswork). Such errors introduced by bots are difficult to find and therefore harmful. Don't let the bot carry out such changes unless you are sure the bot has actually spotted an error and the edit is correct ("sure" = 100%, because you have actually checked the citation). An alternative could be to collect such occurences and publish a list of them for humans to look over. Or add a HTML comment so that later human editors editing the article find the note and can have an extra eye on this. But don't let the bot introduce such changes itself.
- Regarding humans doing similar mistakes, yes, this happens, but they don't hammer their mistakes down into thousands of articles in no time, like bots do. This is why a mistake done by a human may be annoying, but the same mistake done by a bot is harmful and not acceptable.
- --Matthiaspaul (talk) 11:34, 2 September 2020 (UTC)
- 99.9% is a phenomenal success rate. All AWB bots make those changes with their genfixes. No reason why Citation Bot should be the special exception here. Plus, the CS1 documentation Help:CS1#Pages is clear that hyphenated pages are the exception, rather than the norm, and that people should use {{hyphen}} to indicate that this is intentional, and not to be converted to an ndash. Headbomb {t · c · p · b} 14:11, 2 September 2020 (UTC)
- Yes, 99.9% is a phenomenal success rate, but I doubt Citation Bot is that successful because I doubt it detects 1000 different errors. In any case, the success rate can be increased.
- In this case, when the bot decides to make the change from 'hyphen' to 'en dash' it assumes
|page=
is a mistake. In this case, it should also change|page=
to|pages=
. - Maybe the bots should use {{en dash}} or
–
when they make the change. It makes more sense for a bot to add extra characters than a human. - Finally, I did not imply that "Citation Bot should be the special exception". This is were I discovered the error. Of course, all the bots should be fixed.
- — User-duck (talk) 15:38, 3 September 2020 (UTC)
- The statement "Lastly, this change actually changes the wiki text to match what is displayed." confused me until I found these tidbits: (It took me a while to find the source code.)
- page: The number of a single page in the source that supports the content. Use either
|page=
or|pages=
, but not both. Displays preceded byp.
unless|nopp=y
. If hyphenated, use {{hyphen}} to indicate this is intentional (e.g.|page=3{{hyphen}}12
), otherwise several editors and semi-automated tools will assume this was a misuse of the parameter to indicate a page range and will convert|page=3-12
to|pages=3{{ndash}}12
. - OR: pages: A range of pages in the source that supports the content. Use either
|page=
or|pages=
, but not both. Separate using an en dash (–); separate non-sequential pages with a comma (,); do not use to indicate the total number of pages in the source. Displays preceded bypp.
unless|nopp=y
.
Hyphens are automatically converted to en dashes; if hyphens are appropriate because individual page numbers contain hyphens, for example: pp. 3-1–3-15, use double parentheses to tell the template to display the value of|pages=
without processing it, and use {{hyphen}} to indicate to editors that a hyphen is really intended:|pages=((3{{hyphen}}1{{ndash}}3{{hyphen}}15))
. Alternatively, use|at=
, like this:|at=pp. 3-1–3-15
.
- page: The number of a single page in the source that supports the content. Use either
- I do not remember reading about this use of {{hyphen}} and {{ndash}}. But it has been a while since I have read the cite templates' documentation completely.
- Am I correct that the cite templates will convert
|pages=3–1–3–15
to pp. 3–1–3–15 (pp. 3–1–3–15
). Which is obviously wrong. - It is frustrating that I need to incorporate two different work-arounds for the templates and editors.
|page=3-12
and|pages=3–1–3–15
are not ambiguous. (Unless|page=
and|pages=
are aliases for the same parameter.) - It is gratifying to know that some tool did translate
|page=3-12
to|pages=3{{ndash}}12
. Or at least a documentation editor thought they should. (And the Citation Bot does not.) - Finally, what is done with
|pages=325
? This is obviously wrong and from my experience a common misuse of|pages=
for the total number of pages. - — User-duck (talk) 17:08, 3 September 2020 (UTC)
- (edit-conflict) If AWB makes the same mistakes that's not an excuse but a reason to either improve the tools or ditch them. I know, we all are volunteers here, but that should not keep us from applying professional standards to the work we do. Citation Bot's many questionable edits are an ongoing community-wide annoyance and distraction from actual article work. It is such a waste of time to be forced to clean up the mess this bot creates all over the place. I can tolerate an occasional glitch, if it will be fixed soon (I'm even willing to help), but a bot failing in one go and its operators/maintainers/users trying to defend its weaknesses by denying the errors it creates cannot be accepted. That's wasting even more precious time and energy. As sad as it is, at the present low success rate, the project would be better off without this bot. Therefore, if this bot should have any future in Wikipedia, it must be significantly improved in two areas: First, it must not carry out any edits for which there is no broad community-consensus. Second, it must not carry out any edits based on guesswork or likelihoods or assumptions instead of verified facts. Also, the attitude must change to a conservative approach about what kind of edits a bot is realistically able to carry out reliably and what is carrying a risk of messing up something, and to always stay on the safe side. If that can't be garanteed, don't make the edit.
- It is perfectly fine for a bot or other tool to assist humans in detecting, collecting or marking spots which may require careful further investigation by a human. Even heuristics can be used for this with good success. Machines can be very good in screening huge amounts of data in no time. It is also perfectly fine for a bot to carry out "deterministic" actions for as long as they are backed up by consensus, not only by what a few militant citation warriors want to enforce as their citation standard. So, in this example, it would be okay if the bot, running into what could be a valid page or incorrectly a page range in a single
|page=
parameter, would actually retrieve the cited document and check the type of page numbering used in there and see if the page number exists or not, or, simpler, to ask a human to check the page numbering. If the bot is not capable of determining this with 100% reliability, it could still leave a HTML comment in the citation so that later human editors are alerted on the possible situation and can check the facts. So, the bot can still be useful, even if it does not carry out such edits by itself. It is also fine for a bot to add, f.e., an identifier or other missing information that can be retrieved with 100% accuracy (but it should not use unreliable channels to retrieve such information). What is not acceptable is to base edits on guesswork and likelihoods instead of actually verifying it in the source. That's harmful and must stop. Bots exists to assist humans, not the other way around. --Matthiaspaul (talk) 17:54, 3 September 2020 (UTC)|page=
vs.|pages=
. Grrrr. These template citations are like contradiction wrapped in an enigma. Coming soon: https://github.com/ms609/citation-bot/pull/3474 AManWithNoPlan (talk) 19:08, 3 September 2020 (UTC)
- 99.9% is a phenomenal success rate. All AWB bots make those changes with their genfixes. No reason why Citation Bot should be the special exception here. Plus, the CS1 documentation Help:CS1#Pages is clear that hyphenated pages are the exception, rather than the norm, and that people should use {{hyphen}} to indicate that this is intentional, and not to be converted to an ndash. Headbomb {t · c · p · b} 14:11, 2 September 2020 (UTC)
Please do not convert plain references to Cite templates
Please DO NOT convert plain references to {{cite}} templates, as was done here. They are worse in every respect -- length, readability of the source and produced text, ease of entering and editing (especially by new editors), bug resistance, ... and have NO redeeming features. In fact, Wikipedia would be immensely better if their uses were all SUBSTed and the templates deleted. And I believe that there was a WP policy that the converssion (either way) should NOT be done without a good reason.--Jorge Stolfi (talk) 14:54, 1 September 2020 (UTC)
You are heavily mistaken. Citation templates greatly facilitate the long-term maintenance of the Encyclopedia, and present the information in a consistent, uniform, way. That you can't think of a "redeeming feature" doesn't mean they aren't there. I count 5 styles errors alone in your first citation alone
- M. A. Casado-Rodriguez, M. Sanchez-Molina, A. Lucena-Serrano, C. Lucena-Serrano, B. Rodriguez-Gonalez, Manuel Algarra, Amelia Diaz, M. Valpuesta, J. M. Lopez-Romero, J. PerezJuste, and R. Contreras-Caceres (2016): "Synthesis of vinyl-terminated Au nanoprisms and nanooctahedra mediated by 3-butenoic acid: Direct Au@pNIPAM fabrication with improved SERS capabilities". Nanoscale, volume 2016, issue 8, pages 4557-4564. doi:10.1039/C5NR08054A
and 2 more in
- David B. Bigley and Michael J. Clarke (1982): "Studies in decarboxylation. Part 14. The gas-phase decarboxylation of but-3-enoic acid and the intermediacy of isocrotonic (cis-but-2-enoic) acid in its isomerisation to crotonic (trans-but-2-enoic) acid". Journal of the Chemical Society, Perkin Transactions 2, volume 2, issue 1, pages 1-6. doi:10.1039/P29820000001
There are additional features beyond just internal consistency, but those usually kick in on bigger articles. Suffice to say that it is much, much easier to maintain citation templates than it is to maintain manual citations. Having CS1/2 templates will also emit COinS metadata, and several tools will only work if citation templates are used. Headbomb {t · c · p · b} 15:38, 1 September 2020 (UTC)
- (1) There were indeed typos in those references (thanks for pointing that out), but they do not justify converting them to cite templates (which by itself would not fix the typos).
(2) Inconsistency in abbreviation of first names is not a "style error"! Giving the name in full when known (especially if that is how it appears on the artiicle itself) causes no harm to anyone, and may be useful to identify the author and distinguish homonyms. The practice of abbreviating first names (and journal names, and writing "9(11)23-5" instead of "volume 9, issue 11, pages 23-25") was developed by journal publishers only to save paper; not because the God of Bibliography mandated it. Academics are used to these shorthands, but for the general reader of Wikipedia they are inscrutable hieroglyphs. That is one of the many reasons why the Cite templates are BAD.
(3) Insisting an n-dash instead of hyphen to separate page numbers is not only ridiculous finnickery (which is not espoused by many authors and publishers), but in fact goes against the spirit of Wikipedia: that editors should spend their time on contents rather than appearance. That is the reason, by the way, for its early option for straight quotes and apostrophes, instead of paired open-close quotes. Demanding or implying adherence to elaborate typographical standards discourages new editors and wastes the time of old ones, without bringing any measurable benefit to readers.
All the best, --Jorge Stolfi (talk) 18:09, 1 September 2020 (UTC) - (4) And it is not at al true that "cite templates are much, much easier to maintain". Quite the opposite! Just finding the year or title in a Cite template entry takes careful scanning of the whole entry.
Fromyour comment, I infer that there are "projects" that intend to use the Wikipedia references as some sort of database, with query tools etc. Such a project would only have merit if it was explicitly defined, justified, and included in the official Wikipedia goals. You cannot demand that editors help such a project if they do not know about it (and if it has no visible benefit for them of for Wikipedia readers).
Actually I myself got tired of requesting that reference bodies be removed from the articles and placed in a separate unified database, so that entries do not have to be typed again and again -- like images are unified in Wikipedia Commons. I would support such a project (but not using the Cite templates as they are). Until then, please stop converting perfectly good references to Cite templates.
All the best, --Jorge Stolfi (talk) 18:26, 1 September 2020 (UTC)
- (1) There were indeed typos in those references (thanks for pointing that out), but they do not justify converting them to cite templates (which by itself would not fix the typos).
- This discussion IS about ongoing disruption from the citation bot during an RFC, and those who activate it, so please refrain from closing threads. I will now start over on the post I lost to edit conflict, and @RexxS and Salvio giuliano: SandyGeorgia (Talk) 15:57, 1 September 2020 (UTC)
Starting over on post lost to edit conflict when thread was prematurely closed.
- This post IS about citation bot and the broad disruption caused by it and those who activate it.
- Jorge Stolfi is correct about WP:CITEVAR, and existing style issues in a citation are not a reason to convert the entire article.
- Because of the ongoing vagaries and problems associated with this bot, I regret having recently converted FA Tourette syndrome from manual citations to citation templates, as now I must deal with constant disruption.
- As soon as Salvio guiliano unblocked the bot, and in spite of an ongoing RFC, the bot resumed removing free full link URLs, and installing non-free full link URLs, even after I followed the bot operators' advice to add inline comments. At this point, this is simply disruptive and vandalistic, and I should not have to continue correcting such issues, just because someone or some group are anxious to add yet another identifier to every citation, resulting in even more clutter for editors and readers. See disruptive edit at a Featured article here, and take note of inline comments and addition of a non-free URL. Nor should the bot operators be disrespecting CITEVAR as raised by Jorge Stolfi. Salvio giuliano, is it time to reblock? SandyGeorgia (Talk) 16:06, 1 September 2020 (UTC)
- This has nothing to do with Citation Bot. This is all about me. I manually changed the page after fixing the bad doi. That was my bad. AManWithNoPlan (talk) 16:17, 1 September 2020 (UTC)
- OK, thanks for stating that, but once again ... regular editors who just want to get some work done are having to deal with bot problems. What happened at the link I gave above for dementia with Lewy bodies? Do you agree I am justified now in simply reverting the lot, as it appears some operators can't help themselves, and I should not have to do continuous corrections? And by the way, who activated the bot at DLB, and how am I supposed to track that down, other than reporting here as earlier advised ? SandyGeorgia (Talk) 16:20, 1 September 2020 (UTC)
- The word "table" in the PMC url is now magic and the bot will see that. AManWithNoPlan (talk) 16:52, 1 September 2020 (UTC)
- Thanks for whatever magic you did there, but three things. 1. GreenC below is saying we need to somehow flag these issues, which I did as instructed with the inline comment, and yet someone activated the bot and ignored the inline. 2. Besides deleting the URL to the Table, a non-free URL was added ... weird. 3. How can I tell who activated the bot there? Regards, SandyGeorgia (Talk) 18:14, 1 September 2020 (UTC)
- What non-free URL? I don't see any in your example, please be specific. Nemo 06:56, 2 September 2020 (UTC)
- SandyGeorgia, You can see who activated the bot by "Suggested by (username)" . Or in older versions it was "Activated by (username)". Redalert2fan (talk) 09:36, 2 September 2020 (UTC)
- Thanks for whatever magic you did there, but three things. 1. GreenC below is saying we need to somehow flag these issues, which I did as instructed with the inline comment, and yet someone activated the bot and ignored the inline. 2. Besides deleting the URL to the Table, a non-free URL was added ... weird. 3. How can I tell who activated the bot there? Regards, SandyGeorgia (Talk) 18:14, 1 September 2020 (UTC)
- The word "table" in the PMC url is now magic and the bot will see that. AManWithNoPlan (talk) 16:52, 1 September 2020 (UTC)
- OK, thanks for stating that, but once again ... regular editors who just want to get some work done are having to deal with bot problems. What happened at the link I gave above for dementia with Lewy bodies? Do you agree I am justified now in simply reverting the lot, as it appears some operators can't help themselves, and I should not have to do continuous corrections? And by the way, who activated the bot at DLB, and how am I supposed to track that down, other than reporting here as earlier advised ? SandyGeorgia (Talk) 16:20, 1 September 2020 (UTC)
- This has nothing to do with Citation Bot. This is all about me. I manually changed the page after fixing the bad doi. That was my bad. AManWithNoPlan (talk) 16:17, 1 September 2020 (UTC)
Since the non-template articles are few and far between (in the corpus of 6+ million) the onus is on those articles to flag somehow because citation bot is not alone. IABot, WaybackMedic and RefFill are only a few that come to mind that also convert to CS1|2, under some conditions. Like it or not CS1|2 has become a standard. A template flag such as {{nocs1}}
could work or some other method. Automated tools need to be told what you want done, they can't magically determine a non-templated article vs. only a single cite that happened to be non-templated for no reason. -- GreenC 16:42, 1 September 2020 (UTC)
- Indeed. Nobody is forced to write their references with templates, but at the same time nobody is forced to do without templates when a reference needs to be fixed. References go rotten very quickly, so they need frequent maintenance and no human editor can be expected to keep up with it: automation is the only way to respect the second pillar. Nemo 06:56, 2 September 2020 (UTC)
- {{notabug}} but maybe a {{personalproblem}} of mine. AManWithNoPlan (talk) 12:18, 6 September 2020 (UTC)
504 Gateway Time-out for citations.toolforge.org
Looks like something is wrong and I can't run the bot on any pages. Just times out. — Chris Capoccia 💬 15:43, 4 September 2020 (UTC)
- I'm getting a 503 Service Not Available. Abductive (reasoning) 18:39, 4 September 2020 (UTC)
- yep. that's all i'm getting now too. — Chris Capoccia 💬 22:19, 4 September 2020 (UTC)
Fixed now and working again. — Chris Capoccia 💬 02:20, 5 September 2020 (UTC)
- {{fixed}}, but i need to figure out why this occures. AManWithNoPlan (talk) 12:20, 6 September 2020 (UTC)
Bot makes additional edits on second pass
How is it possible, as can be seen at Animal latrine, that the bot can find more things to fix only a few days later, with no intervening edits? Abductive (reasoning) 12:10, 6 September 2020 (UTC)
- There are a few rare cases where this can happen, but in 99%+ of the times, it is a database being down that is the cause. In this case, the bibcode database was down. AManWithNoPlan (talk) 12:22, 6 September 2020 (UTC)
- Maybe the bot should let folks know when databases are down? And is it checking downed databases for each and every article, even if it just encountered the downed database? Might this be partly responsible for the slowness? Abductive (reasoning) 12:49, 6 September 2020 (UTC)
- The edit summaries are too long as they are often. It think people would generally be annoyed and find it not useful to have edits summaries that said "...tune in next week to see how this edit ends up." AManWithNoPlan (talk) 13:34, 6 September 2020 (UTC)
- Also, if you don't really need the bibcodes, you can disable slow mode: that will make the edits faster and more reliable. You'd also leave more of the query quota for users who actually care about the bibcodes (mostly astrophysics folks, I think). Nemo 17:26, 6 September 2020 (UTC)
- Interesting. Abductive (reasoning) 21:57, 6 September 2020 (UTC)
- Maybe the bot should let folks know when databases are down? And is it checking downed databases for each and every article, even if it just encountered the downed database? Might this be partly responsible for the slowness? Abductive (reasoning) 12:49, 6 September 2020 (UTC)
{{notabug}}, but annoying. AManWithNoPlan (talk) 00:50, 9 September 2020 (UTC)
url vs chapter-url again
- Status
- {{fixed}}
- Reported by
- Kanguole 11:20, 7 September 2020 (UTC)
- What happens
- The link shows two errors related to chapters in books:
- In the first change, the URL added to
|url=
actually points at the chapter (and is equivalent to the one already given in|chapter-url=
). - In the second change, the URL points to the whole book, so changing it from
|url=
to|chapter-url=
is erroneous.
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Japonic_languages&curid=501569&diff=977180142&oldid=973737202
- We can't proceed until
- Feedback from maintainers
It seems the first bug is still present (diff). Kanguole 17:09, 8 September 2020 (UTC)
- Sorry, I missed that one. AManWithNoPlan (talk) 17:26, 8 September 2020 (UTC)
added volume and issue; left malformed alias number
- Status
- {{fixed}}
- Reported by
- Trappist the monk (talk) 23:45, 7 September 2020 (UTC)
- What happens
|number=
is an alias of|issue=
- Relevant diffs/links
- [17]
- We can't proceed until
- Feedback from maintainers