User talk:Citation bot/Archive 30

This is an archive of past discussions about User:Citation bot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Archive 25

←

Archive 28

→

Category runs skip Draft/Template pages

This is super annoying. There's no reasons for those pages to be skipped. Headbomb {t · c · p · b} 03:28, 27 December 2021 (UTC)

Template would be dangerous, but drafts probably need the bot more than anything else. Will do. AManWithNoPlan (talk) 12:37, 27 December 2021 (UTC)

Why would allowing the bot to run on templates be dangerous? It seems to me, possibly somewhat naively, that it would be harmless at worst (since it would make no changes to the vast majority of templates that contain no references within themselves), and beneficial for those templates that do contain their own references. Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 22:18, 27 December 2021 (UTC)

Many templates are cite book etc wrappers with all sorts of weird stuff in them. AManWithNoPlan (talk) 21:16, 28 December 2021 (UTC)

Yes, but the bot fails to edit on them when there's weird stuff. I ran it on countless templates, it never once had any issue with weird stuff. Headbomb {t · c · p · b} 21:24, 28 December 2021 (UTC)

And really weird ones should have a nope to bots template - wrapped in no includes of course. Over the years I have added paranoid checks. Template now allowed. /doc now blocked, since those often have errors on purpose. AManWithNoPlan (talk) 12:17, 29 December 2021 (UTC)

Volume cleanup

Status: Fixed
Reported by: Headbomb {t · c · p · b} 19:33, 29 December 2021 (UTC)

What should happen: [1] [2]

Basically find

|volume=Something (Foobar) |issue=Foobar

Replace

|volume=Something |issue=Foobar

Headbomb {t · c · p · b} 19:56, 29 December 2021 (UTC)

Caps: Novye i Maloizvestnye Vidy Fauny Sibiri

Status: Fixed
Reported by: Headbomb {t · c · p · b} 20:00, 29 December 2021 (UTC)

What should happen: [3]
Relevant diffs/links: [4]

bot edits template code

Status: Fixed once deployed
Reported by: Trappist the monk (talk) 01:39, 30 December 2021 (UTC)

What happens: these two templates are used to translate Polish and French cs1|2-equivalent templates to cs1|2; the translations are poor enough please keep the bot away; Editor Oculi, please don't suggest template space as an arena for the bot to play in.
Relevant diffs/links: diff and diff
We can't proceed until: Feedback from maintainers

@User:Whoop whoop pull up @User:Headbomb

Double edit

Status: Fixed
Reported by: Headbomb {t · c · p · b} 04:33, 30 December 2021 (UTC)

What happens: [5] + [6]
What should happen: One edit

did you just click the button twice? — Chris Capoccia 💬 16:14, 30 December 2021 (UTC)

Can gadgets complete the format using the parameter of oclc?

Is this because too many people use gadgets? example; . OCLC 14003250. {{cite book}}: Missing or empty |title= (help). --SilverMatsu (talk) 02:14, 2 January 2022 (UTC)

Now I was able to activate the gadget. The result seems that the gadget can't do anything with the parameter of oclc.--SilverMatsu (talk) 02:36, 2 January 2022 (UTC)

no reliable way to expand them easily. AManWithNoPlan (talk) 03:14, 2 January 2022 (UTC)

If we're talking about the RefToolbar, just put the whole worldcat URL in the URL box and try that. Most of the time it's good enough. Then you can manually replace the URL with OCLC parameter and be done. — Chris Capoccia 💬 13:45, 4 January 2022 (UTC)

Won't fix, since "most of the time" is not often enough for a fully automatic bot. AManWithNoPlan (talk) 13:28, 8 January 2022 (UTC)

The New York Times is a newspaper

Status: Fixed
Reported by: BrownHairedGirl (talk) • (contribs) 18:42, 6 January 2022 (UTC)

What happens: |publisher=nytimes.com → |work=nytimes.com
What should happen: |publisher=nytimes.com → |newspaper=[[The New York Times]], and
|work=nytimes.com → |newspaper=[[The New York Times]], and
|website=nytimes.com → |newspaper=[[The New York Times]]
Note that this should be done only where the URL is www.nytimes.com. URLs of the form blogs.nytimes.com are not part of the newspaper
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Richard_Abels&diff=prev&oldid=1064126667

Official Charts Company

When citations already include Official Charts Company as the publication name, there is no need for this bot to add "OfficialCharts.com" into them. Not only is it repetitive to include, but that form is also malformatted as ".com" isn't technically part of the organization's name, even when used in their URL. SNUGGUMS (talk / edits) 16:26, 8 January 2022 (UTC)

I have updated the bot to add the work/website/journal/newspaper/etc parameter of Official Charts instead of the hostname redirect page. Can you point me to some examples of the trouble you see. AManWithNoPlan (talk) 17:26, 8 January 2022 (UTC)

Here is the most recent instance I found (and I've since reverted it). As you can see, both refs already contained "Official Charts Company" within them. SNUGGUMS (talk / edits) 17:43, 8 January 2022 (UTC)

{{fixed}} AManWithNoPlan (talk) 18:00, 8 January 2022 (UTC)

(edit conflict)

|agency= is the wrong parameter. That parameter is generally for organizations like Associated Press, Reuters, etc when work from the agency is redistributed by another source (typically a newspaper). Use |website=[[Official Charts Company]].

—Trappist the monk (talk) 18:03, 8 January 2022 (UTC)

It should be |publisher=[[Official Charts Company]]. The Official Charts Company is not a website. The website/work here is Official Charts. Headbomb {t · c · p · b} 19:31, 8 January 2022 (UTC)

I've done work on this site in the past. It used to be called "Chart Archive|ChartArchive|Chart Stats|ChartStats" (ie. "chartarchive[.]org|chartstats[.]org"). The companies merged and redirects were fixed. I also tried to normalize the metadata but it's messy as the strings are variable in many places, lots of edge. -- GreenC 19:57, 8 January 2022 (UTC)

"Journal = undefined"

Status: {{fixed}}
Reported by: * Pppery * _{it has begun...} 21:40, 10 January 2022 (UTC)

What happens: Citation bot adds |journal=Undefined
What should happen: It adds the proper journal title
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Strike_action&diff=prev&oldid=1064918648
We can't proceed until: Feedback from maintainers

Anschluss gesucht?!.

Hallo suche Personen die mich unterstützen können. Ich mache Psychiarieerfahrungen. An wen wendet man sich bei staatlicher Willkür wie bekommt man Aufklärung und öffentliches Interesse? Meine Emailadresse ist [redacted] Lg Andreas 2A02:3033:41B:8EC0:D954:CA9:734B:C6E7 (talk) 16:02, 12 January 2022 (UTC)

Which Google translate means Hello I am looking for people who can support me. I have psychiatric experiences. Who do you turn to in the event of arbitrary state decisions, how do you get information and public interest?.

Delete? --John Maynard Friedman (talk) 16:42, 12 January 2022 (UTC)

The bot inserted "An error occured!" instead of page title

Status: {{fixed}}
Reported by: LimaMario (talk) 22:27, 11 January 2022 (UTC)

What happens: The bot inserted |title = An error occured! into the template.
What should happen: The bot should insert the correct page title.
Relevant diffs/links: Special:Diff/1063849399
Replication instructions: curl -A " " -s "https://globaldatalab.org/shdi/shdi/POL/?levels=1%2B4&years=2019%2B2015%2B2010%2B2005%2B2000%2B1995" | grep -oP "(?<=<title>).*(?=</title>)"
We can't proceed until: Feedback from maintainers

The grepping of the title is a terrible idea. That leads to all the citations with titles of "Wiley's Online Library" and similar. The incorrect spelling of "occured" is now recognized. AManWithNoPlan (talk) 14:02, 12 January 2022 (UTC)

Series treated as journal

Status: Fixed
Reported by: Kanguole 09:08, 14 January 2022 (UTC)

What happens: treats Oceanic Linguistics Special Publications (a series) as a journal
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Japonic_languages&type=revision&diff=1065544233&oldid=1061977244

Added to list of series. On that specific page I added a comment to prevent the adding of the page range. I will also add code to reject page ranges the start with "i" and end with something over 100. AManWithNoPlan (talk) 14:05, 14 January 2022 (UTC)

Could/should citation bot have made a better job of this?

Status: unfortunately Won't fix
Reported by: John Maynard Friedman (talk) 11:45, 14 January 2022 (UTC)

What happens: Only abbreviated name of publisher changed
What should happen: Fix other errors at the same time
Relevant diffs/links: see discussion below
We can't proceed until: Feedback from maintainers

At some time in the past, a editor added this feeble citation to Swastika: [https://www.dw.com/en/germany-wont-seek-eu-wide-ban-on-swastikas/a-2330716 DW 29 January 2007]
At 04:29, 14 January 2022 UTC‎, Rlink2 changed it to {{cite web| url = https://www.dw.com/en/germany-wont-seek-eu-wide-ban-on-swastikas/a-2330716| title = DW 29 January 2007}} (diff)
- (Their misunderstanding of bare URLs and WP:CITEVAR has been raised at their talk page and can be regarded as off-topic here. The counter-helpful insertion of spaces at the pipes is a "feature" of AutoWikiBrowser, apparently. This info mentioned for a complete picture, otherwise irrelevant.)
At 04:36, 14 January 2022‎ UTC, CitationBot "Suggested by Anas1712" changed it again to {{cite web| url = https://www.dw.com/en/germany-wont-seek-eu-wide-ban-on-swastikas/a-2330716| title = DW 29 January 2007| website = [[Deutsche Welle]]}} (diff)
I have further corrected it to read {{cite web |url = https://www.dw.com/en/germany-wont-seek-eu-wide-ban-on-swastikas/a-2330716 |title = Germany Won't Seek EU-Wide Ban on Swastikas |date= 29 January 2007 |work = [[Deutsche Welle]]}} (diff)

So my question is whether, accepting how poorly fed it was, could or should CitationBot have got closer to the right answer? --John Maynard Friedman (talk) 11:45, 14 January 2022 (UTC)

Do you have diffs? Headbomb {t · c · p · b} 11:48, 14 January 2022 (UTC)

Retrospectively inserted above. --John Maynard Friedman (talk) 12:01, 14 January 2022 (UTC)

No, not really. The bot would need to recognize the existing human added title as being worthy of blowing away. The bot does have some title styles that it does deem to be rubbish, but unless there are a huge number of references with the junk title of style DW date, I do not see any reason to add that to the bot. AManWithNoPlan (talk) 14:03, 14 January 2022 (UTC)

I guessed as much but thought it worth asking, given that it wasn't a big effort to find the right answer. Looks like we are safe from being replaced by AI for another wee while. Thank you for taking a look. --John Maynard Friedman (talk) 16:13, 14 January 2022 (UTC)

Job stalled

My big job has stalled after 1433/2195 pages. The last edit[7] was an hour ago. BrownHairedGirl (talk) • (contribs) 04:18, 16 January 2022 (UTC)

Resolved. The job has resumed. BrownHairedGirl (talk) • (contribs) 05:00, 16 January 2022 (UTC)

The bot seems "choppy" at times where one job just sleeps and then comes back as if nothing happens. AManWithNoPlan (talk) 13:22, 16 January 2022 (UTC)

Carnarvonshire Railway

Status: Fixed
Reported by: Xenophon Philosopher (talk) 22:33, 19 January 2022 (UTC)

In the Wikipedia article on Carnarvonshire Railway, below all the expected text matter, is a compilation of what I imagine to be an attempt to produce a line route-map that is unfinished. Can you investigate?

Xenophon Philosopher (talk) 22:33, 19 January 2022 (UTC)

bad archive titles

Status: Fixed
Reported by: * Pppery * _{it has begun...} 05:02, 20 January 2022 (UTC)

What happens: Adds title of live version of page when converting bare URL to web.archive.org. For example, it adds |title=UCLA - International Institute ..::.. Error, |title=404, and (this is especially egregious) |title=Compare Payday Loans | Find the Best Loan Deal even though the arhived links are valid.
Relevant diffs/links: Special:Diff/1066790773

Hesperia (journal)

Removing the link to Hesperia (journal) on Crocus was presumably an error. I restored it but the bot might need tweaking to prevent this happening again - presubably it didnt like the lik being to the name of the journal, not to its desscription - which I linked to this time --Michael Goodyear ✐ ✉ 18:26, 17 January 2022 (UTC)

diff, [8].

Basically the bot did as expected, since journals should be fully linked, not partly linked. Don't put [[Hesperia (journal)|Hesperia]]: The Journal of the American School of Classical Studies at Athens, but rather [[Hesperia (journal)|Hesperia: The Journal of the American School of Classical Studies at Athens]]. Though you could also simply have [[Hesperia: The Journal of the American School of Classical Studies at Athens]]. Headbomb {t · c · p · b} 18:37, 17 January 2022 (UTC)

Wrong replacement for RIGHT SINGLE QUOTATION MARK

Status: {{fixed}} with special case code
Reported by: GA-RT-22 (talk) 19:45, 20 January 2022 (UTC)

What happens: Hawai’i (with U+2019 ’ RIGHT SINGLE QUOTATION MARK) changed to Hawai'i (with apostrophe)
What should happen: RIGHT SINGLE QUOTATION MARK should have been left alone or changed to OKINA
Relevant diffs/links: [9]
We can't proceed until: Feedback from maintainers

I realize the original text was wrong, but apostrophe is equally wrong. I also realize this might be hard to fix. GA-RT-22 (talk) 19:45, 20 January 2022 (UTC)

Fixed on the page with the correct thing. https://en.wikipedia.org/w/index.php?title=NCIS%3A_Hawai%CA%BBi&type=revision&diff=1066920058&oldid=1066914647 AManWithNoPlan (talk) 20:02, 20 January 2022 (UTC)

And reverted. We can discuss it on the talk page if you want. This fix is not needed and does nothing to fix the problem I'm reporting here. Wikipedia:Manual of Style/Hawaii-related articles says "[the template] is discouraged as this creates unnecessary characters in the editing box."

Thank you. I will look at adding a special case for Hawai'i . AManWithNoPlan (talk) 20:28, 20 January 2022 (UTC)

It's a difficult problem in general and doing nothing would be a reasonable response. GA-RT-22 (talk) 20:44, 20 January 2022 (UTC)

"Published, John Doe"

Status: Fixed
Reported by: IceWelder [✉] 11:03, 22 January 2022 (UTC)

What happens: Interprets "John Doe published" (in "By [John Doe] published January 1, 1970") as author
What should happen: Should only interpret "John Doe"
Relevant diffs/links: [10]

http://www.pcgamer.com/nidhogg-devs-flywrench-is-finally-nearly-out/
    "author" : [
       "Tom Sykes",
       "published"
     ]
   ],

I will add fix for this. AManWithNoPlan (talk) 13:04, 22 January 2022 (UTC)

Consistent spacing

Status: new bug
Reported by: Abductive (reasoning) 03:24, 2 August 2021 (UTC)

What happens: bot added a date parameter in a ref with a space before every pipe, but did not include a space
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=53W53&type=revision&diff=1036681818&oldid=1036681278
We can't proceed until: Feedback from maintainers

I know this is a minor bug, but it bugs me. I know that the bot is written to make an attempt to duplicate the formatting already present in the ref. How it could have failed here, I don't know. But more importantly, it should default to the consensus ref formatting: space,pipe,parametername,=,parametervalue. (Spaces before pipes, no spaces around the equals signs or anywhere else, except perhaps before the curly end brackets if there already was a space there.) Abductive (reasoning) 03:24, 2 August 2021 (UTC)

I agree. The default should be space,pipe,parametername,=,parametervalue. --BrownHairedGirl (talk) • (contribs) 15:27, 2 August 2021 (UTC)

Cannot fix since the the bot already uses the existing citation template as a guide. Templates that are mixes in spacing such as these cannot be done in a way that makes everyone happy. AManWithNoPlan (talk) 16:45, 2 August 2021 (UTC)

But how to explain the example? The bot deviated from the format of the ref it edited? Abductive (reasoning) 16:59, 2 August 2021 (UTC)

I see, you want the bot to add spaces to existing parameters - in particular the last one. Interesting, the bot by default does not in anyway modify spacing of existing parameters. That parameter has no trailing spaces. As far as the bot in concerned there are no spaces before pipes, just spaces at the end of parameters. AManWithNoPlan (talk) 17:14, 2 August 2021 (UTC)

The bot must have looked at the lack-of-space of the last parameter (before the end curly braces) to come to the conclusion that the ref was formatted that way. Perhaps it should look after the "cite xxxx" for the cue? Abductive (reasoning) 17:51, 2 August 2021 (UTC)

not, that is not what it did. It simply does not change the spacing of existing parameters. The existing final parameter has no ending space, so the bot does not add one. AManWithNoPlan (talk) 21:14, 2 August 2021 (UTC)

Ah, I see what you are saying. It slotted it in at the end. Well, I had hoped that the bot could have provided a cure to the annoying new habit of users removing all spaces from refs, making a wall of text for editors. Abductive (reasoning) 22:25, 2 August 2021 (UTC)

And creates annoyingly unpredictable line wraps. Does this format really have consensus? If so, bots (any bot) could create a cosmetic function for citations they edit. -- GreenC 17:04, 6 August 2021 (UTC)

There are some people who like the "crammed" format. I started a conversation about the formatting here, but I don't really understand what they were saying. Abductive (reasoning) 02:06, 7 August 2021 (UTC)

As Abductive suggests, what the bot should do ideally is to check if the first parameter's pipe following the template name is preceded by a space (or even better, if at least one of the parameters' pipe symbol is preceded by space) and if it is, it should add a space in front of pipe symbol of newly inserted parameters, no matter where they are inserted into the parameter list. If the template has no parameters yet, the bot should fall back to the "default" format "space, pipe, parameter name, equal sign, parameter value" we consistently use in all CS1/CS2 documentation and examples. (Well, IMO, this latter format would ideally be made the only format used at all, but that's a discussion beyond the scope of CB issues here.)

Yeah, it is only cosmetic, but like Abductive I too find it somewhat annoying when previously perfectly formatted citations become misaligned by bot edits.

--Matthiaspaul (talk) 13:34, 7 August 2021 (UTC)

While I agree, this is actually going to be hard to implement. I will need to think about it. AManWithNoPlan (talk) 18:12, 8 August 2021 (UTC)

Still thinking about how to do this. It will have to deal with figuring out what the last parameter before adding a parameter to the very end, but no the middle. AManWithNoPlan (talk) 00:51, 4 September 2021 (UTC)

I ran into this same problem with my bot, I solved it by never adding a new parameter in the last position. It requires a function to determine what the second-to-last parameter is and assumes a library that supports placement of parameters. -- GreenC 18:25, 24 October 2021 (UTC)

Bot changing work= to newspaper=

Is there a convincing reason for the bot to make this change? template:cite news does not give it as preferred to work=. Whether some modern source is or is not a newspaper can be arguable, so why ask questions when you don't know the answer? Do you keep a massive table of which news sources are [physical] newspapers and which only have a web presence? Or which stories only ever appeared on the website but never made it into print? It seems to me that it is not broken so doesn't need fixing. --John Maynard Friedman (talk) 11:05, 7 January 2022 (UTC)

example diff. --John Maynard Friedman (talk) 20:08, 8 January 2022 (UTC)

@John Maynard Friedman: Who needs a massive table? We already have an entire database called Wikidata. It can easily be determined that The Washington Post is a daily newspaper via d:Special:EntityPage/Q166032#P31 (that said I think you have a valid argument as to whether such edits should be done by a bot). —Uzume (talk) 04:15, 9 January 2022 (UTC)

There actually is a (not massive) table. Wikidata would be too much work I think. Izno (talk) 04:44, 9 January 2022 (UTC)

All of which is no doubt interesting but irrelevant. It doesn't explain the value or purpose of the change. And now I see another one: diff] where 'work' has been changed to 'magazine' (which is true, but so what?) but it still says {{cite news}} – not {{cite magazine}} which I suppose just might be more useful metadata sometime. So a potentially useful change is ignored but a fatuous one taken. What is the point of this change? This is the cosmetic edit equivalent of changing eyeshadow. --John Maynard Friedman (talk) 10:08, 9 January 2022 (UTC)

That diff is a bug. Izno (talk) 18:41, 9 January 2022 (UTC)

And consequently we get totally pointless, annoying and counter-policy cosmetic edits like this one. Please stop. --John Maynard Friedman (talk) 10:47, 14 January 2022 (UTC)

I have made some fixes to the magazine code. AManWithNoPlan (talk) 15:26, 15 January 2022 (UTC)

@AManWithNoPlan:: now that the bot is reprogrammed with new rules, has there been a new request to the Wikipedia:Bot Approvals Group. —Sladen (talk) 08:06, 20 January 2022 (UTC)

The changes better implement the already approved tasks. AManWithNoPlan (talk) 12:22, 20 January 2022 (UTC)

@AManWithNoPlan: So, "no"? —Sladen (talk) 19:59, 20 January 2022 (UTC)

Correct, no new feature was added. AManWithNoPlan (talk) 20:04, 20 January 2022 (UTC)

User sandboxes

Why has @AManWithNoPlan set the bot loose on a set of 2,850 pages which so far seems to consist entirely of user sandboxes?

It doesn't seem to me to be appropriate for one editor to edit so many sandboxes. WP:USERPAGE#Editing_of_other_editors'_user_and_user_talk_pages discourages such editing.

I also don't see the utility of these edits. I just checked 5 successive edits in this batch, nd in ech ce the last edit was from 2017/2018 ([11] 2018, [12] 2018, [13] 2017, [14] 2017, [15] 2018).

The bot has limited capacity and is usually overloaded. Why is that capacity being squandered on editing stale userspace drafts, apparently without permission? BrownHairedGirl (talk) • (contribs) 20:14, 23 January 2022 (UTC)

I am trying to debug a crash and so ran it on very old and stale sandboxes. That way it would encounter old styles and such. I have since killed the job, since it did not die. AManWithNoPlan (talk) 20:16, 23 January 2022 (UTC)

Fair enough! That makes sense. BrownHairedGirl (talk) • (contribs) 20:38, 23 January 2022 (UTC)

Bad cite type change

Status: Fixed - now when it gets into a bad state, it will reset to initial type, rather than cite document
Reported by: —¿philoserf? (talk) 07:29, 22 January 2022 (UTC)

What happens: replaced cite web with cite document which is a redirect to cite journal without adding the required attribute, journal. cited content appears to be conference proceeding.
What should happen: change to cite journal with required attribute, or leave unchanged
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Atmospheric_lake&diff=next&oldid=1067198273&diffmode=source
We can't proceed until: Feedback from maintainers

Billboard refs

Is there a particular reason why the bot is changing references using the cite web templates for articles on Billboard's website to cite magazine? The print magazine is not being cited. Even when I corrected the change on an article, the bot came back and changed it to cite magazine again. -- Carlobunnie (talk) 01:00, 30 November 2021 (UTC)

Online magazines are still magazines. Headbomb {t · c · p · b} 01:07, 30 November 2021 (UTC)

Sure yes, but the cite web template is also still correct/applicable so where is the need to change it? Why doesn't the bot change all Time or Variety refs to the cite mag template also? My thing is that it's weird and inconsistent and unnecessary. -- Carlobunnie (talk) 00:06, 1 December 2021 (UTC)

Converting Cite web to Cite document creates a CS1 error

Status: Fixed a while ago
Reported by: GoingBatty (talk) 13:28, 25 January 2022 (UTC)

What happens: Citation bot converts some {{cite web}} template to {{cite document}} (which is a redirect to {{cite journal}}). This then creates CS1 errors Cite journal requires |journal= and adds the article to Category:CS1 errors: missing periodical.
What should happen: Please stop converting the templates. Consider resuming if {{cite document}} is split from {{cite journal}}.
Relevant diffs/links: See this edit to Nirmatrelvir.
We can't proceed until: Feedback from maintainers

Let us know if there are any new ones like this. AManWithNoPlan (talk) 15:04, 25 January 2022 (UTC)

Probably ALL instances of {{cite document}} that contain |url= and no |work=/aliases should be turned into {{cite web}} AManWithNoPlan (talk) 15:06, 25 January 2022 (UTC)

More bad titles

Status: Fixed
Reported by: * Pppery * _{it has begun...} 01:25, 27 January 2022 (UTC)

What happens: Citation bot adds |title=Folha Online - Especial - 2004 - Eleições - Apuração - Santos (SP) - Prefeito (1º turno) and |title=Pagina inicial. It's unclear what the title should be, but it's definitely not either of those things.
Relevant diffs/links: Special:Diff/1068174855

Converting cite document to cite journal without supporting paramaters

The citation bot updated Computer terminal at 08:10, 28 January 2022, changing

{{cite document  |title=G101-A Remote Time Share Terminal with Graphic Terminal
   |author=S. Pardee  |s2cid=27102280
 |date=1971
   |doi=10.1109/T-C.1971.223364
 |quote=Terminal cost is currently about $10,000}}

to

{{cite journal  |title=G101-A Remote Time Share Terminal with Graphic Terminal
   |author=S. Pardee  |s2cid=27102280
 |date=1971
   |doi=10.1109/T-C.1971.223364
 |quote=Terminal cost is currently about $10,000}}

S. Pardee (1971). "G101-A Remote Time Share Terminal with Graphic Terminal". doi:10.1109/T-C.1971.223364. S2CID 27102280. Terminal cost is currently about $10,000 {{cite journal}}: Cite journal requires |journal= (help)

with the preview message

Script warning: One or more {{cite journal}} templates have errors; messages may be hidden (help:CS1 errors|help).

and no error message on the actual citation. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:46, 28 January 2022 (UTC)

There is an error message on the citation. Because of drama elsewhere, the Cite journal requires |journal= error messages are hidden. Use your personal css to show hidden error messages. See Help:CS1 errors § Controlling error message display for instructions.

—Trappist the monk (talk) 15:58, 28 January 2022 (UTC)

The title was wrong, so the bot refused to update the other parameters. FYI, the CS1 error was there before the bot did the edit. AManWithNoPlan (talk) 16:14, 28 January 2022 (UTC)

{{fixed}}. The DOI was leading it to thinking that it was a journal. AManWithNoPlan (talk) 17:15, 28 January 2022 (UTC)

Cite document

Would batch runners of this bot hold off converting all instances of {{cite document}} to {{cite journal}} or {{cite web}}, please? There is a discussion at Help talk:Citation Style 1/Archive 81#"Cite document" needs its own template. Redirecting to cite journal is illogical and unfriendly. which I hope will have the effect of giving us a useful 'cite document' template that actually, wait for it... wait for it..., cites documents. Offline documents, online documents, you name it. Most but not all are PDFs. --John Maynard Friedman (talk) 17:17, 28 January 2022 (UTC)

{{fixed}}. Seems like CS1/CS2 people need to figure this out. AManWithNoPlan (talk) 17:41, 28 January 2022 (UTC)

I blame Editor Smith609: Special:Permalink/362305318.

—Trappist the monk (talk) 17:53, 28 January 2022 (UTC)

Soooooooo many template aliases. Ugh. AManWithNoPlan (talk) 18:05, 28 January 2022 (UTC)

.... ", sooooooo little time"

. Of course if you are feeling lucky [punk], you could propose that we go back to {{Citation}} with type= --John Maynard Friedman (talk) 18:41, 28 January 2022 (UTC)

There are a lot of ways people have been using cite document that need to be converted into many other types of templates including cite thesis or cite report like this diff. — Chris Capoccia 💬 21:05, 28 January 2022 (UTC)

or another mess of situations here. — Chris Capoccia 💬 21:17, 28 January 2022 (UTC)

or here's another that is really a thesis in Greek.

These are just the first few I picked from the "Pages that link to..." list. No idea how a bot is going to muddle its way through this without making a worse mess. — Chris Capoccia 💬 21:24, 28 January 2022 (UTC)

How to make it happen?

Trappist the monk has produced a sandbox version of the new template, but observes that there are 9000 uses of the current cite document in use. So here is my Cunning Plan: establish the new template temporarily as {{cite document2}}. Then our friendly local batch runners (who have already been beavering away 'correcting' instances of cite document into {{cite journal}} or {{cite web}} now have a third choice: the new template. Simple test: if is a pdf or has no url, then it is a genuine document that needs the new template. Existing tests for an actual journal are as present and should come first. Discards go in the cite web bin. 9000 edits later, we rename cite document2 as cite document and another bot run converts all uses to the corrected name. What could possibly go wrong? --John Maynard Friedman (talk) 18:41, 28 January 2022 (UTC)

In view of the number of much bigger problems with cite document being identified above, I am parking this proposal until further notice. My thanks to all who gave it time but I don't think it is fair to push it right now. --John Maynard Friedman (talk) 22:31, 28 January 2022 (UTC)

Adding editor as author twice

Status: new bug
Reported by: GoingBatty (talk) 23:14, 29 January 2022 (UTC)

What happens: Added the editor in the author fields |last=|first= and |last3=|first3=
What should happen: Add editor in |editor= or |editor-last=|editor-first=, don't add |last3=|first3= without |last2=|first2= or |author2=
Relevant diffs/links: This edit to Claire Annabel Caroline Grant Duff
We can't proceed until: Feedback from maintainers

Note that in the same edits it also added |publisher=UNIVERSITY OF ILLINOIS PRESS to books that were already marked as |place=London: Methuen. The correct change there would have been to continue to use the British publisher, |publisher=Methuen |location=London, rather than being schizophrenic and SHOUTY. —David Eppstein (talk) 23:35, 29 January 2022 (UTC)

I will look at the https://books.google.com/books?id=VXQXzUbCSygC meta-data. AManWithNoPlan (talk) 23:36, 29 January 2022 (UTC)

https://github.com/ms609/citation-bot/commit/db0d8ca1946d85077e5f98020b6562bea912b3fb shouty fix. AManWithNoPlan (talk) 00:05, 30 January 2022 (UTC)

https://github.com/ms609/citation-bot/commit/17b72270e94d162b99c3eb1210358433962d6288 location to publisher and visa versa fix. AManWithNoPlan (talk) 00:12, 30 January 2022 (UTC)

https://github.com/ms609/citation-bot/commit/e209b0bb03f197f3d667665fff68e489a13d107a skipping authors fixed. AManWithNoPlan (talk) 00:15, 30 January 2022 (UTC)

Existing bad authors made worse

Status: Fixed in https://github.com/ms609/citation-bot/commit/5e79a93f4e630d732f445467f7dc3acd9816e93d
Reported by: GoingBatty (talk) 22:24, 29 January 2022 (UTC)

What happens: Changed | author = Dinesh Ramde, Todd Richmond | author2 =Associated Press | author2-link =Associated Press to | agency = Dinesh Ramde, Todd Richmond | author2 =Associated Press | author2-link =Associated Press, thereby adding the article to Category:CS1 errors: missing name‎
What should happen: Not edit this reference, or change to | author = Dinesh Ramde, Todd Richmond | agency =[[Associated Press]]
Relevant diffs/links: This edit to Oak Creek, Wisconsin
We can't proceed until: Feedback from maintainers

Unique meta-data leads to author error

Status: Fixed in https://github.com/ms609/citation-bot/commit/476917ca16e6fe68562b5d057f08ac98e0427c03
Reported by: GoingBatty (talk) 22:20, 29 January 2022 (UTC)

What happens: When expanding cite arxiv, the bot added author27 without author26, thereby adding the article to Category:CS1 errors: missing name‎
What should happen: not add author27 without author26
Relevant diffs/links: This edit on BD+60 1417b
We can't proceed until: Feedback from maintainers

One of the authors is literally just the colon character. AManWithNoPlan (talk) 22:58, 29 January 2022 (UTC)

Feature request

Would you like to add a feature, to remove maintenance error: leading equal (==) in CS. https://en.wikipedia.org/w/index.php?title=3D_film&diff=1068666785&oldid=1066977786 This would clean the Category:CS1 maint: extra punctuation. Grimes2 (talk) 18:26, 29 January 2022 (UTC)

https://en.wikipedia.org/wiki/Category:CS1_maint:_extra_punctuation direct link. AManWithNoPlan (talk) 22:51, 29 January 2022 (UTC)

Will be {{fixed}} once deployed to the servers. AManWithNoPlan (talk) 16:55, 30 January 2022 (UTC)

Organizational author split into first/last parameters

Status: {{fixed}}
Reported by: GoingBatty (talk) 01:47, 30 January 2022 (UTC)

What happens: Organizational author incorrectly split into first/last parameters and duplicated, last2/first2 added without last1/first1 or author1, which added the page to Category:CS1 errors: missing name‎
What should happen: Correct author/editor fields
Relevant diffs/links: This edit to Draft:Mabel Kelly (actress); This edit to Operation Lost Trust
We can't proceed until: Feedback from maintainers

Thank you for reporting these things. Dealing with less than perfect refs and meta-data is a pain, but worth fixing. AManWithNoPlan (talk) 16:49, 30 January 2022 (UTC)

Book title added in last2/first2 fields

Status: {{fixed}}
Reported by: GoingBatty (talk) 02:09, 30 January 2022 (UTC)

What happens: Part of the article title added in last2/first2 fields, no last1/first1 fields means this was added to Category:CS1 errors: missing name‎
What should happen: No last/first fields added
Relevant diffs/links: This edit to Maurice King (lawyer)
We can't proceed until: Feedback from maintainers

Changing author to agency without changing author2/author3

Status: {{fixed}}
Reported by: GoingBatty (talk) 02:20, 30 January 2022 (UTC)

What happens: Changed |author=Reuters|author2=Sami Aboudi|author3=Bill Roggio to |agency=Reuters|author2=Sami Aboudi|author3=Bill Roggio, which adds the article to Category:CS1 errors: missing name‎
What should happen: Change author2 to author1, author3 to author2
Relevant diffs/links: This edit to Mohammed Jamal Khalifa
We can't proceed until: Feedback from maintainers

Two simultaneous batch jobs by one editor

I just noticed that the bot's recent contribs list appears to show two batch jobs being run simultaneously at the request of @AManWithNoPlan. One has 1776 pages, and the other has 1682.

That explains why the bot has been so unresponsive today to single-page requests.

How is this possible? And if it happened accidentally, why did @AManWithNoPlan not use https://citations.toolforge.org/kill_big_job.php to kill one of the batches? BrownHairedGirl (talk) • (contribs) 18:54, 30 January 2022 (UTC)

Usually not possible, but it is not perfect. I have submitted a kill job to the newer one, but the old one is almost done. AManWithNoPlan (talk) 19:41, 30 January 2022 (UTC)

And they are both gone. {{fixed}}. AManWithNoPlan (talk) 02:55, 31 January 2022 (UTC)

Thanks, @AManWithNoPlan.

To avoid this happening again, please can you do some monitoring of what the bot with a batch you submit. In this case, such monitoring should alerted you to both the two-jobs problem and the cosmetic edit problem.

And except for short-run testing of the bot's ability to avoid cosmetic edits, it would be advisable to stop using transcludes-template-redirect as the basis for selecting a batch. BrownHairedGirl (talk) • (contribs) 04:11, 31 January 2022 (UTC)

Extra punctuation

Status: Fixed
Reported by: Grimes2 (talk) 05:23, 31 January 2022 (UTC)

What happens: leading equal is not fixed with subsequent link
What should happen: https://en.wikipedia.org/wiki/Category:CS1_maint:_extra_punctuation fix
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=2007–08_Central_Coast_Mariners_FC_season&diff=1068998583&oldid=1068663079

wiki-link was not taken into account. should work for those too once deployed. AManWithNoPlan (talk) 09:44, 31 January 2022 (UTC)

cite arxiv vs cite arXiv

Status: new bug
Reported by: Headbomb {t · c · p · b} 13:50, 30 January 2022 (UTC)

What happens: change cite arxiv to cite arXiv when nothing else is done
What should happen: Leave those alone, unless making substantive changes elsewhere.
Relevant diffs/links: [16], [17]
We can't proceed until: Feedback from maintainers

Can you explain why Special:Diff/1068764242, whose only effect is to replace "cite arxiv" by "cite arXiv" (twice), does not violate the prohibition against purely-cosmetic bot edits? —David Eppstein (talk) 08:55, 30 January 2022 (UTC)

Was about to come in to report the bug for this edit. And @David Eppstein:, usually it's best to simply report this as a bug, rather than start to ask for 'explanations' of what clearly are bugs. Headbomb {t · c · p · b} 13:47, 30 January 2022 (UTC)

This might be bigger than just cite arXiv - see this edit to Joker (The Dark Knight), where the bot only changed {{cite DVD notes}} to its redirect {{cite AV media notes}}. GoingBatty (talk) 15:31, 30 January 2022 (UTC)

Investigating. Some fixes deployed, but existing runs will not see the changes just yet. AManWithNoPlan (talk) 16:21, 30 January 2022 (UTC)

@AManWithNoPlan: "existing runs will not see the changes just yet" So kill the run and restart it? Headbomb {t · c · p · b} 18:39, 30 January 2022 (UTC)

Cosmetic bot, again

The bot is nearing the end of a batch of 1776 of pages requested by @AManWithNoPlan, many of which consist solely of changing {{cite arxiv to {{cite arXiv. The bot's most recent 20 edits include only 14 from this batch, and 4 of those 14 edits make only that capitalisation fix: [18], [19], [20], [21].

These edits are pointless, because {{cite arxiv}} is a redirect to {{cite arXiv}}. So the capitalisation of the template name has zero effect on the rendered article: in each case, the same template is used. There is not even a boost to maintenance, because a difference in capitalisation has no impact on a search for the template.

WP:COSMETICBOT is very clear that such changes should not usually be done on their own, but may be allowed in an edit that also includes a substantive change.

This has been raised many times before, but there has been no fix. Even worse, the prevalence of this cosmetic fix in this batch makes it appear that use of the uncapitalised {{cite arxiv}} was probably a selection criterion for this batch. This seems to be flagrant defiance of WP:COSMETICBOT, and it's only the latest in a series of similar bitches. It is very disappointing to see the bot's maintainer again misusing the bot in this way.

@AManWithNoPlan does wonderful hard work in maintaining the bot. But these cosmetic batch jobs are a breach of bot policy, and they also squander the resources of the bot by displacing more productive jobs. Time to stop this. BrownHairedGirl (talk) • (contribs) 18:32, 30 January 2022 (UTC)

The bot job is a mixture of a wide variety of different possible fixes. With a complaints, the bot gets better about minor vs. major fixes. Doing this is surprisingly not as easy as one would think for such a complex bot. I have no more pages to submit at this time. I have submitted a request to kill the PHP job. AManWithNoPlan (talk) 19:37, 30 January 2022 (UTC)

@AManWithNoPlan: so the miscapitalised template name was one of the selection criteria? Not good, esp when this is an oft-reported bug, and there seems to have been no effort to monitor whether it was causing a lot of cosmetic edits.

I don't see why it is so difficult to identify most of the edits that are purely cosmetic. I don't know the code, so their may be issues I am unaware of ... but even if the causes are hard to track down, we still have the issue of a long-term problem of the maintainer repeatedly making series of edits which appear to have been designed to exploit the lack of a fix. BrownHairedGirl (talk) • (contribs) 00:03, 31 January 2022 (UTC)

Caps: Current Opinion in HIV and AIDS

Status: Fixed once deployed
Reported by: Headbomb {t · c · p · b} 20:00, 1 February 2022 (UTC)

What should happen: [22]
We can't proceed until: Feedback from maintainers

Caps: Chest

Status: Fixed once deployed
Reported by: Headbomb {t · c · p · b} 20:16, 1 February 2022 (UTC)

What should happen: [23]
We can't proceed until: Feedback from maintainers

TNT volume=Online edition

Status: Fixed -- will replace with better value
Reported by: Headbomb {t · c · p · b} 15:40, 2 February 2022 (UTC)

What should happen: [24]
We can't proceed until: Feedback from maintainers

dx.doi.org is flaky

We are aware of extra doi-broken-date flags being added. Have added extra checks to latest version. Existing runs will not see them. AManWithNoPlan (talk) 00:16, 30 January 2022 (UTC)

hdl.handle.org also improved. AManWithNoPlan (talk) 22:21, 30 January 2022 (UTC)

Still a bit of a problem, but new code is much better. AManWithNoPlan (talk) 20:02, 31 January 2022 (UTC)

Websites violating the generally accepted HTTP header standard.

Fixed. AManWithNoPlan (talk) 19:55, 2 February 2022 (UTC)

Work in Cite News

The bot has changed work to newspaper without any other changes. This is purely a cosmetic change since newspaper is an alias of work. Why is this happening? Guerillero ^{Parlez Moi} 21:12, 28 January 2022 (UTC)

TNT journal = Semantic Scholar

Status: Fixed
Reported by: Headbomb {t · c · p · b} 15:49, 4 February 2022 (UTC)

What should happen: [25]

Feature: usurped title

Hi, Example diff (second change). When a domain has been usurped it is replaced with |url-status=usurped and sometimes the |title= also contains usurped content. My bot is able to detect keywords that indicate a usurped title, but unable to determine a correct replacement title, so it adds the placeholder |title=usurped title. This placeholder was also discussed at CS1|2 help. It would be great if Citation bot was able to recognize the placeholder and fill in a better title. It would require using |archive-url= as the source since the |url= is usurped. There are not a huge number (197) but they are growing indefinitely due to the WP:JUDI case. -- GreenC 19:31, 27 December 2021 (UTC)

Good point, @GreenC.

I had been thinking about a similar issue: InternetArchiveBot's addition of |title=Archived copy when it archives a URL which lacks a title. That usage is categorised in Category:CS1 maint: archived copy as title, which currently contains over 160,000 articles.

In both cases, a remedy will require analysis of the archived copy of the page. Whether the generic title is |title=Archived copy or |title=usurped title, the same remedy is required ... so the two should be treated as one task.

I would much prefer that this was done by a new standalone bot, rather than incorporated into Citation bot. Citation bot is way overloaded even with its current task set. Adding in the huge backlog of generic titles would swamp Citation bot.

And in any case, I don't see any overlap between this task and Citation bot's other functions. This generic titles task does not need to add a cite template or change its type; all it needs to do is to change the value of |title=. The lookup of the archived title is not part of Citation bot's current capabilities.

So there is no benefit to including this in Citation bot, only downsides. This needs a new bot. I suggest a request at WP:BOTREQ. BrownHairedGirl (talk) • (contribs) 05:02, 28 December 2021 (UTC)

Dealing with the backlog could have its own dedicated bot, but there's no reason not to have this covered by Citation bot. Headbomb {t · c · p · b} 05:46, 28 December 2021 (UTC)

@Headbomb:: as I explained above, there are two good reasons not to have this covered by Citation bot:

Citation bot is already way overloaded. Adding another task will make that worse.
Direct lookup of title has never been part of Citation bot's function. It does lookup only indirectly, through the Zotero servers. BrownHairedGirl (talk) • (contribs) 06:58, 28 December 2021 (UTC)

If Citation bot wants to do "Archived copy" as well, great. If the concern is someone will submit a 160k job and swamp the system, blacklist the tracking category as input and let the bot do it incidentally while doing other jobs. While it whittles away. People have talked about a dedicated title bot forever. Citation bot already does titles (I think?), getting it from Wayback Machine is the same: <title>Page Title</title>. -- GreenC 06:19, 28 December 2021 (UTC)

@GreenC: your suggested backgrounding of this task would involve a lot of extra programming to Citation bot, which may not be compatible with its existing queue structure. And even if backgrounded, it would still be adding load to an already overloaded tool. In the last six months, there have been many discussion here about how to reduce that overload; adding to the load only make te prob worse, and your idea of

Blacklisting the tracking category would be a truly terrible idea: it would lock Citation bot out of any work needed on ~3% of all en.wp articles.

As above, Citation bot gets titles from the Zotero servers. Direct lookup from the Wayback Machine would be new functionality ... and the Zoteros are also very overloaded, so even if they could be pointed at the Wayback Machine, that would just exacerbate the overload.

There is zero advantage to bolting this function onto Citation bot, because it would all be new functionality; and there are huge downsides. This needs a new standalone bot ... and I think I may know the person who can do it. BrownHairedGirl (talk) • (contribs) 07:16, 28 December 2021 (UTC)

Blacklisting the category means that no one could request that the bot runs on 160K articles at once. This is already done. As for "would involve a lot of extra programming to Citation bot", let's let AManWithNoPlan decide on how feasible this is, since he's the coder. Headbomb {t · c · p · b} 08:46, 28 December 2021 (UTC)

@Headbomb: As above, blacklisting the tracking category would lock Citation bot out of any work needed on ~3% of all en.wp articles. That would be highly disruptive.

In the last 5 months, there have been repeated discussions about how overloaded Citation bot is. The latest such thread (see above: #And failure is the usual option again) was started less than 4 days ago by @John Maynard Friedman, who is understandably miffed at the lack of spare capacity in Citation bot. John wants clones of Citation bot to spread the load; that's a great idea in theory, but the bot maintainer has explained yet again why that is very unlikely to happen. So, dumping a backlog of 180k pages onto Citation bot is a recipe for perma-bottleneck. Even if the backlog was throttled to 500 pages per day, that's 360 days of increased overload.

So why on earth not just give this job to a separate bot? BrownHairedGirl (talk) • (contribs) 09:09, 28 December 2021 (UTC)

"As above, blacklisting the tracking category would lock Citation bot out of any work needed on ~3% of all en.wp articles." No it would not. You do not seem to understand the concept of a category blacklist here: The forbidding of doing a dedicated run on Category:Foobar. Headbomb {t · c · p · b} 09:13, 28 December 2021 (UTC)

@Headbomb: Thanks for finally explaining what the narrow meaning which you placed on the concept of "category blacklist" in this context.

Any such crude "category blacklist" would be pointless if implemented a ban on any dedicated run on Category:Foobar, because:

It is superfluous. The maximum allowed size for category jobs is 550, and tracking categories for these two generic titles already exceed that size.
A ban on simply throwing the category name at Citation bot would be easily circumvented simply by listing the pages in the webform, or by using the "linked pages" feature on the webform.

The bottom line here is very simple. Citation bot is massively overloaded, and adding a big extra task will exacerbate that overload ... so give the job to another bot. BrownHairedGirl (talk) • (contribs) 09:23, 28 December 2021 (UTC)

"adding a big extra task will exacerbate that overload" Tasks are given manually, subject to the usual limits. There's zero reasons why this task should be any lower priority than any other, save for your personal preference that other work be done instead. Things are not zero sum games, if another bot wants to tackle this, great, but there's zero reason to kneecap Citation Bot's usefulness on account of another bot. Headbomb {t · c · p · b} 09:26, 28 December 2021 (UTC)

@Headbomb: this is not complicated, and the relevant afcst are not a matter of "personal preference":

Citation bot's capacity is already roughly fixed, and we are at the limit.
The other tasks which Citation bot is doing are almost entirely tasks which for which Citation bot is the only available bot.

So this is a sort of zero sum game. If we add an extra task to an overloaded bot, some of the existing tasks will suffer.

That is why I argue for a separate bot for the extra task: because it allows all the tasks to be completed faster, by increased the sum of throughput. It is quite bizarre that you choose to dismiss as "personal preference" my call for this extra task to be handled in a way that does not exacerbate a well-documented overload problem. BrownHairedGirl (talk) • (contribs) 09:45, 28 December 2021 (UTC)

Anyone is free to code an additional bot. Which is not an argument to kneecap this one's usefulness because you don't personally want it to do certain types of edits. Headbomb {t · c · p · b} 09:47, 28 December 2021 (UTC)

Sigh. @Headbomb, please drop the aggressive hyperbole and the violent imagery. It is uncivil and disruptive. You have recently poisoned another discussion elsewhere with similar tactics; please refrain from repeating those hyperbolic falsehoods here.

Nobody, least of all me, is proposing to "kneecap" Citation bot. No reduction in its functionality is being proposed by anyone, let alone me. Kneecapping is a form of violent maiming which causes a severe reduction in capability, so labelling my objections as "kneecapping" is nonsense.

I have no objection in principle to adding extra functionality to Citation bot. My strong objection is purely pragmatic: that it would add an extra huge task to an already massively-overloaded bot. That would mean either glacially slow progress on the new task, or an worse bottleneck on the existing task. That is why I why I prefer to address the new task by creating new capacity.

I am surprised that you seem so determined to ignore the fact that Citation bot is already overloaded, or why you express that denialism by using such unpleasant imagery to misrepresent me ... but please stop. BrownHairedGirl (talk) • (contribs) 10:01, 28 December 2021 (UTC)

"violent imagery" You really need to have a major WP:AGF/reality check re-calibration if you think any of what I said is related to the literal meaning, and not, and that should be patently obvious, its metaphorical meaning. Headbomb {t · c · p · b} 10:04, 28 December 2021 (UTC)

On the contrary, @Headbomb: you need to have a dictionary check.

Try e.g. Merriam-Webster, Collins, The Free Dictionary, Dictionary.com, or Lexico: all define the word "kneecapping" as an act of violence against the person.

And it is you who needs a major WP:AGF/reality check re-calibration. You have falsely accused me of seeking to create grave injury to Citation bot. That is patently false: there is no way in which anything I have written in this thread could be reasonably or plausibly interpreted in that way.

You use of violent imagery to misrepresent me is bad enough. But the fact you then then falsely accuse me of needing a reality check is a form of gaslighting. Please stop your vicious bullying tactics. BrownHairedGirl (talk) • (contribs) 10:28, 28 December 2021 (UTC)

PS your claim of metaphorical usage alters nothing. It is a violent metaphor, which is most unpleasant ... and even in its mildest meaning it is in no way a fair description of what I propose.

Please drop the hyperbole and the violent imagery. BrownHairedGirl (talk) • (contribs) 10:32, 28 December 2021 (UTC)

Without getting into the dubious ethics of using inappropriate choice of language (or choice of inappropriate language), the reality right now is that CitationBot is frequently, almost usually, unusable by single-use requestors. I have argued and continue to argue that the batch runs need to be restrained in some way at least until we can get separate instances of the bot. There may be some resolution in sight, see Wikipedia:Village pump (technical)#Is there a ToolForge doctor in the house? CitationBot could use some help. Headbomb may have an argument that a batch run is a batch run is a batch run, so what makes their batch run any less deserving that anyone else's (apart from being a machine-gun to kill grass-hoppers). Right now, it is simply and wildly unrealistic to add another batch run to the load: it won't achieve its own objectives in any useful timescale; it will mean that the same happens to the other batch runs; and it guarantees that individual articke requests will invariably fail rather than just usually. There is a WP article about that attitude: WP:DISRUPTIVE. Maybe it is not fair to be labelled as disruptive for just being the one who loaded the last straw, but tough. Headbomb, you need to find another bot that will do what you need, this is not an argument worth winning. --John Maynard Friedman (talk) 14:00, 28 December 2021 (UTC)

Everyone wins when Citation bots get more useful. People choose what batch run they submit. If you don't want your batch run to be used for 'archived title', simply don't submit one. Headbomb {t · c · p · b} 17:35, 28 December 2021 (UTC)

What if you can't run a batch job at all, or can't get an individual page processed, because someone else has swamped Citation bot with this extra task? What do you do about that? BrownHairedGirl (talk) • (contribs) 17:42, 28 December 2021 (UTC)

You wait a bit, and your task gets processed. Unless you want to get greedy and submit multiple batch runs, then you have to wait till your first batch run is processed. No different than the current situation. There's nothing special about this 'extra task'. Headbomb {t · c · p · b} 17:51, 28 December 2021 (UTC)

Not so. The special thing about this extra task is that unlike Citation bot's existing tasks, it doesn't have to be done using Citation bot.

It is interesting to see that you describe my submission of multiple batches of high-return bare URL cleanup jobs as "greedy". BrownHairedGirl (talk) • (contribs) 18:03, 28 December 2021 (UTC)

edit conflict No, Headbomb, that does not happen. You click the gadget, wait for five minutes and you get a message to say that it has failed. So you resubmit, wait, same result. And again. And again. So you stop bothering. It seems that batch runs don't fail, they just smell that way. Batch runs are disruptive to ordinary editors right now and the more of them that run, the more disruptive they are - and just get in each other's way too but of course that doesn't matter when you can fire and forget. Meanwhile in the real world... --John Maynard Friedman (talk) 18:08, 28 December 2021 (UTC)

And it eventually gets processed. See for example this run. I requested it at 5:45am or so, then got the timeout page. And then at 6am it started being processed. Sometimes it takes hours, sometimes it's minutes, but batch runs do get processed. Submitting one over and over and over serves no purpose. Headbomb {t · c · p · b} 18:55, 28 December 2021 (UTC)

Calling the caped crusader

@Rlink2: if we give you a snazzy cape and your own special car, please can you help out here? Your mission is to make a new bot which takes any CS1/CS2 template with |title=Archived copy or |title=usurped title, looks up the linked archived copy, and extracts a meaningful title from the contents of <title>Page Title</title>.

Please come to our rescue! Your reward in gold will be in the usual place --BrownHairedGirl (talk) • (contribs) 07:28, 28 December 2021 (UTC)

@BrownHairedGirl: @GreenC: Sorry for the delay, and thanks for the humor. I just created a script that can handle this. It will be able to extract the URL titles from web.archive.org and ghostarchive.org. Webcite I can do when the site is back up. There are some edge cases i am yet have to code in but it works mostly. See diffs: Special:Diff/1062447127, Special:Diff/1062447385, Special:Diff/1062447728 and Special:Diff/1062447750,.

Note that it would not work for PDF files, such as the one in 2G spectrum case. It will not work with archive.today URLs either (This is not my fault), so GreenC when replacing links always try to use web.archive.org if placing "archived title". Rlink2 (talk) 14:35, 28 December 2021 (UTC)

Holy moley, @Batman, that was fast!

I checked all 4 diffs, and in each case the result looks good. It is not perfect, because some of the websites abuse the title field to advertise the site, like <title>Storms lash Hightown {{!}} ZYX News, the leading local news service for YOUR town!</title> ... which causes the cite template to have |title=Storms lash Hightown | ZYX News, the leading local news service for YOUR town! rather than just |title=Storms lash Hightown.

But coding to strip this sort of junk is a huge job, and in my view it's much better to have such a verbose title than to just have a generic placeholder. Editors can manually trim such fluff if they have the time.

One enhancement would be useful if it's not too much work: add a |website= parameter, unless it (or |work=/|magazine=/|newspaper= is already present. It seems to me that this should be a relatively easily-coded enhancement, but please ignore this request if it's too much hassle.

Thanks again for the very prompt response. BrownHairedGirl (talk) • (contribs) 14:59, 28 December 2021 (UTC)

@BrownHairedGirl: I can add the website parameter, but I do not know much about it. Does the name of the website go there? Documentation regarding this would be helpful Rlink2 (talk) 15:06, 28 December 2021 (UTC)

@Rlink2: see Template:Cite_web#Website. Basically, a simple and widely-used way of filling it is just to use the domain name, i.e. the text between the 2nd and 3rd slashes in the URL, e.g. in |, just use "www.example.com": |website=www.example.com.

Editors or other bots may later replace that with something more informative (e.g. Citation bot will replace |website=www.washingtonpost.com with |newspaper=[[The Washington Post]]) ... but |website=www.washingtonpost.com is way more useful than no website field. BrownHairedGirl (talk) • (contribs) 15:21, 28 December 2021 (UTC)

Do not use the domain name in the |website= field. I have been admonished by multiple people don't do this. If you are unsure ask at Help talk:Citation Style 1 first. -- GreenC 16:12, 28 December 2021 (UTC)

@GreenC: I have used the domain name in thousands of refs. WP:Reflinks does it automatically. It's better than having nothing to identify the website. BrownHairedGirl (talk) • (contribs) 17:26, 28 December 2021 (UTC)

Actually I think it was on this talk page that someone told me not to do it because I made a feature recommendation that Citation bot could do it. Is reflinks still well maintained? Old tools do things that no longer have good support and no one actively fixing them. Update: here it is: User_talk:Citation_bot/Archive_28#Adding_website_field -- GreenC 18:06, 28 December 2021 (UTC)

Not adding the domain name and thereby leaving the website field as blank or missing is letting the best be the enemy of the good. BrownHairedGirl (talk) • (contribs) 18:14, 28 December 2021 (UTC)

PS your update crossed with my post. So, one editor objected, with no claim of any consensus for their view, let alone evidence. BrownHairedGirl (talk) • (contribs) 18:17, 28 December 2021 (UTC)

No doubt, but there is established controversy so clarification at Help talk:Citation Style 1 would be advisable before making mass edits to avoid the blow back it might cause. Particularly without bot approval. -- GreenC 18:33, 28 December 2021 (UTC)

@GreenC: I understand your desire to avoid drama. But, as a general principle, it is horribly bureaucratic to be pushed to debate every step of incremental progress against those who who prefer no progress to an incomplete improvement. BrownHairedGirl (talk) • (contribs) 19:15, 28 December 2021 (UTC)

I agree with @BrownHairedGirl:'s approach to Wikipedia 100 percent. I also believe in the ideas of incremental improvement even if the solution isn't perfect all the time. Rlink2 (talk) 19:26, 28 December 2021 (UTC)

A bot operator would be wise to check out why this admin said don't do it, before proceeding at a mass scale. There might be a consensus discussion somewhere we don't know about. I would not assume the admin was being bureaucratic to avoid progress. -- GreenC 19:33, 28 December 2021 (UTC)

I agree with BHG's analysis and conclusion. Furthermore, the documentation for {{cite web}} places no such constraint on what may be given in website=, though none of the examples use the fully qualified domain. (What they do do is give an example argument that matches more to my conception of work=, giving website=Encyclopedia of Things, which is surely a work.) Take a contrarian example: Amazon Inc has multiple websites, simplistically language but also in product offering. So it is certainly useful, probably important, to know that the website is amazon.de, amazon.es, or amazon.co.uk. I don't know who decided that website=<domain> is deprecated but it is not policy and if its a rule, it is certainly one to be ignored when the circumstances suggest otherwise, as they do in this case. --John Maynard Friedman (talk) 19:36, 28 December 2021 (UTC)

Rlink2, given this particular example, the script should catch | and swap it to {{!}}. Izno (talk) 00:21, 29 December 2021 (UTC)

Thanks for the heads up Izno. I believe the script already does this, but I will double check to make sure. Rlink2 (talk) 00:30, 29 December 2021 (UTC)

It is technically trivial to extract the title field from a page. And completely difficult to make sure that title string is appropriate for use on Wikipedia. That's why no one does it. Go slowly. Build up rules on what kind of material to keep and keep-out based on experience. -- GreenC 16:11, 28 December 2021 (UTC)

I will, thanks for the tip. I always go slow at first with every new thing I do. This is no different. Rlink2 (talk) 16:24, 28 December 2021 (UTC)

|website=domain is problematic for a huge variety of reasons. For example, if [26] were archived, we wouldn't want |website=mdpi.com, but rather |journal=Religions. Or if this were archived, we'd don't want |website=books.google.ca. Headbomb {t · c · p · b} 20:05, 28 December 2021 (UTC)

Hard cases make bad law. url=books.google.abc is an obvious exception that goes in the exclusion list. Ditto Archive.org, archive.is, archive.today etc. Academic publishers like mdpi, Wiley etc - well, yes, what is so terrible about the first pass giving these as the website=, after all that is indeed the relevant website – does the journal Religions have any other (or rather one more relevant)? A 'first pass, low impact' bot can deal with the 95% that are straight-forward and tag the others for CitationBot to give the deep-cleanse treatment, finding the doi etc etc. --John Maynard Friedman (talk) 20:46, 28 December 2021 (UTC)

The point is those 'hard cases' are extremely common. As are cases like

J. I. Friedman. "The Road to the Nobel Prize". Huế University. Archived from the original on 2008-12-25. Retrieved 2008-09-29.

which should not get |website=hueuni.edu.vn. Headbomb {t · c · p · b} 21:31, 28 December 2021 (UTC)

The above should not be a {{cite web}} but rather a different cite template. "Garbage in, garbage out" should not be a determining factor on what to try and not try. Jonatan Svensson Glad (talk) 22:13, 28 December 2021 (UTC)

Cite web is exactly the template appropriate for this. Headbomb {t · c · p · b} 22:49, 28 December 2021 (UTC)

Hmm, didn't actually open the link (that was my bad), I thought it would have been a journal article. However, I don't see the issue with adding |website=hueuni.edu.vn as long as no |publisher= would have existed. Jonatan Svensson Glad (talk) 00:07, 29 December 2021 (UTC)

@Headbomb: On what basis could you say that |website=hueuni.edu.vn is wrong? That is exactly where the story was first posted. [And if we already have that much info in the citation, what is there left for a bot to do?]

So the result would be

J. I. Friedman. "The Road to the Nobel Prize". hueuni.edu.vn. Huế University. Archived from the original on 2008-12-25. Retrieved 2008-09-29.

Apart from the redundant detail, what is so terrible about it? (Personally I have yet to see any value added by the website= option and actually am more worried that it so open to domain spoofing, where the actual url= says myscambank.com and the website= says myfriendlylocalbank.com and we walk an unsuspecting visitor into a trap and get sued, but I guess that is an argument for another place.) --John Maynard Friedman (talk) 01:24, 29 December 2021 (UTC)

TBH, Headbomb, your example strikes me as a straw man. What problem are we trying to solve here? Let's suppose we have a citation like

{{cite web |title=Foundation of Smallville |url=https://www.smallvillehistory.ky.us |website=Smallville History Society }}

which yields

"Foundation of Smallville". Smallville History Society.

but the web page fell into disuse, the domain registration was not renewed and a gambling site reregistered it to redirect to their site. So what we really want to is mark it as dead and get the last good archived copy (can that be done automatically?) so that vistors only see the archived version and not the redirect site. Yes, we definitely don't want to introduce a website=smallvillehistory.ky.us because that leads to the gambling site, but who says we need to supply one if it is not already present? In fact we need to remove anything like a website=<domain name> because we know it is invalid. --John Maynard Friedman (talk) ~~01:54, 29 December 2021 (UTC)~~ revised 02:13, 29 December 2021 (UTC)

|website=Smallville History Society is dead wrong. The Smallville History Society is not a website. It's the publisher. Headbomb {t · c · p · b} 02:19, 29 December 2021 (UTC)

That is a bizarre statement.

|website=Smallville History Society does not assert that "Smallville History Society" is a website. It asserts that the URL is on the website of the Smallville History Society, which is branded as "Smallville History Society". BrownHairedGirl (talk) • (contribs) 02:31, 29 December 2021 (UTC)

That's very literally what it asserts. Headbomb {t · c · p · b} 10:29, 29 December 2021 (UTC)

I would have used |work=News & events for the Hue University example citation: J. I. Friedman. "The Road to the Nobel Prize". News & events. Huế University. Archived from the original on 2008-12-25. Retrieved 2008-09-29. It's not a particularly helpful part of the citation, but its not as bad as putting the url as the work and it preempts other editors from doing the wrong thing. —David Eppstein (talk) 02:05, 29 December 2021 (UTC)

Except that News & events not the work. There is no publication, nor work, called that. It's a section of the main university website. Headbomb {t · c · p · b} 02:18, 29 December 2021 (UTC)

It is the "News & events" section of the main university website. That's why it says "News & events" on the page, right above the title of the individual page, and why it has "News & events" listed as one of the main sections of the university website in the link bar at the top of the page. It is the highest-level point of organization of web content that is in any way useful to distinguish from the organization that published the content. And as I tried to say in my earlier comment that I replied to but you seem to have missed, the point is less to name the whole website and more to find something plausible to use for that slot so that bad editors do not fill it with bad content like the url hostname. —David Eppstein (talk) 07:22, 29 December 2021 (UTC)

"more to find something plausible" which is exactly what it shouldn't do. If you click on [ Structure of Hue university ] instead, the work/website doesn't all of a sudden become Structure of Hue university. There is no larger work here, and we should not shoehorn one simply because a parameter exists in a template. Headbomb {t · c · p · b} 10:28, 29 December 2021 (UTC)

And failure is the usual option again

I really don't see the point of encouraging editors to install the gadget when most uses of it end in failure, which has been my experience every time I've tried to use it this week. At the same time, I have no problems running refill, so it is not toolforge. Can we have a completely separate process instance for gadget users please?, and let the batch runners fight it out between themselves. --John Maynard Friedman (talk) 12:57, 24 December 2021 (UTC)

"it is not toolforge": that it is incorrect. When Refill is being used by several people, it fails also. That tool has way too many bugs to run as a bot, which is why it is used so much less. AManWithNoPlan (talk) 14:47, 24 December 2021 (UTC)

a second instance would be very nice just for the gadget. Someone with access to toolforge would have to do it. Once spawned, I could modify the code to refuse non-gadget runs on that interface. AManWithNoPlan (talk) 14:51, 24 December 2021 (UTC)

and a third instance would be very nice for single pages too. AManWithNoPlan (talk) 15:26, 24 December 2021 (UTC)

If you were to create another bot, for argument let's call it CitationBatchBot, code-identical to the current bot, would that de facto create another instance without needing to hack toolforge? The very few batch users could be 'persuaded' to use that one, leaving the original for gadget and command-line (sic?) editors. True? --John Maynard Friedman (talk) 20:06, 24 December 2021 (UTC)

Someone with access would need to do this. I do not have access and my toolforge account is in some weird limbo state and unusable. AManWithNoPlan (talk) 20:10, 24 December 2021 (UTC)

Is there a page where I can request other editors to run the bot for articles I edit? I think it will work if I can put the my(our) request in the to-do list of batch runners.--SilverMatsu (talk) 02:56, 7 January 2022 (UTC)

Adding year breaks sfn templates

Status: new bug
Reported by: GoingBatty (talk) 23:08, 29 January 2022 (UTC)

What happens: Adding a year to a reference breaks the functionality of the corresponding {{sfn}} template that doesn't contain the year, placing the article to Category:Harv and Sfn no-target errors‎
What should happen: Skip the article, or add the year to the corresponding {{sfn}}
Relevant diffs/links: This edit to Los Frailes ignimbrite plateau
We can't proceed until: Feedback from maintainers

This has been brought up before. The conclusons were that having the bot edit {{sfn}} was bad and that the {{sfn}}'s were defective since they contain no year and thus are hyper-non-specific. AManWithNoPlan (talk) 23:34, 29 January 2022 (UTC)

This is GIGO. Bot fixes the citation, trading one invisible error for a visible error. Would be nice if the bot could fix SFNs, but it's not required. Humans can do that cleanup better than bots can usually. Headbomb {t · c · p · b} 15:29, 4 February 2022 (UTC)

‎Cosmetic bot, again again

Yet another series of cosmetic edits by the bot, which yet again appears to have been chosen by the bot's maintainer on the basis of this cosmetic task

The bot's latest contribs list shows a batch job of 1605 pages suggested by @AManWithNoPlan. All of the edits which I have checked in this batch include the cosmetic change of |p= to |page=, which is a purely cosmetic change: in CS1/CS2 templates |p= is an alias of |page=, so the change has zero effect on the output of the template.

See e.g. these 16 edits which consist solely of |p= to |page=: [27], [28], [29], [30], [31], [32], [33], [34], [35], [36], [37], [38], [39], [40], [41], [42] ... and this edit[43] which is just |p= to |page= plus trimming redundant whitespace

Per WP:COSMETICBOT, such changes may be made as part of a substantive edit, but not on their own.

I understand that avoiding cosmetic-only edits may not be easy to code for. But it's very unhelpful for anyone to use a cosmetic issue as the basis for selecting a batch, and when the bot's maintainer does that it looks like they are exploiting the bug. Not good.

@AManWithNoPlan, please can you kill this cosmetic batch job? BrownHairedGirl (talk) • (contribs) 22:10, 7 February 2022 (UTC)

It's still happening. In the latest set of bot edits, 13 of the first 20 edits are in @AManWithNoPlan's batch. At least 6 of those 13 are only |p= to |page=: [44], [45], [46], [47], [48], [49]. BrownHairedGirl (talk) • (contribs) 03:18, 8 February 2022 (UTC)

Odd, that run list came from dozens of different regex and searches. AManWithNoPlan (talk) 12:15, 8 February 2022 (UTC)

Ballotpedia.org

Status: Fixed
Reported by: Fettlemap (talk) 06:08, 8 February 2022 (UTC)

What happens: Changes Ballotpedia.org from cite web to news
What should happen: no change
Relevant diffs/links: https://en.wikipedia.org/wiki/Special:MobileDiff/1070573229

Caps: AIDS Research and Human Retroviruses

Status: Fixed
Reported by: Headbomb {t · c · p · b} 20:29, 8 February 2022 (UTC)

What should happen: [50]

Publication dates for pages on LeatherLicensePlates.com

Status: Not a bug
Reported by: Klondike53226 (talk) 17:08, 17 February 2022 (UTC)

Citation bot is adding publication dates for pages on LeatherLicensePlates.com when, AFAIK, these pages do not state said dates anywhere that is obvious - which makes me wonder *exactly* how this bot is coming up with these dates, and even wonder whether it is simply making these dates up.

On Vehicle registration plates of Nevada, which uses LeatherLicensePlates.com's Nevada page as one of its sources, Citation bot has *twice* added August 27, 2015 as the publication date for this page:

But nowhere obvious on the page does it say that the page was published August 27, 2015:

http://leatherlicenseplates.com/old-nevada-license-plates-vintage-nevada-license-plates/

Citation bot has also added August 27, 2015 as the publication date for LeatherLicensePlates.com's Connecticut, Delaware, District of Columbia, Indiana, Missouri, Nebraska, New Jersey, and New York pages:

Again, however, nowhere obvious on any of these pages is August 27, 2015 stated as the publication date:

Finally, Citation bot has added August 28, 2015 as the publication date for LeatherLicensePlates.com's Oregon and South Dakota pages...

...and *three times* has added this particular date as the publication date for the site's Ohio page.

Once again, though... I can't see this date stated anywhere obvious on any of these pages.

Obviously, I myself am not a bot, and my intelligence isn't artificial - but with respect where respect is due, I would *never* put a date on when a source was published if I didn't know what that date was, that date wasn't stated anywhere obvious on the source, and there was no easy way at all of determining that date.

Hence - and again with all due respect - I would quite like to know exactly how Citation bot is coming up with these particular dates for the pages on LeatherLicensePlates.com. I'd be pretty shocked if it turned out that this bot was making these dates up...

Klondike53226 (talk) 17:08, 17 February 2022 (UTC)

If you right-click on a webpage and select 'View Page Source', you will see a date if the author added it into the html code. But you are correct that the date is not displayed on the webpages above. I myself have occasionally used the embedded/hidden date when manually creating a ref, and am curious if there is a general guideline on this practice. Abductive (reasoning) 17:17, 17 February 2022 (UTC)

More generally, what HTML metadata can a wiki bot legitimately use for {{cite web}}? --Shmuel (Seymour J.) Metz Username:Chatul (talk) 17:49, 17 February 2022 (UTC)

Not the bot's fault. In the wikitext editor from the Cite menu > Templates choose cite web. Put this url in the UTL field: http://leatherlicenseplates.com/old-nevada-license-plates-vintage-nevada-license-plates/ and click on the adjacent quizzing glass icon. Wait a bit and when other fields are filled, click Show/hide extra fields and there in Date will be 27 August 2015. Doing these things returns this template:

{{cite web |title=Old Nevada License Plates {{!}} Vintage Nevada License Plates |url=http://leatherlicenseplates.com/old-nevada-license-plates-vintage-nevada-license-plates/ |website=LeatherLicensePlates.com |date=27 August 2015}}

"Old Nevada License Plates | Vintage Nevada License Plates". LeatherLicensePlates.com. 27 August 2015.

which (sigh) is malformed because the website name does not belong in |title=...

—Trappist the monk (talk) 18:07, 17 February 2022 (UTC)

@Abductive, Chatul, and Trappist the monk: Ah, *now* I see. :) I see, too, that the HTML metadata for each of LeatherLicensePlates.com's pages also contains the date on which the page was last modified.

Must admit, though, that I find myself wondering as well if there is a guideline for when the publication date is stated in the page's HTML metadata but not on the page itself. Klondike53226 (talk) 18:15, 17 February 2022 (UTC)

Community Wishlist Survey 2022: More capacity for Citation bot

See my proposal at https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2022/Citations/More_capacity_for_Citation_bot

If you support this idea, or have suggestions for improving the proposal, please post at https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2022/Citations/More_capacity_for_Citation_bot BrownHairedGirl (talk) • (contribs) 18:45, 10 January 2022 (UTC)

I support the idea, but I'm not sure it's anything the WMF can do anything about. Or would be willing to, given they only filled one relatively trivial request lasts year. Headbomb {t · c · p · b} 19:49, 10 January 2022 (UTC)

@Headbomb: the WMF certainly can do something. In the year to end of June 2021 the WMF's income was US$163million and its expenditure was US$112million, so it had a US$51 million surplus. With net assets of US£231 million, the WMF is in a very strong financial position.

Employing a few skilled programmers on the community tech team wouldn't cost even 0.5% of that 2020/2021 surplus.

It may indeed be the case that the WMF is not willing to spend their money to assist editors. But they could if they wanted to. BrownHairedGirl (talk) • (contribs) 04:50, 11 January 2022 (UTC)

@BrownHairedGirl and @Headbomb: we could also spend Google's money; see WP:VPT#Proposed Google Summer of Code project: expanding citations. Enterprisey (talk!) 08:03, 12 January 2022 (UTC)

italics

Why does citation bot add italicization to Associated Press and Reuters as seen here? Our own articles about those news agencies don't italicize them. — Fourthords | =Λ= | 02:58, 24 December 2021 (UTC)

Hello, fourthords,

You might try asking the bot operator. Being a bot, it won't be replying to inquiries here. Liz ^{Read! Talk!} 03:43, 24 December 2021 (UTC)

Since my inquiry was about this bot's edits, this seemed the most appropriate place to ask. Apparently plenty of editors (and possibly the bot's programmer, somewhere) are watching this page. — Fourthords | =Λ= | 17:22, 24 December 2021 (UTC)

Because |agency= is to be used when the work of Reuters or AP (and other agencies) is republished in another publisher's work (typically a newspaper). When Reuters or AP (and other agencies) is cited directly, then the source is the 'work'. We cite the work not the corporate entity. The en.wiki articles are not italicized because the articles are about the corporate entities. In both of these cases, the corporate entities have eponymous websites that are the sources so those names go in |work= when citing their articles directly.

—Trappist the monk (talk) 03:58, 24 December 2021 (UTC)

Should, then, this script not be performing its edits in contravention of this bot? — Fourthords | =Λ= | 17:22, 24 December 2021 (UTC)

The Associated Press is an organization, not a collection of documents, and should therefore be listed under |via= or |publisher=, not |work=. Organization names are not italicized; periodicals, edited volumes, websites, or other collections of documents are. The current name of the collection of documents that the Associated Press publishes appears to be AP News. If "Associated Press" is being used in the work parameter, it is being used incorrectly there. If the bot is moving "Associated Press" to the work parameter without changing it to "AP News" or some similar name for the work rather than the organization, it is doing the wrong thing and should stop. —David Eppstein (talk) 18:59, 24 December 2021 (UTC)

Ah, that seems to be in contravention of what Trappist the monk (talk · contribs · blocks · protections · deletions · page moves · rights · RfA) said, [AP and Reuters] have eponymous websites that are the sources so those names go in |work= when citing their articles directly. Is there an explicit MOS or guideline that says one way or another, then? — Fourthords | =Λ= | 19:19, 24 December 2021 (UTC)

I presume that Editor David Eppstein did not intend to write: Organization names are not capitalized (emphasis added)

Editor David Eppstein and I rarely agree on anything but in this case, for the most part, I think that we agree. The Associated Press is an organization, my term was 'corporate entity'. We don't cite organizations or corporate entities, we cite their work. The Associated Press has an online presence at AP News (I hadn't bothered to look – Reuters has an eponymous online presence). That name for the collection of documents is italicized when one of the documents that it holds is cited. AP News is sufficiently similar to the corporate name that it is not necessary to write |publisher=The Associated Press (|via=The Associated Press should not be used for work distributed from AP News because AP News is the publisher's outlet).

I do not know of any MOS or guideline covering this though the topic is surprisingly volatile with entrenched camps on both sides of the italic/no-italic divide. There is some, reasonably stable text at Help:Citation Style 1 § Work and publisher.

—Trappist the monk (talk) 20:19, 24 December 2021 (UTC)

Typo fixed; I meant "italicized" not "capitalized". —David Eppstein (talk) 20:23, 24 December 2021 (UTC)

I think news agencies like Reuters and Associated press were exactly what was intended for use by the agency parameter in cite news and not the via parameter. Could be publisher, website or work if you're actually citing a page on the Reuters or AP website. But if you're citing a news source where some other organization like NBC, Fox News, Washington Post or whatever is publishing an AP story, then agency is the best parameter. — Chris Capoccia 💬 21:31, 10 February 2022 (UTC)

Adds URL instead of Project MUSE parameter

Status: feature request
Reported by: — Chris Capoccia 💬 17:41, 20 November 2021 (UTC)

What happens: expanding doi 10.3751/69.3.12 adds URL, but seems like better choice would be to use Project MUSE template with id parameter, Project MUSE 586504
Relevant diffs/links: diff
We can't proceed until: Feedback from maintainers

URL is better unless the identifier auto-links. Nemo 22:10, 21 November 2021 (UTC)

what do you mean by "auto-links"? — Chris Capoccia 💬 15:22, 27 November 2021 (UTC)

Some identifiers can automatically add themselves to the title, when |muse-access=free is present with the |muse=12345678. AManWithNoPlan (talk) 23:02, 8 December 2021 (UTC)

Simplewiki

Simplewiki is full of thousands of easily fixed CS1 errors, bare urls, and basically all the sorts of stuff that this bot is supposed to fix. Simplewiki uses the same template parameters on the cite templates, so why doesn't this bot run there? Is there something I'm missing here? Please add support for simplewiki? Mako001 (C) (T) 12:26, 20 February 2022 (UTC)

They are not the same citation templates. Similar, but not the same. Several years older, that migt be an issue. The gadget mode probably just works if you add it to your common.js file - other than it most likely timing out and failing because the bot is too busy. Might also need different oath tokens for non-bot mode. AManWithNoPlan (talk) 12:57, 20 February 2022 (UTC)

They are not the same citation templates. Similar, but not the same. Several years older, that migt be an issue. Are you sure about that? simple:Module:Citation/CS1 was updated 24 January 2022) and simple:Template:cite book, simple:Template:cite web, simple:Template:cite news, simple:Template:cite journal, and simple:Template:cite encyclopedia all appear to be up to date.

—Trappist the monk (talk) 13:07, 20 February 2022 (UTC)

cite arxiv is not. AManWithNoPlan (talk) 13:27, 20 February 2022 (UTC)

Gadget works - just verified. AManWithNoPlan (talk) 14:04, 20 February 2022 (UTC)

Making some changes to make port possible. Considering just how few pages use CS1/CS2 templates, probably no need for a separate bot instance. AManWithNoPlan (talk) 16:50, 20 February 2022 (UTC)

If one believes the {{NUMBEROFARTICLES}} magic word at simple.wiki, there are 204,780 articles. If one believes the results of this search, there are approximately 102,400 articles that use a cs1|2 template. So, roughly half of all articles use cs1|2. I suspect that that number will change as simple.wiki gains more articles.

I was looking at cite journal usage, which is quite low. AManWithNoPlan (talk) 17:27, 20 February 2022 (UTC)

Easily remedied. Editor Djsasso, who imported the module suite and some or all of the mated templates, can simply import the current version of {{cite arxiv}} from en.wiki.

—Trappist the monk (talk) 17:08, 20 February 2022 (UTC)

Code implemented. To avoid usage on incompatible wikis, the code has an array of safe wikis. Currently en and simple only. https://citations.toolforge.org/process_page.php?edit=toolbar&wiki_base=simple&page=Water (so wiki_base defaults to en, but can be simple). But, the bot does not have the ability to actually commit edits (Write error: We encountered a captcha, so can't be properly logged in) and barfs out the wikitext to the user, so they can hand copy and paste. AManWithNoPlan (talk) 21:57, 20 February 2022 (UTC)

https://simple.wikipedia.org/w/index.php?title=Ecology&type=revision&diff=8044067&oldid=7992470
https://simple.wikipedia.org/w/index.php?title=Apple&diff=prev&oldid=8044066
https://simple.wikipedia.org/wiki/User:AManWithNoPlan/common.js

It appears that the Bot does not have an account on simple: https://simple.wikipedia.org/wiki/User:Citation_bot AManWithNoPlan (talk) 22:41, 20 February 2022 (UTC)

Doing some further research, although if some admin could just create account, that would be great. AManWithNoPlan (talk) 23:24, 20 February 2022 (UTC)

Any simplewiki admin can forcibly create the account by going to simple:Special:CreateLocalAccount. I don't think there are any watching this page, though. * Pppery * _{it has begun...} 05:17, 21 February 2022 (UTC)

I'll make a request there and link to this. Mako001 (C) (T) 14:13, 22 February 2022 (UTC)

Ah, it actually does? But no userpage? It has made edits before though, these seem to be typical ref fixes. Mako001 (C) (T) 14:17, 22 February 2022 (UTC)

That fooled me too. Those edits are from imported history from en.wikipedia.org. AManWithNoPlan (talk) 14:28, 22 February 2022 (UTC)

Seems to exist now, but the bot often runs into a captcha. AManWithNoPlan (talk) 15:17, 22 February 2022 (UTC)

Functioning, but people should wait upon final approval before using a lot: https://simple.wikipedia.org/wiki/Wikipedia_talk:Bots#Citation_bot AManWithNoPlan (talk) 19:00, 22 February 2022 (UTC)

Fixed - done. AManWithNoPlan (talk) 19:18, 22 February 2022 (UTC)

Double edit needed during series cleanup

Status: Fixed
Reported by: Headbomb {t · c · p · b} 13:50, 24 February 2022 (UTC)

What happens: [51] + [52]
What should happen: [53] (combined edit)->

Incorrectly changes capitalization of initials

Status: Fixed after git update
Reported by: Umimmak (talk) 00:00, 26 February 2022 (UTC)

What happens: The bot lowercases the initial "E." in periodical title such as Contributions from the E. M. Museum of Geology and Archæology of Princeton College [54]
What should happen: It should leave the capitalization alone
Relevant diffs/links: https://en.wikipedia.org/w/index.php?diff=1073983213&oldid=1072153166&title=Amynodontidae
We can't proceed until: Feedback from maintainers

Hello

Can you edit Draft:Spider-Man (Takuya Yamashiro) please? Blackknight1234567890 (talk) 09:12, 4 March 2022 (UTC)

Job stalled

My big batch job has stalled since this edit[55] at 11:16, and https://citations.toolforge.org/kill_big_job.php doesn't kill it.

@AManWithNoPlan, please can you kill this job? BrownHairedGirl (talk) • (contribs) 13:57, 9 March 2022 (UTC)

All evidence points to a job die-off. My big job died also. AManWithNoPlan (talk) 14:14, 9 March 2022 (UTC)

I have rebooted the bot. AManWithNoPlan (talk) 14:15, 9 March 2022 (UTC)

Many thanks, @AManWithNoPlan ... both for the action and for its promptness. You are a star! BrownHairedGirl (talk) • (contribs) 14:23, 9 March 2022 (UTC)

Changes agency=Reuters to work=Reuters

Status: Not a bug
Reported by: SchreiberBike | ⌨ 00:45, 4 March 2022 (UTC)

What happens: Changes agency=Reuters to work=Reuters
What should happen: Reuters should stay at agency
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=French_fries&diff=1075100068&oldid=1074815682

In the article French fries, the bot made the change above. ‎Reuters is not a publication, but a news agency. Putting it in a "work" parameter italicizes it in the reference and is an error. SchreiberBike | ⌨ 00:45, 4 March 2022 (UTC)

Not a bug. |agency= is used when citing a source produced by the agency but issued by another news source (very commonly, a newspaper). When citing a source on the agency's own website, |work= (or |website=) is correct.

—Trappist the monk (talk) 00:55, 4 March 2022 (UTC)

It is not correct, Reuters is not a work, it's an agency. Headbomb {t · c · p · b} 01:02, 4 March 2022 (UTC)

It is arguably correct, in Trappist's opinion and the opinion of some editors, but |agency= is just as arguably correct, in other editors' opinions. The bot should not be used to unfairly overpower one editor's opinion with a bulldozer over other editors armed only with hand-trowels; that is an abuse of bot policy and should lead to the bot being blocked from editing. The bot should stick to uncontroversial cleanups. —David Eppstein (talk) 01:03, 4 March 2022 (UTC)

|agency= is categorically incorrect when it is the work being cited. There's no arguing about that. Izno (talk) 04:05, 4 March 2022 (UTC)

Except it is not the work being cited. It's the agency being cited. The work is the website Reuters.com. Headbomb {t · c · p · b} 05:05, 4 March 2022 (UTC)

Why do we have |agency=? Is it to create metadata about the type of news outlet like |tabloid= hah. Rather, the purpose is to be paired with |work= (eg. work=NYT agency=Reuters) so that users can track down alternative sources in case of link rot or WP:V. If Reuters is the sole source, there is no agent involved, even though Reuters is technically a news agency it doesn't matter for our citing purposes. -- GreenC 05:55, 4 March 2022 (UTC)

Whether you add a .com to the end or not, this change is positive. Izno (talk) 07:25, 4 March 2022 (UTC)

This is news to me. You are saying that a news agency is sometimes a work and sometimes not, so sometimes it's Reuters and sometimes it's Reuters, even though it's a news agency in both cases. So, sometimes a news agency is a "major work". Is there a consensus about this somewhere or are these just strongly stated opinions? SchreiberBike | ⌨ 04:27, 4 March 2022 (UTC)

Reuters is not an agent to itself. -- GreenC 04:35, 4 March 2022 (UTC)

^. |agency= is for reprints of material by another agency, such as The NYT publishing a work from Reuters or The LA Times publishing a work from the AP. An agency is not an agency for itself. Izno (talk) 07:24, 4 March 2022 (UTC)

The documentation remained relatively unchanged from its creation in February 2012 until this edit (see this discussion) and then this edit (see this discussion).

The bot's edit was correct.

—Trappist the monk (talk) 12:41, 4 March 2022 (UTC)

There's a consensus at Template:Citation Style documentation/agency, a page which has considerably fewer watchers than Help talk:Citation Style 1 or Wikipedia:Manual of Style. There have been discussions at several places with many strongly stated opinions, but no consensus which led to an RfC and a decision. This is one of Wikipedia's problems; unless there's a consensus and decision, chaos continues to reign and people continue to do what they want. In a capitalization question I once even proposed that we leave the final decision to chance, the result of a public lottery, and that was laughed at, but it left us with chaos (mixed usage), which is worse than either choice. I think we are in the same place here, and I think it's a mistake to use a bot to make changes which do not have a real consensus. SchreiberBike | ⌨ 03:47, 5 March 2022 (UTC)

Have to agree with Trappist, Izno, and GreenC here.

is just as arguably correct, in other editors' opinions. Well, the official policy is to use the work parameter. if an editor thinks "agency=" is the right one, they need to have a good reason for why that is the case.

Wikipedia is not real life and such there is not a such need to follow everything exactly down to the letter, if the bot is making some sort of improvement that is consistent with the documentation for the template, even if it is slightly slightly controversial, why is it a big deal? There will always be one person that doesn't like something, that's why theres consensus discussions, etc.... Rlink2 (talk) 03:57, 5 March 2022 (UTC)

I agree. agency= should only be used when the citation already has a work= or newspaper=. Think of it like a via=. When Reuters etc are publishing in their own right, it is just a work like any other. --John Maynard Friedman (talk) 12:49, 5 March 2022 (UTC)

Incorrect change to newspaper

Status: Fixed
Reported by: Whywhenwhohow (talk) 16:41, 14 March 2022 (UTC)

What happens: Changes website to newspaper for the National Institutes of Health
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Janssen_COVID-19_vaccine&diff=1077039595&oldid=1077038877

It may be making the change because the string "news" is contained in the URL. It is a press release from the NIH. — Preceding unsigned comment added by Whywhenwhohow (talk • contribs) 16:41, 14 March 2022 (UTC)

Convert cite journal |doi=10.48550/arXiv.####.##### to proper cite arXiv |eprint=####.#####

Status: Fixed
Reported by: Headbomb {t · c · p · b} 09:00, 20 March 2022 (UTC)

What should happen: rese t + normal expansion

Bot deitalicizes translations of work titles

Status: Fixed
Reported by: Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 09:22, 21 March 2022 (UTC)

What happens: Italics are stripped off of translated titles of works in trans-title
What should happen: Translated titles in trans-title should stay italicized, per MOS:TITLECONFORM
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Girls_und_Panzer&diff=1078368131&oldid=1077524028

Game Audio Network Guild is not a newspaper

Status: Not a bug
Reported by: IceWelder [✉] 21:12, 2 March 2022 (UTC)

What happens: Bot marks content the "Game Audio Network Guild" as a newspaper. The G.A.N.G. is a non-profit organization, not a newspaper, so |publisher= is the correct parameter to use.
Relevant diffs/links: [56]

when the |work= and |publisher= are the same, publisher is not included. AManWithNoPlan (talk) 22:11, 4 March 2022 (UTC)

lincstothepast.com

Status: Fixed
Reported by: Keith D (talk) 16:26, 21 March 2022 (UTC)

What happens: Adds a link to the search page of site with a generic title of "Lincolnshire Archives: CalmView"
What should happen: Probably should ignore http://www.lincstothepast.com links as all seem to go to same place.
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=Skellingthorpe&curid=14306242&diff=1078391721&oldid=1072879888#cite_ref-9

title links

Status: Fixed
Reported by: Shmuel (Seymour J.) Metz Username:Chatul (talk) 12:39, 21 March 2022 (UTC)

What happens: the bot removed the valid wikilink from |journal=[[c't]] in microcode Revision as of 18:28, 24 February 2022

location in name

Status: Not a bug
Reported by: deisenbe (talk) 11:34, 11 March 2022 (UTC)

What happens: link to newspaper is removed: "journal=The Liberator (Boston, Massachusetts)" is turned into "journal=The Liberator (Boston, Massachusetts)" No link. After all damn times I typed this out.
What should happen: no change should be made
Relevant diffs/links: https://en.wikipedia.org/w/index.php?title=John_Brown%27s_body&diff=1076462643&oldid=1076302204

GIGO? The Liberator is a newspaper not a scholarly or academic journal so |newspaper= not |journal=. The name of the newspaper is [[The Liberator (newspaper)|The Liberator]] not [[The Liberator (newspaper)|The Liberator]] (Boston, Massachusetts). If it is necessary to disambiguate a source by its location, do this:

|newspaper=[[The Liberator (newspaper)|The Liberator]] |location=Boston

Another problem not mentioned above is this part of the edit where the bot broke the google books link:

|url=https://books.google.com/books?vid=HARVARD:32044092691898&printsec=titlepage#v=onepage (works) → |url=https://books.google.com/books?id=HARVARD (does not work)

—Trappist the monk (talk) 13:07, 11 March 2022 (UTC)

It should be newspaper, not journal. I don't think I did that and I should have caught it.

Is it actually harmful, like messing up counts, if the location of the newspaper follows its name in the newspaper= attribute? Newspapers.com, which is by far my biggest source, serves up this information routinely in its clip ID info. deisenbe (talk) 16:25, 11 March 2022 (UTC)

Surely if Newspapers.com gives you "The Liberator (Boston, Massachusetts)", it can't be too painful to insert a "location=" given that you are adding double square brackets anyway (and maybe disambiguating, as in this case)? I suspect that the MOS mandates this style in any case, --John Maynard Friedman (talk) 16:59, 11 March 2022 (UTC)

Twas you.

The parenthetical (Boston, Massachusetts) is not part of the newspaper's name so does not belong in |newspaper=. When the parenthetical is included in |newspaper=, it is a corruption of the citation's metadata. When the parenthetical is included in |newspaper=, it is rendered in italics as if it were part of the newspaper's name. The location-of-publication is properly rendered in an upright font when used with |location=.

If you are using some sort of automated tool to scrape the clip ID info, you should fix the tool or, if not your tool, report this to the tool maintainers for fixing.

—Trappist the monk (talk) 17:16, 11 March 2022 (UTC)

While adding incorrect doi and s2cid to valid CS2 conference paper citation, replaces conference title with repeat of paper title

Status: Fixed by adding code to deal with |contribution=
Reported by: —David Eppstein (talk) 05:49, 19 March 2022 (UTC)

What happens: In the linked diff, the bot broke a {{citation}} that had |contribution=No sublogarithmic-time approximation scheme for bipartite vertex cover (the correct title of a conference paper and |title=26th International Symposium on Distributed Computing (DISC), Salvador, Brazil, October 2012 (a valid form of the title of the conference, although not exactly the form reported by DBLP or by the publisher) by replacing |title= with the paper's title, causing the title to repeat redundantly within the citation, and by removing the conference title. At the same time, it added a doi for a different journal version of the paper (not the conference version that was cited) but neglected to change any of the other parameters to match the journal version. The added s2cid, incidentally, appears to be for the journal version but with garbage metadata using a wrong publication date that is neither correct nor a match for the cited conference version (confirming my general low opinion of SS and of the uselessness of spamming their ids everywhere).
What should happen: Not that. The usual convention for this field would be to manually check that it really is the same paper (or similar enough to contain whatever result is cited from it) before replace the conference citation with a proper citation to the journal version. But there are often good reasons for citing the conference version instead (for one thing because its dates might be relevant for establishing precedence), so this requires human intelligence. And replacing the citation also requires replacing the dates of article text and templates that refer to it, again requiring human intelligence (because the dates might be in plain text rather than templates). The bot is incapable of making decisions at that level, though, so the correct thing to do would have been to leave the citation alone and/or add the doi for the conference version that was cited (in this case, doi:10.1007/978-3-642-33651-5_13). Possibly it is relevant that the name of the journal is a substring of the name of the conference, but the bot should not be matching names so sloppily.
Relevant diffs/links: Special:Diff/1077967915
We can't proceed until: Feedback from maintainers

Bot fails to extract information from PDF barelinks

Status: Won't fix - would go no better than the battle of techno house
Reported by: Whoop whoop pull up ^{Bitching Betty ⚧️ Averted crashes} 05:18, 22 March 2022 (UTC)

What happens: When it encounters a bare link to a PDF, the bot fails to do any ref-fixing, and leaves the bare link untouched.
What should happen: The bot should extract the title, authorship, date, etc., information from within the PDF and use that to fill out the ref.
Relevant diffs/links: Multiocular_O
We can't proceed until: Feedback from maintainers

The problem with PDF bare links is that there's no way to reliabily extract any of the data. Some PDFs are just images of text, and even when it is text the formatting can be inconsistent. With webpages there can be structred data relating to authorship, title, etc...... PDF doesn't have any of that. Rlink2 (talk) 13:59, 22 March 2022 (UTC)

It would definitely be an annoyingly infinite whitelist of good data, instead of a short blacklist of bad data. Even with meta-data is included the data is often something horrible like "Accepted Draft" written by "Admin" in "1904". AManWithNoPlan (talk) 15:51, 22 March 2022 (UTC)

For example https://www.unicode.org/wg2/docs/n5170-multiocular-o.pdf is "n5170-multiocular-o.qxp_n3357r2-old-turkic" written by "Michael Everson" on "1/24/2022". The title is wrong, the date wrong, and the author is thankfully correct. Absolute strange topic though. A character used one time. AManWithNoPlan (talk) 15:54, 22 March 2022 (UTC)

https://www.unicode.org/charts/PDF/Unicode-15.0/U150-A640.pdf is better, but the only title is the book title, and not the chapter/section title which is what is really wanted. AManWithNoPlan (talk) 15:57, 22 March 2022 (UTC)

If this request can be programmed at all, then I suggest that it be limited to single-shot use, fully attended with the output checked for sanity. It would be grossly irresponsible to leave this to a 200-article batch run. --John Maynard Friedman (talk) 00:15, 23 March 2022 (UTC)

Caps: e-Health

Status: Fixed
Reported by: Headbomb {t · c · p · b} 22:32, 22 March 2022 (UTC)

What should happen: [57]