Jump to content

User:JCW-CleanerBot

From Wikipedia, the free encyclopedia
JCW-CleanerBot
This user is a bot
(talk · contribs)
OperatorHeadbomb (talk · contribs)
AuthorHeadbomb (talk · contribs)
Approved?BRFA
Flagged?Yes
Task(s)See here
Edit period(s)Irregular
Automatic or manual?Automatic/Semi-automatic
Programming language(s)WP:AWB
Exclusion compliant?Yes
Source code published?Just ask
Emergency shutoff-compliant?Yes

Logic

[edit]

What is the bot approved for?

[edit]

The bot has approval for the following 5 tasks. Tasks 1–4 are approved for automated editing, while task 5 is approved only for semi-automated editing.

  1. Closing all exposed disambiguators in |work=, |journal=, |magazine=, |publisher=.
    (journal|magazine|publisher|work)(\s*)=(\s*)\[\[([^\|\n]*) \((journal|magazine|publisher)\)\]\]
  2. Fixing non-correct ways to refer to a publication.
    Automated fixing will be done with 'sure shot' code.
    journal(\s*)=(\s*)Journal of Foobar\.? (case insensitive) → journal$1=$2Journal of Foobar
    This would do both capitalization fixing (Journal of foobarJournal of Foobar) and dot removal (Journal of Foobar.Journal of Foobar), and be guaranteed to only touch the |journal= parameter.
    Semi-automated fixing will be done with 'sloppy' code.
    Journal of Foobar\. (case insensitive) → Journal of Foobar
    This will also do both capitalization fixing, and dot removal, but would remove the final dot in a sentence like Bob published an article in Journal of Foobar.. This will be used for cases where it'll be quicker to be sloppy and perform a manual review.
  3. AWB general fixes. The bot will skips making only minor/whitespace genfixes, although they may be performed alongside its main task.
  4. Removal of {{Italics title}} from pages with {{Infobox journal}}/{{Infobox magazine}}
  5. AWB typo fixing, which will only be done semi-automatically per WP:CONTEXTBOT.

What are non-correct ways to refer to a publication?

[edit]

Listing every type of 'non-correct' journal names would be rather long. But they have 7 common themes:

  • A) Legit typo/misspelling
PaleontologyPalaeontology
  • B) Bad dots
J. Phys. A.J. Phys. A
J Phys. AJ. Phys. A
This does not touch undotted abbreviation like J Phys A
  • C) Bad capitalizations
J phys AJ Phys A
Journal of physicsJournal of Physics
  • D) Non-standard abbreviations to standard abbreviations. The standard ISO 4 abbreviation list can be found here.
Am J PhysicsAm J Phys
Br J PsychiatrBr J Psychiatry
The bot will respect all field-standard abbreviations, like those from MathSciNet/Mathematical Reviews, the Bluebook, or the National Library of Medicine/PubMed. For example, the ISO 4 abbreviation for Transactions of the American Mathematical Society is Trans Am Math Soc, but the MathSciNet abbreviation is Trans Amer Math Soc, and won't convert from one form to the other.
  • E) Subtitles
Clinical Cancer Research: An official journal of the American Association for Cancer ResearchClinical Cancer Research
This does not cover subtitles like Journal of Physics: Conference Series
  • F) Remove multi-language language entries to match the language of the article cited [semi-automated only, Latin alphabet only].
Canadian Journal of Development Studies/Revue canadienne d'études du développement

Canadian Journal of Development Studies or Revue canadienne d'études du développement
This does not touch entries in a journal with an official title in a different language (e.g. an English article published in Naturwissenschaften) even if the translated title (The Science of Nature) later becomes the official title.
  • G) Database 'leftovers'
|journal=Journal of Immunology (Baltimore, Md. : 1950)|journal=Journal of Immunology

Under the BRFA, the bot only touches |journal= for 2A, but I eventually plan on covering |work= / |magazine= / |publisher= later. Later means "once we get the equivalent compilation to WP:JCW for magazines and publishers". There will be a specific BRFA for those, although I'm pretty sure that would get speedily approved.

The bot screwed up! What should I do?

[edit]

Although I take great care to ensure that it doesn't screw up, it's possible that I screwed up the logic somewhere, or that there's a corner case it didn't handle correctly.

If the bot screwed up, follow these few simple steps:

  1. Assume good faith
  2. Leave a message on the bot's talk page. This will stop the bot from editing. Your message should include
    1. A diff of the problematic edit.
    2. An explanation of why you think the edit is wrong.
  3. We'll discuss the issue. If the bot was wrong, I'll either update the bot, or find some other way to prevent the edits in the future.

If the bot affected more than a handful of pages, I would advise against mass-reverting before discussion. If the bot was right, it'll be annoying to have to re-do those edits later. And if the bot was wrong, it'll likely be easier to use the bot to fix the problematic edits.

[edit]

See also

[edit]