Wikipedia:Bots/Requests for approval/WikiCleanerBot 5
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: NicoV (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 08:49, Saturday, June 15, 2019 (UTC)
Function overview: Fix some WP:WCW errors using WPCleaner
Automatic, Supervised, or Manual: Automatic
Programming language(s): Java (WPCleaner)
Source code available: On GitHub
Links to relevant discussions (where appropriate): Wikipedia:Bots/Requests for approval/PkbwcgsBot
Edit period(s): Twice a month, with the dump analysis that I already perform, see Wikipedia:Bots/Requests for approval/WikiCleanerBot.
Estimated number of pages affected: A few thousand articles for the initial runs spread over a few sessions, then normally only a few dozen or hundreds each time.
Namespace(s): Main
Exclusion compliant (Yes/No): Yes
Function details: As PkbwcgsBot hasn't been run for several months, I'd like to take over some of the tasks that Pkbwcgs was performing with WPCleaner. This request is a part of Wikipedia:Bots/Requests for approval/PkbwcgsBot. It includes automatically fixing part of some WP:WCW errors:
- CW Error #2: tags with incorrect syntax. The list of articles that the bot will check comes from CheckWiki list #2 (currently 617 articles) and from Wikipedia:CHECKWIKI/WPC 002 dump (currently 725 articles): only some articles will be fixed, only the simple ones (like false
</br>
tags). - CW Error #16: unicode control characters. The list of articles that the bot will check comes from CheckWiki list #16 (currently 2508 articles): only some articles will be fixed, only the simple ones.
- CW Error #17: category duplication. The list of articles that the bot will check comes from CheckWiki list #17 (currently 6328 articles) and from Wikipedia:CHECKWIKI/WPC 017 dump (currently 8449 articles): only some articles will be fixed, only the simple ones (like exact category duplication with same sort key). For example, on the first 100 articles in Wikipedia:CHECKWIKI/WPC 017 dump, 70 are modified.
- CW Error #85: tags without content. The list of articles that the bot will check comes from CheckWiki list #85 (currently 831 articles): only some articles will be fixed, only the simple ones.
- CW Error #88: DEFAULTSORT with a blank at first position. The list of articles that the bot will check comes from CheckWiki list #88 (currently 349 articles): only some articles will be fixed, only the simple ones.
- CW Error #90: internal link written as an external link. The list of articles that the bot will check comes from CheckWiki list #90 (currently 5715 articles): only some articles will be fixed, only the simple ones.
- CW Error #91: interwiki link written as an external link. The list of articles that the bot will check comes from CheckWiki list #91 (currently 2100 articles): only some articles will be fixed, only the simple ones.
Discussion
[edit]Approved for trial (140 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Please run 20 edits for each proposed task. Primefac (talk) 12:31, 15 June 2019 (UTC)[reply]
- Thanks ! Here are the results:
- CW Error #2 (tags with incorrect syntax): 20 edits, no problems detected.
- CW Error #16 (unicode control characters): 20 edits, no problems detected.
- CW Error #17 (category duplication): 20 edits, no problems detected.
- CW Error #85 (tags without content): 20 edits. Wondering what I should do when there are comments inside the tag without content (gallery tags: Ana Vidjen, Andrews County Veterans Memorial, Battle of Naseby, Catherine Marks ; noinclude tags: Barnet Copthall): either keep the automatic fix as it is now, or comment the tag itself, or do nothing. Answer can be different depending on the tag.
- With respect to commented-out markup I'd leave them alone in case the comment markup is ever removed (e.g if the file(s) is/are restored). Jo-Jo Eumerus (talk, contributions) 14:22, 16 June 2019 (UTC)[reply]
- Jo-Jo Eumerus. To be on the safe side, I've modified WPC not to automatically remove tag without content when there are comments inside them. --NicoV (Talk on frwiki) 17:22, 17 June 2019 (UTC)[reply]
- With respect to commented-out markup I'd leave them alone in case the comment markup is ever removed (e.g if the file(s) is/are restored). Jo-Jo Eumerus (talk, contributions) 14:22, 16 June 2019 (UTC)[reply]
- CW Error #88 (DEFAULTSORT with a blank at first position): 20 edits, no problems detected.
- CW Error #90 (internal link written as an external link): 20 edits, no problems detected.
- CW Error #91 (interwiki link written as an external link):
- 4 edits, a problem detected on the 4th edit on Azerbaijan State Philharmonic Hall. I've modified WPC not to automatically replace the external link when it's not surrounded by square brackets.
- 6 edits, a problem detected on the 6th edit on Counties of Norway. I've modified WPC not to automatically replace the external link when there's no text provided.
- 10 edits, no problems detected.
- Trial complete.. --NicoV (Talk on frwiki) 14:17, 15 June 2019 (UTC)[reply]
- {{BAG assistance needed}} --NicoV (Talk on frwiki) 13:57, 25 July 2019 (UTC)[reply]
- @NicoV: Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. 20 edits each for CW Error #85 and CW Error #91. Headbomb {t · c · p · b} 04:07, 6 August 2019 (UTC)[reply]
- Thanks Headbomb. Here are the results:
- CW Error #85 (tag without content): 20 more edits, no problems dectect.
- CW Error #91 (interwiki link written as an external link): 20 more edits, no problems dectect.
- Trial complete. --NicoV (Talk on frwiki) 20:48, 6 August 2019 (UTC)[reply]
- @NicoV: This would be much better than this. Headbomb {t · c · p · b} 21:01, 6 August 2019 (UTC)[reply]
- @Headbomb: I can also remove the carriage return if the empty tag was on the first line, and alone in the line, if you want. For other cases (not on the first line), there may be side effects with removing the carriage return. What do you say? --NicoV (Talk on frwiki) 21:22, 6 August 2019 (UTC)[reply]
- Should be for otherwise empty lines only. Headbomb {t · c · p · b} 21:34, 6 August 2019 (UTC)[reply]
- @Headbomb: The problem is that it will change the display in some situations, see below. --NicoV (Talk on frwiki) 22:53, 6 August 2019 (UTC)[reply]
- @Headbomb: I've modified WPC to remove extra white lines (if there are 2 or more, or if they are the beginning or the end of the article). Result on the same article that you reported. --NicoV (Talk on frwiki) 19:32, 7 August 2019 (UTC)[reply]
- @Headbomb: The problem is that it will change the display in some situations, see below. --NicoV (Talk on frwiki) 22:53, 6 August 2019 (UTC)[reply]
- Should be for otherwise empty lines only. Headbomb {t · c · p · b} 21:34, 6 August 2019 (UTC)[reply]
- @Headbomb: I can also remove the carriage return if the empty tag was on the first line, and alone in the line, if you want. For other cases (not on the first line), there may be side effects with removing the carriage return. What do you say? --NicoV (Talk on frwiki) 21:22, 6 August 2019 (UTC)[reply]
- @NicoV: This would be much better than this. Headbomb {t · c · p · b} 21:01, 6 August 2019 (UTC)[reply]
- Thanks Headbomb. Here are the results:
- @NicoV: Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. 20 edits each for CW Error #85 and CW Error #91. Headbomb {t · c · p · b} 04:07, 6 August 2019 (UTC)[reply]
- {{BAG assistance needed}} --NicoV (Talk on frwiki) 13:57, 25 July 2019 (UTC)[reply]
Example:
Line 1 before noinclude tag <noinclude></noinclude> Line 2 after noinclude tag
Before removal of the empty tag:
Line 1 before noinclude tag
Line 2 after noinclude tag
After removal of the empty tag (keeping the empty line): same display
Line 1 before noinclude tag
Line 2 after noinclude tag
After removal of the empty tag (removing the empty line): modified display
Line 1 before noinclude tag Line 2 after noinclude tag
- Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. 20 edits to see if that case is handled correctly in, and results in oddities otherwise. Have a mix of that case and others in the trial if possible. Headbomb {t · c · p · b} 20:10, 7 August 2019 (UTC)[reply]
- @Headbomb: Here are the new edits:
- Feminism in Sweden: span tags in the middle of a sentence
- FK Dubnica: gallery tags in their own lines
- FC Epfendorf 1929: center tags in table cells
- Eretz Yisrael Shelanu: div tags at the end of table
- Elsa Cladera de Bravo: includeonly tags in the middle of a sentence
- Dominic Fotia: gallery tags at the beginning of the article
- Domadugu: div tags on their own lines
- District of Columbia and United States Territories Quarter: noinclude tags at the beginning of the article
- History of agriculture: div tags spanning on 2 lines
- Hidden message: includeonly tags in the middle of a sentence
- Heritage Day (South Africa): includeonly tags in the middle of a sentence
- Heidi Quante: gallery tags spanning on 2 lines
- H. M. Khoja: gallery tags spanning on 2 lines
- God's Favorite Customer: includeonly tags at the beginning of the article
- Geneva fusillade of 9 November 1932: span tags in the middle of a sentence
- Fredericton shooting: includeonly tags at the beginning of a sentence
- Frank Dorsa: gallery tags at the beginning of the article
- Jacques Delors: includeonly tags at the beginning of the article
- Jakobstad Museum: gallery tags at the end of the article
- Isleworth Mona Lisa: includeonly tags at the beginning of the article
- Trial complete. --NicoV (Talk on frwiki) 21:29, 7 August 2019 (UTC)[reply]
- @Headbomb: Here are the new edits:
Approved. CW Error #85 is technically cosmetic in many cases, but I feel it's editor-hostile enough to deal with it through a bot. Ping me if there's pushback on that task. Headbomb {t · c · p · b} 22:12, 7 August 2019 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.