Wikipedia:Bots/Requests for approval/Qwerfjkl (bot) 12
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was Approved.
Operator: Qwerfjkl (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 09:13, Saturday, May 14, 2022 (UTC)
Automatic, Supervised, or Manual: automatic
Programming language(s): AutoWikiBrowser
Source code available: AWB
Function overview: Remove[[Category:(country) films]]
Links to relevant discussions (where appropriate): Wikipedia:Bot requests#Film categories (and the prior discussion linked there)
Edit period(s): One time run
Estimated number of pages affected: <200,000
Exclusion compliant (Yes/No): No
Already has a bot flag (Yes/No): Yes
Function details: The bot will remove[[Category:(country) films]]
deepcategory
query on the relevant categories, via a regexp. The page count is hard to estimate because of the number of categories removed, and the large size of categories to work on, so I've estimated an upper limit.
The categories I'll run deepcategory
on are:
- Category:3D films by country
- Category:Black-and-white films by country
- Category:Direct-to-video films by country
- Category:English-language films by country
- Category:Feminist films by country
- Category:Independent films by country
- Category:Lost films by country
- Category:Multilingual films by country
- Category:Rediscovered films by country
- Category:Short films by country
- Category:Silent films by country
- Category:Television films by country
- Category:Films based on actual events by country
- Category:Lists of films by country of production
- Category:Film series by country
- Category:Crossover films by country
Discussion
[edit]- Just for a bit of context on why this is warranted, if it would help: WP:FILM formerly had a policy of deeming "(Country) films" categories to be all-inclusive, meaning that they had to directly include all films from that country even if they were already extensively subcategorized for genre or other characteristics. That wasn't necessarily unreasonable 15 to 20 years ago when that rule was first established, as we had far, far fewer articles about films at that time than we do now — but in 2022, a considerable number of the categories are now populated into the thousands or tens of thousands, and would have been deemed too large and in need of diffusion in virtually any other category tree. So the WikiProject has now established a consensus to drop the "all inclusive" rule, but due to the sheer number of articles involved nobody wants to tackle the whole job manually.
So the idea is to use a bot to clean out the redundant category from articles that are already properly subcategorized, so that the human editors can concentrate our efforts on the smaller number of articles that are only filed in the parent while lacking any subcategorization. Bearcat (talk) 12:51, 14 May 2022 (UTC)[reply]- Approved for trial (64 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Ideally try to spread them out over the various categories. Primefac (talk) 08:59, 26 May 2022 (UTC)[reply]
- @Primefac, Is there a limit to the length of a regexp? Currently mine is
\[\[Category: ?(Afghan|Albanian|Algerian|American|Andorran|Angolan|Antigua and Barbuda|Argentine|Armenian|Australian|Austrian|Austro-Hungarian|Azerbaijani|Bahamian|Bahraini|Bangladeshi|Belarusian|Belgian|Beninese|Bhutanese|Bolivian|Bosnia and Herzegovina|Botswana|Brazilian|British|Bruneian|Bulgarian|Burkinabé|Burmese|Burundian|Cambodian|Cameroonian|Canadian|Cape Verdean|Chadian|Chilean|Chinese|Colombian|Comorian|Democratic Republic of the Congo|Republic of the Congo|Costa Rican|Croatian|Cuban|Curaçaoan|Cypriot|Czech|Czechoslovak|Danish|Djiboutian|Dominican Republic|Dutch East Indies|Dutch|East Timorese|Ecuadorian|Egyptian|Emirati|Equatoguinean|Estonian|Ethiopian|Faroese|Fijian|Finnish|French|Gabonese|Gambian|German|Ghanaian|Greek|Greenlandic|Guatemalan|Bissau-Guinean|Guinean|Haitian|Honduran|Hong Kong|Hungarian|Icelandic|Indian|Indonesian|Iranian|Iraqi|Irish|Israeli|Italian|Ivorian|Jamaican|Japanese|Jordanian|Kazakhstani|Kenyan|Korean|Kosovan|Kuwaiti|Kyrgyzstani|Laotian|Latvian|Lebanese|Lesotho|Liberian|Libyan|Lithuanian|Luxembourgian|Macedonian|Malagasy|Malawian|Malaysian|Maldivian|Malian|Maltese|Mauritanian|Mauritian|Mexican|Moldovan|Mongolian|Montenegrin|Moroccan|Mozambican|Namibian|Nepalese|New Zealand|Nicaraguan|Nigerian|Nigerien|Norwegian|Pakistani|Palestinian|Panamanian|Paraguayan|Peruvian|Philippine|Polish|Portuguese|Qatari|Romanian|Russian|Rwandan|Sahrawi|Samoan|Saudi Arabian|Senegalese|Serbian|Sierra Leonean|Singaporean|Slovak|Slovenian|Somalian|South African|South Sudanese|Soviet|Spanish|Sri Lankan|Sudanese|Surinamese|Swazi|Swedish|Swiss|Syrian|Taiwanese|Tajikistani|Tanzanian|Thai|Togolese|Tongan|Trinidad and Tobago|Tunisian|Turkish|Turkmenistan|Ugandan|Ukrainian|Uruguayan|Uzbekistani) films\]\]\n?
which might need splitting up. ― Qwerfjkltalk 14:29, 26 May 2022 (UTC)[reply]- I have no idea; try it, and if it doesn't work split it up. For what it's worth, you have a lot of unicode spaces in your copy above (which may or may not be present in your original files) so you might want to check that before you run anything. Primefac (talk) 14:31, 26 May 2022 (UTC)[reply]
- Thanks, now removed, and the regex works. I'll have the trial done soon (I've alphabetised the list to try and spread out the categories, not sure how effective it'll be). ― Qwerfjkltalk 14:41, 26 May 2022 (UTC)[reply]
- Trial complete. See these 64 contributions. ― Qwerfjkltalk 14:48, 26 May 2022 (UTC)[reply]
- Thanks, now removed, and the regex works. I'll have the trial done soon (I've alphabetised the list to try and spread out the categories, not sure how effective it'll be). ― Qwerfjkltalk 14:41, 26 May 2022 (UTC)[reply]
- I have no idea; try it, and if it doesn't work split it up. For what it's worth, you have a lot of unicode spaces in your copy above (which may or may not be present in your original files) so you might want to check that before you run anything. Primefac (talk) 14:31, 26 May 2022 (UTC)[reply]
- @Primefac, Is there a limit to the length of a regexp? Currently mine is
- Approved for trial (64 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Ideally try to spread them out over the various categories. Primefac (talk) 08:59, 26 May 2022 (UTC)[reply]
- @Primefac:, requesting update here as this has now been hanging for almost two weeks. Bearcat (talk) 19:44, 7 June 2022 (UTC)[reply]
- @Bearcat, you might want to try {{BAG assistance needed}}. ― Qwerfjkltalk 19:55, 7 June 2022 (UTC)[reply]
- {{BAG assistance needed}} ― Qwerfjkltalk 16:33, 10 June 2022 (UTC)[reply]
- I've been on holiday the last two weeks, and BRFAs are a bit far down my "catch-up priority" list, but I'll try to get to these as soon as possible. Primefac (talk) 11:43, 15 June 2022 (UTC)[reply]
- @Primefac, there's no rush. ― Qwerfjkltalk 22:23, 18 June 2022 (UTC)[reply]
- I've been on holiday the last two weeks, and BRFAs are a bit far down my "catch-up priority" list, but I'll try to get to these as soon as possible. Primefac (talk) 11:43, 15 June 2022 (UTC)[reply]
After reviewing the edits, I don't have any concerns with this. As with all regexes, please be careful and spot check/fix any errors that may arise. Approved. As per usual, if amendments to - or clarifications regarding - this approval are needed, please start a discussion on the talk page and ping. --TheSandDoctor Talk 15:20, 19 June 2022 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.