Wikipedia:Bots/Requests for approval/Snotbot 2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Snottywong (talk · contribs)
Time filed: 17:53, Tuesday February 8, 2011 (UTC)
Automatic or Manually assisted: Automatic
Programming language(s): Python
Source code available: Pywikipedia script, source code can be made available on request.
Function overview: Tagging image files which are used in video game articles with {{WikiProject Video games}}.
Links to relevant discussions (where appropriate): Wikipedia:Bot requests/Archive 40#Tagging files for WikiProject Video games
Edit period(s): One time run
Estimated number of pages affected: Maximum of about 20,00025,000 images.
Exclusion compliant (Y/N): No
Already has a bot flag (Y/N): Yes
Function details: User:Anomie was kind enough to create a list of all images used in articles which are in WikiProject Video Games (see 1, 2, 3, 4, 5). The bot will go through each of these files and add the WPVG template to the file's talk page if it is not already present. If no talk page exists, the bot will create one. The only problem that remains to be resolved is that Anomie's list contains all images used on the articles, including images transcluded from templates. So, I will talk to Anomie to see if it is possible to generate a list that excludes images that aren't directly linked in articles. If that is not possible, then the bot will need to go through and check that each image is directly linked before adding the WPVG template.
Discussion
[edit]Quick update on the scope of the bot: In preparing the bot, I've found that there are exactly 28,060 unique files that are directly linked from WPVG articles (and by "directly linked", I mean that the article uses [[File:...]] or [[Image:...]] in the wikicode, not counting images transcluded from templates, although images used in templates tagged by the WPVG banner are counted). Since approximately 3,200 of the files are already tagged with the WPVG banner, this leaves roughly 25,000 that will likely need to be tagged. The bot is programmed and ready to go, just waiting on approval now. SnottyWong babble 20:17, 11 February 2011 (UTC)[reply]
Could you post a list of some 200-300 random files from your list, so we can see how many are not actually relevant to the project? This wasn't answered by anyone in any of the discussions. — HELLKNOWZ ▎TALK 21:06, 11 February 2011 (UTC)Trout. — HELLKNOWZ ▎TALK 21:06, 11 February 2011 (UTC)[reply]- Well, the false positives seem to be very few and rare. Let them not stand in building encyclopaedia.
- Approved for trial (150 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.. — HELLKNOWZ ▎TALK 21:13, 11 February 2011 (UTC)[reply]
- I don't think it's a terrible idea to get a random sample. Here's 300 files from the list, chosen completely at random. Let me know if you find any irrelevant ones, I'll check some of them out as well.
- Looks like there's a bunch of redlinks. I imagine that's because the bot doesn't distinguish between actual links and links that have been commented out in the wikicode (i.e. <!-- Unsourced image removed: [[Image:EndlessSaga1.jpg|250px| ]] -->. In any case, the bot is already programmed to make sure that a file actually exists before posting on its talk page, so those redlinks will be ignored. SnottyWong chat 21:36, 11 February 2011 (UTC)[reply]
- Anomie's lists already had the files, so that's where I looked and previewed a batch of 300 or so. — HELLKNOWZ ▎TALK 21:50, 11 February 2011 (UTC)[reply]
- Trial complete.
- 150 edits made with only one problem: The bot is skipping over files that are hosted on Commons. The API is telling the bot that the files don't exist (see this API query), and the bot just skips them. What do you think the bot should do in these cases? Even if it could search Commons to see if the file exists, it still probably wouldn't be useful to create a talk page on en-wiki for the file, right? I suppose the bot should just be skipping over these files anyway? SnottyWong spout 22:41, 11 February 2011 (UTC)[reply]
- Argh. It also appears that it blanked the existing talk page of 15 articles. I'll go through and fix them manually. Silly omission in the code. SnottyWong babble 22:49, 11 February 2011 (UTC)[reply]
- Problem fixed and code updated to add the template to the top of existing talk pages instead of overwriting them completely (assuming the talk page doesn't already have the WPVG banner or one of its redirects). SnottyWong babble 23:00, 11 February 2011 (UTC)[reply]
- Use prop=imageinfo to tell if the image exists (imagerepository in the response will tell you if it's local or Commons); prop=info gets you info on the description page whether or not the file itself exists. Anomie⚔ 18:29, 13 February 2011 (UTC)[reply]
- Thanks for the tip, I tried it out and it works. However, I don't think I should be creating talk pages here for images that are on commons. Do you agree? SnottyWong soliloquize 16:38, 14 February 2011 (UTC)[reply]
- I don't have a strong opinion one way or the other, but I do lean towards not creating those talk pages. Anomie⚔ 17:56, 14 February 2011 (UTC)[reply]
- Thanks for the tip, I tried it out and it works. However, I don't think I should be creating talk pages here for images that are on commons. Do you agree? SnottyWong soliloquize 16:38, 14 February 2011 (UTC)[reply]
- Argh. It also appears that it blanked the existing talk page of 15 articles. I'll go through and fix them manually. Silly omission in the code. SnottyWong babble 22:49, 11 February 2011 (UTC)[reply]
Only non-scope image I found was File talk:Flag of the United Kingdom.svg. What about if talk page uses {{WikiProjectBannerShell}}? I doubt there's many images now, but should you get requested to tag more projects, you may end up tagging the same page multiple times for different projects.
Anyway, Approved for extended trial (5 existing page edits or 50 random edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's see correct existing page update. — HELLKNOWZ ▎TALK 09:19, 12 February 2011 (UTC)[reply]
- As the majority of our images are hosted locally (at least, all of the non-free ones), the hope was that the bot would skip the Commons images. All the iffy images should be at commons (map of Nebraska, picture of a tiger, or whatever). Free video game images aren't terribly common, are easy to find at Commons, and don't require the maintenance that non-free images do. If any free images are hosted locally, I will move them to commons. ▫ JohnnyMrNinja 18:22, 14 February 2011 (UTC)[reply]
- Sounds good. This is the bot's current behavior, so it shouldn't need any updating. I'm not currently at the PC that runs the bot, but I should have the ability to do another test run in a few hours. SnottyWong talk 18:52, 14 February 2011 (UTC)[reply]
Trial complete. I made 50 more random edits, and only one of them had an existing talk page, File talk:VentureAfrica.png. It took forever this time because there was a lot of database server lag, so the bot kept pausing. Looks like the previous problems have been resolved. Also, I have changed the bot so that it doesn't mark the edits as minor if it is creating a new page (I did that after the 50 edits were complete, so you won't see that until next time). SnottyWong prattle 21:47, 14 February 2011 (UTC)[reply]
What about if talk page uses {{WikiProjectBannerShell}}? — HELLKNOWZ ▎TALK 21:58, 14 February 2011 (UTC)[reply]
- Ahh, right. I haven't addressed that yet. My first thought is to just have the bot skip over any pages that use that template, and create a list of such pages. Then, if there are only a handful of pages (and I expect there will be very few, if any at all), then I can just process them manually. If the list ends up being quite large, then I can run a different script to process them on their own. However, I think that creating a complicated regex to deal with that situation is going to be a lot of effort to expend for one or two cases (and increased risk for mistakes), especially considering that less than 150 pages in the File Talk namespace use that template (including its redirects). SnottyWong yak 22:31, 14 February 2011 (UTC)[reply]
- Does that work for you? SnottyWong speak 02:28, 16 February 2011 (UTC)[reply]
- I was thinking about future case in which you could run this task for more projects, eventually tagging some files to multiple projects.
- Does that work for you? SnottyWong speak 02:28, 16 February 2011 (UTC)[reply]
Anyway, Approved. — HELLKNOWZ ▎TALK 08:44, 16 February 2011 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.