Wikipedia:Bots/Requests for approval/Kumi-Taskbot 4
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Revoked.
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Kumioko (talk · contribs)
Time filed: 16:26, Tuesday January 10, 2012 (UTC)
Automatic or Manual: Manual, Semi-auto or Automatic based on the need.
Programming language(s): C#, Regex, AWB
Source code available: Yes
Function overview: Add WikiProject banner or parameter, limited assessment, limited WikiProject parameter addition or removal
Links to relevant discussions (where appropriate):
Edit period(s): As needed
Estimated number of pages affected: About 10, 000 a month (may be more or less depending on the requests)
Exclusion compliant (Y/N): Yes
Already has a bot flag (Y/N): Yes
Function details: This task will add WikiProject banner templates as needed and may perform some simple assessment inheritance based on the request and the discussion. To clarify, the discussion may be from the Bot request page, a WikiProject page, the Village pump or other appropriate discussion venue.
This is primarily intended for WikiProject United States or supported projects but may be requested by others as well.
A basic desciption of the task is as follows and is mostly done using Regex and the Advanced find and replace feature within AWB:
- Based on a list of provided articles, will add the applicable banner or parameter as in the case of a supported project such as adding
|WA=Yes
to the WikiProject United States banner. Can inherit class if desired from another project if that project appears on the page. I am currently only doing this in the case of 3 projects that I belive are reliable (US Roads, MILHIST and NRHP) but others can be used as well if desired.If class is inherited it will also populate the parameter|auto=Yes
. The yes can be changed to Inherit or something else based on the needs of the request.- As a somewhat seperate but related task. I would like to include in this request allowing the bot to perform the following additional related task. The bot can scan through an article list for certain items (such as Infoboxes, stub templates or identifying if the article is a redirect). That list could then be saved and the applicable parameter could be applied (or if need be removed) to the WikiProject Template on the talk page. For example, the article list could be scanned for evidence of an infobox and then once the list has been scanned could add
|needs-infobox=Yes
to the applicable WikiProject banner or removed if necessary based on the request.--Kumioko (talk) 16:26, 10 January 2012 (UTC)[reply]
Discussion
[edit]- Some stuff:
- This task will add WikiProject banner templates as needed and may perform some simple assessment inheritance based on the request and the discussion. To clarify, the discussion may be from the Bot request page, a WikiProject page, the Village pump or other appropriate discussion venue. — This is a bit broad. We typically don't give blanket approvals for arbitrary bot tasks based on arbitrary discussions that might happen in the future, so we would have trouble approving this task request if this portion of it remains.
- Automatic or Manual: Manual, Semi-auto or Automatic based on the need. — Similar concerns as above regarding broadness. You should specify on each task what form of operation is involved.
- This is primarily intended for WikiProject United States or supported projects but may be requested by others as well. Same as above. Furthermore, we also typically don't give blanket approvals based on WikiProject, either, unless there's consensus from the WikiProject itself that a bot necessarily needs to be highly versatile as part of a specific task description.
- The remainder of the task description (i.e., the text starting with "A basic desciption of the task is as follows") is more what we're after. To clarify, is that the task description you're intending us to review?
- --slakr\ talk / 20:58, 13 January 2012 (UTC)[reply]
- Yes please the 4 tasks listed I have also clarified more below. I was trying to keep from submitting a flurry of BRFA's individually but its no big deal to me. The reason its broad is because there are currently about 65 projects supported in some way by WikiProject United States and each may have multiple tasks. For example, I am compiling lists of articles relating to some of them such as the one here that I need to tag with the applicable banner. That's only 1 of several for the United States project alone and that's only one of the projects.
- On the second issue, for the most part it will be automatic but unfortunately due to a complete lack of standardization within the WikiProject banners it often requires me to do a number of them manually due to unprogrammable variations in the appearance of the banners and talk page content.
- On the last issue again its no problem I was just trying to save time. I will compile a detailed list of tasks a little later. --Kumioko (talk) 13:57, 14 January 2012 (UTC)[reply]
- Based on the request above for clarification I request the bot be allowed to perform the below list of tasks for the WikiProject United States banner.
- Add the WikiProject United States banner to
- Remove the needs infobox parameter from the WikiProject United States banner if the article contains an infobox
- Add the needs infobox parameter to the WikiProject United States banner if the article doesn't contain an infobox
- Add the needs image parameter to the WikiProject United States banner if the article doesn't contain an infobox
- Add the unref parameter to the WikiProject United States banner if the article doesn't contain any references
- Remove the unref parameter from the WikiProject United States banner if the article contains a reference
Assess as Redirect class if the article is a redirectAssess unassessed WikiProject United States articled by inheriting the class assessment from other banners if not assessed.
That should be enough to get started. --Kumioko (talk) 14:34, 14 January 2012 (UTC)
{{BAG assistance needed}}[reply]
Trial
[edit]Approved for trial (7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Regarding the broadness of the original task description, I agree with slakr's comments, however if this BRFA goes well, similar subsequent tasks could probably be speedily approved (it will just give us a chance to check the discussion on the Wikiproject etc) --Chris 17:03, 20 January 2012 (UTC)[reply]
- Assuming, that you are letting me do all of the 8 above I will test them in like groups. I did about 100 edits for 1, 7 and 8. --Kumioko (talk) 02:58, 21 January 2012 (UTC)[reply]
- I did 5 and 6. I only posted about 25-30 out of about 500. --Kumioko (talk) 03:46, 21 January 2012 (UTC)[reply]
- 2 and 3 are done/ There weren't very many for 2 because i just cleaned them up manually with my regular account recently. --Kumioko (talk) 09:25, 21 January 2012 (UTC)[reply]
- 4 is tested. I did about 100 edits. I currently have a listing of about 8500 but that may vary as images or articles are added. --Kumioko (talk) 20:40, 21 January 2012 (UTC)[reply]
- Trial complete. All 8 items have been tested and are in the contributions history with the appropriate related edit summary. Please let me know if you have any questions. --Kumioko (talk) 20:40, 21 January 2012 (UTC)[reply]
{{BAGAssistanceNeeded}} - Its only been three days so its not neglected just trying to keep things moving. --Kumioko (talk) 17:47, 24 January 2012 (UTC) Please provide links to the edits performed under this trial, in the format task 4[reply]
- Why does Enosburg Falls, Vermont's talk history show two edits? Josh Parris 01:47, 26 January 2012 (UTC)[reply]
- Why are both http://en.wikipedia.org/w/index.php?title=1944_Washington_Huskies_football_team&diff=prev&oldid=472498600 and http://en.wikipedia.org/w/index.php?title=1944_Washington_Huskies_football_team&diff=prev&oldid=472498600 both necessary edits?
- Has a change to prevent this problem with redirects from reoccurring been implemented? Josh Parris 01:47, 26 January 2012 (UTC)[reply]
- Ok on the first question the simple answer is because, frankly, there is no consistency to how WikiProject templates and there parameters are displayed and quite literally hundreds of variations. I have most of them captured but sometimes new ones slip in. Basically what I do is let the bot run through the list. Then I manually go back through and address the exceptions as they appear. In the case of the Enosburg Falls, Vermont article I moved the auto inherit parameter after the importance parameter so that the logic could process.
- On question number 2 and 3 I didn't factor out redirects initially, after these came up I added logic to factor out redirects from needs image, unref and needs infobox. I will add the numbers groupings as soon as I can. --Kumioko (talk) 02:11, 26 January 2012 (UTC)[reply]
- Add the WikiProject United States banner to - Inherit assessment
- Remove the needs infobox parameter from the WikiProject United States banner if the article contains an infobox - Rmv needs-infobox=Yes
- Add the needs infobox parameter to the WikiProject United States banner if the article doesn't contain an infobox
- Add the needs image parameter to the WikiProject United States banner if the article doesn't contain an infobox - add needs-image=Yes
- Add the unref parameter to the WikiProject United States banner if the article doesn't contain any references - Add unref=yes
- Remove the unref parameter from the WikiProject United States banner if the article contains a reference
Assess as Redirect class if the article is a redirect Assess as Redirect if applicableAssess unassessed WikiProject United States articled by inheriting the class assessment from other banners if not assessed. Inherit assessment
- Good catch, That happened because I had multiple projects in the same regex. Basically I didn't want to assume that the assessment was correct for just any project and just used some that I knew had a high chance of being accurate. In this case New York and Texas. Since this article had both it added the banner for each. In the future I plan on only running it against one project at a time. So rather than having some thing that looks like this: (Texas|New York|Utah|etc.) I will only have (Texas). That way it will reduce the chances of things like this happening. For what its worth I also periodically check for this scenario anyway so any article that has multiple WPUS templates will be cleared up every couple weeks through routine maintenance I do manually. --Kumioko (talk) 15:18, 31 January 2012 (UTC)[reply]
- A few notes:
- Task 1: An approved bot task seems to have conflicted with the new bot tasks here and here, here and here. How do you plan on improving the logic to handle this? (E.g. not adding WikiProject United States if WikiProject Massachusetts is already on the page, knowing it'll be converted to WikiProject United States later?)
- Task 2: Is your approved bot task going to fix the seeming redundancy here?
- Task 4: Conflict similar to task 1 here and here. Have you excluded everything but the Talk namespace now per this?
- Task 5: Someone else seems to have done the work of the bot here; is there any way these tasks can be run simultaneously so the bot won't have to make multiple edits?
- Your contribs links for task 6 and 7 are the same, incidentally, but I checked out the task 7 edits and don't see any problem.
- In summary, I think you need to look at consolidating all of your tasks related to WikiProject United States into one AWB module/edit. Perhaps you were already going to do so and the issues above were caused by trialling each task by itself, in which case, disregard that. Also, I wouldn't recommend this task be approved for automatic unsupervised. It can be semi-automatic or automatic supervised but the edits need to be checked at some point; templates are complex enough that even the best regexes can accidentally fail to match and/or break them. — madman 18:42, 2 February 2012 (UTC)[reply]
- On task one, that was just used as an example to show the functionality. The edits were done because the article started with United States and the fact that Massachusetts got included as a supported project was just a coincidence.
- On task 2, there is no redundancy, I added the USMIL=yes parameter to WPUS to indicate the article was in the scope of both projects, just as MILHIST has an Aviation task force at the same time as the Aviation WikiProject tags the article.
- On task 4, the bot does catch a lot but not all nor can it until we can standardize the layout of how the WikiProjects and parameters display. Not to get on a soap box here but I want to standardize all of the WPUS articles to have a consistant layout, class first, then importance, then the supported projects in alph order, then support parameters and then listas. If I was allowed to do that I could write logic that would make sure that all the parameters where in a certain order and were there (such as articles with no Class or Importance) and then I could easily and cleanly make sure things like this don't happen. Unfortunately I am not allowed to do that so all I can do is try and catch as many as I can of the hundreds of variations as I go along. A lot of the time the bot catches it but sometimes it doesn't. That edit occurred because, as I have mentioned before there are hundreds of variations of how the WikiProjects and their parameters display. I don't have logic built to account for WikiProject and image needed without class or importance. Additionally, because there are a number of other projects that Start with United States I can't just assume that its the US project, so I have to do a manual cleansing run, where I go through using AWB and manually scan through all the WPUS articles looking for dual banners and then fix them.
- On the 2nd question of 4 yes.
- On task 5 unfortunately no. The problem is for somethings like needs infobox I first pull in all the articles under needs infobo=yes then scan through and see which ones have one and then remove the uneeded parameter. The opposite is also true. Eventually I hope to be able to craft some code that can look at the main page and then make the appropriate edit but unfortunately I nor anyone else I have asked seems to know how to do that.
- For the summary, most of the tasks are consolidated but most of the problems are due to a lack of standardization of the WikiProject banner layout or my ability to do anything about that. Frankly, I am not the greatest programmer and I freely accept and admit that. I only created the bot and started to do the tasks myself because after about a year of asking other people to do them I couldn't get the tasks done otherwise. I continue to make corrections and refine the process as I go along and it continues to get better but due to layout of the WikiProject banners and the screwy policies we have in place that prevents me from fixing that I am limited by what I can do. If you choose to make this a manual or semi-auto bot then theres no point because I am not going to manually do tens of thousands of edits. I also don't intend to sit and go blind staring at my monitor while 10, 000 plus edits go by. I do random spot checking, I do a lot of manual checks and balances and cleansing and as I continue to refine the process and as I repeatedly go through the articles I will catch and fix anything that the bot does wrong. I also fix any errors the bot creates as they are presented to me and I would be glad to post the code for anyone who feels compelled to make it better. For what its worth even when I had other bots such as Dodobot or Xenobot do the tagging and assessing they made quite a few errors such as wiping support parameters like needs infobox or image, breaking other templates, or carrying over garbage from other projects as part of the auto assessment as well so I don't think its fair to expect perfection from this bot either. --Kumioko (talk) 20:56, 2 February 2012 (UTC)[reply]
So after thinking about the problem and playing with the code for the last couple hours I think I worked out a solution that should fix most of these problems and at the same time allow me to simplify the code a bit. In the below examples I move the importance and then the class directly after the banner for the project affected. This will allow the coding to be somewhat more standard and allow the logic to work in a lot more cases.
//Make importance come directly after WikiProject banner X (next section will move class before that)<br/>
ArticleText = Regex.Replace(ArticleText, @"{{\s*(WikiProject[ _]+Massachusetts)[ ]*\|(.*?)\|[ ]*importance(.*?)\s*([\|}{<\n])", "{{$1|importance$3|$2$4", RegexOptions.IgnoreCase);<br/>
//Make class come directly after WikiProject banner X<br/>
ArticleText = Regex.Replace(ArticleText, @"{{\s*(WikiProject[ _]+Massachusetts)[ ]*\|(.*?)class(.*?)\s*([\|}{<\n])", "{{$1|class$3|$2$4", RegexOptions.IgnoreCase);
Please let me know if you see a better or cleaner way but this is what I have come up with in my limited programming ability. --Kumioko (talk) 04:42, 3 February 2012 (UTC)[reply]
Sorry, to clarify, the main problem is parsing complex WikiProject templates? --Chris 10:58, 6 February 2012 (UTC)[reply]
- Partially, but its not just the complex ones. I have to account for a lot of variations. So consider that the banner allows for just class and importance. I have to account for no parameters, one or the other, bot, with and without data, parameters displaying in a different order. Some variations (like priority instead of importance, In some cases upper and lower case characters. When you expand that to include other parameters it starts to get pretty complicated. With that said I think I fixed most of the problems by adding the above code. --Kumioko (talk)
Trial complete.
Trial 2
[edit]Approved for trial (100 edits or 5 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. using the new code addition right above this section. MBisanz talk 15:31, 6 February 2012 (UTC)[reply]
- Thank you I'll do that as soon as I get home tonight. --Kumioko (talk) 15:53, 6 February 2012 (UTC)[reply]
Ok I got a run tested here. Its a bit over 100 because I completely rewrote the section of coding that deals with adding the WikiProject banner when missing. I also included its own edit summary. That code is as follows:
//Add WikiProject United States banner if missing
Regex header = new Regex(@"\{\{WikiProject United States}}", RegexOptions.IgnoreCase);
Summary = "Adding {{WikiProject United States}}";
Skip = (header.Match(ArticleText).Success || !Namespace.IsTalk(ArticleTitle));
if (!Skip)
ArticleText = "{{WikiProject United States}} \r\n\r\n" + ArticleText;
I am still working on the assessment coding which was never part of this BRFA anyway so I will submit that once its built. This just adds the bare template which puts the article in the scope of the project for now. --Kumioko (talk) 21:56, 7 February 2012 (UTC)[reply]
- It appears the bot is adding one blank line after the {{WikiProjectBannerShell}} for each entry it makes. Is this intentional? Josh Parris 04:28, 8 February 2012 (UTC) Preemptive clarification: each entry within the shell causes another blank line after the template: http://en.wikipedia.org/w/index.php?title=Talk:United_States_Post_Office_%28Greenwich,_Connecticut%29&diff=prev&oldid=475649387 and http://en.wikipedia.org/w/index.php?title=Talk:United_States_occupation_of_the_Dominican_Republic_%281965%E2%80%931966%29&diff=prev&oldid=475648389 for example. Josh Parris 04:54, 8 February 2012 (UTC)[reply]
- As for the extra lines, AWB has done that for some time whenever it adds the WikiProjectBannerShell template. I have mentioned it to them before but they haven't had a chance to fix that yet I guess. I can probably add some logic to remove some of those. --Kumioko (talk) 05:00, 8 February 2012 (UTC)[reply]
- Looking at http://en.wikipedia.org/w/index.php?title=Talk:United_States_of_China&diff=prev&oldid=475648480, I take it you're not doing point 4 of the task description. This article is a redirect; where did you get the tasking (what list, or who told you) to edit this redirect? This redirect is about China, and not about the USA except insofar as it has the words United States in the title. Josh Parris 04:39, 8 February 2012 (UTC)[reply]
- The list was articles Starting with United States. I had previously scrubbed a lot out of the list but I guess I missed this one. I can do the Redirect assessment however I have found that in many cases the assessment of the talk page doesn't equal the main page so I am going to scan through the main pages to get a list of redirects then I will populate the assessment. I am going to write some code that does the assessment piece and submit that as a separate BRFA so we can strike 7 and 8 both from this list. --Kumioko (talk) 04:57, 8 February 2012 (UTC)[reply]
- But it's not setting class=Redirect, and task 4 says it will. Josh Parris 05:30, 8 February 2012 (UTC)[reply]
- Yes it can, as written but in order for me to do the assessment I have to scan the articles main page then switch over so it can't currently be done at the same time as when I tag the article. I am trying to come up with a way of looking at the main page and applying the assessment or other action from the talk page without having to first scan the ain page for each individual problem but I'm not there yet. --Kumioko (talk) 20:01, 9 February 2012 (UTC)[reply]
- But it's not setting class=Redirect, and task 4 says it will. Josh Parris 05:30, 8 February 2012 (UTC)[reply]
- The list was articles Starting with United States. I had previously scrubbed a lot out of the list but I guess I missed this one. I can do the Redirect assessment however I have found that in many cases the assessment of the talk page doesn't equal the main page so I am going to scan through the main pages to get a list of redirects then I will populate the assessment. I am going to write some code that does the assessment piece and submit that as a separate BRFA so we can strike 7 and 8 both from this list. --Kumioko (talk) 04:57, 8 February 2012 (UTC)[reply]
From Trial 1, does this seem right to you http://en.wikipedia.org/w/index.php?title=Talk:Gene_La_Rocque&diff=prev&oldid=472377489 ? Josh Parris 12:19, 11 February 2012 (UTC)[reply]
- Yep. It removed the infobox=yes because the article has an infobox and added listas from a previously approved BRFA. With that said, I have stopped doing the listas because I learned that WPBIo auto applies the the listas to all projects so you shouldn't be seeing that anymore. --Kumioko (talk) 12:44, 11 February 2012 (UTC)[reply]
- What about the removal of "|needs-infobox=no" from another template? Josh Parris 14:03, 11 February 2012 (UTC)[reply]
- If it is set to now then there's no need for it to be there. I don't usually do this unless I am doing something more significant but if its set to no then there is no need to be there. In fact most banners including Bio state it needs not be there if set to no or is missing. --Kumioko (talk) 17:22, 11 February 2012 (UTC)[reply]
- What about the removal of "|needs-infobox=no" from another template? Josh Parris 14:03, 11 February 2012 (UTC)[reply]
I've run out of things to whine about. MBisanz? Josh Parris 12:28, 12 February 2012 (UTC)[reply]
Approved. MBisanz talk 16:00, 12 February 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.
Revoked. Per consensus at [1]. — madman 15:21, 18 February 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.