Wikipedia:Bots/Requests for approval/Plasticbot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Denied.
Automatic or Manually Assisted: Automatic, unsupervised
Programming Language(s): AutoWikiBrowser
Function Summary: Adding non-breaking spaces between numbers and their units per Wikipedia:NBSP, a section of Wikipedia:Manual of Style.
Edit period(s) (e.g. Continuous, daily, one time run): Daily. Obviously these edits are not urgent, so I do not plan on running the bot during high-traffic periods. I have reviewed some traffic rankings and see that traffic dips off significantly during certain periods. Unfortunately none of the rankings (that I found) specified their time zones, so I don't know when these light periods are. I have a flexible schedule and am open to suggestions there.
Off the top of my head (and I am really just winging it here) I was thinking about limiting the edit speed to every 15-20 seconds during the week and every 8-10 on the weekend.
Already has a bot flag (Y/N):
Function Details: It will use the following regex search and replace:
FIND: (\d) (mph|km|mile|mi|kilometer|mbar|knot|feet|ft|meter|m |m\)|metre|kilometre|inch|million|billion|foot|days|kt|millibar|mm|cm|dollar|USD|inHg|hPa|people|hour|liter|degree|°|year|month|square |sq )
REPLACE: $1
$2
Obviously the find and replace will ignore wikilinks, interwiki, nowiki, image, refs, etc. per the AWB "Find and Replace" settings.
Discussion
[edit]It seems to me that this kind of operation would have a high amount of false positives. I'm not going to make any claims since I'm not so hot with regular expressions, so I'll just ask you: have you ran the expression yourself on large blocks of text to see if anything turns up that isn't wanted? tj9991 (talk | contribs) 21:19, 26 July 2008 (UTC)[reply]
- I started about 2 weeks ago running this on a few Tropical Cyclone articles that I was preparing for FAC. Everything checked out fine, so I ran it on a few dozen more Tropical Cyclone articles with the edit summary Adding non-breaking spaces. Please report errors. Obviously I was checking all of the changes manually, but the only errors that I found were false negatives where non-breaking spaces were requried before units that I hadn't thought of. I ran the Find and Replace on several hundred more articles, checking each replacement manually, and as I was doing so I built up that long list of units you see in the FIND section. I still found no false positives in my manual checking, and no one reported any errors to my talk page. I loaded a list of all Tropical Cyclones and, over about a week, filtered through that. I wasn't familiar with Wikipedia:BOT at that point, and in retrospect I probably should have come here first, but c'est la vie. So anyway, about 3000 edits, and neither I nor anyone else has found a false positive. Plasticup T/C 21:34, 26 July 2008 (UTC)[reply]
- Here is a typical example if it helps. Plasticup T/C 23:49, 26 July 2008 (UTC)[reply]
This type of thing is what AWB's general fixes were made for. Edits that are purely cosmetic and trivial shouldn't be done alone. I would recommend either having this change put into AWB's general fixes or not doing the task at all. Thousands of bot edits that only change the type of a space are silly and unnecessary. --MZMcBride (talk) 13:39, 27 July 2008 (UTC)[reply]
- I would hardly call this "cosmetic and trivial". It is the first element of Wikipedia:Manual of Style (dates and numbers), and comes up in about 20% of Featured Article candidacies. The AWB general fixes are ones that do not affect the appearance of the article. Non-breaking spaces appreciably affect readability. Plasticup T/C 14:33, 27 July 2008 (UTC)[reply]
As a former BAG member I think this needs to be quickly denied under Wikipedia:BOT and per common sense. BOT states does not consume resources unnecessarily as a requirement for a bot. Changing only spaces is a complete waste of editing. If this was part of another more serious edit then I would not have a problem allowing this bot to run. but as it stand this needs denied. βcommand 15:14, 27 July 2008 (UTC)[reply]
- You are doing it a disservice by saying that it is "changing only spaces". The changes ensure that numbers are always found with their units. Reviewed articles are always checked for non-breaking spaces. It is a staple of the Manual of Style and adhered to rigidly the editors who seek to bring articles up-to-snuf. I can understand that a group which normally deals with functionality (categorization, tags, notifications, etc) may not appreciate how large a portion of the community actually spends time making articles conform to the Wikipedia:MOS, but for us this bot would make a huge difference. Plasticup T/C 15:50, 27 July 2008 (UTC)[reply]
- and the MediaWiki devs who operate the servers would block your bot as a waste of resources. If you can make other Productive edits along with these edits I would say go for it, otherwise just have it added to AWB's GEN fixes and wait for it to be done. this task also violates the usage rules of AWB. βcommand 15:53, 27 July 2008 (UTC)[reply]
- I don't think you are appreciating the value of these changes. If this has so little value then why has the community made it the first point of Wikipedia:Manual of Style (dates and numbers)? I understand the aversion to frivolous edits, but if conforming with this MOS requirement is demanded of every peer-reviewed and featured article it is fair to assume that the community has deemed it non-trivial. There is a broad consensus on this. Plasticup T/C 18:28, 27 July 2008 (UTC)[reply]
- Yes there is a 'broad consensus' on that this is what an article should look like, but making a thousands of edits just to do it is a trivial change. I'd have to agree with the opinion made here by Betacommand and by MZMcBride. Q T C 18:51, 27 July 2008 (UTC)[reply]
- The alternative is to have them painstakingly performed by various editors over several years. I am of the opinion that if it is worth doing at all it is worth doing right. Plasticup T/C 19:19, 27 July 2008 (UTC)[reply]
- Yes there is a 'broad consensus' on that this is what an article should look like, but making a thousands of edits just to do it is a trivial change. I'd have to agree with the opinion made here by Betacommand and by MZMcBride. Q T C 18:51, 27 July 2008 (UTC)[reply]
- I don't think you are appreciating the value of these changes. If this has so little value then why has the community made it the first point of Wikipedia:Manual of Style (dates and numbers)? I understand the aversion to frivolous edits, but if conforming with this MOS requirement is demanded of every peer-reviewed and featured article it is fair to assume that the community has deemed it non-trivial. There is a broad consensus on this. Plasticup T/C 18:28, 27 July 2008 (UTC)[reply]
- and the MediaWiki devs who operate the servers would block your bot as a waste of resources. If you can make other Productive edits along with these edits I would say go for it, otherwise just have it added to AWB's GEN fixes and wait for it to be done. this task also violates the usage rules of AWB. βcommand 15:53, 27 July 2008 (UTC)[reply]
What is the source for the list to check? BJTalk 18:35, 27 July 2008 (UTC)[reply]
- Not sure if I am understanding the question, but are you asking which articles I would apply these changes to? I was planning on going through various Wikiprojects that I think will use lots of units (Geography, Meteorology, Volcanoes, etc) and selecting their Top/High Importance articles, e.g. Category:High-importance Tropical cyclone articles. I would import the lists to Excel, chop off the "Talk:" extension, and export to a text file. I have also thought about Category:Wikipedia featured articles and Category:Wikipedia featured article candidates. Plasticup T/C 18:45, 27 July 2008 (UTC)[reply]
- While I disagree with Betacommand's assertion that the page writes are wasteful, checking non-targeted lists surely is. BJTalk 18:48, 27 July 2008 (UTC)[reply]
- What would you suggest? Scanning a database dump first to identify pages in need of correction? Plasticup T/C 19:17, 27 July 2008 (UTC)[reply]
- In my experience, over 95% of the Tropical Cyclone articles were missing non-breaking spaces. If I were to load Biographies of Living Persons I think that wasteful page loads might be an issue, but if I am smart about it I don't see that being a problem. Plasticup T/C 11:39, 28 July 2008 (UTC)[reply]
- While I disagree with Betacommand's assertion that the page writes are wasteful, checking non-targeted lists surely is. BJTalk 18:48, 27 July 2008 (UTC)[reply]
I personally think this is useful, even if minor, and I don't think it's going to strain the servers. (Contrary to some assertions above, the MediaWiki devs will not block an approved bot making changes in line with MOS, even if the benefit seems minimal.) I can't see any harm -- to servers, or articles, or anything else -- for this bot to be approved. However, even if it's not approved, it might be nice to have a list of "pre-approved additional changes" that bots can make if approved for other tasks. And this would certainly be on it. For instance, I could add this change to the minor fixes Polbot #8 does, if it already needs to change an article for other reasons. – Quadell (talk) 13:40, 29 July 2008 (UTC)[reply]
- This plan works as well, and actually makes less wasteful edits than my plan below. I support it. BJTalk 13:42, 29 July 2008 (UTC)[reply]
I am traveling for the next 7 days and will not be available to reply. Hopefully that won't be too much of a disruption. Plasticup T/C 03:20, 31 July 2008 (UTC)[reply]
The idea that making a bunch of small edits would "waste server resources" is quite silly. The edits this bot would make are a drop in the bucket compared to what the servers already deal with. I would recommend approving this bot. rspeer / ɹəədsɹ 17:08, 31 July 2008 (UTC)[reply]
- I'm gonna have to agree with rspeer here. Approving this only as an addition to other changes is a second choice. – Quadell (talk) 11:58, 4 August 2008 (UTC)[reply]
I've been incorporating this logic into Polbot's 8th task, and I've found a difficult false-positive. [[8 mm film]] should not be changed to [[8 mm film]]. This will be a challenge to avoid. – Quadell (talk) 18:18, 5 August 2008 (UTC)[reply]
- AWB has an option "Ignore templates, ref, link targets and headings" which I enabled to avoid that problem. Plasticup T/C 21:49, 6 August 2008 (UTC)[reply]
- {{BAGAssistanceNeeded}}
- It has been a little more than a week since any significant discussion has taken place. If we agree that the changes are useful I would like to ask for some trial edits. Apologies for the {{BAGAssistanceNeeded}} template; I hope it is not too obnoxious. Plasticup T/C 15:10, 8 August 2008 (UTC)[reply]
...Honestly, I think I would prefer to see smaller changes like this, done at the same time as other smaller/larger changes, something like AWB's 'Minor fixes', or, BJ's idea below. Sounds like a good idea, regarding above, maybe making a list of things like this that any bot may (and is encouraged to) do while going about it's normal edits. How many articles are we talking about by the way? Sorry if I'm late getting here. SQLQuery me! 10:19, 9 August 2008 (UTC)[reply]
RefSpaceBot was recently denied on the same grounds that people are bringing up here. One solution that came up on IRC was to merge all the 100% safe fixes in to a single bot, so with every edit more is actually getting done. It would only edit when a threshold is crossed to avoid making pointless edits. BJTalk 20:39, 27 July 2008 (UTC)[reply]
- That sounds like a pretty good compromise. I'm sure that someone familiar with Bots/Requests for approval can think of several such projects. Does AWB include a functionality for minimum number of changes or would this require some external code? Plasticup T/C 20:51, 27 July 2008 (UTC)[reply]
- This would require a bot (not AWB), I'm willing to do the coding if it gets support at Wikipedia talk:MoS. BJTalk 20:55, 27 July 2008 (UTC)[reply]
Denied. I can certainly see the benefits of this bot-task, but there seems to be consensus that it doesn't pass all of our requirements when run on its own. This task could be added to any approved bot (so long as it only edits pages that would have been edited for other reasons) but I can't approve running this task on its own. – Quadell (talk) 13:45, 9 August 2008 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.