Wikipedia:Bots/Requests for approval/LivingBot 11
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: Jarry1250
Automatic or Manually Assisted: Supervised automatic
Programming Language(s): PHP, custom, basic framework.
Function Overview: "Fix" illogical headline hierarchies in articles.
Edit period(s): One off run, periodically after that.
Already has a bot flag (Y/N): Y
Function Details: This is a huge problem - over 14,000 illogical hierarchies (i.e. starting at level one or level three; jumping from level 2 to level and so forth) are listed at WP:CHECKWIKI, in various categories. With all the will in the world, some automated solution is needed here, and I think I have the perfect automated solution to this problem:
- Look at each heading in turn. Allow backward jumps (4 -> 2) but lessen any forward jumps > 1 level to 1 level (change 2 -> 4 to 2 -> 3).
- Assume that hierarchies which go 2 -> 4 -> 3 where the level four and level three are intended to be equal are very rare or non-existent and can be safely ignored.
- Hence assume that most hierarchies are in the most part logical in terms of the editor wanting one of three things: to indent, to unindent or to stay level, all of which the bot can mimic (i.e. relative operations). It's just the levels that they get wrong (i.e. the absolute operations, such as right at the start of the article).
- Thus, the bot should, most of the time, preserve the ToC (this is "autofixed" by MW, you see).
- The only deviance from this should be when an involved editor in good faith has deviated from the logic to add "See also", "External links", or most likely "References" at the end of an article, giving it a level two regardless of the structure of the article.
- Assume that bumping all remaining level ones up to level twos in unlikely to break anything.
As you can see, a few assumptions there; these, however, should be comparatively minor. I'm happy to drop the whole project if a significant false positive is presented.
I would, quite frankly, be ashamed to release the sourcecode to this one at the moment, but, in lieu of that, I've turned it temporarily into a sort of tool: (one example). Just change the url to try it out on different pages, and be sure to report any errors to me.
Discussion
[edit]Helpful timestamp: - Jarry1250 (t, c) 18:27, 23 May 2009 (UTC)[reply]
- Okay, I've been having a look around, and I find a few people objecting on grounds of style. I don't think this really prohibits changing heading levels, but I thought I'd mention it. There was a VP debate about changing how they look, but until consensus is reached people should, I think, use the logical heading levels and skin it themselves (if they visually prefer level 4 headings to level 3 headings after level 2 headings, for example). - Jarry1250 (t, c) 08:26, 24 May 2009 (UTC)[reply]
- What's a "VP debate"? And what do you mean by "skin it themselves"? – Quadell (talk) 14:41, 24 May 2009 (UTC)[reply]
- Sorry, obviously got out of the wrong side of bed this morning (or hadn't had any caffeine at that time). For VP, I meant a Village Pump discussion (either miscellaneous or proposals, can't remember which). For, skin it themselves, I meant tweak their own monobook.css. I've edited the comment above, as well, to mellow it a bit. - Jarry1250 (t, c) 14:47, 24 May 2009 (UTC)[reply]
- Ah, okay. I was wondering what Vandal Patrol had to do with it, since I figured you didn't mean the other VP debate. :) And I'm glad to see that "skinning it" does not mean "removing the skin from". :) – Quadell (talk) 14:59, 24 May 2009 (UTC)[reply]
- Sorry, obviously got out of the wrong side of bed this morning (or hadn't had any caffeine at that time). For VP, I meant a Village Pump discussion (either miscellaneous or proposals, can't remember which). For, skin it themselves, I meant tweak their own monobook.css. I've edited the comment above, as well, to mellow it a bit. - Jarry1250 (t, c) 14:47, 24 May 2009 (UTC)[reply]
- What's a "VP debate"? And what do you mean by "skin it themselves"? – Quadell (talk) 14:41, 24 May 2009 (UTC)[reply]
Well, I can't see any situations where this would be detrimental. Does anyone else? – Quadell (talk) 14:59, 24 May 2009 (UTC)[reply]
- I'm watching the talk pages of a few editors who are doing this manually at the moment; I'll keep this discussion posted on any concerns and how I would reply. (n.b. I don't want anyone even thinking about trialling this for a few days - we talking about 16,000 edits here people!) Oh, and I think I'll start a WP:VPM (That's village pump (miscellaneous) to you Quadell) discussion regarding the topic, not the bot. I don't want to scare people. - Jarry1250 (t, c) 15:02, 24 May 2009 (UTC)[reply]
Well, the discussion at VPM (oh, you kids these days!) seems to be positive. I'd recommend not running this in userspace or usertalkspace, and you might consider excluding all talkspaces, since it could be seen as a bot refactoring someone's comments, which some people will see as tantamount to an insult directed to one's mother. – Quadell (talk) 14:12, 26 May 2009 (UTC)[reply]
Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Let's see her in action. – Quadell (talk) 14:12, 26 May 2009 (UTC)[reply]
- Trial complete. - I gave it 50 to look at, and it edited all but 7, which was quite good going as it was battling against some serious lag and was set to exclude disambiguation and some other page. Don't worry, the lists given to the bot will solely refer to mainspace, and I wouldn't dream of running it in any other namespace. Edits are here, give or take; I don't mind adjusting the edit summary - maybe linking to an instruction page? I think I've looked through all of the edits. One I reverted, because it was late and it didn't look constructive. As it turns out, the headings were already messed up, and so the result was not better nor any worse.
The rest look perfect to me;I have left a note for the only person I have found objecting to this sort of automated edit, explaining the process. (S)he hasn't edited since I left that note. - Jarry1250 (t, c) 11:56, 27 May 2009 (UTC)[reply]- Docu has raised some worrying diffs on my talk page; they shouldn't have happened and I'm going to try to find out what went wrong. - Jarry1250 (t, c) 12:52, 27 May 2009 (UTC)[reply]
- Trial complete. - I gave it 50 to look at, and it edited all but 7, which was quite good going as it was battling against some serious lag and was set to exclude disambiguation and some other page. Don't worry, the lists given to the bot will solely refer to mainspace, and I wouldn't dream of running it in any other namespace. Edits are here, give or take; I don't mind adjusting the edit summary - maybe linking to an instruction page? I think I've looked through all of the edits. One I reverted, because it was late and it didn't look constructive. As it turns out, the headings were already messed up, and so the result was not better nor any worse.
- I get the sense a village is missing its idiot *sigh*. I wasn't syncing the files properly, and a number of edits I made to the "preview" tool (linked to above) hadn't been copied over to the "bot" version. At least it's something easily rectifiable. I'll have a word with docu about fixing the mistake the bot made in its first 50 edits; I think another trial may be called for. - Jarry1250 (t, c) 16:32, 27 May 2009 (UTC)[reply]
You didn't say the magic word.– Quadell (talk) 20:06, 27 May 2009 (UTC)[reply]- Oops, I forgot I wasn't teaching Sunday School. Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. – Quadell (talk) 20:17, 27 May 2009 (UTC)[reply]
- Trial complete. A little explanation is needed here. After watching a few edits, I realised the bot was messing up some unicode characters (e.g. [1] and [2]) in addition to correcting the headlines. I immediately stopped the bot, reverted the handful of bad edits and slept on it. This morning, I realised that this was a simple character encoding issue, and promptly fixed it by saving the PHP file as UTF-8. I then ran out the remaining edits, including articles which had been edited and rolled back the night before. Those edits were successfully completed, proving the fix had worked (here and here). Since the trial edits, I have changed the edit summary to point to an information page rather than bugging the instruction in the main edit summary, and tweaked the code to be a little more efficient. - Jarry1250 (t, c) 12:09, 28 May 2009 (UTC)[reply]
- Oops, I forgot I wasn't teaching Sunday School. Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. – Quadell (talk) 20:17, 27 May 2009 (UTC)[reply]
I like this bot, but it's a tricky situation. This bot is taking cases that are clearly wrong, and trying to convert them to what the authors would have done had they been paying attention. It usually makes the page better, and sometimes it converts the hierarchy from one bad version to another. It's hard to tell, and takes a basic understanding of each article it touches in order to verify that it's doing the right thing. I don't see it ever making the page worse, but I'm not 100% certain. Could someone else look over the edits and see what you think? – Quadell (talk) 19:09, 28 May 2009 (UTC)[reply]
- I've debated this with a couple of people, and managed to work it out. I'm in the always-a-net-benefit-but-sometimes-more-than-other-times camp, obviously, but I am interested to here opposing views before going large scale (one correspondent indicated that WP:OPERA might have something of a systematic problem with this, so I contacted them directly). - Jarry1250 (t, c) 19:15, 28 May 2009 (UTC)[reply]
- I've had a look through the edits and they seem good. Nice work :-) [[Sam Korn]] (smoddy) 18:58, 29 May 2009 (UTC)[reply]
Approved. Okay. – Quadell (talk) 19:40, 29 May 2009 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.