Wikipedia:Bots/Requests for approval/SharedIPArchiveBot 2
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Request Expired.
Operator: Petrb (talk · contribs)
Time filed: 21:22, Friday November 4, 2011 (UTC)
Automatic or Manual: Automatic
Programming language(s): c#
Source code available: yes http://code.google.com/p/sharp-wikibot/source/browse/branches/bot2/wiki_bot_core/wiki_bot_core/
Function overview: Perform archiving of shared IP talk pages with old messages and update their header templates
Links to relevant discussions (where appropriate): Shared IP talk page archiving proposal on VPR
Edit period(s): daily
Estimated number of pages affected: ~20000
Exclusion compliant (Y/N): Y
Already has a bot flag (Y/N): N
Function details: As part of a short-term A/B test, this bot will run through half of the pages in Category:Wikipedia user talk pages of shared IP addresses and set up an archiving system for old, outdated talk page messages. The frequency of archiving will be decided per the discussion on VPR, and the bot will continue to systematically archive old messages on these pages for the duration of the test. For these pages, it will also replace the current shared IP header templates with a slightly altered version that more prominently encourages new users to create an account (see the redesigned templates here). The bot will not archive:
- pages that do not contain one of the 11 specified header shared IP templates
- pages that have been edited within the time agreed upon by community discussion (i.e., pages where any kind of live discussion is still happening)
- current block notifications
- pages that are already archived
Discussion
[edit]- This bot will be assisting the Wikiproject User warnings testing task force in an A/B test on shared IP talk pages. The purpose of the test is to see if archiving old messages on shared IP talk pages produces any positive effects. Currently, readers who open a Wikipedia page at a coffee shop or library are likely to see the "You have new messages" banner and be directed to a talk page with dozens of old warnings that were not meant for them. We suspect that vigorously archiving old messages sent to shared IPs might reduce the likelihood of users being hit by irrelevant messages, which might prevent good-faith contributors from being discouraged from editing. We also hypothesize that old warnings might encourage rather than discourage vandalism, as per the Broken windows theory. After we run the test, we will be able to present data on what kinds of edits came from those addresses and how many new users registered accounts from them, which will give us some indication if archiving positively affects the community. Maryana (WMF) (talk) 22:11, 4 November 2011 (UTC)[reply]
- FYI, result of the VPR discussion is here. So, the bot would archive every 2 weeks (unless there's a live block on the talk page), and the test would run for 2 months. Maryana (WMF) (talk) 21:32, 14 November 2011 (UTC)[reply]
Ok, there seems to be some consensus for that. Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. How does this task related to Wikipedia:Bots/Requests for approval/SharedIPArchiveBot, do they need to be approved at the same time as part of the A/B testing? Or are they independent of each other? As far as testing this, I'd ask that for the first run or two, you reduce the number of edits to about 50/100, just in case something goes wrong, once any niggling problems have been fixed, then you can start going with larger numbers of edits. --Chris 03:31, 17 November 2011 (UTC)[reply]
- It's running now, first: I didn't really understand how many edits am I allowed? What 50/100 means? is that 50 edits? Bot is running as "Petan-Bot" because it's crucial to have bot flag for this task, (it changes talk page). Petrb (talk) 09:54, 17 November 2011 (UTC)[reply]
- You're allowed to have as many edits as you need to trial the bot (within common sense, e.g 50,000 edits within one day would not be appropriate). All I meant by 50/100, was before doing a large trial, you should do a small run of between 50 to 100 edits to make sure there aren't any problems with the code (e.g. you don't want to do 500 edits and then suddenly have to revert all of them). It's mainly just common sense, but I thought I would make sure, just in case.--Chris 10:49, 17 November 2011 (UTC)[reply]
- It's necessary to get a bot flag before it can operate using this account, is it possible? Petrb (talk) 11:01, 17 November 2011 (UTC)[reply]
- Normally, bots are not flagged until approved. However, this is a "special case". It needs to be flagged, so that it does not cause the users to get a "You Have New Messages" alert, and then be faced with no new message. Therefore, I hope we can make an exception.
- Petrb, I'd like to see - and chec - a small number of edits, such as 50 to 100 pages - and have the chance to manually review them, before scaling up.
- Also, can you please clarify what it'll do - specifically;
- Does it only act upon pages that have not been edited for <timeframe> (I believe consensus was 2 weeks, is that correct?)
- Does it only either archive the entire page, and put a shared IP header? Does it not remove any partial pages?
- Does it skip users that have been blocked within <timeframe>?
- Sorry to ask so many questions, but I think some of the ideas given in the bot-request might've shifted a bit, following the discussion; for example, the request here says it'll only skip "current block notifications", and I think the discussion shows that several users were concerned about removing block notices immediately after they expire. Chzz ► 13:33, 17 November 2011 (UTC)[reply]
- Although my main concern remains the apparent removal of block notices immediately upon expiry, instead of later [1], I have some other queries about this bot, too; including a) why we're addding miszabot auto-archiving (which AFAIK has never been discussed), b) on the VPR I thought we were going to add an indication that previous warning/messages had been removed (and possibly how many) but that doesn't seem to have happened. I'm also concerned that the test was performed on several hundred pages (and using another bot account) [2] and I think, at this point, it would be best if we could evaluate the edits and resolve these questions before resuming testing. Chzz ► 15:53, 17 November 2011 (UTC)[reply]
- Both was solved. concerning b - I don't know about it / no one requested that either on proposal of bot task (in my userspace), concerning other account it was done because this bot didn't have bot flag in the time Petrb (talk) 16:04, 17 November 2011 (UTC)[reply]
- If you still want to me stop the bot, say it here, I believe that all concerns were solved. Petrb (talk) 16:58, 17 November 2011 (UTC)[reply]
- Yes, STOP - it's supposed to be approved for a trial of 50-100 edits, and we wanted to check them. In addition to the run on the first account, SharedIPArchiveBot (talk · contribs) now seems to have made over 5000 edits! Chzz ► 02:27, 18 November 2011 (UTC)[reply]
- Shutdown for the moment --Chris 03:11, 18 November 2011 (UTC)[reply]
- 100 edits? I didn't notice I was restricted to that count, I specificaly asked how many we can do, according to edits: the task isn't that simple it's using algoritm which archive everything what looks like messages + specified elements, since not all elementes are defined now, it may happen that bot forget to remove some template etc. however once I define it, it will be removed next time (reply to my answer on TP, and probably question you also wanted to ask). Petrb (talk) 08:45, 18 November 2011 (UTC)[reply]
- Yes, STOP - it's supposed to be approved for a trial of 50-100 edits, and we wanted to check them. In addition to the run on the first account, SharedIPArchiveBot (talk · contribs) now seems to have made over 5000 edits! Chzz ► 02:27, 18 November 2011 (UTC)[reply]
- You're allowed to have as many edits as you need to trial the bot (within common sense, e.g 50,000 edits within one day would not be appropriate). All I meant by 50/100, was before doing a large trial, you should do a small run of between 50 to 100 edits to make sure there aren't any problems with the code (e.g. you don't want to do 500 edits and then suddenly have to revert all of them). It's mainly just common sense, but I thought I would make sure, just in case.--Chris 10:49, 17 November 2011 (UTC)[reply]
I do not have time to check the edits in detail right now; I hope to write more soon, but it may be a few days. For now, I will make some brief comments. To be honest, I am annoyed and frustrated - because I, and others, have gone to some effort to discuss/reach agreement upon the operation of this bot for a trial, but our comments seem to have been disregarded. I understand Petrb's desire to get the task going ASAP, and saying you're kinda "just the programmer" and do what you're told to do. However, what you've apparently been asked to do does not match the consensus/agreements. I have asked - of various people, on the VPP thread/on this BRFA/on the 1st-task BRFA - several clear, straight-forward questions - such as, I will try to say it more clearly: Will the trial remove block notices within 'x days' of the block ending [3] - and clearly, that's not what has happened. In addition, we've had another communication breakdown, resulting in the BRFA-trial being performed on thousands of pages, instead of a much smaller number. And, trying it out using another bot account that wasn't approved for this task, just because it was flagged, wasn't a good idea; but let's brush over that one.
- The idea of BRFA, as I understand it, is as follows; <Chris G/BAG, of course feel free to correct me>
- A) A user explains what the bot will do.
- B) People comment, discuss it, and we refine/address concerns
- C) It is approved for trial - typically, for between 50 and 100 edits
- D) Users review the edits, check for problems, and discuss solutions
- Steps C and D are repeated as necessary
- E) It is approved
- In this case, we've had a misunderstanding - possibly because of the two meanings of the word "trial", in this case;
- 1. As part of the BRFA process, the bot goes through a trial, to try and sort out any problems
- 2. This bot is designed to perform a 2-month trial, which was discussed on Wikipedia:VPR#Proposal: Shared IP talk page archiving.
- Unfortunately, some of the things discussed in that VPR thread do not seem to have come through here to BRFA. In particular, users wanted the bot to only archive pages that had not been used for a certain time, and not to remove block notices until some time after they had expired. I think consensus was for two weeks, in both cases.
- Another things discussed in the trial that have not come through to this BRFA is: The message left on the IP talk pages should indicate that previous warnings had been removed. That helps users looking at the talk-page, to see what has happened. "it could be 90% welcome message, with a small note that the IP has a history, with a link to where the warnings are archived. The note would not even need to mention the history is of warnings. That way we are friendly to new users of the IP, but the information is readily available to those who know what the message means. Monty845 18:26, 28 October 2011 (UTC)"
- Minor point: looking at the source, it seems like the exclusion compliance' code only looks for "nobots" - it could be better; see Template:Bots#C.23
- I will try to review some of the edits, ASAP. Chzz ► 12:20, 18 November 2011 (UTC)[reply]
Per the above, it is best that the trial be put On hold for the moment while further discussion takes place. Also, I'd like to apologise to Petrb for the miscommunication regarding the trial details. --Chris 13:00, 18 November 2011 (UTC)[reply]
- @Chzz:
- Sorry, don't take this offensive, but you need days, for what? You didn't address any issue which is happening right now, I updated the bot according to your request asap when you informed me about it on irc, so bot doesn't archive block notices of blocks which expired recently, concerning the edit count, I discussed this with Steven yesterday and he also understood this as we got approval to run unlimited number of edits for two months and yes I run it over 50 edits, examined them and considered it ready for more, the bot is running in debug mode and edit very slowly, so I am checking what is it doing, and I understand that you may have problem with me trying to handle this somehow faster, but keep in mind that we are discussing this over and over for several weeks, and it may surprise, but I also have real work and other stuff to do, which makes me pretty busy, so yes, I do not have a time to read all discussions realated to this task, I created page with summary of task in my userspace and asked you, Maryana and Steven to update it according to what is being proposed on discussions, so that I can follow it and create the task according to what is there, instead of reading through huge discussions and finding out what is actually true, I apologize for being such an ignorant, but I really can't read everything everywhere, I made a configuration and shutdown page so you could at least shut it down yourself and tell me exactly what needs to be improved instead of telling everyone that bot is completely broken so we must stop it and that you will tell us what we should fix after few days. Your concerns:
- Bot is editing more than approved:
- false
- users wanted the bot to only archive pages that had not been used for a certain time, and not to remove block notices until some time after they had expired. I think consensus was for two weeks, in both cases:
- that's what happens now and happened even before.
- The message left on the IP talk pages should indicate that previous warnings had been removed. That helps users looking at the talk-page, to see what has happened.:
- is this part of consensus? yes / no? if so, I can insert it in 5 minutes
- Bot is archiving block notices of blocks which expired recently - false (since we discussed this and it didn't even happen for many or maybe for any users since the bot wasn't running for long time until it was fixed)
- So may I ask what is actually problem? Thank you Petrb (talk) 13:15, 18 November 2011 (UTC)[reply]
- The core problem right now is, you do not seem to understand BRFA. Your bot has not been approved for unlimited edits. It was approved for "about 50" edits. You've made over 5000, and that is why I shouted STOP. I, also, have other things to do (as do BAG members) - so I hope you will understand why it might be several days before we can check over some of the edits, and decide if it can be approved.
- It's your bot; you are responsible for what it does. Nobody else.
- You said, "the bot is running in debug mode and edit very slowly" - it was editing about 10 times per minute, so I've no idea how you could possibly be checking those 5000+ edits.
- Chris G/BAG, I'd really appreciate it if you could help me explain things more clearly. Chzz ► 19:37, 18 November 2011 (UTC)[reply]
- I still read in this thread that it was approved for more than 50 edits, Chris told me that 50000 edits in one would be nonsense, and that's not what I did, I don't know where you get the number from. Petrb (talk) 20:26, 18 November 2011 (UTC)[reply]
- I mean this: You're allowed to have as many edits as you need to trial the bot (within common sense, e.g 50,000 edits within one day would not be appropriate)
- I either didn't understand it correctly (if so I apologise for that, but I wasn't only one who didn't) or it was wrong interpreted. I started 50 - 100 edits, in debug mode where I needed to confirm each of them, and after review I started bigger run, since no other concerns than those I already fixed, were mentioned, I continued with run, until it was shutdown, it's true that there are more bugs, but those are very minor and will be fixed. Petrb (talk) 21:10, 18 November 2011 (UTC)[reply]
- I still read in this thread that it was approved for more than 50 edits, Chris told me that 50000 edits in one would be nonsense, and that's not what I did, I don't know where you get the number from. Petrb (talk) 20:26, 18 November 2011 (UTC)[reply]
Hey, uh, why is this bot substing {{archive box}} when it archives shared IP talk pages? That should not be being substed. It is a dynamic template to show all of the archive subpages. And the fact that the bot has done this over 5000 times outside of trial... Logan Talk Contributions 20:02, 18 November 2011 (UTC)[reply]
- Fixed Petrb (talk) 21:36, 18 November 2011 (UTC)[reply]
I don't know if this is the correct place to put this, but anyway...
The Bot somehow overwrote an IP's talkpage so subsequent Warnings are ending up in the empty Hide/Show-Old Warnings section as seen in this edit.
Also, in this edit the bot archived all the posts from July 2008 through February 2011 but left the {{Old IP warnings top| Warnings and IP-Blocks date from July 2008 through January 2010.}} and {{Old IP warnings bottom}} intact. But they're empty now, so there's nothing to hide-show... --Shearonink (talk) 20:06, 18 November 2011 (UTC)[reply]
- Again: the task isn't that simple it's using algoritm which archive everything what looks like messages + specified elements, since not all elementes are defined now, it may happen that bot forget to remove some template etc. however once I define it, it will be removed next time (reply to my answer on TP, and probably question you also wanted to ask) - I already noticed that, after update it would archive even this, archiving of such pages is much more complicated than what miszabot is doing, I am not archiving just talk threads here Petrb (talk) 20:31, 18 November 2011 (UTC)[reply]
- Fixed Petrb (talk) 21:36, 18 November 2011 (UTC)[reply]
- Again: the task isn't that simple it's using algoritm which archive everything what looks like messages + specified elements, since not all elementes are defined now, it may happen that bot forget to remove some template etc. however once I define it, it will be removed next time (reply to my answer on TP, and probably question you also wanted to ask) - I already noticed that, after update it would archive even this, archiving of such pages is much more complicated than what miszabot is doing, I am not archiving just talk threads here Petrb (talk) 20:31, 18 November 2011 (UTC)[reply]
I just received a nudge in wikipedia-en-help IRC chat channel that one of my warnings was somehow not showing up on the editors talk page, due to what seems to be an issue with the all-new SharedIPArchiveBot2. Since there was some talk in the help channel about the bot, and since i felt that i might be missing something since i have been absent from editing for three months, i decided to have a look around to see what it would be doing. And to be honest - i am rather, if not very concerned about the bot and the process around it.
- Right now the bot seems to have made over 5.000 edits even though it is on trial. I assume this is a miscommunication, but seeing that bug reports (Including my own) have already been posted we may now have anywhere between 2 and a few thousand broken talk pages. Ergo, if user 139.137.244.2 would now receive a warning (As he did), he would never see it due to the faulty trial. I equally notice that part of the edits have been made trough user:Petan-Bot instead, which makes cleaning this more difficult - and which should not have been done in the first place for this reason.
- I am also somewhat concerned about the closure of [4] as "Strong consensus" as this is not what i see. While the majority of the editors seems to support this, the amount of editors weighting in into this result is quite marginal, especially if i compare it to the discussion about the (in my eyes) much more trivial "Does Wikipedia need a “share” button?" right about it. Was this discussion even raised on WP:CENT or WP:ANI? I can't seem to find any indication that it was.
Note that i might simply be missing pieces of the proposal here since it is somewhat hard to track every discussion since they are spread on several pages. If i missed anything please do let me know. However, if this was not raised on either ANI or CENT, i would strongly suggest raising it there regardless just to confirm consensus, as i believe that the vast majority of the editors who would comment on this proposal are not aware of it right now. Excirial (Contact me,Contribs) 22:46, 18 November 2011 (UTC)[reply]
- Concerning your report it was already fixed, problem was not on thousands pages but only on 5. Petrb (talk) 22:48, 18 November 2011 (UTC)[reply]
- Concerning Petan-Bot it was used because it was needed to have botflag for this task, and I wasn't sure I can get it without proving that task is working, anyway there are no mistakes done by User:Petan-Bot since I reviewed all edits there, so you don't need to be afraid of cleaning up that. Petrb (talk) 22:52, 18 November 2011 (UTC)[reply]
- Petrb, one thing you seem to have missed/not heard is, that in the VPP discussion, I believe general agreement was that pages should be either entirely archived, or not archived at all. That's because we were concerned about splitting up threads, or misrepresenting subsequent comments. For that reason, I do not understand why we're worried about complicated algorithms; to me, the agreed task seems much more simple: For pages tagged as "shared IP", we check if there's been activity within <2 weeks>; we also need to check if a block has expired within <2 weeks>. If neither of those are true, then the page can be archived (it'd make sense to use User talk:IP/Archive 1 for that, assuming none currently existed). Then it would replace the header with the 'new style' heading, which would include a small note, along the lines of "Stale warnings were automatically [difflink|removed] from this page." or similar. And that's it. At least, that's how I read the discussion/consensus. Chzz ► 23:27, 18 November 2011 (UTC)[reply]
- Really? What if "header" is in middle of page as on many pages? Or what if it's substituted in the middle so it's not just a header, what if there are some other special elements (categories) which are not to be archived. It's not that simple. Petrb (talk) 23:55, 18 November 2011 (UTC)[reply]
- Petrb, one thing you seem to have missed/not heard is, that in the VPP discussion, I believe general agreement was that pages should be either entirely archived, or not archived at all. That's because we were concerned about splitting up threads, or misrepresenting subsequent comments. For that reason, I do not understand why we're worried about complicated algorithms; to me, the agreed task seems much more simple: For pages tagged as "shared IP", we check if there's been activity within <2 weeks>; we also need to check if a block has expired within <2 weeks>. If neither of those are true, then the page can be archived (it'd make sense to use User talk:IP/Archive 1 for that, assuming none currently existed). Then it would replace the header with the 'new style' heading, which would include a small note, along the lines of "Stale warnings were automatically [difflink|removed] from this page." or similar. And that's it. At least, that's how I read the discussion/consensus. Chzz ► 23:27, 18 November 2011 (UTC)[reply]
- Concerning Petan-Bot it was used because it was needed to have botflag for this task, and I wasn't sure I can get it without proving that task is working, anyway there are no mistakes done by User:Petan-Bot since I reviewed all edits there, so you don't need to be afraid of cleaning up that. Petrb (talk) 22:52, 18 November 2011 (UTC)[reply]
Start again?
[edit]I'm unhappy with the state of the pages it has acted on, and the archive-pages it has made, for several reasons. I cannot detail those with diffs right now, but we've already discussed some of the concerns above. With that in mind, would it be best if - ASAP - we just undo everything, THEN perhaps we could have a sane conversation (here), get things nice and clear, and come to agreement about what it is doing, and do a test on 50 pages. y'know, like we were supposed to in the first fucking place (expletive struck later, per [5] Chzz ► 01:11, 19 November 2011 (UTC))[under discussion][reply]
I don't think that would be too hard for SharedIPArchiveBot (talk · contribs) right now, as I think we could catch almost everything with special:nuke (5469 edits, 2645 of which are 'archive' pages).
It is slightly complicated because of the 363 edits performed under Petan-Bot (talk · contribs). But it's not that hard to reverse those too.
We've got into a bit of a mess, here; I'm not bothered about blame for why we've got here, but, here we are: I'm looking forwards, and sometimes it's necessary to take one step back making progress.
Thoughts? BAG opinion? Chzz ► 23:19, 18 November 2011 (UTC)[reply]
- You should give us at least one valid to reason to revert 5000+ edits, since 99% are ok. 1% will be fixed soon Petrb (talk) 23:25, 18 November 2011 (UTC)[reply]
- I hope I don't need to remind you that between nuke and revert is very small difference, so it would be also big load for cluster, apart of that it's completely useless. Especially when you "just don't like it". What if I "just didn't like the way how all pages on wikipedia are written" would you nuke them? Petrb (talk) 23:28, 18 November 2011 (UTC)[reply]
- Please do not claim "I just don't like it" - that is not true, and to claim so is disingenuous; why on Earth do you think I am putting effort into this project, and trying to get it going? Do you actually believe my motivation is some form of deliberate stubbornness, malicious attack upon your character, or...well, what? I am striving - in the face of adversity - to help make this thing work. I resent the accusation that I "just don't like it". I find it startling that, now, you are worried about a "big load for cluster", when you've just completed 5000+ edits with a non-approved bot. Yes, I'm pissed off; because I've tried my best to help make this thing go smoothly, and many of my comments have been disregarded. But my being-pissed-off is beside the point, and not constructive. So,
- Despite that, here is a reason: The bot has created 2,645 archive pages [6]. In all cases, they are called "Archive", and not "Archive 1". In some cases, there were existing archives. In other cases, it partially archived the IP talk, which is in contradiction with the consensus shown at VPP. In some cases, it may have removed recent block notices, again in contradiction with consensus. In replacing the headings on the IP talks, it did not note that archives had been created (which was the apparent agreement at VPP). The current trial data will be distorted by the operations performed and now halted. I thus believe it is easier to "wipe the slate clean", put the problems behind us, and try again. This time, following the Wikipedia policy-based system for bot approval; by testing on a small number of pages, then giving users a chance to evaluate them. Chzz ► 23:53, 18 November 2011 (UTC)[reply]
- I don't really see a reason to nuke all those edits. There was a very low error rate and it did successfully archive almost all the pages without mangling threads, deleting content, adding the wrong templates, or archiving inappropriate things. I think we should spot check the edits it already made for basic technical acceptance, not do something ridiculous like delete thousands of edits in order to start over. And Chzz, please don't swear at Petr. Obviously there was miscommunication about the desired trial of the bot, and he's acting in good faith here to try and answer any concerns. Steven Walling (WMF) • talk 23:57, 18 November 2011 (UTC)[reply]
- I hope I don't need to remind you that between nuke and revert is very small difference, so it would be also big load for cluster, apart of that it's completely useless. Especially when you "just don't like it". What if I "just didn't like the way how all pages on wikipedia are written" would you nuke them? Petrb (talk) 23:28, 18 November 2011 (UTC)[reply]
What if have the bot self-revert all but 50-100 of its edits? That seems better to me than nuking, though Petr might have concerns about the workload on the Toolserver. Steven Walling (WMF) • talk 01:00, 19 November 2011 (UTC)[reply]
- I'm not suggesting this measure for any kind of "this is WRONG!!" type reasons; purely, honestly, because it is often easier (with such matters) to just 'undo everything' and start again. We could try to analyse where it's right/wrong, and clean up but experience tells me, that'd be more hassle than a couple of our clever mops clicking a few buttons and BZZT; back to square-one, let's work from there. If I might make the comparison: it's like when a dozen editors have messed about with an article, and added 'something good/mostly bad' - it's just easier, technically, to go back to the last revision before the problem, and add back the good parts, than to try and work on the bad version. But, it's only a humble suggestion; if consensus is not to do that, I will merrily work with where-we-are-at, and try help sort it out. Chzz ► 01:19, 19 November 2011 (UTC)[reply]
- I think the reasons for the request are sound and I agree. It just seems illogical to me to delete test edits in order to make more test edits, when all we'd have to do is look at the ones already made. I'll let other people chime in though - to be honest I just want to move forward as well. Steven Walling (WMF) • talk 01:23, 19 November 2011 (UTC)[reply]
(edit conflict) The unfortunate reality here is that, the longer we wait, the more of the pages may subsequently be edited; so if we're gonna go for the 'nuclear option' we should do it soonest. Chzz ► 01:25, 19 November 2011 (UTC)[reply]
- Steven Walling (WMF), if really isn't illogical; you (we!) want to get meaningful data from this trial. Right now, we have a corrupt sample; we have various pages edited in various ways, which don't correspond to consensus. It will be very difficult to draw any conclusions from that. If we can step back, test, agree, and then press ahead, we could try to evaluate a 2-month trial; where we are NOW makes that real hard, because pages have been amended in a way that consensus doesn't agree to. Chzz ► 02:27, 19 November 2011 (UTC)[reply]
Ok, this is exactly what I didn't want to happen, and well it happened. The intention behind the 50-100 edits part of the trial was that we could have a small trial first to make sure there aren't any problems, and if there were any problems we only had to revert 50 edits not thousands of edits "(e.g. you don't want to do 500 edits and then suddenly have to revert all of them)". Sorry, I should have made myself clearer on that point. {{BAGAssistanceNeeded}}
Now we need to decide, what to do about these edits. On one hand, every edit is slightly broken, because the archive template was substituted when it shouldn't have been. There is also the possibility of other errors such as this which create future problems for warning users. So, is it worth performing a mass revert? Or do we just accept the damage and do our best to mitigate it --Chris 03:21, 19 November 2011 (UTC)[reply]
- Chris G, I am not interested in recriminations at this point, but I think the most expeditious answer to the current situation is to rv all the edits, to clear the decks, then try to move fwd. As I said above. Obviously, the sooner that is done, the easier it is. Chzz ► 05:31, 19 November 2011 (UTC)[reply]
- It's been made clear that every edit has the issue that either they should be using {{Archives}} or they should be edits to /Archive 1 not simply /Archive. Other mistakes have also been pointed out, and there are almost certainly more. However, reverting the bots edits maybe more complicated than simply nuking the pages and then using mass rollback. At the moment I'm reluctant to simply say "go ahead" to reversion, however, as Chzz said, the longer we wait the more complicated it gets. Chris, is the BAGAssistanceNeeded template asking for input from other BAG members about how to proceed, with regard to mass reverting/just continuing? Because my thoughts on that at the moment, would be that although I'm leaning towards mass reverting, the tricky part is figuring out how to do that without causing more damage, and I'm still open to persuasion that that may not be the best method. - Kingpin13 (talk) 07:08, 19 November 2011 (UTC)[reply]
- Right if someone could also take a look on what I am telling I have also some points:
- 1 at Chris There were 5 mistakes like that and all fixed
- 2 at Chzz I told you to update bot config where is defined whether it should be numbered or not, I told you that several times and you didn't do it, you knew it's gonna not use numbered archives several weeks ago, however you were waiting for it to do 5000 edits although you could have stop it yourself just to yell at me here. However move the 2500 pages is less harm than doing extra 10 000 edits nuking whole thing and starting it over
- 3 at Kingpin Is that so big problem? Even some pages archived by other bots did similar mistake, no one reverted it.
- I still don't see any major problem over all 5000+ edits, so could someone tell why is it worth reverting? However revert those edit's would be better, fix them would be best (IMHO), nuke them is non sense (IMHO) - @Chzz if you ever tried how nuke works - it's installed on hgwp where you have enough flags to try it out - you would see it's more than few clicks. Thanks Petrb (talk) 08:26, 19 November 2011 (UTC)[reply]
- It's been made clear that every edit has the issue that either they should be using {{Archives}} or they should be edits to /Archive 1 not simply /Archive. Other mistakes have also been pointed out, and there are almost certainly more. However, reverting the bots edits maybe more complicated than simply nuking the pages and then using mass rollback. At the moment I'm reluctant to simply say "go ahead" to reversion, however, as Chzz said, the longer we wait the more complicated it gets. Chris, is the BAGAssistanceNeeded template asking for input from other BAG members about how to proceed, with regard to mass reverting/just continuing? Because my thoughts on that at the moment, would be that although I'm leaning towards mass reverting, the tricky part is figuring out how to do that without causing more damage, and I'm still open to persuasion that that may not be the best method. - Kingpin13 (talk) 07:08, 19 November 2011 (UTC)[reply]
@Chzz
- 2.11. - I showed you how the proposed task should look: http://test.wikipedia.org/wiki/Special:Contributions/SharedIPArchiveBot
- You obviously didn't notice anything about numbered archives although it was clear it's not going to happen.
- 3. 11. - I linked you to User:Petrb/Proposed_bot_task_iptalk where you commented
- You didn't notice we should use numbered archives, I also told you that there is a config where it could be changed, you did't do that either
- 17. I started first small trial at 20:59 GMT few minutes after that it was interrupted according to you concerns which I fixed - I told you there is shutdown link you should use in case that you find any issues
- All your concerns were fixed and no one notified me about numbers of archives
- 17. few hours after fix I continued with test and I again told you that there is shutdown button
- you knew it's running, you knew it did hundreds of edits and that it is going to do thousands (I told you that), however you wait until it did 5000 edits (if you stopped it after 500 edits you could hardly have something you could use against us).
And please keep in mind I am not telling you are anyhow responsible for that it has happened - it's my fault, I just tried to tell you that you could have inform me that there is something wrong you disagree with weeks ago, but you didn't
Now you are complaining with "mistakes" which aren't nearly mistakes and repeat over and over that we did too many edits, which is probably true (it was misunderstanding - I am sorry) but it's the only problem you have, still repeat it and you could prevent it from happening. You are constantly trying to find more and more problems in edits where there are nearly no issues, apart of those I already fixed, so you are still repeating the same thing. What is so big deal with that it wasn't numbered? The pages are never going to have large history, some of them are old 5 years and they have few templates - is it necessary to split it to more archives? And even if it was. It can be fixed. I don't see any reason to revert it. Other than that you just don't like it Petrb (talk) 09:14, 19 November 2011 (UTC)[reply]
- So if I summarize it: we have only one (big?) problem and that the bot "substituted" (id didn't really substitute it) navigation template, could some please explain me what is wrong on that, that we need to revert all edits?. Thanks Petrb (talk) 09:33, 19 November 2011 (UTC)[reply]
- And if you tell me what is exactly problem with that it's not being numbered I will be happy to come with some plan to fix that, however according to that histories are really going to be rather short, having one archive makes sense to me. Petrb (talk) 11:26, 19 November 2011 (UTC)[reply]
- I still think that we should keep it archiving to one big page, instead of several smaller, the reason is that people wanted all warnings together instead of splitting them, the pages are usually small and archive would hardly grow too big, and having it all in one would make it much easier to search certain templates or count number of warning templates. Petrb (talk) 19:12, 19 November 2011 (UTC)[reply]
- And if you tell me what is exactly problem with that it's not being numbered I will be happy to come with some plan to fix that, however according to that histories are really going to be rather short, having one archive makes sense to me. Petrb (talk) 11:26, 19 November 2011 (UTC)[reply]
I think it's time to move forward so: Yes, I have done some mistakes there is no doubt about it and I take full responsibility for that, however instead of arguing we should look forward to some solution,
Here is the list of issues I have found reviewing the edits:
- The bot was about to remove block notice from recently expired notices - fixed before it could do that
- The bot have done major error when archiving page due to missing definition of some templates and that it left template which was hiding new templates, this occurred on 5 pages and is fixed (thanks for report @Logan, Matthew)
- The bot substituted navigation template, although it's disputable if that is major issue, I can fix it of course - there are two possibilities:
1 - bot will fix all pages where it happened (2600+ edits which are not needed)
2 - bot will fix it next time when it's archived
- The bot didn't use numbering for archives, it was done purposefully, however there are some complains about it so, I am willing to "fix" that:
1 - the bot will move all (2600 moves which are not needed)
2 - the bot will start using it and leave existing archives until they are full, in that case it move Archive to Archive 1 and start another one.
Please let me know what solutions you like best. Petrb (talk) 09:35, 20 November 2011 (UTC)[reply]
- Question - I am under the impression that the archiving will cause a "you have messages" bar, regardless of bot flag. Is it definite that it won't as suggested above? Rich Farmbrough, 21:02, 22 November 2011 (UTC).
- Unless things have changed recently, an edit which is marked as both a minor and bot edit does not trigger the message bar - Kingpin13 (talk) 21:13, 22 November 2011 (UTC)[reply]
I am coming to this late (just ran into an instance of the bot on a IP talk page). It's my understanding that the bot is archiving IP talk pages
- so new users assigned to a dynamic IP do not see the "You have new messages" banner. If that is so, then it seems to me the solution would be to turn off the banner if it stays up more than 2 weeks.
- to keep up appearances under the "broken windows theory". Note that the "broken windows" here are not the talk page warnings but the instances of vandalism and link spam that the anti-vandalism editors and bots are continually removing. That's not to say it might not help to keep the talk pages cleaned up but I'm waiting to see the results of the trial.
I think it's likely this bot will increase problems:
- At least for schools, I would hope that there might be institutional interest in monitoring vandalism by students and removing the warnings will lead most teachers or whoever might be monitoring to think everything is fine. To address this, perhaps the bot could summarize the warning and block statistics (a graph would be great) so the information is not hidden away.
- Archiving all but the last 14 days is too aggressive. It makes it a lot harder to detect persistent vandals and spammers, will increase the ant-vandalism workload and will increase the number of vandalism-only and spam-only accounts that go unblocked. I would prefer to see the bot leave six weeks of content or provide a summary analysis of what is being archived.
Jojalozzo 17:59, 12 December 2011 (UTC)[reply]
- Hi Jojalozzo,
- I understand your concerns, and we talked about this on the VPR thread. Consensus was for two-week archiving, and many people wanted even more rapid archiving (some even thought we should delete the old warnings altogether – that's actually what happens on German Wikipedia after 24 hours of no activity on a shared IP talk page). Here are some details to keep in mind:
- Tools like Huggle and Twinkle automatically reset to issuing level 1 warnings if there are no additional warnings issued to an IP talk page after 72 hours. So, if you're using them, you won't see any change at all to your vandalfighting.
- If an IP is blocked or if warnings keep coming in, the bot won't archive its talk page.
- We don't actually know if warnings have any deterring affect whatsoever on persistent vandals. Maybe they just make them more malicious. That's why a test like this is important.
- All of the old warnings will be available at the archive page (prominently linked to on the talk page), so this won't change your ability as a vandalfighter to go back and check for persistent vandalism.
- And, lastly, this is only a test, not a permanent change to the system. If this really does lead to an increase in vandalism and spam – which I highly doubt; otherwise, I wouldn't have suggested doing it :) – we'll have learned something very important about the value of warning messages.
- I hope this puts you a little more at ease. Let me know if you have any more questions. Maryana (WMF) (talk) 18:36, 12 December 2011 (UTC)[reply]
- I don't know if warnings have much effect either though I suspect they do work for some editors. I do think that blocks and school monitoring are effective and because blocking policy and school monitoring depends on being able to see the history of warnings, this bot will make that more difficult. That's why I suggested the bot provide a summary of the archived warnings and blocks. Jojalozzo 17:40, 14 December 2011 (UTC)[reply]
- Oh, we're not touching Category:Shared IP addresses from educational institutions, which has over 60,000 users. We're only testing on Category:Wikipedia user talk pages of shared IP addresses (about 38,000 users), which has some educational institutions, but not very many. It's most ISPs and hotspots. So, if the schools turn out to be a problem in our sample, we'll know that they require extra monitoring, and if not, that'll be another test to think about in the future... but first let's try this one! :) Maryana (WMF) (talk) 20:22, 14 December 2011 (UTC)[reply]
- Ok, but that still leaves
- issues with ARV/blocking
- the suggestion that the bot provide summary info about what it archives
- the suggestion that instead of deleting old warnings, the bot turn off old alerts that "you have new messages".
- Jojalozzo 21:06, 14 December 2011 (UTC)[reply]
- issues with ARV/blocking – Not sure what you mean here... there shouldn't be issues with blocking, because the bot won't archive talk pages with active block notices on them. And most shared IPs don't get hardblocked, anyway, so these would be edge cases.
- the suggestion that the bot provide summary info about what it archives – That's quite tricky, technically speaking. How can a bot know what kinds of messages it's removing? It could perhaps give an approximate number of messages archived, but that information wouldn't be very useful to anybody. All you'd have to do is click on the archive link to see them in full.
- the suggestion that instead of deleting old warnings, the bot turn off old alerts that "you have new messages" – A good idea but, again, I don't believe it's technically feasible. Maryana (WMF) (talk)
- Ok, but that still leaves
- Oh, we're not touching Category:Shared IP addresses from educational institutions, which has over 60,000 users. We're only testing on Category:Wikipedia user talk pages of shared IP addresses (about 38,000 users), which has some educational institutions, but not very many. It's most ISPs and hotspots. So, if the schools turn out to be a problem in our sample, we'll know that they require extra monitoring, and if not, that'll be another test to think about in the future... but first let's try this one! :) Maryana (WMF) (talk) 20:22, 14 December 2011 (UTC)[reply]
- I don't know if warnings have much effect either though I suspect they do work for some editors. I do think that blocks and school monitoring are effective and because blocking policy and school monitoring depends on being able to see the history of warnings, this bot will make that more difficult. That's why I suggested the bot provide a summary of the archived warnings and blocks. Jojalozzo 17:40, 14 December 2011 (UTC)[reply]
Where are we?
[edit]Status Unknown. This task still seems somewhat heated, and judging consensus is made somewhat difficult by the very fragmented nature of the discussion surrounding the task. At this point I think it will be best to leave the edits from the last trial, and start completely afresh. Petrb, for clarity's sake, could you please restate the details of how the bot will be operating, (when/what/where it will archive etc?). Unless there are any major objections, I would like to move to a smaller scale trial and get this task moving again. --Chris 08:45, 26 November 2011 (UTC)[reply]
- Of course, bellow is summary:
- Bot will walk through half of all the pages in the list Category:Wikipedia user talk pages of shared IP addresses
- It will check if the talk page is empty / archived and which template is used at the top
- If no template is present and content can't be archived (vandalised page) it will skip
- Template which matches the list provided by Maryana will be replaced with new one
- Bot will check the last time (if more recently than 14 days) the user's talk page was edited and if user isn't blocked
- Bot would check if there isn't a block notice which expired less than 14 days ago, in that case it would only replace main template (leaving notice) and skip
- If page is not already being archived, has messages older than 14 days, the talk page hasn't been edited in 14 days, and there are no live block notices on the page, the bot will:
- *create Archive N subpage of the user page with {{talk archive navigation}} {if the archive is over a max size (e.g. 100Kb) we should start archive 2,3,4 etc?}
- *cut and paste all old messages onto that page and save the page
- *leave an archive banner {{archives}} at the top of the talk page and save the page
- Bot will recheck all the pages again after 3 days. If the page has not been edited in 14 days and matches other critera, it will cut and paste all old messages to the archive.
- Bot will continue checking pages every 3 days and archive messages on talk pages that have not been edited in 14 days.
- Bot will not archive anything while a user is blocked {or if block has expired within 14 days}
- Let me know if you needed some details Petrb (talk) 08:52, 26 November 2011 (UTC)[reply]
- I admit it, I'm not all that tech-savvy and am a little confused about the scope of this next trial. At Category:Wikipedia user talk pages of shared IP addresses that Petrb mentioned, there are subcategories with dynamic IPS of 1229 pages, gov. addresses with 372 pages, .edu addresses with 60,143 and 38,135 pages listed at the bottom as being in the category of "Wikipedia user talk pages of shared IP addresses". Shearonink (talk) 17:09, 26 November 2011 (UTC)[reply]
- Sorry, it's a little messy because of the inconsistent categorization system for flagging and classifying IPs...
- I admit it, I'm not all that tech-savvy and am a little confused about the scope of this next trial. At Category:Wikipedia user talk pages of shared IP addresses that Petrb mentioned, there are subcategories with dynamic IPS of 1229 pages, gov. addresses with 372 pages, .edu addresses with 60,143 and 38,135 pages listed at the bottom as being in the category of "Wikipedia user talk pages of shared IP addresses". Shearonink (talk) 17:09, 26 November 2011 (UTC)[reply]
- We're only focusing on Category:Wikipedia user talk pages of shared IP addresses, not Category:Shared IP addresses from educational institutions or the other subcategories (though there are some .edu, .gov, and dynamic shared IPs in the former). There's ~38,000 pages in that category, but some of them have templates that fall outside the scope of this test (we're only making changes to talk pages that have one of these 11 templates). And we're halving that 38k because it's an A/B test, so we need a control sample. So, the number of affected talk pages will be a bit under 16,000. If we get good results from this test, we might consider proposing a change to all the shared IP talk pages (including all subcategories), but we won't know if it's worth it until we run the test. Does that make sense? Maryana (WMF) (talk) 16:25, 28 November 2011 (UTC)[reply]
- Makes sense but Chris posted that this would be a small-scale trial and "a bit under 16,000" doesn't quite sound like his "smaller-scale trial". I am also troubled by the date-range being 14 days. In my experience IP-vandals will return to the scene of their Wiki-crimes often and the long-term vandals will do so more than 14 days later. Is there any consideration to the time-limit being something other than 14 days, maybe a month or whatever from the last warning? Shearonink (talk) 18:05, 28 November 2011 (UTC)[reply]
- I think what Chris meant by "small-scale trial" is that the bot should do 50-100 test edits first. As for length of time for archiving, please see the discussion here about why that was the duration we settled on. The short version is that this is meant to be a short-term test, not a systemic change, and since the test will only run for 2 months, archiving every month wouldn't make too much sense :) Maryana (WMF) (talk) 18:20, 29 November 2011 (UTC)[reply]
- Ok, just to make sure I understand this, the way the trial will be done is in the following order:
- 1) Bot will do just a short test of 100 IPs.
- 2) Bot will be stopped.
- 3) That 100-IP test will be evaluated for any possible issues.
- 4) Possible issues will be fixed.
- 5) Full-scale trial will then ensue on the approximately 16000 IP talk pages for a time-period of two months.
- 6) Bot will be stopped after two months and the complete trial will then be empirically evaluated.
- -Shearonink (talk) 19:04, 29 November 2011 (UTC)[reply]
- Typically we do trials with 100 edits or less; this one seemed to be a bit of a fluke / mis-communication as to the nature and extent of the trial. If there's consensus for a continued trial, let's mainly worry about numbers 1-4, and then we'll figure out any further trials, if any are needed, once we get that far. --slakr\ talk / 22:31, 29 November 2011 (UTC)[reply]
- Yes, that sounds like the clearest plan to me. Thanks to both of you for the comments. Steven Walling (WMF) • talk 01:54, 30 November 2011 (UTC)[reply]
- Typically we do trials with 100 edits or less; this one seemed to be a bit of a fluke / mis-communication as to the nature and extent of the trial. If there's consensus for a continued trial, let's mainly worry about numbers 1-4, and then we'll figure out any further trials, if any are needed, once we get that far. --slakr\ talk / 22:31, 29 November 2011 (UTC)[reply]
- Ok, just to make sure I understand this, the way the trial will be done is in the following order:
- I think what Chris meant by "small-scale trial" is that the bot should do 50-100 test edits first. As for length of time for archiving, please see the discussion here about why that was the duration we settled on. The short version is that this is meant to be a short-term test, not a systemic change, and since the test will only run for 2 months, archiving every month wouldn't make too much sense :) Maryana (WMF) (talk) 18:20, 29 November 2011 (UTC)[reply]
- Makes sense but Chris posted that this would be a small-scale trial and "a bit under 16,000" doesn't quite sound like his "smaller-scale trial". I am also troubled by the date-range being 14 days. In my experience IP-vandals will return to the scene of their Wiki-crimes often and the long-term vandals will do so more than 14 days later. Is there any consideration to the time-limit being something other than 14 days, maybe a month or whatever from the last warning? Shearonink (talk) 18:05, 28 November 2011 (UTC)[reply]
- Again I propose/persist that the bot moved the "actual wrong" archived talkpages from /Archive to /Archive 1. mabdul 14:09, 30 November 2011 (UTC)[reply]
- That is no problem but is it needed? Petrb (talk) 15:20, 30 November 2011 (UTC)[reply]
- Personally, I don't think it's really necessary, however it couldn't hurt to have the bot fix the pages as it goes through and edits them again (note, I think a separate run, with the sole purpose of fixing all the pages would be overkill and unnecessary). Once the trial below is over, and if there is support for it, we can look into the best way to include that in the bot. --Chris 06:44, 1 December 2011 (UTC)[reply]
- That is no problem but is it needed? Petrb (talk) 15:20, 30 November 2011 (UTC)[reply]
Ok, lets get this moving. Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. per what was agreed to above. To be clear, 100 edits, then stop. Then we will review, and decide our next move. --Chris 06:44, 1 December 2011 (UTC)[reply]
BRFA trial part 2
[edit]- This worries me, because the "old warnings" 'show' indicated that there were none. I see you removed it "by hand" [7] - has it been fixed for future edits?
- Could the edit-summary please include a link to WP:UWTEST
- The archive box, e.g. {{archive box|[[User talk:109.232.72.10/Archive_1]]}} ([8]) is rather ugly/confusing; it would be better to use a relative path and a space ("/Archive 1") but I think it is easier to just use {{archive box | auto=yes }} instead ([9]) Chzz ► 13:24, 1 December 2011 (UTC)[reply]
- Here, it has left behind an {{old IP warnings bottom}} (with no corresponding 'top') Chzz ► 15:30, 1 December 2011 (UTC)[reply]
- Yes it was fixed
- It's in configuration - User:SharedIPArchiveBot/Config
- Fixed
- Fixed
- Thanks. Trial is running now Petrb (talk) 18:27, 1 December 2011 (UTC)[reply]
- Also it have done other mistakes like leaving part of messages on the page instead of archiving them I will try to update it to be more hard when checking what to archive. Petrb (talk) 18:32, 1 December 2011 (UTC)[reply]
Finished Petrb (talk) 20:36, 1 December 2011 (UTC)[reply]
- I have noticed that it has done few mistakes on beginning, all should be fixed now. Petrb (talk) 20:40, 1 December 2011 (UTC)[reply]
Lets go for another slightly bigger trial to make sure those errors are fixed. Approved for trial (350 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. --Chris 17:16, 4 December 2011 (UTC)[reply]
- Trial complete. Petrb (talk) 21:07, 4 December 2011 (UTC)[reply]
Bug found: So what is with that article? User_talk:130.246.132.26 mabdul 23:32, 4 December 2011 (UTC)[reply]- Another not-directly-related-bug: The template has to be changed at: User_talk:129.7.35.213/Archive_1. mabdul 23:34, 4 December 2011 (UTC)[reply]
- Could you please report more recent? thanks :) Petrb (talk) 23:34, 4 December 2011 (UTC)[reply]
- Yeah that was my fault since I click on the "see also that IP". Sorry. but let us discuss that cases:
- User_talk:129.65.27.88/Archive_1 <-- should the old Shared IP address template really archived?
- User_talk:129.230.248.1/Archive_1 <-- why are the old templates in an archive in a collapsed box? That is super-dupa-special-hidden.
- mabdul 23:53, 4 December 2011 (UTC)[reply]
- That's a question what it should do if more than 1 template is there. I will make it keep both.
- Second bug is fixed. Petrb (talk) 08:41, 6 December 2011 (UTC)[reply]
- 1 And another bug, why was here the old IP warnings removed but nothing changed? Either archive the old warnings or let the collapseable box there! mabdul 12:39, 6 December 2011 (UTC)[reply]
- 2and another move that needs discussion: [10] should this box/template really archived? Shouldn't that pages get manual investigation and restoring(to the archive) of the old warnings? mabdul 12:42, 6 December 2011 (UTC)[reply]
- Some cosmetica at User talk:128.174.150.43 (and many other talkpages): remove the unneeded whitespace if possible ;) mabdul 12:46, 6 December 2011 (UTC)[reply]
- Can you explain that page http://toolserver.org/~petrb/logs/ you linked on the Userpage of the bot? It is neither up to date nor is it "usable" since it is rather getting long! mabdul 13:19, 6 December 2011 (UTC)[reply]
- 4Why was here a archivebox added? No archive was created! mabdul 13:26, 6 December 2011 (UTC) (oh and by the way: was revert!) mabdul 13:27, 6 December 2011 (UTC)[reply]
- User_talk:125.161.133.239/Archive_1:5 should the IPtalk template really archive? why not simply remove it? mabdul 13:29, 6 December 2011 (UTC)[reply]
- when archiving this, please remove also the __TOC__ (or notoc, depends what is on the page) mabdul 13:33, 6 December 2011 (UTC)[reply]
- Yeah that was my fault since I click on the "see also that IP". Sorry. but let us discuss that cases:
- 1 it was removed because it's configured to remove it, that will be improved
- 2 all moves that needs discussion should be discussed then, however if you want it not to remove such stuff feel free to insert it to bot ignore or skip list, it's in configuration I will be happy to explain how does it work, just insert it there, or send me what should be skipped, have a discussion and meanwhile bot would skip it
- 3 logs are not displayed because it wasn't running on toolserver, trial is running on another pc
- 4 Archive box was created because page was archived but abuse filter prevented creation of page
- 5 misconfiguration only
Thank you. No idea what's wrong with toc Petrb (talk) 15:28, 6 December 2011 (UTC)[reply]
- The TOC will be displayed as you can see in the archive in the middle of the page where the collapse bottom was. so that shouldn't be in there! mabdul 16:03, 6 December 2011 (UTC)[reply]
- Fixed Petrb (talk) 20:03, 6 December 2011 (UTC)[reply]
Ok, lets have one more small trial. Approved for trial (400 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.. Then I would like to move towards the 2 month extended trial, however I think it might be a good idea to place a limit on how many edits the bot can do in a day, so that if more bugs are found, they are less damaging and can be fixed easier. --Chris 04:56, 9 December 2011 (UTC)[reply]
BRFA trial part 3
[edit]So, what will no happen with the last ~3k pages? Will they get moved? Will they get cleaned up? If so, by whom and how? mabdul 21:00, 9 December 2011 (UTC)[reply]
- Trial complete. Petrb (talk) 00:21, 10 December 2011 (UTC)[reply]
- I did a fast check on the first 50 edits. I see two problems which really should be discussed. The first is at User_talk:142.35.26.32/Archive_1 - clicking on that template shows that the IP 142.35.26.32/Archive_1 has no edits - of course. So I think we really should change many templates.
- The second problem I see, that on User_talk:142.35.51.2/Archive_1 the only notice is, that the old comemnts were deleted. Checking the history at User_talk:142.35.51.2, I see the comment (cur | prev) 03:29, 2 January 2011 BD2412 (talk | contribs) m (53 bytes) (blank ancient IP talk page posts per WT:CSD. using AWB) (undo) - so it seems to me, that there was a BFRA(?) for a AWB job. Can we restore the clearing and archive the old edits instead of the template? mabdul 00:36, 10 December 2011 (UTC)[reply]
- I found a minor thing at User talk:142.31.44.81/Archive 1: It archives __FORCETOC__ which is identical to __TOC__. mabdul 19:25, 10 December 2011 (UTC) (added to the config on my own) mabdul 19:27, 10 December 2011 (UTC)[reply]
- Where was the consensus to replace templates like {{ISP}} with {{ISP test}}, for example, with this bot? Even if there was, shouldn't the regular template just be turned into a randomizer so that all of these "tests" don't need to be replaced later on? And, furthermore, what is the point of these encouraging shared IP templates if there is no way to track their impact on account creation? Logan Talk Contributions 22:59, 10 December 2011 (UTC)[reply]
- As far as it was explained to me wmf has access to these data. Petrb (talk) 23:03, 10 December 2011 (UTC)[reply]
- Hi Logan,
- Consensus is here on the original VPR thread, and yes, we can track this information. We can't randomize a transcluded template, but we'll return them back to normal after the test. Please let me know if you have any more questions. Maryana (WMF) (talk) 00:43, 11 December 2011 (UTC)[reply]
- Okay, thanks Maryana. :) Logan Talk Contributions 03:55, 12 December 2011 (UTC)[reply]
- As far as it was explained to me wmf has access to these data. Petrb (talk) 23:03, 10 December 2011 (UTC)[reply]
At the moment I am considering whether this is ready to be approved for the two month trial or not. If anyone still has strong opposition/concerns to the bot, please speak below (likewise for those supporting the task).
Secondly, we need to deal with the broken pages from the previous trial. Personally, I think the best solution would be to have the bot fix those as it is editing the rest of the pages (as opposed to a mass run to fix all the pages, which I would strongly oppose). As I understand it all the genuinely broken pages have now been fixed, and it is only more cosmetic errors (e.g. "Archive" vs "Archive 1"), that are left. --Chris 19:05, 12 December 2011 (UTC)[reply]
- Undecided: I would really like to see an archival for cleared pages (per the mentioned AWB job) - and thus this would need another trial.
- For the "cosmetic" changes and moves I give a strong support. mabdul 20:47, 12 December 2011 (UTC)[reply]
- Okay guys, let's get this thing worked out. We've had 3 tests and a month and a half to talk it over and work out the kinks – if we leave this sitting any longer, everyone's going to forget what the original idea for the test was in the first place :)
- If there are dire concerns about the test, please voice them. If there are concerns about details that can be worked out as we run the actual test, then let's run the test and fix them along the way. The point of this and all our tests is to get a quick-and-dirty sample, figure out if there are any positive changes we as a community can easily make, and, if not, scrap the idea and move on. I know there's WP:NO DEADLINE, but I'm going to set an arbitrary one, anyway :) Can we say either yes, let's test or no, no test by the end of this week? Don't mean to be pushy, I'd just rather know sooner than later that this isn't a fruitful alley to pursue! Maryana (WMF) (talk) 20:40, 14 December 2011 (UTC)[reply]
- Welcome to the English Wikipedia bot request system. It's long, and it's painful. But, by golly, it works. I should stress though, (historically) we do not do (quick and) dirty anything. That is to say, personally at least, I would not approve any bots knowingly running with faulty code. Now, I'm not saying that's the case here, but if it is, then I find it quite correct that faulty code should be fixed before a test (even if it at present has only resulted in cosmetic issues, these things are often suggestive of the potential for wider problems). Of course, if there are in fact no problems, then I'd support a trial in line with WP:VPR (I haven't read it). - Jarry1250 [Weasel? Discuss.] 09:12, 16 December 2011 (UTC)[reply]
- Hehe, you could've stopped at "Welcome to the English Wikipedia" :) What I meant was: if there are known issues, please bring them to Petrb's attention, and if not, then let's test. I know he's itching to get going on this, too, and the longer it sits around, the likelier it is that people just coming into it will have no idea what's going on and will have to ask a lot of questions and get a lot of redundant explanation to get caught up to speed. Of course, I have exactly zero bot approval experience, so I could be totally wrong on this, but it seems like the people who are familiar with the bot's task should have enough info at this point to judge whether or not it does its job correctly. But again, me no bot herder, so maybe that's a bad assumption :)
- Anyway, thanks for your help and let me know if there's anything you need from my end! Maryana (WMF) (talk) 17:35, 16 December 2011 (UTC)[reply]
- "But, by golly, it works." [citation needed]. --Chris 10:07, 16 December 2011 (UTC)[reply]
- Heh. No, but seriously, it does work. Bots do get approved through it and do end up doing a lot of good work. ANd a lot of bugs are found and fixed before thousands of edits are made. - Jarry1250 [Weasel? Discuss.] 16:50, 16 December 2011 (UTC)[reply]
- "But, by golly, it works." [citation needed]. --Chris 10:07, 16 December 2011 (UTC)[reply]
Regarding the AWB job these, are all the edits I could find (about 219). I'm rather surprised that this was done, mainly because the discussion cited in the edit summary, is from 2006. However, as I understand it, most of these pages will be unaffected by the bot (most of those pages no longer have any shared ip templates), so, at a guess, there'd be less than 50 pages that the bot would actually edit, which is not enough to warrant adding it too the task. On a side note, it appears that some of the edits removed shared ip templates, however any clean up of that is outside the scope of this task.
Secondly, regarding the cleanup. As I understand it, the two errors that need to be fixed now are the incorrect naming of "Archive" (instead of "Archive 1"), and the incorrect substitution of the archive box template. As I have said previously, I think the best way to deal with those, would be to have the bot fix them as it goes through its normal activities; however if that were to be done, another small trial would still be necessarily, to ensure that the fixes don't accidentally break any more pages. --Chris 10:07, 16 December 2011 (UTC)[reply]
- Thank you for reply, it was actually implemented already, but if you want I can of course run another small trial to check if it's correct now. The main reason why bot didn't fix any of pages is that they were edited by last run and bot only edit pages which were not touched for certain period of time. Currently it moves all Archive pages and replace the substituted template Petrb (talk) 10:27, 16 December 2011 (UTC)[reply]
- Yes, I think it would be best just to make sure that the fixes work correctly. Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. --Chris 10:33, 16 December 2011 (UTC)[reply]
- Is that a trial of regular task or only for edits which fixes previous edits? Petrb (talk) 16:34, 16 December 2011 (UTC)[reply]
- Only edits which fix previous edits (if that's possible) --Chris 17:20, 16 December 2011 (UTC)[reply]
- Is that a trial of regular task or only for edits which fixes previous edits? Petrb (talk) 16:34, 16 December 2011 (UTC)[reply]
- Yes, I think it would be best just to make sure that the fixes work correctly. Approved for trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. --Chris 10:33, 16 December 2011 (UTC)[reply]
Trial complete. Petrb (talk) 10:53, 17 December 2011 (UTC)[reply]
- Extra new line was fixed too Petrb (talk) 10:59, 17 December 2011 (UTC)[reply]
Extended Trial
[edit]Ok, those look fine. Lets get this moving again Approved for extended trial (62 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. --Chris 11:12, 17 December 2011 (UTC)[reply]
- I was unaware of this bot until my watchlist started to fill with these changes yesterday. I raised a question at Wikipedia_talk:WikiProject_user_warnings/Testing#Proposal:_shared_IP_test as the bot appeared to be exceeding what I would expect for a "test". I believe that the mass creation of archive pages on anon IP user pages is naff and unnecessary. It would be just as easy to delete all notices over a year old, they would still be in the history and we have a template for that.
- I recommend a community wide RFC to gain a common understanding of the consensus for this mass change to how we handle anon IP user pages and to ensure a clear description of what this bot is up to and what analysis has been done to demonstrate this is needed.
- I request the bot is stopped in the meantime.
- If I get no reasonable reply on action to be taken here, I will stop this bot later today. --Fæ (talk) 10:33, 18 December 2011 (UTC)[reply]
- Fae, the RfC happened on VPR a month ago. I'm sorry you didn't get a chance to comment on it. To be very clear: this is 'not a '"mass change to how we handle anon IP user pages" – it's a two-month A/B test. After the end of the trial, the bot will revert all its edits, and we'll analyze the data to see if there's any benefit to fast archiving. If it does look like it leads to less vandalism and more anons registering accounts, we'll have an RfC to talk about making any permanent changes. If not, we'll have learned something about the value of warnings. In the meantime, this is only a test, so please don't panic :) Maryana (WMF) (talk) 16:46, 18 December 2011 (UTC)[reply]
- Lot of users prefer their talk pages archived to a page and not to null for a variety of reasons. Most obviously, in a continuous archive process, it is far easier to build up a picture of previous editing activity than by going through history.
- As I see it, this seems to be a shot to nothing. No-one seems to have suggested a way in which this does any harm at all, beyond being a waste of the bot writer's time.
- In this context, I'm not sure why you'd want to stop everything for weeks, hold long discussions, etc with the hope of being able to midfy this task. I mean, feel free to, but it's not clear to me what would be gained from diverting people's time to this issue. (I may well be missing something here, I have to say, since I haven't read *all* the discussions.) - Jarry1250 [Weasel? Discuss.] 14:12, 18 December 2011 (UTC)[reply]
- Stopped Petrb (talk) 15:18, 18 December 2011 (UTC)[reply]
- @Fae I want a reply from you regarding the reason for stopping the bot, there was already RfC regarding this task and wast majority of people agreed with the task, so unless you have a good reason to stop it, I will restart the bot in few hours, we are approved to run the requested 2 months trial and it already begun, in order to have accurate result of the task, we need to have a bot running for 2 months, otherwise it would be harder to get a valid results of this research. Thank you Petrb (talk) 15:29, 18 December 2011 (UTC)[reply]
- I can now see the archived Village Pump proposal where a number of alternative options were discussed. I remain confused why running this bot on 20,000 pages is considered a "test" (at least, based on the description that half of the 40,000 IP user pages with suitable templates will be handled by the bot). If half of all such pages are changed, then this is not a test as there is no way that we would mass revert all these changes. Was this really understood by the people supporting at the VP discussion? When I look at the "support" comments there were a range of test periods mentioned. None of these seemed to pick up on the suggestion that 20,000 pages would be changed during the test period. --Fæ (talk) 18:03, 18 December 2011 (UTC)[reply]
- BTW, as pointed out elsewhere, I did not actually stop this bot. I am unsure why this research needs to be done on 20,000 pages. Surely testing these stylistic changes could be done on a much smaller sample in order to make a decision for creating user archive pages and ISP header notices standardized compared to any other approach (such as collapsed sections or deleting old notices over a year old; both suggestions brought up on the VP discussion)? --Fæ (talk) 18:08, 18 December 2011 (UTC)[reply]
- Any issues? Next massive run will be started in approximately 1 week and 4 days Petrb (talk) 23:31, 22 December 2011 (UTC)[reply]
The more I keep seeing this, the more I keep thinking that old talk page warnings on IPs (e.g., >1 year) really don't need to be archived. In the past, we've typically just deleted old talk page warnings because they're simply not relevant—most of the warnings are dropped on dynamic IPs, and those that aren't dynamic IPs are typically schools where stale warnings don't matter. Because of the sheer number of IPs with talk pages, I'm not entirely sure archival is the best approach that this point, as prior warnings are in the page history (which vandal fighters check anyway, since IPs like to blank or alter warnings frequently), and making seperate archive pages will just contribute to bloat in dump files. That said, the bot seems to otherwise work from a technical standpoint. --slakr\ talk / 20:23, 13 January 2012 (UTC)[reply]
- I completely agree. On the German Wikipedia, an admin runs an authorized bot that deletes all shared IP talk page messages after 24 hours. That might be a bit extreme for en.wiki, but I definitely think year-old messages aren't helping anybody. However, this issue was a somewhat decisive one during the VPR discussion. I'm hoping that once we get some data back from this test, we'll have more concrete quantitative evidence about the aggregate behavior of shared IP editors, which might help placate people's fears and suggest a future course of action. Anyway, yeah, let's definitely keep brainstorming more optimal solutions post-testing :) Maryana (WMF) (talk) 21:18, 13 January 2012 (UTC)[reply]
- Slakr, reason that bot is doing this is that community requested it, not because we wanted to do that. No matter what your opinion is (actually I agree with that), we can not change the task because of that. Perhaps you could join the discussion which would be probably started when this trial finish so that you can explain this to rest of the community and hopefully we would be able to tweak the bot to do it right. Petrb (talk) 17:47, 14 January 2012 (UTC)[reply]
End of extended trial
[edit]Now that we're coming to the end of the extended trial, any data that might be useful in giving this bot final approval? MBisanz talk 15:15, 6 February 2012 (UTC)[reply]
- I would like some hard data published as the results. Experimenting on 20,000 pages and (presumably) creating 20,000 additional pages as permanent archive pages of old IP warnings will need some credible justification before letting this bot run wild. If the benefits seem weak, this bot should not go ahead. I still believe this was an unnecessary bot test, with a scope not clearly supported by the RFC mentioned above and seems like a solution looking for a problem rather than the reverse. I do not accept that this now has unstoppable momentum on the "but we've created it now" dubious rationale. --Fæ (talk) 15:25, 6 February 2012 (UTC)[reply]
- First, I agree about publishing the data. That's the whole point of running a proper A/B test instead of just the usual vague "trial". Second, I think you need show a little more good faith Fae. Trying new things on Wikipedia is hard enough without an attitude that is unnecessarily skeptical towards new ideas and change. Steven Walling (WMF) • talk 20:54, 6 February 2012 (UTC)[reply]
- Sorry if all my comments appear like bad faith to you. Let me try rephrasing in a way that you will not find personally offensive. I am trying to ask for an unambiguous need for this bot to add archive pages to IP user home pages and mass reformat the way all such pages are formatted. Such mass changes should, in my opinion, have a clear mandate and the test data should be able to demonstrate the benefits of making this part of the default infrastructure of the way we deal with Anon IP accounts across the whole of Wikipedia. If these things are in place then you have my support. --Fæ (talk) 22:30, 6 February 2012 (UTC)[reply]
- There was a clear mandate. We spent a month talking about it on the Village Pump for proposals, and the proposal evolved significantly based on what everyone talking it about it wanted. Steven Walling (WMF) • talk 19:54, 7 February 2012 (UTC)[reply]
- Sorry if all my comments appear like bad faith to you. Let me try rephrasing in a way that you will not find personally offensive. I am trying to ask for an unambiguous need for this bot to add archive pages to IP user home pages and mass reformat the way all such pages are formatted. Such mass changes should, in my opinion, have a clear mandate and the test data should be able to demonstrate the benefits of making this part of the default infrastructure of the way we deal with Anon IP accounts across the whole of Wikipedia. If these things are in place then you have my support. --Fæ (talk) 22:30, 6 February 2012 (UTC)[reply]
- First, I agree about publishing the data. That's the whole point of running a proper A/B test instead of just the usual vague "trial". Second, I think you need show a little more good faith Fae. Trying new things on Wikipedia is hard enough without an attitude that is unnecessarily skeptical towards new ideas and change. Steven Walling (WMF) • talk 20:54, 6 February 2012 (UTC)[reply]
{{OperatorAssistanceNeeded|D}}
Pinging for data on the trial result. So far, community consensus has been established from the VP discussion linked and discussed above, so a new discussion or indication of change would be needed to alter that.. Still waiting on final technical validity of the test to give final approval. MBisanz talk 21:10, 8 February 2012 (UTC)[reply]
- Hi all, it's a little early for data given that the test hasn't ended yet :) The official end date of the two-month period is February 19th (that's 2 months from when we started the trial). While I certainly wish it were possible to have instantaneous numbers and graphs for you, it's going to take Faulkner some time to gather all the samples and run a rigorous analysis on them, especially given that this is only one of a number of tests that are ending around then. We can push it up to the front of the queue in terms of priority, but I'm pretty sure it will still take a week or two. So, if you'd like to have a date in mind for when to expect results, I'd say probably first or second week of March.
- As to what happens on February 19th: the bot will stop archiving talk pages. If people feel really strongly that it should also go back and revert all of its edits to remove the archives it's already created, I'm sure Petrb will be happy to do that. But I don't see a logical reason for that until after we actually look at the results – if we see in the analysis that there is a clear benefit to archiving talk pages, it would be pretty cumbersome to re-revert everything again and put a new archiving system in place. Why not just wait and see what happened and then make the decision? Though we did get BAG's approval, this isn't a traditional BAG test (we already did several of those).
- Finally, I really wish Steven and I didn't have to keep stressing this point, but: this is not some sneaky way of forcing a permanent change on the community without its approval. That's not what our testing is about, previous, now, or ever. We've run eight tests so far as part of our WP:UWTEST project and have never kept a test running past when we said it would end, so please don't make those kinds of accusations. Again, on February 19th, the bot will stop archiving, we'll analyze the data, present it to the community, and ask everyone to come to a new consensus if there's conclusive evidence suggesting a need for it. Maryana (WMF) (talk) 22:27, 8 February 2012 (UTC)[reply]
- Ok, no rush. I do not think Petrb should revert the test edits, as that would be form over substance to the absurd degree. If, as we get closer to the 19th, Faulkner thinks everything is running fine, I would not object to a temporary continuation pending the final data in order to maintain continuity to the other projects you may be working on that don't have time-limited trials. MBisanz talk 06:16, 9 February 2012 (UTC)[reply]
- Comment I don't think this is worth doing. I've been editing from IP addresses for years and have probably been assigned 100's of them. It's quite rare that I get one with any kind of notices on them. I think it's better to leave the talkpages intact, since they sometimes contain info relevant to article development. One thing your bot could do instead is undelete all the IP user and talk pages that one of MZMcBride's unapproved bots deleted a few years ago. A number of those had useful info. 67.117.145.9 (talk) 08:13, 18 February 2012 (UTC)[reply]
Has your data analysis finished yet? Josh Parris 01:02, 24 February 2012 (UTC)[reply]
- No, we're in the process of wrapping up other tests and running analysis on them (which you're welcome to watch in close to real time on Faulkner's journal). As I said above, you should expect results for this test about 2 weeks from now. Thanks for your patience with our tiny 3-person analytics team :) Maryana (WMF) (talk) 23:11, 24 February 2012 (UTC)[reply]
- Well? I'm afraid it's been more than two weeks, this BRFA has gone on since November and surely nobody thinks it's getting out of hand yet? Either mark as expired (my choice), approve or deny, because just sitting around here taking up space has no merits whatsoever. Rcsprinter (state the obvious (or not)) 17:00, 14 March 2012 (UTC)[reply]
My view is, there still needs to be significant discussion, regarding this task, before it can be approved to run "full time". I think at the moment it would be best to mark this particular request as expired. Once the results and analysis from the trial are available, a new discussion can be started on the Village Pump to establish consensus, and then a fresh BRFA can be opened. Any objections? --Chris 11:14, 16 March 2012 (UTC)[reply]
- No objection. I support a well rounded discussion with a full unambiguous conclusion, hopefully one where I do not get repeatedly accused of bad faith when I attempt to ask questions or express an opinion. Thanks --Fæ (talk) 11:19, 16 March 2012 (UTC)[reply]
{{OperatorAssistanceNeeded|D}}
There is agitation for closure; any objection from the operator? Josh Parris 03:36, 17 March 2012 (UTC)[reply]
- Why you want to close it before we get the results? Petrb (talk) 16:55, 17 March 2012 (UTC)[reply]
- Update: Hey all, sorry about the epically long wait. Just to give you an update on what's going on with this: Staeiou ran the numbers, but he noticed right away that something was off in the results. It looks like the bot's randomization wasn't quite right, leading the test and control groups to be disproportionate in size (not an A/B test, but more of an a/BBBBBBBBB test :-P), which in turn makes our usual line of analysis impossible. Faulkner's trying to rescue it by looking just at the archived cohort to try to get some behavioral data out of it. I'll let you know as soon as he has anything (which actually should be by the end of this week, if not later today). As for the BRFA, you can feel free to close it – it wasn't really intended as a test to see if the bot should run permanently full-time, but to see if the task was worth doing. Thanks for your patience with this, and for your openness to experimentation :) Maryana (WMF) (talk) 16:17, 29 March 2012 (UTC)[reply]
- Presumably you've thought of tossing the BBBBBBBB leaving you with a/B? Josh Parris 03:45, 30 March 2012 (UTC)[reply]
- Just wanted to note that my bot did what was in definition of task. It said: archive half of category, nothing about templates and randomization. This is list of all ip addresses which were archived - bots.wmflabs.org/~petrb/list.txt it's not exact half but 50% +- 4% of category Petrb (talk) 14:37, 30 March 2012 (UTC)[reply]
- It's a little more complicated than just paring down the test group, because the test encompassed different kinds of shared IPs with different behaviors, so we have to compare like to like (and once you start splitting those groups into smaller groups, the numbers become less statistically significant). But Faulkner's looking into Stu's tables and thinks he might be able to tell us something about behavior pre- and post-archiving.
- RE:Petr's comment – sorry, I didn't mean to imply that something was wrong with the bot :) It could be any number of other factors. I'm thinking maybe it had to do with the way the category was split into two groups, which could have brought a lot more of one kind of IP into the test group and another kind into the control. That might account for the disproportionate numbers of archived/non-archived IPs. Maryana (WMF) (talk) 17:07, 30 March 2012 (UTC)[reply]
Request Expired. Expired doesn't quite cover it, but the bot is neither denied, nor approved, so that is what i am going to go with. Overall I would say that from a technical perspective, the trial was a success. However while there was a consensus for a two month trial, there is currently no consensus for this bot to run permanently. As I have stated above, this type of task needs more discussion, possibly even an RfC. Once the discussion has taken place, and there is a clear consensus, we can then start another BRFA. Personally, I'd like to thank everyone involved in this discussion, particularly Petrb, as this has been quite a mammoth BRFA, and that's never fun to go through. --Chris 06:28, 7 April 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.