Wikipedia:Deferred changes/Request for comment 2016
- The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
- Thank you to all who participated in this discussion. The consensus seems clear. All three proposals are successful.
- There is consensus to allow edit filters to defer changes either passively or actively in accordance with the edit filter guideline. (Passive deferring places the edit on Special:PendingChanges for human review, but still presents it to readers immediately. Active deferring holds back the edit, showing readers instead the revision prior to the edit, similar to how pending changes protection currently works.)
- Bots may, on approval, also defer edits passively, and bots with rollback rights may, on approval, also defer edits actively. A frequently cited example of a bot that would benefit from active deferred changes was ClueBot – the community expressed a lot of faith in it to catch vandalism, and deferred changes would allow the bot to catch edits it suspects may be vandalism, but isn’t quite sure enough to revert.
- The ORES extension is authorized to defer edits both passively and actively under the condition that the thresholds for doing so are decided beforehand by consensus and are higher than what they are currently. Administrators may, at their discretion, increase the thresholds in the event of backlogs.
- There were concerns in the threaded discussion sections about the backlog and about biting newcomers. To address the backlog, one suggested solution was to implement deferred changes cautiously and passively: start with a high passive threshold then slowly lower it until an optimal threshold is reached before allowing active deferring. When actively deferring changes, a friendly notification should be presented to the user who made the changes, carefully worded to avoid biting. There was some discussion about creating a separate queue for deferred changes (rather than Special:PendingChanges) and changing the standards for accepting them, but there is no consensus to deviate from the way pending changes are currently reviewed. Respectfully, Mz7 (talk) 23:04, 11 November 2016 (UTC)
- Thank you to all who participated in this discussion. The consensus seems clear. All three proposals are successful.
This request for comment concerns deferred changes, a way to defer for review edits by users who aren't autoconfirmed when they match certain edit filters, are picked up by a bot (e.g. User:ClueBot NG) as warranting attention, or are considered damaging by ORES. Until reviewed, the revision displayed to readers can be chosen to be either the latest revision as usual ('passive' defer) or the revision prior to the user edits[1] ('active' defer). Deferred edits appear at Special:PendingChanges. Should we request implementation of deferred changes? Specifically, should we allow it for the edit filter, bots and ORES, both passively and actively? 08:12, 14 October 2016 (UTC)
- ^ The same revision that rollback would revert to.
Edit filter
[edit]Should we allow edit filters to defer edits passively and actively? Use would be governed by the edit filter guideline.
Support (Edit filter)
[edit]- As proposer. Per rationale at Wikipedia:Deferred changes. Edit filter managers can be trusted to use the new actions adequately. Cenarium (talk) 09:09, 14 October 2016 (UTC)
- Edit filters already flag edits for review, but there is no centralized location for follow-up. (It's also not particularly obvious where to look.) The rationale for active deferral is also sound, the simple type of heuristics available in the edit filter are often prone to false positives -- this change will increase the preventative power of such filters. MER-C 03:35, 15 October 2016 (UTC)
- This seems like the most logical use for deferred changes and would be an improvement over the existing tagging system. Kaldari (talk) 22:36, 15 October 2016 (UTC)
- I support deferring edits as it prevents the visibility of likely vandalism. It doesn't stop helpful edits. Chris Troutman (talk) 23:11, 15 October 2016 (UTC)
- Sounds like a good idea and a reasonable use for this system. Enterprisey (talk!) 05:44, 16 October 2016 (UTC)
- I remember seeing a post about this at WP:EFN and thought it would be a good idea, and I still believe it will be. -- The Voidwalker Whispers 23:58, 16 October 2016 (UTC)
- I think this is a good idea as proposed. Prevention is important in stopping the vandalism. -- Dane2007 talk 05:35, 17 October 2016 (UTC)
- Makes sense, but see comments below. Ivanvector (Talk/Edits) 17:49, 17 October 2016 (UTC)
- This would definitely help in identifying and removing potentially damaging edits, with minimal impact to the site. Gluons12 ☢|☕ 02:44, 18 October 2016 (UTC).
- I made a similar post at EFN, glad to see it got off the ground. Iazyges Consermonor Opus meum 04:58, 18 October 2016 (UTC)
- Arguments made here seem compelling. Carrite (talk) 03:33, 20 October 2016 (UTC)
- Sounds like a very good idea to me - I would go for 'active' defer or possibly 'active' defer for new users and 'passive' defer for confirmed users if possible. KylieTastic (talk) 22:02, 20 October 2016 (UTC)
- Support per above --TerraCodes (talk to me) 22:26, 20 October 2016 (UTC)
- If edit filters can disallow, then edit filters can defer. — Esquivalience (talk) 02:13, 21 October 2016 (UTC)
- Well, it's better than just denying the edit, which is annoying especially if an anonymous user tries to add something constructive and it's disallowed. epicgenius - (talk) 12:39, 21 October 2016 (UTC)
- Very good idea especially for the bots. KGirlTrucker81 huh? what I'm been doing 01:42, 22 October 2016 (UTC)
- Excellent scheme, less bitey than bot reversion or edit filter disallow. Guy (Help!) 22:42, 22 October 2016 (UTC)
- Support per above, a little extra reviewing couldn't hurt. —Skyllfully (talk | contribs) 00:10, 23 October 2016 (UTC)
- Kevin (aka L235 · t · c) 00:07, 24 October 2016 (UTC)
- This would help Wikipedia eliminate a higher amount of damaging edits from the public eye. I like the bit about "passive" versus "active" deferrals. Dustin (talk) 01:04, 24 October 2016 (UTC)
- Something that would clearly be good for Wikipedia for both its content and reputation as all of the schools I've been to have usually criticised the site as having lots of spam. The Ninja5 Empire (Talk) 08:44, 24 October 2016 (UTC)
- Definitely support - has the potential to increase edit quality across the site. A lot of bad edits fall through the cracks, this could help with that. [Belinrahs|talktome⁄ ididit] 22:42, 24 October 2016 (UTC)
- Support per MER-C. — Gestrid (talk) 15:49, 25 October 2016 (UTC)
- I have seen edits tagged as possible vandalism by edit filters that were only reverted after several hours. Deferred changes would make those edits be reverted more quickly and be less visible. Gulumeemee (talk) 09:03, 28 October 2016 (UTC)
- Support a reasonable proposal, I'm assuming this will involve an update to the Special:AbuseFilter form in the "Actions to take when matched" box with an additional check-box. edit: saw Wikipedia:Deferred changes/Implementation — Andy W. (talk) 17:25, 1 November 2016 (UTC) 17:29, 1 November 2016 (UTC)
- I think it's clear that many potentially embarrassing "joke" type edits often go unnoticed and sit, sometimes for months, before they're caught. Our detection tools/algorithms have grown sophisticated and accurate enough that I think it's good idea to give this 'active deferral' a try. I'm already selectively picking and choosing which edits to patrol in the recent changes feed anyway. If I can get a prepped list of most likely malicious edits, it would greatly increase catch time and efficiency. -- Ϫ 07:31, 2 November 2016 (UTC)
- Totally support. I'm excited by how this could radically reduce casual vandalism and spamming in articles. This takes away the "fun" of seeing one's vandalism in lights. Gradually, folks will realize this site is no place to get their kicks, but instead, a serious encyclopedia. Stevie is the man! Talk • Work 19:00, 5 November 2016 (UTC)
- This is an excellent idea. I expect this combination of automation and human review to be a powerful and efficient tool. Ozob (talk) 00:51, 6 November 2016 (UTC)
- This may broaden the amount of room we have to handle questionable-but-not-vandalistic edits. Jo-Jo Eumerus (talk, contributions) 09:45, 6 November 2016 (UTC)
- This is a very good idea. --Tryptofish (talk) 23:38, 7 November 2016 (UTC)
- Strongly support. ~ Rob13Talk 23:51, 7 November 2016 (UTC)
- This sounds very useful in combating likely disruption. This sounds flexible and doesn't overly penalize editors based on their newness, as the worst that can happen is similar to pending changes. NinjaRobotPirate (talk) 04:35, 8 November 2016 (UTC)
- Support: This seems like it would be able to prevent the visibility of likely vandalism. It wouldn't stop helpful edits. - tucoxn\talk 13:26, 11 November 2016 (UTC)
Oppose (Edit filter)
[edit]Discussion (Edit filter)
[edit]- Unless I'm entirely mistaken, details about edit filters are hidden from non-administrators, other than the most basic info (just the edit filter number, I think, I'm not even sure if we can see when an edit is flagged with a filter). Is there some kind of risk of revealing sensitive information to non-admins by doing this? (I assume that there's not, and that the hiding of this information is just a legacy we-can't-trust-anyone sort of thing). Ivanvector (Talk/Edits) 17:51, 17 October 2016 (UTC)
- @Ivanvector: Actually, most edit filters are publicly visible, but a great many of them are not. The only thing sensitive about edit filters is their rules, which would not be revealed by the filter taking action. The only thing revealed when a private filter makes an action is the description of the filter. -- The Voidwalker Whispers 21:26, 17 October 2016 (UTC)
- The only filters that are private, AFAICR, are those which are targeted against long term abuse. These are generally, but not always, "deny". All the best: Rich Farmbrough, 17:49, 20 October 2016 (UTC).
Bots
[edit]Should we allow bots to defer edits passively, and bots with rollback rights actively? Each use would require bot approval.
Support (Bots)
[edit]- As proposer. Per rationale at Wikipedia:Deferred changes. I except that Cluebot would be able to catch many vandalism edits that are otherwise missed. Cenarium (talk) 09:18, 14 October 2016 (UTC)
- As soon as we don't have 100 bots sending an uncountable number of edits to the pending changes backlog, of course. — Esquivalience (talk) 01:57, 15 October 2016 (UTC)
- The rationale given at Wikipedia:Deferred changes is sound. Active deferral can also serve as a deterrence mechanism, and can be used in place of the "don't revert twice" model used by some anti-vandalism bots. MER-C 03:32, 15 October 2016 (UTC)
- I support deferring edits as it prevents the visibility of likely vandalism. It doesn't stop helpful edits. Chris Troutman (talk) 23:11, 15 October 2016 (UTC)
- I support this motion as well. The rationale given at Wikipedia:Deferred changes is not a concern to me and this can help stop vandals. -- Dane2007 talk 05:35, 17 October 2016 (UTC)
- Support per answer to my question below. Seems like another good tool for the chest, and if the bots are going to continue reverting obvious vandalism as they have been, then I'm not concerned about extra reviewing work. Ivanvector (Talk/Edits) 03:07, 18 October 2016 (UTC)
- I support for Cluebot as it's proven on accuracy, but I think each new bot should have to go though a trial period of non-active flagging and review before being accepted, and a relatively easy way to challenge and shut down. Also Cluebot could start as 'active' defer (or both 'active' and 'passive' at different confidence levels), but other bots should be 'passive' until proven. Cheers KylieTastic (talk) 22:09, 20 October 2016 (UTC)
- Support what KylieTastic said. --TerraCodes (talk to me) 22:28, 20 October 2016 (UTC)
- Again, it's better than just rejecting a constructive anonymous edit. Also, we have implemented this in some way for vandalism changes: if Cluebot does not have high confidence that something is vandalism, it goes to a semi-automatic program (like STiki or Huggle) where human editors can review. epicgenius - (talk) 12:41, 21 October 2016 (UTC)
- Support per my comments above. KGirlTrucker81 huh? what I'm been doing 01:44, 22 October 2016 (UTC)
- Support, I think. Guy (Help!) 22:43, 22 October 2016 (UTC)
- Support per above, a little extra reviewing couldn't hurt. —Skyllfully (talk | contribs) 00:10, 23 October 2016 (UTC)
- Kevin (aka L235 · t · c) 00:07, 24 October 2016 (UTC)
- Absolutely - I've been impressed with the accuracy of bots like Cluebot NG, and the extra edit review seems like a good idea. [Belinrahs|talktome⁄ ididit] 22:42, 24 October 2016 (UTC)
- Support, as ClueBot NG has already proven time and again that it knows what vandalism looks like. Unlike ORES (which I oppose below), it, from what I've seen, hardly ever has a false positive. In the months that I've been active on Wikipedia, I've only ever seen one false positive by ClueBot NG. — Gestrid (talk) 15:47, 25 October 2016 (UTC)
- Support, I can see this working. -- The Voidwalker Whispers 22:29, 25 October 2016 (UTC)
- Support the option for bots to choose to defer edits for review. — Andy W. (talk) 17:25, 1 November 2016 (UTC)
- Support per KylieTastic. Stevie is the man! Talk • Work 19:20, 5 November 2016 (UTC)
- Support. Even the best bots are obliged to be overcautious. Allowing them to defer edits passively or actively provides another layer of protection. Ozob (talk) 00:51, 6 November 2016 (UTC)
- Mostly per rationale provided in the previous section. Jo-Jo Eumerus (talk, contributions) 09:45, 6 November 2016 (UTC)
- Support, with the caveats stated by KylieTastic. --Tryptofish (talk) 23:39, 7 November 2016 (UTC)
- Strongly support. ~ Rob13Talk 23:51, 7 November 2016 (UTC)
- Support, I agree with KylieTastic that Cluebot has proven its accuracy. It should be the only bot initially approved for active defer. Each new bot should complete a trial period of non-active flagging and be reviewed before being accepted for active defer. - tucoxn\talk 13:24, 11 November 2016 (UTC)
Oppose (Bots)
[edit]Discussion (Bots)
[edit]Whatever is decided here, I think it is important to make sure that edits deferred by a bot with a neural network/machine learning setup (i.e. CluebotNG) which are later reviewed by a human are automatically added to Cluebot's dataset. I don't know how technically difficult this is to accomplish, but it seems worthwhile if it is reasonably doable.
I'd also caution whoever first implements this to go very slowly at first. Creating a massive backlog of edits that will never be reviewed is not the goal. Tazerdadog (talk) 08:49, 16 October 2016 (UTC)
- Assuming that a bot like ClueBotNG would continue to operate as it does now (reverting edits that meet its vandalism scores, or however it works) and only defer edits it is less sure about? Ivanvector (Talk/Edits) 17:56, 17 October 2016 (UTC)
- @Ivanvector: Yes, that's the plan. Cenarium (talk) 21:54, 17 October 2016 (UTC)
ORES
[edit]Should we allow ORES to defer edits passively and actively? Thresholds would be determined by consensus initially, and administrators may set higher thresholds in case of backlogs.
Support (ORES)
[edit]- As long as the thresholds can be specified onwiki. ORES is still in beta and the false positive ratio is quite high for the current threshold, see this for the kind of edits it picks up. We would need a much higher threshold. Cenarium (talk) 09:35, 14 October 2016 (UTC)
- I support deferring edits as it prevents the visibility of likely vandalism. It doesn't stop helpful edits. Chris Troutman (talk) 23:11, 15 October 2016 (UTC)
- weak support as not sure on accuracy - however as per comments in discussion below it's probably just about getting the correct settings/levels. So support with a careful monitored extended trial period, and a way it can be shutdown/paused if causing issues. Cheers KylieTastic (talk) 22:13, 20 October 2016 (UTC)
- Support per Halfak's comment in the discussion section and the above. --TerraCodes (talk to me) 22:34, 20 October 2016 (UTC)
- Weak support because it is useful, but misses certain cases that are more likely to be vandalism. I'll have to try more ORES, but this would be a weak support from me based on what I've seen so far. epicgenius - (talk) 21:10, 21 October 2016 (UTC)
- Quite honestly, I'm not sure why this is even being debated. ORES's accuracy can be adjusted; heck, ORES's API can be used by flagged bots. This doesn't mandate use; it only allows use – and provides a native on-wiki queue instead of manual review of automated reverts or ad-hoc IRC scoring solutions, like we have now. Kevin (aka L235 · t · c) 00:07, 24 October 2016 (UTC)
- Support, only at a very high threshold. See comments below. Kaldari (talk) 02:54, 26 October 2016 (UTC)
- Provided that the threshold is higher. Esquivalience (talk) 00:59, 3 November 2016 (UTC)
- Support per Halfak (WMF)'s explanation below. — Gestrid (talk) 19:30, 5 November 2016 (UTC)
- Support but with others' concerns taken into account. Stevie is the man! Talk • Work 20:11, 5 November 2016 (UTC)
- Support. Systems like ORES can be calibrated to different levels of sensitivity. If it's overly aggressive, it can simply be set to a different threshold. Ozob (talk) 00:51, 6 November 2016 (UTC)
- Support, if ORES scoring is set to be sufficiently selective enough to not clog the queue. -- AntiCompositeNumber (Leave a message) 20:06, 6 November 2016 (UTC)
Oppose (ORES)
[edit]Until the vandalism model of ORES improves in accuracy. Currently, it is pretty inaccurate, with a high false positive rate even on its "High" setting. Also, I believe that the model doesn't factor in linguistic features, which is fatal to its accuracy. I believe that until more diffs are included into the machine learning model (perhaps the ones in CBNG), we should hold off on including ORES. — Esquivalience (talk) 01:45, 15 October 2016 (UTC)- The high setting actually flags more edits than the low setting, so will have more false positives. It is high in sensitivity but relatively low in specificity. The setting for deferral would be kept very high in specificity (to avoid false positives) and would have a relatively low sensitivity. Cenarium (talk) 13:10, 20 October 2016 (UTC)
- Well said. Thank you. --Halfak (WMF) (talk) 20:27, 20 October 2016 (UTC)
- The high setting actually flags more edits than the low setting, so will have more false positives. It is high in sensitivity but relatively low in specificity. The setting for deferral would be kept very high in specificity (to avoid false positives) and would have a relatively low sensitivity. Cenarium (talk) 13:10, 20 October 2016 (UTC)
ORES is still under active development and is not yet as accurate as bots like ClueBot NG. Deferring edits based on ORES would be premature, IMO.Kaldari (talk) 22:34, 15 October 2016 (UTC)- If we set our threshold at the level of ClueBot, we would be just as, if not more, accurate. --Halfak (WMF) (talk) 18:12, 19 October 2016 (UTC)
- @Halfak (WMF): I didn't think about that. What threshold would you propose in order to be "at the level of ClueBot"? Kaldari (talk) 19:42, 21 October 2016 (UTC)
- The ORES service reports a threshold for minimizing false-positives in it's test statistics. E.g. https://ores.wikimedia.org/v2/scores/enwiki/damaging?model_info shows "threshold": 0.959 for recall_at_fpr(max_fpr=0.1). If we wanted to run this as a trial, I'd start there and adjust based on what we learn. --Halfak (WMF) (talk) 15:01, 24 October 2016 (UTC)
- @Halfak (WMF): Thanks for the info. I switched my comments to a support. Kaldari (talk) 02:54, 26 October 2016 (UTC)
- The ORES service reports a threshold for minimizing false-positives in it's test statistics. E.g. https://ores.wikimedia.org/v2/scores/enwiki/damaging?model_info shows "threshold": 0.959 for recall_at_fpr(max_fpr=0.1). If we wanted to run this as a trial, I'd start there and adjust based on what we learn. --Halfak (WMF) (talk) 15:01, 24 October 2016 (UTC)
- @Halfak (WMF): I didn't think about that. What threshold would you propose in order to be "at the level of ClueBot"? Kaldari (talk) 19:42, 21 October 2016 (UTC)
- If we set our threshold at the level of ClueBot, we would be just as, if not more, accurate. --Halfak (WMF) (talk) 18:12, 19 October 2016 (UTC)
ORES isn't quite accurate in my view for active deferral. It might be selective enough for passive, but that would probably fill the queue unnecessarily. AntiCompositeNumber (Leave a message) 14:15, 16 October 2016 (UTC)- Changing to Support AntiCompositeNumber (Leave a message) 20:06, 6 November 2016 (UTC)
- Oppose per points raised above, this may create too much of an unnecessary workload. —Skyllfully (talk | contribs) 00:11, 23 October 2016 (UTC)
Until ORES is better at predicting vandalism, I wouldn't trust it with something like this quite yet. It has the potential to flood Pending Changes with way too many false positives until its been more finely tuned. — Gestrid (talk) 15:42, 25 October 2016 (UTC)- Changing !vote. — Gestrid (talk) 19:27, 5 November 2016 (UTC)
- Oppose because ORES is still at "release status: beta" according to its page on Meta. After the beta period is over, I would be able to change my position here. - tucoxn\talk 13:18, 11 November 2016 (UTC)
- Also, this sub-section doesn't seem to be counting correctly. - tucoxn\talk 13:20, 11 November 2016 (UTC)
- Fixed numbering -- AntiCompositeNumber (Leave a message) 14:01, 11 November 2016 (UTC)
- Also, this sub-section doesn't seem to be counting correctly. - tucoxn\talk 13:20, 11 November 2016 (UTC)
Discussion (ORES)
[edit]There seems to be come confusion about m:ORES' accuracy level. ClueBot NG only reverts a small amount of vandalism because it's thresholds are set extremely strictly. Thresholds are set far less strictly in the ORES Review Tool because it is intended to benefit from human review and the goal in patrolling is to get all of the vandalism -- including the majority of damaging edits that ClueBot NG misses. The nice thing about ORES is that it allows external developers the possibility of tuning the threshold to their needs. --Halfak (WMF) (talk) 20:12, 19 October 2016 (UTC)
I don't understand what ORES is. What does it do? How would it help compared to regular recent changes or m:RTRC? epicgenius - (talk) 12:43, 21 October 2016 (UTC)
- Hi epicgenius. See m:ORES for an overview. ORES is a machine classification service. It can flag edits that are likely damaging/vandalism for review. The m:ORES review tool surfaces predictions on Special:RecentChanges and a few other places. The review tool is installed as a beta feature on English Wikipedia. You can enable it in your your preferences. The ORES service, however, provides scores that can be used on all sorts of interesting ways (e.g. User:DataflowBot/output/Popular_low_quality_articles_(id-2). Quality control is just one of the use-cases of the system. --Halfak (WMF) (talk) 14:52, 21 October 2016 (UTC)
- @Halfak (WMF): Thank you for the explanation. So, are the classifications automatic from the beginning, or do humans classify the edits in order for the machine to provide suggestions? epicgenius - (talk) 15:42, 21 October 2016 (UTC)
- epicgenius, see m:ORES/What and Wikipedia:Labels/Edit quality. If you want more substantial discussion, see this presentation I gave about the system at the Berkman Center.--Halfak (WMF) (talk) 15:49, 21 October 2016 (UTC)
- Thanks. Anyway, I'll be sure to try it out. I'll respond with a support or oppose later. epicgenius - (talk) 15:50, 21 October 2016 (UTC)
- epicgenius, see m:ORES/What and Wikipedia:Labels/Edit quality. If you want more substantial discussion, see this presentation I gave about the system at the Berkman Center.--Halfak (WMF) (talk) 15:49, 21 October 2016 (UTC)
- @Halfak (WMF): Thank you for the explanation. So, are the classifications automatic from the beginning, or do humans classify the edits in order for the machine to provide suggestions? epicgenius - (talk) 15:42, 21 October 2016 (UTC)
Given that there is some dispute about whether ORES is sufficiently specific for this task, with the counter-argument that it can be tuned to decrease positive results, would it be possible for the people behind ORES to provide a specific list of example edits which would be matched (e.g. from some contiguous sample of 10000 edits on enwiki)? ⁓ Hello71 15:35, 25 October 2016 (UTC)
Actually, now that I think about it, given that the edit filter proposal seems overwhelmingly popular, would it not be possible to feed ORES scores into AbuseFilter so that EFMs could tune it based also on other criteria (e.g. user is not confirmed or whatever)? ⁓ Hello71 15:38, 25 October 2016 (UTC)
- Maybe not such a great idea, since it was pointed out that ORES scoring is slow. Maybe it would be worth it for this to split edit filters into immediate and queued types based on the actions taken, but that's a whole different can of worms. ⁓ Hello71 15:54, 25 October 2016 (UTC)
General discussion
[edit]- How many edits do you envisage this adding to the queue? If this is going to create a situation like IP edits on de-wiki, where the queue becomes so unmanageably long that the edits never get reviewed, then all this will do is add another unmanageable queue to a project which already hasn't got enough hands on the pump to cope with the existing queue at Special:NewPages; if that does happen, and it creates in practice a situation where any IP edit to a controversial topic never goes live because this has swamped the Pending Changes queue, that potentially has serious editor retention implications—a lot of Wikipedia editors start off as IPs editing their favourite band/politician/company and we don't want to send the "nothing you do is of any potential value to us" message. ‑ Iridescent 10:12, 14 October 2016 (UTC)
- Such a situation should not be allowed to develop. We should start slowly, and passively at first.
- For an edit filter, we start with passive defer, so the revision displayed is still the latest revision, it's merely added to the review queue (but note that we can filter out passive defers). If this creates a backlog, we make the criteria more stringent so it doesn't pick as many edits, then we can make it active.
- For ClueBot NG, we start also with passive defer, use a false positive rate slightly lower than the one used for rollback, then check if this creates a backlog, if not we go active at this level and try a slightly lower level for passive defer, check if it creates a backlog, and so on. (Similar for ORES)
- There are enough editors who deal with vandalism on this project. For various reasons, anti-vandalism work is more 'popular' than NPP and lots of other areas of the project. It won't take people from those other areas, because there are already enough people doing anti-vandalism work. This is just a way to make this work more efficient, by guiding these people toward the edits that are most likely to be vandalism or otherwise damaging (and prevent their visibility to the public). Cenarium (talk) 11:18, 14 October 2016 (UTC)
- Such a situation should not be allowed to develop. We should start slowly, and passively at first.
- What kind of message do people receive when their edit is deferred? I suspect that their response (quitting in disgust vs. patiently waiting vs. merrily keep editing) partly depends on the response text. Jo-Jo Eumerus (talk, contributions) 15:51, 14 October 2016 (UTC)
- For passive defers, I don't think the user needs to be notified since the edit is merely added to a queue. For active defers, we may send an echo notification like "An automated computer program has requested review of your edit on <page>." plus the log summary of the defer. The notification could link to WP:Deferred changes. Cenarium (talk) 20:17, 14 October 2016 (UTC)
- How feasible is it to actively defer edits that create a new page i.e. replace the publicly available page text with something like "This newly created page has been flagged by an automated algorithm for human review before being made publicly available."? Such a functionality would be useful against spam, attack pages and copyright violations. MER-C 05:02, 15 October 2016 (UTC)
- Replacing the publicly available text would be difficult. Though userspace is now noindexed by default, and noindexing of new pages in article space is coming right now. Cenarium (talk) 13:34, 15 October 2016 (UTC)
- What's the real need for this? Do we not have enough anti-vandalism tools? I'm concerned that this would make the newcomer's Wiki editing experience even more unfriendly than it already is. SteveStrummer (talk) 02:28, 17 October 2016 (UTC)
- We don't have tools that act before vandalism occurs, except the edit filter prevent action and the spam blacklist, but these are too strong so can only be used on the most extreme kind of vandalism/spam to avoid false positives. This provides a new way to deal with edits identified as damaging before they can be seen by readers. Tthere is also plenty of vandalism that gets identified by edit filters but get missed by existing tools for days. As for being unfriendly to newbies, this concern is present in active defer but can be mitigated by watching for false positives and using friendly notifications. Cenarium (talk) 15:42, 17 October 2016 (UTC)
- Given we have reasonable thresholds for active defers, I can see this reducing casual vandalism attempts, thus reducing the need for regular editors to deal with vandals. The problem this solves to a point is vandalism that becomes publicly viewable, that, in turn, gives the casual vandal a little "high". Without this high, these folks might well see that the Wikipedia isn't a place where they can get their kicks. Stevie is the man! Talk • Work 20:17, 5 November 2016 (UTC)
- Agreed. Even when we revert them, their edit is still visible to everyone (barring any administrative action) in the page history. — Gestrid (talk) 20:23, 5 November 2016 (UTC)
- Doubts about putting deferred edits in with the Pending changes list. PC currently includes any edit to individually selected articles; the PC patrollers are looking for not only vandalism but BLP violations and copyvios, and have volunteered and been granted the permission to do that job. No-one else who currently patrols for vandalism anywhere but has not sought and gained the PC-reviewer flag could patrol the deferred edits, even though anyone at all can currently patrol recent changes homing in on an edit filter tag or group of tags. No, a new queue would be needed for the deferred edits, and new criteria would have to be agreed to review that queue. Even then, I wonder if the proposer and supporters have sufficiently considered the knock-on effects: maybe
there are enough editors who deal with vandalism
but is it expected that many of them will move over to deferred edits from STiki, Huggle and other processes, making these processes ineffective or redundant?: Noyster (talk), 10:04, 17 October 2016 (UTC)- I don't see why they should be treated differently than any other pending edit, BLP violations and copyvios should also be watched for, as well as spam, etc. PC reviewers are chosen specifically for their ability to review this kind of edits, and deferred changes can be seen as a temporary pending changes protection (in fact, it's how it's implemented), so I don't see the need for another queue or usergroup. As for a knock-on effect, rollbacking an edit with Stiki or Huggle would remove it from the queue so the processes are complementary. Deferred changes is more of a secondary tool, useful for edits that were missed by the primary recent changes patrolling tools. Cenarium (talk) 15:42, 17 October 2016 (UTC)
- What is the standard for accepting a deferred edit (especially active defer)? Is it the same as pending changes? In my view, it should be an even lower standard. If we wouldn't have reverted that edit on sight before deferred changes existed, then it should be accepted. I don't want this to turn into a way to reduce the voice of IP editors. agtx 16:05, 19 October 2016 (UTC)
- It's a legitimate concern, it would have to be discussed when rewriting WP:Reviewing pending changes for deferred changes. The standard would be de facto lower since there would be no need to "uphold the reason given for protection". Cenarium (talk) 12:48, 20 October 2016 (UTC)
- This is going to have some serious performance considerations. ClueBot NG takes about 5 seconds to score an edit[1]. ORES takes 1.5 seconds. Abuse filter takes long enough that its application is split into a job queue. If we were to defer changes at the time of save, we'd need to delay the save operation substantially. --Halfak (WMF) (talk) 20:34, 19 October 2016 (UTC)
- This doesn't have to be done at the time of save, I've added it in a deferred update (The terminology is a bit confusing here!) for AbuseFilter[2]. (FYI, AbuseFilter actions are not run in a job queue, since they sometimes need to prevent the action, only change tagging is deferred since it needs the RC saved.) ORES I don't know enough about it to implement but I believe it's run in a job queue so would not be done at the time of save. For bots, it's after save when we get the API request, like rollback. Cenarium (talk) 12:48, 20 October 2016 (UTC)
- Thanks for the clarifications. --Halfak (WMF) (talk) 20:26, 20 October 2016 (UTC)
- This doesn't have to be done at the time of save, I've added it in a deferred update (The terminology is a bit confusing here!) for AbuseFilter[2]. (FYI, AbuseFilter actions are not run in a job queue, since they sometimes need to prevent the action, only change tagging is deferred since it needs the RC saved.) ORES I don't know enough about it to implement but I believe it's run in a job queue so would not be done at the time of save. For bots, it's after save when we get the API request, like rollback. Cenarium (talk) 12:48, 20 October 2016 (UTC)
- Can an RfC even snow-close as passed? Because I think this one might qualify. — Gestrid (talk) 08:00, 8 November 2016 (UTC)
- @Gestrid: In theory, yes, of course. Happens all the time with content discussions. In practice, something as significant as this should probably run 30 days. ~ Rob13Talk 08:24, 8 November 2016 (UTC)
- @Rob13: I figured. Who knows? Once everyone's done watching the election, this RfC could fail miserably. (I said half-kidding.) — Gestrid (talk) 08:27, 8 November 2016 (UTC)
- @Gestrid: In theory, yes, of course. Happens all the time with content discussions. In practice, something as significant as this should probably run 30 days. ~ Rob13Talk 08:24, 8 November 2016 (UTC)
References
[edit]- ^ Geiger, R. S., & Halfaker, A. (2013, August). When the levee breaks: without bots, what happens to Wikipedia's quality control processes?. In Proceedings of the 9th International Symposium on Open Collaboration (p. 6). ACM. http://grouplens.org/site-content/uploads/2013/09/geiger13levee-preprint.pdf
- ^ https://gerrit.wikimedia.org/r/#/c/218104/22/backend/FlaggedRevs.hooks.php