Wikipedia talk:Flagged revisions/Trial/Archive 2

History of this page

Anyone else not showing the history beyond December 30th? ♫ Melodia Chaconne ♫ (talk) 18:10, 2 January 2009 (UTC)[reply]

Works for me... Happy‑melon 18:21, 2 January 2009 (UTC)[reply]

I'm showing 30Dec then 2 Jan - but not my own post! Peridon (talk) 18:21, 2 January 2009 (UTC)[reply]

Masses of 2 Jans appeared now. Peridon (talk) 18:22, 2 January 2009 (UTC)[reply]

Well there weren't any posts here between 30 December and 2 January, so that explains why you're not seeing those :D. Are the recent posts appearing now? Happy‑melon 18:27, 2 January 2009 (UTC)[reply]

Most of the 2Jans (including mine) have gone again from history. I'm in the UK, if that makes a difference.) Peridon (talk) 18:42, 2 January 2009 (UTC)[reply]

They're back again including my latest (not this, obviously). Peridon (talk) 18:45, 2 January 2009 (UTC)[reply]

There were some history issues on I think WT:RFA a few hours ago, perhaps they are related? davidwr/(talk)/(contribs)/(e-mail) 19:05, 2 January 2009 (UTC)[reply]

Server issues at the moment. See Wikipedia:Help_desk#Servers_getting_out_of_sync. Black Kite 19:22, 2 January 2009 (UTC)[reply]

Aha. There were, it looks like, six edits missing when I posted that message, but it seems fine now. ♫ Melodia Chaconne ♫ (talk) 20:18, 2 January 2009 (UTC)[reply]

Neither test please

We have two suggested tests. Neither is useful; both will be harmful.

One is to sight all semi-protected articles for two months. For articles that need semi-protection, this will restore the obligation to revert vandals that semi-protection solved, and place in fewer hands: the reviewers. Furthermore, when vandalism dies down, as it tends to do, it will leave the article sighted. Sighting interferes much more with normal editing than semi-protection does; this is a net cost to Wikipedia in both states.
The other is to sight all FA's. This is, if possible, worse. Problems with FAs tend to be subtle and complex, requiring sophisticated edits from printed sources. Will the reviewer be able to judge them? I doubt he will even have access to the sources.
- If this one defaulted to making the unsighted the default visible one, it would have real prospects; at present, one editor can watch an article, and review recent changes to see if they need to be reverted; flagging would make this a tag-team effort. How much of an improvement this is over present capabilities will be seen; we seem to be able to handle most articles now. Septentrionalis PMAnderson 19:55, 2 January 2009 (UTC)[reply]
  Will the reviewer be able to judge them? Yes, because reviewers will only need to check the article for obvious vandalism, obvious copy-right violations and obvious libel. Ruslik (talk) 20:30, 2 January 2009 (UTC)[reply]
  Then it will not suffice for BLPs, which we need to protect against inobvious but actionable libels. If it will not do that, it has no purpose. Septentrionalis PMAnderson 18:46, 3 January 2009 (UTC)[reply]

Please take this failed proposal back to the German Wikipedia and drown it there. Septentrionalis PMAnderson 19:33, 2 January 2009 (UTC)[reply]

The first proposal tests whether the amount of vandalism such articles experience reduces in volume once vandals realise that their edits are not being reflected back at them; does the loss of the satisfaction of seeing their work immediately visible on the eight most popular website on the internet discourage them from making such vandalism? I personally feel that testing this on all 3,500-odd semi-protected articles is excessive; I'd rather see a trial on 1,000 such articles so we can compare to the 'control' of the articles that remain semi-protected. I have seen no evidence to support your bare assertion that "Sighting interferes much more with normal editing than semi-protection does"; that is something I would very much like to see some hard numbers on. Numbers that we could best gain by comparing... oh.

You completely misinterpret the role of sighting in the second proposed trial; the suggestion notes in large bold letters that "all revisions are to be sighted unless they are found to contain vandalism, copyright violations, or libel. FlaggedRevisions thereby functions as a low-level filter against obvious vandalism...". There is no "subtle and complex" judgement required; sighting here is to act as a filter against obvious vandalism, to quote verbatim. Of course, whether such a filter is useful is a matter for debate, a debate that we can much better have if we are in posession of some hard data on how effective it is. Data that we can best obtain by comparing... oh. Happy‑melon 23:06, 2 January 2009 (UTC)[reply]

Neither of your proposed tests makes comparison possible. Septentrionalis PMAnderson 18:46, 3 January 2009 (UTC)[reply]

Any data collected from such a test will be invalid. The current proposal admits, "The proposed configuration does not scale well to a full deployment". If it doesn't scale, then neither will our experience. Consequently, interpreting the results becomes a matter of voodoo: Someone will say, "The number of vandalism edits was ..." and someone else will say, "But the proportion of reviewers to edits will be ..." and they'll both be right. The minimum we ought to do is to make the test as similar as possible to the system we hope to deploy. The current proposal doesn't do that, so it doesn't give us the data we need. Ozob (talk) 00:44, 3 January 2009 (UTC)[reply]

If that is the case, then what are we doing now? Astrology? There's no guarrantee that our final deployment won't be a limited deployment; where is it written in stone that any final implementation must cover the entire mainspace? The implementation doesn't scale technically to a full deployment, that's a necessary safeguard. Procedurally and logistically it scales as large or small as we want to make it. If we wanted to conduct a trial over half the mainspace, we could do that with this implementation; it would just be technically very difficult (only in terms of what the surveyors have to do to set it up; for reviewers the situation would be exactly the same as a full deployment). Are you saying you'd support a trial that covered half the mainspace :D ? Happy‑melon 10:55, 3 January 2009 (UTC)[reply]

Yes, astrology about describes it; you are proposing to test on a large number of articles with no way of telling whether they would have been better or worse without the mechanic you propose to introduce. Septentrionalis PMAnderson 20:01, 3 January 2009 (UTC)[reply]

No, the proposal is that we continue to pour time and energy into the discussion of how we can test FlaggedRevisions, including what metrics and mechanisms to use, safe in the knowledge that when we finally get one, two, three, ten, sensible proposals we will have the technical means to implement them. If you want a another analogy, we propose that we build a lab. Happy‑melon 20:37, 3 January 2009 (UTC)[reply]

No, it is a proposal of certain technical means, which are themselves undesirable, to implement a suggestion for testing which themselves require massive rethinking to be implemented. Until we think this through, we do not need any tools; and we do not need these tools at all. If we are going to make sighting ability automatic, we should say so now; if not, we should not bother the developers. Septentrionalis PMAnderson 20:50, 3 January 2009 (UTC)[reply]

No, you see, in astrology, you entrust your life to the stars and the planets, who you can at least rely on to be in certain places at certain times. I think a better analogy for the present discussion is the Magic 8-Ball. :-)

I'm willing to support some sort of test of FRs. I oppose them on principle, but I admit that I don't have the data to back me up; neither does anyone else have the data to support them. But I have objections to the current proposal. I really don't think it's precise enough to decide anything. I mean it: I really don't. When you called it "deliberately vague" above I nearly exploded. My first draft of my comment on that was way over the top. (I'm glad I always click "Show preview"!)

I'm comforted to hear that you don't forsee any procedural or logistical barriers to widespread deployment of the proposed FR system. That's not how I interpreted the caveat in the lead of the proposal—I interpreted it as, "We want to try something, even if we know it won't work!" I've edited the proposal to try to make this clearer.

I almost want to ask you to go through the proposal and put in twice as much detail, but since the poll's already started, it's too late for that. If this poll fails and you want to try again, I'd be willing to look over another proposal before you open a second poll. I'd also be willing to look over specific proposals for tests. If we have to test this thing, then we ought to do it right. Ozob (talk) 22:05, 3 January 2009 (UTC)[reply]

A side question: what percentage of BLP articles (rough guess) are categorized as Category:Living persons? I did a dirty quick test of 10 random articles, so 7 are tagged as living, two not tagged, one does not list year of death but the person was young and active in 1960s and can be alive now. NVO (talk) 20:00, 4 January 2009 (UTC)[reply]

Suggestion for articles to sight

Hey, I'm really excited to see this trial going forward, after all the time that's been spent discussing mechanisms like this without evaluating them. The big problem currently appears to be choosing which articles to evaluate it on. While one can easily debate about the relative benefits and costs to each class of articles, I think the most important thing is that the regular editors watchlisting a particular article are comfortable with having the feature enabled on it.

So here's my suggestion: let's create a template that is placed on the talk page of every semi-protected and featured article, and any other article anyone wishes to manually place it on. It will describe the new feature and ask the editors of that article to come to a consensus regarding whether it should be enabled on that article. If they approve, they'll contact the surveyor specified in the template. This is much less likely to produce a backlash and more likely to get interested users involved with using the feature. Dcoetzee 19:57, 2 January 2009 (UTC)[reply]

Should be active consensus to enroll; that would at least limit this to articles being actively followed. Septentrionalis PMAnderson 20:00, 2 January 2009 (UTC)[reply]

That's a very interesting idea. Instead of picking articles by type or subject, you're essentially saying that we should invite editors to suggest articles that they think could benefit from having FlaggedRevs enabled? That would certainly ensure a wide variety of subjects; I can certainly think of half a dozen articles I've worked on that I would nominate. Why don't you jot something down at WP:Flagged revisions/Trial/Proposed trials?? Happy‑melon 22:57, 2 January 2009 (UTC)[reply]

Sounds like a good idea to me.-Oreo Priest ^talk 14:46, 3 January 2009 (UTC)[reply]

And permit users to protect articles against having FR. I think it not unlikely that the test will be actively harmful to the articles on which it is tried, and would prefer not to spend my time cleaning up my active watchlist from the harm it does. Septentrionalis PMAnderson 18:37, 3 January 2009 (UTC)[reply]

Glad to see this discussed. Pasted from my vote above: there should be a wide selection of articles included in this trial, otherwise the results won't be meaningful. The trial should include at least two of the following: FA article, BLP article, science article, controversial politics article. In each category there should be one high traffic and one low traffic article (traffic with respect to edits/day, not views). Xasodfuih (talk) 11:06, 4 January 2009 (UTC)[reply]

Bots?

Consider the following situation:

There are several revisions after the last sighted version when a bot arrives at an article.

One is a piece of raw vandalism
One is a constructive edit; to make it vivid, let's suppose it is a correction of a falsehood about a living person, in a separate section from the vandalism.

There are four alternatives I can see, if a bot can create sighted versions:

The new sighted version includes the revisions and the vandalism.
The new sighted version contains none of the revisions, thus retaining the BLP error.
The bot can select which revisions to include. How?
This will never happen. Why not?

Septentrionalis PMAnderson 21:44, 2 January 2009 (UTC)[reply]

Bots cannot create sighted versions in this situation, so the question makes no sense. Happy‑melon 22:31, 2 January 2009 (UTC)[reply]

To expand, the permissions given to bots mean that if a bot makes an edit on top of an already sighted revision, then the edit made by the bot is sighted automatically. If the top revisions is not sighted, the edits remain unsighted. So the 'answer' is 4: the situation cannot occur. Happy‑melon 22:32, 2 January 2009 (UTC)[reply]

That should be stated explicitly. It is a necessary condition for this not to lead to disaster.

But I'm not sure it's sufficient; not all bot edits are well-advised. (For example, some bots clean up syntax, and do not stop at direct quotations.) They should wait to be reviewed, like anons. Septentrionalis PMAnderson 18:40, 3 January 2009 (UTC)[reply]

I am not aware of bots that add libel, copyrighted material or vandalism to articles. So it does not make any sense to deny bots the ability to autosight their edits. Even if an edit is ill-advise it still should be sighted. Ruslik (talk) 19:44, 3 January 2009 (UTC)[reply]

But bots can worsen articles. Waiting three weeks for that to be corrected is a harm to Wikipedia. Septentrionalis PMAnderson 19:57, 3 January 2009 (UTC)[reply]

It is stated explicitly: in the PHP code for the specification, bots are given the autoreview permission but not the review permission. It is not technically possible for them to do anything other than the actions given above. I agree that it's not clearly explained in the proposal, but hopefully anyone else who picks up that point will read this thread and see the explanation. Editors can worsen articles too, everyone can. Are you saying that no one should have their edits automatically reviewed? Happy‑melon 20:29, 3 January 2009 (UTC)[reply]

You have identified a central problem with all flagged revision proposals (unless the unflagged version is visible), but there is a difference: Editors are presumed to be intelligent, and we assume good faith; bots are dumb, and will go wrong. Septentrionalis PMAnderson 20:53, 3 January 2009 (UTC)[reply]

I would say that the more fundamental difference is that humans are assumed to have judgement and bots are assumed not to. That's why humans can review and bots can't. The autoreview permission is an assumption of the good faith of the editor in question, which should apply just as much to bots as to their operators. I agree that bots screw up regularly and with varying degrees of spectacle. Just because an edit has been sighted doesn't mean it can't ever be reverted. That an edit is "sighted" will (probably) indicate that it does not contain obvious vandalism, spam or libel, nothing more than that. Since a bot physically can't add such things, I think we're justified in assuming that they're out to help, not to harm. Happy‑melon 21:25, 3 January 2009 (UTC)[reply]

Bots generally have a far lower error rate than human editors. We generally don't review bot edits currently anyway (which is why they have the bot flag). The odds of a bot making an error that's not just a formatting error is near zero. FlaggedRevs is mainly to catch vandalism and other things we really don't want readers to see; formatting errors really don't matter. Mr.Z-man 19:17, 5 January 2009 (UTC)[reply]

Sinebot

Sinebot made a strange edit to this page. As a result two my comments (and several votes, which were later restored) evaporated. Ruslik (talk) 22:21, 2 January 2009 (UTC)[reply]

Should probably contact slakr. §hep • ¡Talk to me! 22:57, 2 January 2009 (UTC)[reply]

Sunset provision

The enabling of any Flagged-revisions functionality (not the individual trials, but the entire schema) should be done only with a clear Sunset provision, which this proposal does not include. This proposal should be rejected on that basis alone. (sdsds - talk) 05:00, 3 January 2009 (UTC)[reply]

The only imperative statement in the whole proposal is that "each trial must have a definite endpoint". This is an extremely clear sunset provision for each trial, as you note. However, without an active trial, there is no "Flagged-revisions functionality"; if there are no active trials, the extension is invisible. Happy‑melon 10:50, 3 January 2009 (UTC)[reply]

On a technicalitly, there is no statement in the 'Future options' section of when the consensus for keeping FR will be reviewed. The proposal can be seen as allowing the beginning of new trials with specific endpoints indefinitely until one finishes with the right result. MickMacNee (talk) 15:29, 3 January 2009 (UTC)[reply]

What is the "right" result? Surely if we can ever agree that a trial has produced the "right" result then the trial period is over? I doubt the community's patience would stretch to an indefinite set of trials, and I trust our bureaucrats to judge the community's mood correctly. If support for continued trials begins to wane, then bureaucrats will be less amenable to beginning them, leading to the "invisible" situation. The sunset provision in this proposal is effectively that the community's willingness to continue is being reassessed at the start of every trial. Happy‑melon 17:55, 3 January 2009 (UTC)[reply]

The way to see whether this has produced the right result is to have a control sample which is not flagged, and which is otherwise comparable. Septentrionalis PMAnderson 18:41, 3 January 2009 (UTC)[reply]

Definitely, such ideas should definitely be a part of any viable trial. Happy‑melon 20:18, 3 January 2009 (UTC)[reply]

I would have said that measuring how much support has waned, given the multitude of different ways a trial can be proposed, is going to be difficult. Why not just say, "if after X months no specific trial config has made it into a roll out discussion, the feature will be removed"? Considering trials are in the order of months, this could be a very long process. MickMacNee (talk) 19:58, 3 January 2009 (UTC)[reply]

Doing so doesn't really mean anything, it's just an olive branch to those who are opposed to FlaggedRevisions in any form, since if support wanes the implementation goes into hibernation anyway. That's not to say it's necessarily a bad idea, of course. It would be rather difficult, however, to agree on how long X months should be, especially since as you say this is likely to be a very long process. I would be surprised if the trial phase was over before the end of this year. I think I have enough faith in our bureaucrats to trust them to make the right calls on this issue. Happy‑melon 20:24, 3 January 2009 (UTC)[reply]

Wikipedia -- The Encyclopedia that invites you to have faith in bureaucrats! Perhaps that model of governance is a big piece of the philosophical framework that allows some people to support flagged revisions.... (sdsds - talk) 21:13, 9 January 2009 (UTC)[reply]

Yes, that sounds about right. Why, is there something we should know about the ability of our bureaucrats to judge community consensus? Happy‑melon 21:44, 9 January 2009 (UTC)[reply]

Please indicate which wording in Wikipedia:Bureaucrats supports the assertion that a task like this is appropriate for bureaucrats. Taken to the extreme, bureaucrats could decide every word that apppears on each Wikipedia article page. We don't do that, though, because authoritarianism isn't the model of governance which has attracted so many editors to Wikipedia. (sdsds - talk) 05:21, 10 January 2009 (UTC)[reply]

Who determines the parameters of trials?

If we go forward with trials, who will determine the parameters of these trials? The proposal says that "A trial begins when there is a consensus on the pages, metrics and procedural details involved. Each trial must have a definite endpoint; either a fixed time duration or some other objective quantity." Are there going to be separate community-wide discussions about each and every trial or will the trials be put into the hands of a smaller, more manageable group held accountable to the larger community?

Until those questions are answered, it seems that the current proposal is entirely too vague. --ElKevbo (talk) 06:16, 3 January 2009 (UTC)[reply]

Each trial will require a separate consensus on which articles to include, how and when the various technical bits are assigned (when to sight, who to make reviewers, etc), when the endpoint is, and (most importantly) whether to go at all. As the bureaucrats are the ones with the technical ability to create 'surveyors', who are the only ones who can actually configure FlaggedRevs on individual articles, judging that consensus is in the hands of the 'crats; if we don't trust them to judge consensus correctly, we have much greater problems than a FlaggedRevs trial :D. While I expect the trial discussions won't attract as wide a community involvement as this discussion, they'll still be open to everyone to contribute. Happy‑melon 10:42, 3 January 2009 (UTC)[reply]

Could you at least amend the proposal to make clear that specficic proposed trials that can be proposed, as the four currently standing, do not have to be two months long, and do not have to require admins to allow reviewer rights. There are a lot of things that people clearly believe will happen in this process if implementation for trials is approved, but aren't mentioned anywhere on the page. The status of new pages for example as mentioned above in the opposes as well. MickMacNee (talk) 15:23, 3 January 2009 (UTC)[reply]

Amending this proposal while it's in a straw poll stage is asking for a riot, but the /Proposed trials page is entirely fair game (indeed I reverted just this morning an attempt to 'enshrine' those two variables there). Why don't you put something down there yourself? It's not 'my' poll :D... Happy‑melon 17:51, 3 January 2009 (UTC)[reply]

That is why WP:VIE strongly recommends against taking a poll until the proposal has been actively discussed, and then only to document an apparent consensus on the result. I have included a list of conditions which might persuade me and some others to support next time; but making proposals which cannot be changed shows a failure to understand that we edit by consensus. Septentrionalis PMAnderson 19:46, 3 January 2009 (UTC)[reply]

You've been asleep for about a month if you think this proposal hasn't been actively discussed, both with those active on WT:FLR and the wider community. Did you miss the RfC? Happy‑melon 20:17, 3 January 2009 (UTC)[reply]

No, I've just missed that obscure page; I will watch it hereafter. Yes, like most editors, I missed the RfC too; please supply a link. (This is why proposed changes to the whole of WP should go on watchlist notice.) Septentrionalis PMAnderson 21:34, 3 January 2009 (UTC)[reply]

If I understand you correctly, it sounds as if each step and parameter of each trial would require another round of community-wide consensus gathering. That, to me, seems unnecessarily bureaucratic and cumbersome. Would it not be more practical for these details to be worked out by a smaller and more manageable group of people? Whatever is done, it still seems that the proposal needs to be fleshed out and made more clear.

In any case, I appreciate your patience! --ElKevbo (talk) 17:57, 3 January 2009 (UTC)[reply]

I wouldn't say we need a separate community-wide consensus for each and every trivial detail. In the same way this proposal has evolved, we'd start with a framework, develop the details, invite community input and refine based on those comments, and then conclude with a consensus-demonstrating poll. That's the timeline of a good proposal. Doing so for each proposed trial separately is important for a number of reasons. Firstly, it acts as a check-and-balance to keep our feet on the ground and make sure that what we're doing has and continues to have community consensus. Secondly, it means we can have more than one parallel proposal in the 'pipeline', without which we'd be trying to shove square pegs into round holes to squash all the myriad possible variations on FlaggedRevisions into one implementation. Finally, we can separate that decision-making process from the two most controversial questions: "who arbitrates said decision-making process?" and "should we allow that process to proceed at all?". This proposal is really only to answer those questions: to say that the 'crats, rather than the devs, should judge our consensus, and that yes, we do want to carry this forward at least one more step. I agree that the individual proposals need considerable "fleshing out"; but that's another job for another day, once we've decided whether to let them continue to grow or slaughter them now :D. Happy‑melon 18:12, 3 January 2009 (UTC)[reply]

"Evolved"? How? Septentrionalis PMAnderson 19:46, 3 January 2009 (UTC)[reply]

If you read the archives, particularly some of the threads at the top of WT:FLR, you can see how this proposal grew out of two individual users' personal configuration ideas; the idea that we should conduct a trial grew organically from that discussion and pretty much snowballed from there. There was never a "Hey guys why don't we do a trial, we could run it exactly like this..." AFAIK. Happy‑melon 20:15, 3 January 2009 (UTC)[reply]

On the contrary, that's exactly what this page is; it leaves out many of the important specifications on what a meaningful trial should be, but includes exactly what keys and triggers should be used. Septentrionalis PMAnderson 21:10, 3 January 2009 (UTC)[reply]

I can see how it could appear to be such for someone who managed to miss all the discussion beforehand, and the first they heard of it was the announcement on various pumps. However, to conclude from that that the item in question must have been created instantaneously by some Designer is rather similar to certain other theories of questionable scientific legitimacy. The edits speak for themselves: this proposal grew from simple starting points into a more complex entity, pretty much the definition of evolution :D. Now we're keen to take everyone's input on how best to progress it further. Happy‑melon 21:40, 3 January 2009 (UTC)[reply]

I've just added some outline trial proposals. In light of the consensus dring a poll issue, if they aren't appropriate on that page, and need to come here, I would not object. Or should I mark them as late additions? MickMacNee (talk) 19:50, 3 January 2009 (UTC)[reply]

I'm glad to see some part of this that can be edited. Some of the discussion above implies that the proposals are not what we are currently being polled on, so modifying them should be fine. Septentrionalis PMAnderson 20:10, 3 January 2009 (UTC)[reply]

These are very interesting points; I think in particular a trial on Unwatched articles will be an absolute necessity. The /Proposed trials page was split out precisely to allow that page to evolve during this period of high interest, so as Pmanderson says, feel free to make whatever edits there you think are helpful. Happy‑melon 20:13, 3 January 2009 (UTC)[reply]

statistics please

Before this trial begins how about posting some stats on article edits, the numbers/percentage of vandalism edits that are true vandalism. How significant is this problem, is this going to address them or create little cabal worlds? Gnan garra 09:39, 3 January 2009 (UTC)[reply]

(Using the rule that 80% of stats are made up ...) looking at a small subset of about two hundred articles on my watchlist, about 15% of the articles have had vandal edits which I have seen (actually about 27 out of 180+) in a two month period. On those pages, about 3 to 5% of the edits are by vandals or very clueless newbies. It would be interesting to see if other editors have found similar ratios, or whether my group of "more contentious than average" articles represents worst-case. Extrapolating - in a test of 10,000 articles, I would expect 1,500 to be attacked within a given period of two months. Collect (talk) 23:51, 3 January 2009 (UTC)[reply]

I have done some research into this this afternoon. Working from the top of the recentchanges list, limiting to the mainspace and excluding log events, I have gathered the following statistics using a short python script, which I'm happy to post if you want it. From the most recent one million revisions:

Roughly a quarter of all edits are from IPs (238100 (24%))
The percentage of all edits that are obvious reversions (rollbacks or undos with things like "vandal", "spam" or "rvv" in the summary) is steady at around 4% (41,425/1,000,000)
Of those obvious reversions, roughly two thirds are reversions of edits by IPs (25770 (62%))
Of the individual users reverted, 75% of them are IPs. (17,941/24,211 (74%))

This does not really answer the most important question, which is "what percentage of edits by X are vandalism?", and note also that the method used to identify reversions will systematically underestimate the total number of reversions (and hence the total amount of vandalism). I do not believe, however, that there is systemic error in the last two figures, so I would be prepared to claim that "Three quarters of 'casual vandals' are IPs" and that "Two thirds of 'obvious' vandalism is from IP editors". Any suggestions for improving the depth and accuracy of these results? Happy‑melon 16:29, 4 January 2009 (UTC)[reply]

In short (and considering one million edits to be a statistically valid sample) almost 10% of IP edits are non-utile (giving them the benefit of the doubt). And about 2% of non-IP user edits are non-utile edits. This is, moreover, similar for IPs to what the German report claimed. Unfortunately we do not have a breakdown of regstered users by number of edits, and we must also be aware that some cases may be legitimate edit disputes as well (calling an edit "vandalism" does not automatically mean it was vandalism). Basically this means this experiment may not be the way to go (much as I wish it were). We would gain quite nearly the same positive result by making sighting only required for IP edits in the first place (basically a variant of "semi-protection") as we would be implementing flagging of all articles. Collect (talk) 16:47, 4 January 2009 (UTC)[reply]

We could test that. If we set up one trial with a bot to run round sighting all non-IP edits to stable versions (we'd have to give the bot 'reviewer' as well as 'bot' rights, because it can't normally do that), we'd be simulating exactly the system you propose. That could be an interesting implementation of FlaggedRevs (a full implementation would give autoreview to all registered (autoconfirmed?) users), certainly worth further consideration. Happy‑melon 17:09, 4 January 2009 (UTC)[reply]

Your result that 4% of edits are vandalism contrasts with a figure of about 10% that was being bandied about in Flagged Revs discussions a few months ago. As you point out somewhere else, WP editing has weekly and seasonal cycles. I suspect that the level of vandalism is much lower than average during school holidays for instance! PaddyLeahy (talk) 10:16, 5 January 2009 (UTC)[reply]

Note also that the count excludes bot edits, ie reversions by our anti-vandal bots; as I said, the percentage of vandalism is a systemic underestimate. Your point about school holidays is interesting though: I'm repeating the analysis taking one million edits from the bottom of the recentchanges table, which is December 6 onwards; at least some of which will be during school term time. Happy‑melon 10:34, 5 January 2009 (UTC)[reply]

Stats from the bottom of the recentchanges table are very much the same as those from the top:

How many revisions to analyse? 1,000,000
Total edits:   1,000,000
IP edits:        274,756 (27%)
Total reverts:    69,615
IP reverts:       45,191  (65%)
Total vandals:    39,309
IP vandals:       29,412  (75%)

I'll try and get the bot edits included. Happy‑melon 11:12, 5 January 2009 (UTC)[reply]

While you're doing that, do you think that you could get an estimate of how long articles spend in a vandalized state? Comparing timestamps between edits and reverts should give the start of an estimate. These stats are very useful, thanks! Lot 49a^talk 17:23, 5 January 2009 (UTC)[reply]

Also, it looks like 7.0% of edits in the older sample are vandalism. So PaddyLeahy might have a point. Lot 49a^talk 00:26, 6 January 2009 (UTC)[reply]

Likely so -- which is a tad more in line with my figures above. Key though is the apparent ratio of IP vandals which is quite nearly the same in each case. Collect (talk) 15:38, 7 January 2009 (UTC)[reply]

There are some links to studies over here. The result all seem pretty preliminary, but the main take home I get from them is that ~3% of page views of BLPs are vandalized views ~7% of article minutes of highly visited pages are in a vandalized state and that currently the median time to fixing is about 6-14 minutes (the mean time is much, much higher due to some vandalism that was missed for a very long time in a few cases).

At this point, given how hard a time we seem to be having coming up with good statistics or information on this problem, I feel like we ought to be spending more time gathering that evidence to make a rock-solid case for how serious a problem this really is.

I'm also open to hearing a compelling argument for whether having ~3% of page views of BLPs be vandalized page views is a compelling enough problem that we need to overhaul a significant portion of our editing and article approval processes. edit My tone is obviously sceptical but I am genuinely open to changing my mind on this. Lot 49a^talk 09:10, 8 January 2009 (UTC)[reply]

Here is another study. Lot 49a^talk 09:18, 8 January 2009 (UTC)[reply]

These are interesting figures and its lower then I thought, I expected to see around 35-40% of edits being vandalism related(edit/rvt) given the wider affect it'll have. To see results in the 3-7% range means that 93-97% of edits are not an issue, it appears to me that maybe we are making mountains in the sky over this. If there was a subset of articles which were attracting a higher rate I'd assess that but I definitely dont see any need based on these. An additional concern would be that complacency will creep into article watching making it harder to address the problems beyond edit/revert stage. As already said maybe a better effort in obtaining stats/measures would be beneficial before considering any trial period. Gnan garra 09:48, 8 January 2009 (UTC)[reply]

The more I read into the stats (and I agree that I think we need more studies) the more it seems like the biggest problem is the stuff that sneaks past the watch tower that is RC patrol etc. and then sticks around for a very long time because no one cares or is watching these pages. The median time for reverts seems really good, it's the nutty outliers where vandalism is unfixed for days that need to be brought down. These make the mean unacceptably high (though in aggregate, we're still talking only ~3% of page views showing bad edits live) It's not clear to me how FR solves this problem. I suppose it means that these unfixed vandalism changes would remain unsighted for all that time, but that brings up the backlog problem, the solution to which has been 'bots would auto sight posts after a certain amount of time" which means that the undetected-in-the-first-place vandals would still not be stopped? Lot 49a^talk 10:32, 8 January 2009 (UTC)[reply]

Is there a list of unwatched articles somewhere? If so, just advertising that fact might reduce the list. I wouldn't be averse to adding a few to my watchlist, and I'd expect that there are many others in the same position. - Hordaland (talk) 15:42, 9 January 2009 (UTC)[reply]

Apparently there is but only Admins can see it (probably to prevent vandals from picking easy targets). I'd sign up for a service where I got some articles assigned to me to watch :) Lot 49a^talk 22:56, 10 January 2009 (UTC)[reply]

User:Dragons flight/Log analysis may be of interest: analysis of the situation up to October 2007, at which point 10% of all edits were reverted (20% of IP edits); in general the problem was growing with time up to then but now we have reached a bit of a steady state (slowly declining edit rate) maybe vandalism has stopped growing or is even declining as well. PaddyLeahy (talk) 21:17, 10 January 2009 (UTC)[reply]

Lots of nice material -- at the time, then, 7% of edits were reverted and made by IPs. About 3% were reverted and made by registered users. Total IP edits have fallen, and the percentage of them which are reverted has fallen, though not as greatly as the drop in reversions of edits by registered users. Thus, vandalism is apparently reduced, especially from its peak, but there is still a substantial problem left. The issue boils down to -- is flagging the optimal way to proceed? Collect (talk) 23:12, 10 January 2009 (UTC)[reply]

But these stats obviously don't include all the vandlaism that would be occurring if lots of pages were not semi-protected? Or am I missing something. They only show the current reversion rate for edits to non protected articles, which misses all the attempted vandalism of the most popular but consequently protected targets. Mfield (talk) 23:22, 10 January 2009 (UTC)[reply]

As per Gnangarra, I would also expect some solid statistics laid down before such noise were made. That is, nobody is here exactly knows the extent of the problem (vandalism) but we're still voting for a trial (aiming to solve that problem) to be implemented or not. How would the results of this trial be interpreted then, be compared against what? It seems to me that some users would like to have some fun time with this trial even not knowing what's going on. I suspect german and russian wikis have had any real improvement after the implementation of that flag thing.Logos5557 (talk) 00:09, 11 January 2009 (UTC)[reply]

The German Wikipedia couldn't find any statistical improvements after 8 months of flagging, so discussion is ongoing to increase the requirements for Sighting de:Wikipedia_Diskussion:Gesichtete_Versionen#Ergebnis_Teil_1_-_Kriterien_f.C3.BCr_die_Sichtung_.28Vorschlag.29, which will increase the current backlog further. Mion (talk) 00:25, 11 January 2009 (UTC)[reply]

Mion, could you point us to any statistics on WP:DE regarding improvements?--Jo (talk) 09:21, 15 January 2009 (UTC)[reply]

Yes I can point to statistics at Wikipedia:Flagged_revisions#Statistics, and de:Wikipedia_Diskussion:Gesichtete_Versionen#ParaDox.27s_Tabellen_und_Diagramme, (with thanks to all who are working hard to provide them). More stats are expected. Mion (talk) 10:04, 15 January 2009 (UTC)[reply]

Statistics

Available statistics are at:

http://stats.wikimedia.org/DE/ (updated September 2008)
User:Hut 8.5/German editing stats (2009)
User:Hut 8.5/DEWP reviewer stats (5 may 2008- 12 dec 2008)
http://s23.org/wikistats/wikipedias_html.php (2009)
http://vs.aka-online.de/cgi-bin/rchiststat.pl?dur=86400&lim=30&wp=DE&hl=&fl= (live)
http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=images&project=dewiki (live)
http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=outofdatereviews&project=dewiki (live)
http://toolserver.org/~aka/cgi-bin/reviewcnt.cgi?lang=english&action=overview&project=dewiki (live)
de:Spezial:Statistik (live)

No page will ever change unless sighted / "seconded"

There is probably a lot I don't follow regarding the mechanics, but one thing seems clear, no page will ever change unless sighted.

In other words, were all articles on FR, and were no-one to ever flag anything, Wiki would never change, for better as well as for worse. Under FR, change only happens when manual flagging occurs. That is a lot of work, but it is much less than doubling the normal activity we now have. Even were every future edit to be viewed and either passed or declined, that would be less than double work because reading and deciding take less time than sourcing and composing.

However, there is still a lot of work. That means a few people doing a lot or, on average, people sighting as much as they contribute. I guarded support the proposed trial, though I would recommend on purely mathematical grounds that even IPs should be allowed to sight articles. Quality revisions do not depend on sighters, who may know little about a topic, but on the contributors at a page. The advantage of sighting is mainly at low traffic pages where solo operating experimenters have been saying for some time, in their own unique, way that we should adopt a "display only if seconded" policy.

I have a feeling that mathematics will force us to be generous, or we will indeed only drastically slow genuine improvement and growth of the encyclopedia in our attempt to exclude vandalism. It's worth remembering that more sources are published every day than current levels of volunteers could ever keep pace with. "Seconding" edits is the perfect counter to solo experimenters; however, authorising selected volunteers to render judgments outside their areas of expertise will only lead to burn-out of those volunteers, contributors they clash with, and the encyclopedia itself.

Viewed positively, people's "sighting counts" will be a very important contribution to the encyclopedia, and get us all working more closely together.

"If it were done ... 'twere well it were done quickly."

I support FR so long as IPs are authorised as sighters. Alastair Haines (talk) 14:18, 3 January 2009 (UTC)[reply]

Uh, doesn't that defeat the purpose of FR? If were limiting which registered users can sight articles, how does opening it up to IPs help the situation? IPs have no stable or permanent edit record, so even if an IP applys for sighting permission, how do we determine that the next person to edit under that IP is the same person who was given the permission? How do we stop vandals and trolls from using permitted IPs to sight their own edits or the non-productive edits of others? I don't see how FR can ever work unless we know exactly how is doing the sighting, and that they can be held accountable for what they do with the sighting priveleges. - BillCJ (talk) 15:07, 3 January 2009 (UTC)[reply]

Btw, I don't need to know anything about an article's subject to know that "This article is KEWL!!!" has nothing to do with nuclear physics or whatever other subject. The main point of FR is to keep the crap out of public view. - BillCJ (talk) 15:11, 3 January 2009 (UTC)[reply]

It is a point that everybody needs to bear in mind, under the FR system, there is no-one to "blame" for a perfectly good edit not being sighted, it is just an inherent feature of the system, much like there is no-one to blame if nobody investigates an SSP, or responds to an ANI or other noticeboard post. There is as far as I can see, no specific commitment to even investigate the effects of this basic change in the editting process in the proposal. MickMacNee (talk) 15:17, 3 January 2009 (UTC)[reply]

Thanks guys, two good points.

Yes Bill, no expertise is needed to reject "This article is KEWL!!!" which is why IPs can do it. On the other hand, I've been so long out of mathematics, I wouldn't trust myself to know if some "corrections" to the Laplace transform article were cheeky experiments or not. If we are to have "qualified sighters" that's fine, so long as they really are qualified; and that can't be determined within the current Wiki structure. User:BobTheJazzMan sights a correction to the Binary numeral system#Binary arithmetic to ensure it says "1+1=0" and rejects it as vandalism. How long will it take to persuade Bob, who is a trusted Wikipedian with 50,000 edits to music articles, that he's actually making Wiki maths articles less reliable?

I also take Mick's point, we are all to blame if contributors give up donating effort to Wiki because no one gets around to sighting their new Pokemon related article under the strain of having to prioritise where they go sighting, or because they're not a qualified Pokemon expert, or because they think they are.

I'm in favour of FR if it is a way we all take non-expert interest in one another's work, or support the work of others working in similar areas to ourselves. If it's going to introduce a group of "experts" into the system, that's a new thing altogether, and ultimately contributors will want to see verifiable independent, third-party backing for such claims of expertise, and constraints on taking action outside that reliably sourced expertise. That's OK with me, but if I were a maths editor confronted by BobTheJazzMan telling me what's right, because User:WikiRulesOK said he could be trusted, I'd have a good giggle and contribute my future donations to online knowledge via another web-site.

To conclude, I think FR is a brilliant step forward, if it forces us to take a closer interest in one another's work and "second" it for our co-workers. However, if we set up some system where some editors are supposed to be more expert than others, that's another thing altogether. Sighters will regularly be less expert than the contributors whose edits they are reviewing. Perhaps someone can come up with a dozen examples better than my "1+1=0" one. Alastair Haines (talk) 17:39, 3 January 2009 (UTC)[reply]

The only way FlaggedRevisions can work logistically is if the group of 'reviewer's make up a measurable fraction of the active user population. De.wiki has over five thousand sighters; we have three times as many articles as them, so we're looking at a reviewer base of at least ten thousand editors. No one is entirely sure how many active editors there are in total on en.wiki, but that must be at least 10% of the total, and including every one of the top quartile by activity. I don't believe that the formation of a 'cabal' in such a situation is really possible.

As for what will happen in such a situation, which I agree is highly plausible, I think the important thing to realise is that there is more to wiki editing than just sighting revisions. We have to actually make revisions as well :D. If User:Foo reverts a 1+1=0 edit and sights it, that situation is entirely analogous to User:Foo stumbling across such a phrase in an article today and 'fixing' it. When User:Bar, a more "expert" mathematician, sees the change, he'll have a quiet laugh, restore the change with an explanatory edit summary, and either sight the change himself or wait for a sighter to see the change (and the explanation) and sight it for him. The only change from the way it currently works is the "sighting" part; the history would be exactly the same, and it would appear very similarly in people's watchlists. Happy‑melon 17:48, 3 January 2009 (UTC)[reply]

Thank you Happy-melon, which is a really great name, but I won't ask more about it. You do calm my anxiety on key points. Most importantly, a very large number of sighter/reviewers is intended. I note the proposal suggests all who currently have rollback access, which I believe I have without ever having asked for it. And, incidently, I would never have asked for had a request been necessary. Likewise, even now, I would accept the responsibility of sighting if it was given to me, but I would not ask for it if it wasn't...

except for one thing:

you speak of User:Bar "sight[ing] the change himself". This is the odd thing, it seems that effectively the idea is that anyone can apply to sight her own edits, if she's not considered to be a potential risk of vandalism she'll be accepted. Presumably, if she breaches trust, she could lose this access; however, the main thing is, FR seems simply to be about a way of implementing a kind of "probabationary" feature for accounts. That doesn't quite match the figure of 10% of active editors being sighters, and were I a regular but not high volume contributor not granted the option of sighting my own edits, where others were, I might feel irritated that my good faith contributions have not already been noted and new hoops are being imposed on me.

Anyway, I don't mean to be critical of the FR idea, nor of your comments Happy-melon. I think we agree on most matters of principle and although it might seem there's a gap between your 10% and my 100% (sighting can be lost, but is otherwise automatic) I think that gap can be closed by what we learn from trials. You are doing a great job of helping talk nervous people like me and others through this trial process. That's precisely how a community should work, good for you! :) Alastair Haines (talk) 18:30, 3 January 2009 (UTC)[reply]

Thanks for your comments. I think a lot of people misjudge the amount of security we can legitimately expect from FlaggedRevisions, either expecting it to be a panacea to all our ills or to be of no use whatsoever. Of course it's not going to stop all disruptive edits, no technical measure can. What it can do is stem the tide and remove the bulk of the crap that we get deluged with every hour of every day, leaving us humans time to deal with the problems that can't be handled by such processes - and of course, the actual job of writing the encyclopedia :D. I don't see it as a "probation" so much as an expression of good faith: "you've clearly shown some good edits, and no obvious problems, so there's no reason not to give this to you". Obviously if that tool is abused, that indicates that our initial assumption of good faith was incorrect, and so it should be removed. The ultimate position should in fact be the same: all legitimate accounts have the flag, and it has been granted and then removed from all the 'bad' accounts. Of course logistically we're never going to get to that stage, but the principle is indeed the same. Happy‑melon 20:09, 3 January 2009 (UTC)[reply]

I do mean to be critical of the FR idea; I began by thinking this an ill-thought out proposal for testing it, but this discussion has convinced me that the original idea was almost as bad.

The example here makes clear what is wrong with it, besides sheer impracticability: suppose User:Foo with his good-faith fix, is an admin, and sights his own edit; there is no requirement that admins know or understand mathematics. Then we have something worse than the present situation, for User:Bar's fix may well have to wait three weeks or longer to actually take effect on the visible side of WP. Septentrionalis PMAnderson 20:24, 3 January 2009 (UTC)[reply]

Gosh, it would be nice to have some evidence for how long they'd have to wait that wasn't imported from a wiki with a radically different culture and mentality to ours... :D You raise a good point, and if our median sight time is 21 days it will indeed represent a significant problem. We really do need to confirm for ourselves that we can keep that backlog down. There's only one way to find out: to test and see if we can handle a small set of articles. If we can, we can try a larger set, and a larger set, until we are confident that we have either found our limit, or that we can handle an entire namespace. Some number between zero and infinity is the number of articles that we can successfully sight without building up an enormous backlog. No one here has the foggiest idea what that number is, we have no evidence whatsoever that is really applicable here. Why don't we collect some? Happy‑melon 21:45, 3 January 2009 (UTC)[reply]

This proposal admits that this test will not scale. The German Wikipedia as a whole is smaller than we are; tests can only prove that backlogs will exist, they can't disprove it. Septentrionalis PMAnderson 22:30, 3 January 2009 (UTC)[reply]

The proposal does not scale technically: the difficulty of maintaning a trial (ie setting which pages should and should not display FLR behavior) increases with the size of the sample. Of course the difficulty in maintaining the pages themselves also scales with the size of the trial, but that's what we're trying to find out. If we decide to have a trial on 6,914,558 articles, we can do that, it will just be a pig to initiate. The data gathered from such a test would be exactly the same as if we had enabled FLR over the entire mainspace. The point is that this configuration is not a sustainable solution for having large numbers of pages displaying FLR behavior. But then again, that's the whole point. Happy‑melon 16:13, 4 January 2009 (UTC)[reply]

No, the data from such a trial would be utterly useless, because nobody would ever be able to examine it. (The secondary consideration that there would be no control to compare it to is relatively minor.) Septentrionalis PMAnderson 16:56, 4 January 2009 (UTC)[reply]

Oh I quite agree, such a trial would be utterly useless and would have my most strident opposition. My point is that the data gained would be just the same (and equally useless) as if it were gathered from a full deployment. The proposal allows trials of any shape and size we want, with the understanding that larger trials are more difficult to organise. The only restriction is, as you correctly note, our ability to analyse the resulting data, with larger implementations being more difficult to draw accurate conclusions from. As evinced admirably by the mishmash of soundbites and statistics from de.wiki. Happy‑melon 17:06, 4 January 2009 (UTC)[reply]

I think autoreviewing as discussed in the section Autoreview delay below would be an effective way of preventing backlogs, at the cost of letting through a small proportion of vandalism; in this system all revisions that have stood as the current revision for a certain time period (say 24 hours) are automatically flagged as reviewed. The vast majority of vandalism is still detected, and no good edits have to wait longer than a predictable, reasonable period to get through. Dcoetzee 02:07, 6 January 2009 (UTC)[reply]

Questions for somebody with great patience . . .

I gotta tell you, as a technodolt, I'm just barely able to grasp what is being proposed here. I went to the Wikilab thing and made an edit, looked at the history, and now have some questions. Okay, so if the page viewed by the public does not change until it has been "sighted", what happens when multiple editors attempt to edit that page ere anyone takes time to "sight" it? Will the second editor see the page as edited by the previous editor, or the public page? I assume the former, and if that's the case, when I go to "sight" it, will it show me the edits individually or altogether? If there were six edits made since the last sighting, all by different anons, and edits #2 and #5 were vandalism, will I be able to dismiss those and leave the others? Doesn't all this lead inevitably to at least some likelihood of increased edit conflicts?

As an editor who has recently (and reluctantly) come to the conclusion that perhaps we should require registration to edit _{(it feels to me like 90% of anon edits are vandalism, and 90% of vandalism is from anons)}, I probably should jump on this proposal, right? But like someone in the discussion up there mentioned, one of the things that may motivate an editor in the early going of their Wikipedia career—whether they are anon or registered— is getting to see their edits right away. Are we going to be turning off a substantial portion of our potential editorship with this? Finally, can this process be applied exclusively to anon edits? That is, could it be set so that all edits from registered accounts—even those just created—would appear right away?

Just some questions from someone like everyone else here who just wants the best for the project. Un school 17:07, 3 January 2009 (UTC)[reply]

Firstly, I could just as easily imagine someone getting a kick from seeing his/her edits immediately, as from seeing that another human being has agreed that those edits improve the Free Encyclopaedia.

Secondly, have a look at a page that has not yet been checked. The 'edit this page'-link at the top of the page will presumably read 'edit draft', and will allow you to edit the unchecked version, even showing you what (unchecked) edits were made since the last review. Have a look, experiment a bit. I think this should answer several of your questions. -- Ec5618 17:32, 3 January 2009 (UTC)

These are good questions. The first is very simple: the edit box always contains the latest version of the page wikitext, so there is never any conflict there; editors are editing the current version, whether or not they saw it on the 'view' screen. If there are differences, they are shown in a diff above the edit box, so editors will always know that they've either seen the latest version, or what the changes between the version they saw and the version they're editing are. This would allow them to see, for instance, if a change they wanted to make has already been made. If you're able to sight the page, then after you've made your edit you'll be invited to review the latest changes. A diff will show you all the changes made on top of the most recent sighted revision. If some of them are vandalism and others not, you may need to make another edit to revert only those vandal edits, but you'd need to do that with or without FLR (and with the system, you have a diff at the top of the screen to show you exactly what you're looking for). Once you're in a position where you're happy that the diff between the last sighted version and the current revision contains only positive changes, you can click a button to sight it (and by implication, all the edits before it). You don't have to sight each edit individually; they're sighted versions, not sighted edits.

We're all aware that the 'buzz' an IP gets from seeing their edits straight up in blinking lights is one of the things that is going to be lost with FlaggedRevisions. However, remember that that buzz is not always a good thing: I'm sure a large proportion of vandal edits are motivated by exactly the same instant gratification. Although obviously I can't put any hard evidence to it (yet :D), I hope that the two additional low hurdles (creating an account to see current versions, and getting some sort of reputation to get the 'reviewer' flag) will actually encourage people to make the transition from being the IP who makes the once-in-a-blue-moon spelling correction, to the registered user who makes the occasional gnome edit, to the fully-fledged editor who is an active part of the wikipedia community. I think our encouragement of existing editors is quite good; our biggest chokepoint is in getting IPs to register in the first place. I hope that this will actually help to get them on the map.

On a technical level, yes, we could give all registered users the ability to sight revisions. However, I don't believe that this would be beneficial. IIRC the numbers you quote at 90% are actually closer to 60%; certainly there are a staggering number of vandal edits that do not come from IPs. Sockpuppeteers, hardcore vandals and even casual vandals who use accounts purely to hide their IP addresses, all demonstrate that it is not possible to say with any certainty that all, or even an acceptable majority, of registered users are trustworthy. I would be very hesitant even to put my chips down to say that an acceptable majority of autoconfirmed users are trustworthy. In my opinion, this level of 'trust', which is really very low, is the lowest that requires the human touch; a review (however brief) by a real person who we've already determined to have coherent judgement. Certainly 'reviewer' should be granted very liberally indeed, to almost everyone who asks for it. But I think that having to make that step of submitting a request is important, for two reasons. Firstly, it encourages editor development, as noted above; making a foray 'backstage' is an open door to innumerable 'force multiplier' tools and features that will make them a more durable (less likely to drift away) and effective editor. Secondly, it weeds out the casual vandals and blatant undesirables (spammers and SPAs in particular). All in all, I strongly believe that a lightweight human review for 'reviewer' is an essential component of a successful FlaggedRevisions implementation.

I hope this answers your questions, or at least gives my perspectives on them. Feel free to ask if anything needs further clarification. Happy‑melon 17:39, 3 January 2009 (UTC)[reply]

Thanks for your answers, melon. There was a single point which I failed to make clear...I wasn't suggesting that all registered accounts be able to sight edits, I was suggesting that all edits by registered editors not need "sighting". But I can see from your other comments (about people registering to perform vandalism) that you would probably not favor this. One last question that I hadn't thought of before. If I am a "reviewer", are my edits automatically "sighted"? Un school 18:15, 3 January 2009 (UTC)[reply]

If you make an edit to a sighted version (ie there are no changes between the time the last version was sighted and the time you edit) then the version after your edit is sighted automatically. If you make an edit on top of other unsighted revisions, the final version is not automatically sighted. This makes sense since it's versions that are sighted, not edits; if the only change between a new version and a sighted version is an edit by a trusted user, it makes sense to assume that the new version is clean. However, if there are other changes involved, it wouldn't be appropriate to automatically sight the new revision. Happy‑melon 18:19, 3 January 2009 (UTC)[reply]

Just one question... "Who's on first?" Seriously... I could not make heads or tails of your last answer. So let's try again... 1) If I am a "reviewer" are my edits automatically "sited" 2) if not, do I have to get someone else to site them or can I site them myself? Blueboar (talk) 22:01, 5 January 2009 (UTC)[reply]

Let's say User:123.123.123.123 edits foo. Their changes are obviously not autoreviewed; so after their edits the page has "unreviewed changes". Now let User:Rollbacker come and make an edit. They have made changes to an unsighted version, so their edits are not marked as "sighted": if they want the version after their edits to be sighted (and hence display for anons) they have to "sight" it manually (there is an screen for them to do this immediatley after their edit is processed). Once they've done that, the page is 'up to date', and has no unreviewed changes. Now if User:Reviewer comes along and makes an edit, all the changes that are made have been made by an explicitly 'trusted' user, so User:Reviewer's edits in this case are automatically sighted; no one has to manually mark them as sighted. Anyone who can sight edits can sight all edits, including their own; indeed there are subtle encouragements in the software for people to do this as good 'wiki cleanliness'. I hope this clarifies. Happy‑melon 23:10, 5 January 2009 (UTC)[reply]

Question regarding consensus

I'm strongly opposed to this idea, as are many, many others (not to mention, I'm sure, valid and productive IP editors). What are you guys in support going to take as consensus to push this forwards? At the time of writing we're on 93 in support and 38 in oppose, so that's roughly 71% support. In RfA that's considered by most to be the lowest possible percentage to promote an administrator. In RfB, that's not enough. I doubt it'd be anywhere near reasonable enough for ArbCom either. So, what are you guys going to say is the benchmark for you to take this forwards? I'd imagine it should be a lot higher than 70% support. —Cyclonenim (talk · contribs · email) 19:16, 3 January 2009 (UTC)[reply]

In this case, it's not up to us do decide. The developers are responsible for judging this particular consensus. That they are neither selected nor best placed to do so is a reason why we need control of this process in the hands of the bureaucrats. Happy‑melon 20:01, 3 January 2009 (UTC)[reply]

The very idea of placing this proposal up for a poll before discussing the implementation, and then declaring it unamendable, is a violation of the Wiki way, in providing no avenue to reach consensus; it should be dropped now. Fortunately, developers are unlikely to take notice of anything we do here. Septentrionalis PMAnderson 20:06, 3 January 2009 (UTC)[reply]

The point of this straw poll is to see whether there is consensus for taking the idea further, turning on the capability to have a trial and hashing out detailed proposal(s) for trials. As Happy-melon says, if subsequently there is no consensus on any detailed trial proposal, no trial will happen. This straw poll is not going to lead directly to any final decision being taken. Even if we have a trial, and even if there is consensus that the trial was a success, consensus could still change and we could still decide to turn it off again.—greenrd (talk) 10:28, 4 January 2009 (UTC)[reply]

I very strongly object to any tests under this protocol. (I have stated my reasons above.) I therefore object to the protocol in itself, and want it turned off now and permanently; I could support tests under other conditions. Others object to any tests of FR at all, because no results would get them to support such a change to Wikipedia. Both of us are entitled to object in toto now, rather than having to catch and object to each test proposal. Septentrionalis PMAnderson 16:51, 4 January 2009 (UTC)[reply]

This is my point precisely, and the whole reason to conduct the process in this fashion: it is important that we see whether there is consensus for, and for people like yourself to have the opportunity to comment on, the principle of conducting any trials at all before it makes sense to finalise any particular trial in detail.

What "other conditions" would you apply to the trial architecture? Happy‑melon 19:01, 4 January 2009 (UTC)[reply]

Wikipedia:Flagged_revisions/Trial/Proposed_trials#Conditions would be a good start. Not all of those are my ideas, but most of them are good ideas. Septentrionalis PMAnderson 22:19, 4 January 2009 (UTC)[reply]

I can't agree with the presented numbers, in sep 2008, registered editors (with more than 20 edits) made 13971 edits stats anons made 6058 edits ( 30,24 %) stats, we can safely assume that no anon editor would vote in favor of this proposal, so 185 sup + 102 opp makes 287 votes *.30 = +86 anon opposing votes, in effect the proposal has just as many opponents as proponents .

Taking into account that after the implementation on the German Wikipedia one of every 5 editors left the project raises the question, is 49 % enough ?Mion (talk) 03:28, 5 January 2009 (UTC)[reply]

The above arguments for requiring exceptionally high super-majority to represent consensus seem to be, a) I don't like it, b) All anons would vote like I would so we need to make it harder, c) Full implementation on the German Wikipedia resulted in lots of people leaving.

These seem fallible to me. a) "I don't like it" is obviously not a suitable reason. b) How anons would vote is speculation. Anons are simply editors without accounts. A large proportion of them could support it. c) This is more an argument against the idea, not why we need an exceptional super majority. So it devolves to "I don't like it" again.

I should note also that this is not an irreversible change, or even fundamental change at the moment. It's a trial of functionality, that may either succeed or not once it's tested out. Perhaps when it's time to poll on if it should be rolled out across the wiki, if that ever happens, it might require an exceptional supermajority, but not now. I think a simple majority would probably be good enough for a trial implementation. --Barberio (talk) 03:43, 5 January 2009 (UTC)[reply]

For the first A, is not about, I don't like it, the proposal is a change of rule 2. Ability of anyone to edit articles without registering , a change to make visible edits unvisible. One of the five core rules of the foundation. See m:Foundation_issues

For B, i suggest we do roll playing, I do the roll as Sighter and you logout so you become an anon editor, you make an edit and i make the decision about your valuable contribution. Mion (talk) 04:03, 5 January 2009 (UTC)[reply]

As I said, these are arguments against the use of Flagged revisions, not for a stronger super-majority requirement for a limited trial of flagged revisions.

You've had a full chance to convince people that your opinion is correct on it's merits. That people still seem to disagree is not a cause to require a greater super-majority considering it's only a limited trial. --Barberio (talk) 04:08, 5 January 2009 (UTC)[reply]

At the moment, this has less than 65% approval. Does anyone contend that this is a sign of WP:Consensus, even in our peculiar sense? Septentrionalis PMAnderson 06:52, 5 January 2009 (UTC)[reply]

For a trial, yes. --Barberio (talk) 07:01, 5 January 2009 (UTC)[reply]

Incidentally, comparing this to RfA votes is misleading, as this vote has a much larger population involved. The difference between 100 and 200 votes is more people than the difference between 10 and 30 votes. --Barberio (talk) 07:07, 5 January 2009 (UTC)[reply]

Yes, indeed, it is misleading. 65% of a sample of 300 is very likely to be within a percent or so of a sample of all Wikipedians; 4-2 or even 20-10 are much more dubious. But WP:CONSENSUS defines consensus as a compromise everyone can agree on; there is none such here. Even if all those who would support some form of testing were won over, that would only be 70% or so; 30% dispute this absolutely, and many of the supports are weak, depending on the fraudulent promise that FR will produce reliability and the erroneous impression that the delays on the German WP are minimal.

WP:Consensus also says Articles go through many iterations of consensus to achieve a neutral and readable product. If other editors do not immediately accept your ideas, think of a reasonable change that might integrate your ideas with others and make an edit, or discuss those ideas. That applies even more strongly to a change like this to Wikipedia's structure; the thing to do now is to discuss what changes to this proposal might make it broadly acceptable. It is possible that #just a thought... below, might be one such, but I cannot speak for those who find this unconditionally unacceptable. Septentrionalis PMAnderson 16:07, 5 January 2009 (UTC)[reply]

I think you're making a common miss-reading of WP:Consensus, that a single person or a minority, can filibuster any change by preventing an assumption of general consensus. The policy also includes the phrases, "Consensus can only work among reasonable editors who make a good faith effort to work together in a civil manner.", "Consensus is a partnership between interested parties working positively for a common goal.", and "A representative group may make a decision on behalf of the community as a whole.". Filibuster like attempts to set the bar artificially high, or require complete unanimity, are not good-faith efforts to work together. Consensus is not a set of handcuffs that tie the rest of the group to someone who wants to walk off a cliff. --Barberio (talk) 16:43, 5 January 2009 (UTC)[reply]

The correct interpretation of "consensus" is that the people with the bigger sticks can say that consensus exists even when others say it clearly does not. "Consensus" is routinely abused on Wikipedia to force through the wishes of cliques who have sufficient control of Wikipedia's power structures to do so. To push through this change, in light of the arguments presented in the straw poll, would be a clear abuse of process. However, as our soi-disant constitutional monarch has said he wants it, and has said elsewhere that he is prepared to impose policy on BLP issues, debate seems pointless. DuncanHill (talk) 16:49, 5 January 2009 (UTC)[reply]

Your point, Barberio, if I understand it correctly is, that consensus is like a majorite vote and if the majority is in favor, everyone opposing is a bad-faith editor whose point of view should be discarded? I am sorry, but I always understood that consensus is a question of who has the better arguments, not of what more people like. I'd prefer if some people were to judge consensus here who are not involved here and have no preference...if such people exist. Regards So Why 17:06, 5 January 2009 (UTC)[reply]

Everyone always believes that their own view is the better argument. So if we were to judge consensus solely on if people were presenting differing arguments they felt were valid, then one person could prevent anything happening by saying "I disagree because the space lizards say to."

Consensus is about trying to get consensus agreement by allowing for discussions amongst people working together in good faith. This does mean that a minority of dissent may occasionally have to be ignored if they can not convince the majority that they have valid arguments. The wikipedia processes do normally allow for 'well the basic majority want X, but the minority have compelling arguments not addressed by the majority' to be used to prevent finding of consensus. But not 'well an overwhelming majority want X, and a minority have presented arguments addressed and rejected'.--Barberio (talk) 17:21, 5 January 2009 (UTC)[reply]

If "overwhelming" means "slim" (and that with a debate which excludes most of those likely to be most adversely affected by the trial) then there is an overwhelming majority to implement this trial. If, however, "overwhelming" means what most people use it to mean, then there isn't. DuncanHill (talk) 17:54, 5 January 2009 (UTC)[reply]

I just want to point to a relevant past precedent : Wikipedia:Non-administrator_rollback/Poll. It was fully enabled with 304 for and 151 against. Some opposes made apocalyptic predictions much like now. After the poll was closed as implement, some filed a RFAR. Still we have non-administrative rollback now, and Wikipedia still exists. Ruslik (talk) 10:37, 5 January 2009 (UTC)[reply]

Unless the poll results change drastically,implementing this on these results would be a disruptive nuisance; anyone who presumes to do so should, at a minimum, be stripped of the tools they have abused. Septentrionalis PMAnderson 16:07, 5 January 2009 (UTC)[reply]

It appears that this trial will be implemented at the current count. You may, if you wish, take the issue to arbitration. --Barberio (talk) 16:43, 5 January 2009 (UTC)[reply]

Oh, there are so many other stops for disruptive and unsuitable admins before ArbCom; Barberio should feel free to explain his view here. For now, it is the minority even in this section. Septentrionalis PMAnderson 17:51, 5 January 2009 (UTC)[reply]

Barberio, there's no reasonable way this can be interpreted as a consensus. We need more discussion about implementation, not games of chicken where we challenge people to go to the ArbCom. JoshuaZ (talk) 20:09, 5 January 2009 (UTC)[reply]

I feel it is perfectly reasonable to oppose this 'trial' on the basis that one would oppose the full implementation of the flagged revisions. discounting opposes on that basis seems like poor form. Protonk (talk) 20:01, 5 January 2009 (UTC)[reply]

Non-admin rollback was not a change to the fundamental principles of the Foundation, but rather a technical matter. In addition, for NAR, a developer called a consensus, but this was disagreed with by dozens of editors, even those who supported it. Comparing the two is a false analogy. NuclearWarfare ^{contact me}_{My work} 20:04, 5 January 2009 (UTC)[reply]

Discussions of site-wide technical changes like these are always difficult to handle because they cross the least well-defined territory in the balance of power within wikimedia: between the Foundation and the wiki communities. We are a wiki community, which has complicated, sometimes arcane, and outwardly inconsistent means of judging consensus amongst huge groups with vastly differing opinions. The Foundation is a much smaller organisation, with a hierarchy that is both more clearly-defined, and more linear. This technical change, if implemented, will be made by Wikimedia developers who are employed and contracted to the Foundation; they are both entirely obligated to do as the Foundation decides, and entirely immune from action by the community. How, exactly, would we "strip of the tools they have abused" someone with shell access to the database? These are people who could write this entire thread out of the revision tables and make it appear to have never happened.

The Foundation is totally dependent on its communities to create the free content that is its mandate; and yet the communities are totally dependent on the Foundation to organise and finance the environments and structures that 'house' them. The Foundation has to balance the desires of its communities with how well those desires support its constitutional goals, and the communities have to balance what's best for them with what's best for the readers. In this context, the thought that it is possible to put an absolute number on the percentage of support required is even more ludicrous than in most of our other poll situations.

This poll is, I'm sure, attracting attention far beyond the en.wiki community. I would be very surprised if its 'result' was judged solely by a single developer. I would not be surprised if the final outcome is not what anyone expects; in my opinion, it really is completely out in no-man's land. But my point is that this really is not a vote; it's not even a situation where we can apply our customary hodge-podge of vote counting and strength-judging. We could ask a bureaucrat to close it. We could ask all twelve active bureaucrats to present a collaborative analysis. We could ask our ArbCom to adjudicate. But when push comes to shove, it's really not our decision. We're making a request, effectively, for divine intervention, how are we supposed to dictate whether that intervention is granted? Happy‑melon 23:33, 5 January 2009 (UTC)[reply]

A hundred and more of us don't want divine intervention, thanks. Septentrionalis PMAnderson 23:54, 5 January 2009 (UTC)[reply]

I'm well aware of that, thank you. Well over two hundred of us do. Whose voice is 'louder' in the ears of the developers? I don't know. Some people (on both sides) have put forward some truly awful arguments which evidence little more than their inability to do basic research... will the developers weed out those voices and give them less weight as a bureaucrat would do? I don't know. Will certain voices be given more weight than others as a result of their familiarity to the Foundation (ArbCom, Checkuser/Oversight, etc)? I don't know. My point is not that we should be in any way discouraged from making both our opinions and our arguments heard; merely that trying to second-guess the result is doomed to failure. Happy‑melon 11:15, 6 January 2009 (UTC)[reply]

Above in this thread, Barberio wrote: "I should note also that this is not an irreversible change..." That's not the impression I have from other pages on this topic. I'm sure I read that once this gets implemented, it may or may not stay dormant, but that it will never be un-implemented. To me, that suggests many hundreds of hours spent mobilizing for & against dozens of proposals for more-or-less well-thought-out trials. A huge time-waster, in other words. - Hordaland (talk) 22:42, 7 January 2009 (UTC)[reply]

To suggest that it can never be "un-implemented" is simply not correct; like everything else on wikipedia, consensus can change and if there is a consensus to remove the extension, the developers will do so. However, such a consensus would need to be very strong; the "other pages" suggest rather that this would be quite difficult to achieve. However, with this proposal, the job of determining whether to allow another trial is not being made by developers each time, but by our local bureaucrats, who are much better at judging the desires of this community - that's why they were appointed. They can perform a more selective analysis of the actual strength of the arguments for and against, rather than a more-or-less straight vote count. So it will not be necessary to "mobilise" in the same way as would be required for a dev request; merely to clearly and coherently put forward the arguments for and against, in a widely-advertised discussion. I think the thought that there will be "dozens" of trials, or even trial ideas that become fully-fledged proposals, is misleading. Happy‑melon 12:21, 8 January 2009 (UTC)[reply]

On the contrary, requiring consensus against to turn it off is to declare that it will never actually be turned off. There isn't consensus against it now; there never will be, because some people will always see this as a magic pony which would solve all our problems, if we only wish hard enough. If there were a provision that lack of consensus to continue was enough to turn it off - as the German Wikipedia is clearly divided on it- then I would be more willing to experiment. But that's another point for the next draft. Septentrionalis PMAnderson 23:15, 8 January 2009 (UTC)[reply]

The distinction between "turning it off" as in completely uninstalling the extension and burning the extra database tables, and "not using it", is purely semantic; neither will prevent the 'magic pony brigade' from wanting to call in the cavalry, and neither will satisfy those who would only be placated if every copy of the source code was hunted down and destroyed. For the reasonable people in the middle, however, having this configuration installed but there being no consensus to do anything with it, is no different to not having it in the first place. Since each trial requires a positive consensus (and a sunset provision), to suggest anything other than that use of this extension requires continuing positive consensus, is disingenous. Happy‑melon 14:36, 9 January 2009 (UTC)[reply]

OK, so now I'm not one of "the reasonable people in the middle," thanks. I do not agree that the diff between uninstalling (or not installing in the first place) and not using is "purely semantic." If we don't install, we avoid long discussions like this one and get back to writing the encyclopedia. If we do install, there'll again be as many opinions as there are Wikipedians about which trial, how to do it, controlls, tests, evaluation etc. And people will be called unreasonable, as I have been here, and likely give up the discussions. Some will give up Wikipedia altogether.

Let the Germans fight it out for another couple of years. There is no hurry. The offer isn't going to go away. - Hordaland (talk) 06:20, 10 January 2009 (UTC)[reply]

Reversion

This edit is simply unacceptable. Some of us, clearly, believe that defining the necessary conditions for any future test to be the only point to continuing to discuss this page and the possible tools, instead of unconditionally opposing any future implementation of Flagged Revisions at all.

If Happy melon wishes to suppress discussion, it would simplify matters if he would state his chosen venue of WP:Dispute resolution now. Septentrionalis PMAnderson 21:07, 3 January 2009 (UTC)[reply]

This edit should also be mentioned, a few seconds before the one Pmanderson notes. My reasoning is given in the edit summaries; of course I will revert if there is consensus to do so. Happy‑melon 21:49, 3 January 2009 (UTC)[reply]

I agree with PMAnderson that it is important to raise these issues, and I agree with Happy-melon that Wikipedia:Flagged_revisions/Trial/Proposed trials is a better choice of venue. As a compromise, I suggest providing a suitable link or links from this front page to the subpage. Geometry guy 23:04, 3 January 2009 (UTC)[reply]

Huh?

I am probably as aware as most editors, but I have no idea what "Flagged revisions" are, and all of this talk and all of these references make very little sense to a very busy person. Okay . . . what IS a flagged revision? Yours in puzzlement (and please, no putdowns). GeorgeLouis (talk) 05:16, 4 January 2009 (UTC)[reply]

A 'flagged revision' is an edit made by someone who has not yet established themselves as a sincere contributor to Wikipedia. Flagged revisions are saved but not published until someone who has established themselves as a sincere editor reviews the edit and verifies that it is not vandalism. The primary purpose of flagged revisions is to eliminate vandalism, which usually comes from either unregistered users or registered users who have made very few edits. The German Wikipedia implemented an initiative like this several months ago. – SJL 06:46, 4 January 2009 (UTC)[reply]

The first two sentences are completely wrong: flagged revisions are not the unverified versions, but the verified ones. See m:FlaggedRevs. Geometry guy 13:48, 4 January 2009 (UTC)[reply]

Don't be a dick. I misunderstood and reversed the attribution, but the rest of my explanation is accurate. I was trying to be helpful, after quite a while away from Wikipedia, and you just reminded me why I stopped contributing regularly in the first place. – SJL 18:14, 4 January 2009 (UTC)[reply]

A sharper mind and a thicker skin might help. It is kind of important when discussing a trial of flagged revisions that the people voting on it understand what they are voting for, no? Many, apparently, do not. Geometry guy 10:52, 6 January 2009 (UTC)[reply]

Oh, pooh! Quit arguing. ("Can't we all get along?") Thank you, SJL. Sincerely, GeorgeLouis (talk) 08:14, 9 January 2009 (UTC)[reply]

Can trial also experiment with what IP users see?

As many have noted, it's hard to judge the impact of a change without experiment. So would it be possible to experiment with different appearances to the IP user? Here is one example I could imagine - there are many others:

In one case, display the latest revision, with an optional button to view the last sighted version (and perhaps labelled latest revision with seal of approval or something similar, since an IP user most likely will not know what a sighted version is.) And perhaps a note that tells them that if they wish, they can create a login and make this the default, since they would not know that either.
In the other case, display the latest sighted version as in the current proposal.

This could be done across different pools of pages, or sequentially, or randomly, or maybe diliberately set the English wiki up with a different policy from the German one, so the results can be compared.

This could allow us to understand some user behavior that is just speculation now. What percentage of IP editors are discouraged by having their edits not appear right away? What percentage of the time do users wish to see the different (latest, sighted) versions? How many people become registerd, and the first/only thing they do is set the sighted/unsighted preference? Of these users, how many become editors?

If we don't try some experiments like this, we could fall into the traditional computer science trap. A lot of high tech stuff has horrible user interfaces, since developers are selected from those who do not mind mastering arcane interfaces. It's just like the professor who knows the material well, but is a rotten teacher since they can no longer even imagine what it was like to not know the material, much less sympathize with a poor student. Likewise, here we are arguing about the effect on IP users, but since we are all registered users we are not at all typical newbies, and hence our opinions are suspect. So rather than argue about the effect on IP users it would be much better to measure it, if possible. LouScheffer (talk) 06:28, 4 January 2009 (UTC)[reply]

Expanding on your thought a bit. I would recommend that the first test involve only special test pages which could be used to establish the format for the basic messages that would be used in the following live tests. I would hate to have new users have to deal with something that experienced users have not even had the chance to review and test. I remain very concerned as to just what the IP users are going to see and just how they are going to have to interface with it. I know that we can change it, but the current default is totally unacceptable that I would not want to see it used in any live test. Dbiel ^(Talk) 20:20, 9 January 2009 (UTC)[reply]

Limitations

I would be happy with flagged revisions if

Revisions are accepted automatically after a period of time from last edit (say use mean time to vandalism reversion *constant, where the constant is between 1 and 3). I believe I read a paper on wiki which quoted the mean time to reversion. Can't quite remember where though. I may hunt it down later.
Articles are, by default, not flagged. Flagging an article should be considered to be almost as bad as a semi-protect -- these are quite similar.
A large userbase (say all autoconfirmed users, or some other level of trust) are able to sight revisions

Some features that would be useful

A filtering of the article history that is shows only draft & sighted edits. This would make diffing articles that have heavy vandalism a bit easier.

My 2 cents User A1 (talk) 07:14, 4 January 2009 (UTC)[reply]

Seconded. Imo very concise simple practical considerations.

Your #1 stops the FR system from making all change dependent on manual sighting—things'll tick over w/out us, phew!

Your #2 allows us to be sensibly selective about applying FR where it's needed, and spare ourselves unnecessary work elsewhere.

Your #3 seems to protect "anyone can edit", or is simply a practical necessity should FR end up being more draconian than your #1 and #2.

Your #4 sounds practical too, but though I'd be content that a filter defaults to "sighted"+"draft", I'd like to have the option to "see all", including who drafted alleged vandalism, and who deemed it to be so.

I'd be v. interested in both PMAnderson's and Happy-Melon's views on these suggestions. Perhaps they could agree on very specific things like this. :) Alastair Haines (talk) 07:50, 4 January 2009 (UTC)[reply]

Number 1 is currently technically impossible, since FR do not have this feature. However 1 is actually not very important for the limited trials proposed.
Number 2 is already in proposal. In the proposed implementation articles are not flagged unless surveyors manually enable flagging over them on page by page basis.
I agree with number 3. Reviewer permission should be liberally distributed. Actually Flagged Revisions can be configured so that anybody with a certain amount of edits (other conditions can be attached too) is automatically assigned to the reviewer group. However we decided against including this into the trial proposal, because of the lack of agreement about criteria. In future automatic assignment can be implemented, of course. Ruslik (talk) 08:27, 4 January 2009 (UTC)[reply]
As to filtering, if my recollections are accurate, it is already a part of FR (you can check this on en. labs).

Ruslik (talk) 08:27, 4 January 2009 (UTC)[reply]

As a side note #1 could technically be enforced by bots, writing a bot to do this could be done with nasty scraping scripts or with more clever use of the wiki-API (may need modification). But one would hope that such a bot would be located close to the wikimedia network to keep bandwidth down. Of course having it as part of the FR system would probably be most efficient. 121.44.50.78 (talk) 11:00, 4 January 2009 (UTC)[reply]

This seems a reasonable kernel for the next attempt at this. Septentrionalis PMAnderson 17:07, 4 January 2009 (UTC)[reply]

Report from Polish Wikipedia

As many of you know, we've been using flagged editions on pl-wiki for a while. Initially I was very skeptical about it. However, I changed my mind after a couple of times a vandalized edition was simply not visible to common folks - in my view now it really does increase the quality of output to the outer world. However, I have also noted a serious risk: quite often an established editor might edit a part of an article and accidentally add validity to some nonsense. I would not over exaggerate this risk though: after all to some extent this is what we have now. I disagree with some of the editors above that flagging should be "as bad as a semi-protect". I think that unflagging should be just a little proof that an edition is confirmed by an established editor, and thus should be quite widespread, while flagging last editions from IPs and new editors is only a consequence of this approach. Pundit|utter 07:53, 4 January 2009 (UTC)[reply]

Please explain review procedure in pl-wiki: how deep the scrutiny goes? NVO (talk) 08:01, 4 January 2009 (UTC)[reply]
You mean, in everyday practice? My wild guess would be that it does not go very deep. Some editors will certainly make a big deal of "certifying" some revision, while others would perfunctorily correct a typo without caring about simultaneous confirmation of the article's version. My point is, the bottom line is what we have now (an average Joe come to Wiki and perceives whatever is in there as "confirmed", I don't think everyday users see all the nuances, many do not even realize there is a history button on the page). Thus, flagging will serve a good purpose: occasional Wikipedia users will be more likely to see a sensible revision, rather than rubbish. I don't think flagging may resolve all our problems, but it adds some value. On the other hand I do realize it may be a bit confusing for new editors, whose edits will not be immediately visible to everyone. Pundit|utter 09:12, 4 January 2009 (UTC)[reply]

A few questions...

I'm struggling to find a page that explains this idea without techno-babble (no, I don't script antivirus software and design firewalls, so you may as well explain it in Welsh) so I have a few questions-

If an IP makes an edit to a page, can they then see it, or will they have to wait for someone to review it?
If an IP clicks "edit this page", do they see something in the edit box different to the article if another IP has made edits to it since a revision was approved?
1. If not, do they edit the approved version, even if there are other edits waiting to be approved?
  1. If so, must reviewers choose the best possible "new" history, discarding the rest?
  2. What if the same IP edits a page several times, as is all too common? Can a reviewer only choose to save a single one of their edits?
  3. If multiple saved up changes can all be approved, what if they conflict with each other?
  4. Are we not going to lose attribution as reviewers choose to slip changes in under their own name, as they can't approve both changes made to a page?
Are recent change patrollers (I'm thinking particularly of the 12 years old, never wrote an article brigade) really going to continue recent change patrolling when it turns from the romantic notion of zapping vandalism, hunting down trolls and getting them blocked to the positively thrilling art of clicking to "approve" edits?
Does this not effictively mean that we are treating IP editors as vandals by default, meaning that they're guilty until declared innocent?

I'm assuming I've missed something significant here, as I refuse to believe the Flagged Revisions I understand could possibly be supported by anyone who had any respect for Wikipedia. J Milburn (talk) 12:30, 4 January 2009 (UTC)[reply]

You can try many of these things out in the live demonstration. An important point to realise is that we're not talking about verifying edits, but versions. The system works more or less as it does now, you see. When IP users visit a page, they see the stable version of the article. That is new. When they go to edit the page, they edit the latest 'draft' version, which includes all edits made since the 'stable' version was flagged. Indeed, the link 'edit this page' reads 'edit draft', to accentuate the difference, and the edit window shows some text to notify users that they are editing a 'newer' version than the one they saw. On top of that, the changes between the 'stable' version and the current draft version are shown, so that the editor can decide whether those changes are as much of an improvement as the new edits the user wants to make.

But the basic process isn't much different as it is now. Editing a page that has been modified saves a new version that includes both edits. Editing a draft saves a new draft that includes both edits.

Please have a look at the live demonstration. It will answer many of your questions. -- Ec5618 13:46, 4 January 2009 (UTC)

I'll have a go at answering your questions. If an IP edits a page, they will see a notice at the top of the edit screen saying (by default) "edits will be incoporated into the stable version when an authorised user reviews them". Once they've saved their edit they will see the article as it was before their changes, with a short explanatory banner including a link to the current version of the page. So the most recent version is only a click away. Whenever anyone edits a page, the contents of the edit window are the most recent version; if this is different to the version the user saw on the edit screen, a diff is provided to highlight any changes. So whenever a user with sighting ability goes to edit or sight a page, they see a diff of all changes since the last sighted version. It is slightly confusing to think of sighting individual edits, what is really sighted is the version of the page in between edits. So It doesn't matter if those changes were introduced in one edit or a hundred; what matters is the changes those edits cumulatively make. Really that's no different to the way things work currently, except that the process is made a little more streamlined and easy to keep track of.

I obviously can't speak for all or even any RC patrollers, but I can say with certainty that this extension isn't going to stop all vandalism. If used correctly, it can make vandalism invisible to the outside world, which is really just as good, but RC patrol is not going to change overnight. I don't know the answer to your question, but I want to find out. That's why we should test it to see if it works.

Obviously a lot of the 'philosophical' issues surrounding FlaggedRevisions are dependent on your personal point of view; but I think that many people overemphasise the extent to which FlaggedRevs represents a change in 'faith assumption'. We already treat IPs as second-class citizens, limiting their abilities, treating their edits with automatic suspicion. Whenever we see an IP contributing to a discussion we suspect a sockpuppet; whenever we see an IP edit at the top of our watchlists, we suspect vandalism. There's no avoiding the issue: I've just parsed the last ten thousand recent changes, found 1,706 rollbacks of which 1,189 are of IPs; of the 385 distinct users reverted, 292 of them were IPs. More data is (as always) needed, but the initial conclusion is that around 70% of vandal edits and 75% of vandals are IPs. With those two thoughts in mind, what we're 'doing' to IP editors with FlaggedRevs actually begins to look rather mild. Are we treating them with assumption of guilt? Yes, I suppose so. Do we do that already? Most definitely. In my perspective, what's new here is a way to approve IP edits, to remove that assumption of suspicion. That's new, that's not been available before; previously all IP edits were equally suspicious. It's a matter of perspective.

You've piqued my interest in edit statistics now, so I think I'm going to go write a script to analyse recentchanges more thoroughly, to try and get some more accurate numbers to play with. That's what this proposal is all about, after all. Happy‑melon 14:20, 4 January 2009 (UTC)[reply]

"Once they've saved their edit they will see the article as it was before their changes". This is not true, at least not on the demo install. After an IP user has saved their edit, they will see the most recent version of the article with the results of the edit visible. This shows to them that nothing went wrong and confirms the "anyone can edit" slogan. Only when refreshing the page or coming back to it are they shown the last sighted version. --94.192.121.203 (talk) 14:43, 4 January 2009 (UTC)[reply]

That's interesting to know - I never edit from my IP address as an absolute rule because it identifies me very accurately indeed, so I was making my best assumptions from looking at the source code. I presume that there is a note to identify the version as the current version rather than the sighted version? If so, I see that as a rather elegant solution to, as you say, confirm the "anyone can edit" philosophy and to allow them to check that everything went ok. Happy‑melon 16:08, 4 January 2009 (UTC)[reply]

On editing, it shows MediaWiki:Revreview-editnotice when the current version is sighted, and MediaWiki:Revreview-edited when there have been edits after sighting. Saving shows MediaWiki:Revreview-newest-basic-i and a note about an authorised users having to review edits to the page. --93.97.227.226 (talk) 01:25, 5 January 2009 (UTC)[reply]

And how will newbies react when their edit is visibly registered and then disappears? The tests will not register those who go away confused or disgusted. Septentrionalis PMAnderson 17:00, 4 January 2009 (UTC)[reply]

"Hey where's my edit?? Let's do it again!" Clicks on edit like before. "Oh, my edits are visible at the top of the page in green, that must be good. And right above them it says that 1 change needs review, and that Edits to this page will be incorporated into the stable version once an authorised user reviews them." ... "Oh and just to be sure, they're in the edit box too where I wrote them, so there's no way anything could have gone wrong! How long do I have to wait for an authorised user?" Does no answer to the last question make people confused or disgusted?? Haven't accounts of first time Wikipedia experiences shown more surprise that anyone can edit, rather than their attempts at humorous inserts sticking for a longer time? Anyone with any experience in collaborative environments, and heck, experience in working with people, will expect peer review. That's what all you watchlisters are doing already anyway, it's just hidden away to us anons who then have no idea of what's going on behind the scenes and nothing to lure us to participate in it. --94.192.121.203 (talk) 01:25, 5 January 2009 (UTC)[reply]

"all IP edits equally suspicious"? I, and Huggle's edit-assessing algorithm, beg to differ -- Gurch (talk) 17:11, 6 January 2009 (UTC)[reply]

I will agree with Gurch that "all IP edits are not equally suspicious" but speaking for myself they do all get more attention from me as being potentially more suspicious. - Other factors that I take into account include the edit summary (no edit summary = more suspicion) I consider registered users with redlinked user pages and talk pages to be no different that IP users. Dbiel ^(Talk) 20:38, 9 January 2009 (UTC)[reply]

Thankyou Happy-melon, Ec5618 and 94.192..., I do sort of see where I was going wrong while thinking about it. PMAnderson still raises my main concern- this is complicated and bureacratic, and is probably going to scare off potential users. Consider my opposition to this scheme weaker. J Milburn (talk) 23:41, 4 January 2009 (UTC)[reply]

Here's my attempt at answering the above questions. Note that some of the details are configurable options in the software. Also note that the use of flagged versions can be enabled or disabled on a page-by-page level.

If a non-logged-in user makes an edit to a page, they will immediately see the new version. If they return to the page later, then they will see the "flagged" version of the page, and they will also see a message explaining that there is a newer "draft" version available. They will be able to click on a link to see the new "draft" version, and diffs between the latest draft and the earlier "flagged" version. They will have to wait for someone to review the changes before the "draft" version gets promoted to a "flagged" version.
If a non-logged-in user clicks "edit this page", they will be editing the latest version of the page, whether or not that version is a "flagged" version. If the version they are editing is not a "flagged" version, they will also see a diff between the "flagged" version and the latest "draft" version that they are busy editing.
1. They do not edit the approved or "flagged" version version, they always edit the latest version, whether or not it is "flagged".
  1. Reviewers see only one line of history, just as at present. Reviewers must check the diffs between the older "flagged" version and the latest "draft" version before they mark the latest draft as "approved".
  2. If the same user edits a page several times, and a reviewer wants to choose to save some but not all of theyr edits, then the reviewer will have to use the "undo" function to undo bad edits, or manually edit the article to their liking; after that, the reviewer can mark their own latest version as "flagged".
2. There is no way for multiple saved up changes to conflict with each other, because there is only a single line of history, and each change (whether "flagged" or not) builds on the previous change (whether "flagged" or not).
  1. If reviewers make any changes before flagging an article, those changes will show up under the reviewer's name in the history.
Recent change patrollers will be able to continue the exciting job of zapping vandalism, hunting down trolls and getting them blocked; and will also be able to do the less exciting work of clicking to "approve" edits.
This would not mean that we would be treating IP editors as vandals by default.

—AlanBarrett (talk) 19:12, 8 January 2009 (UTC)[reply]

One consideration I think people are missing

Not sure how many people read the rest of the page compared to a hit and run "zomg this is horridz!", but one thing to remember is that anyone CAN see the drafts, even if they are an IP; at least that's how it works on the tests, and how I've always understood it. It's simply a matter of what one sees in a normal browsing -- you can switch to the draft any time you want. Yes, only 'trusted users' (or whatever) can site things, but we also have limitations on moving (and yet we still have such people are the one who moved this page to nonsense last night), limitations on making a new article, and of course semi-protection.
Plus, for now at least, this ISN'T about making the whole of WP FR. It's about making a small subset of articles where having the extra layer of control to what the non-editor sees is only a good thing. Earlier I came across someone posting a name and a phone number. Isn't it good that less people see THAT? ♫ Melodia Chaconne ♫ (talk) 12:50, 4 January 2009 (UTC)[reply]

It would be a good thing if we could certify the absolute bonesty and clue factor of every single editor before they could make an edit, but then this wouldn't be Wikipedia anymore. It would be a publishing house with a staff. MickMacNee (talk) 22:27, 6 January 2009 (UTC)[reply]

possible mathematical analysis to be done?

In "statistics please" supra, an interesting suggestion was made.

“

In short (and considering one million edits to be a statistically valid sample) almost 10% of IP edits are non-utile (giving them the benefit of the doubt). And about 2% of non-IP user edits are non-utile edits. This is, moreover, similar for IPs to what the German report claimed. Unfortunately we do not have a breakdown of regstered users by number of edits, and we must also be aware that some cases may be legitimate edit disputes as well (calling an edit "vandalism" does not automatically mean it was vandalism). Basically this means this experiment may not be the way to go (much as I wish it were). We would gain quite nearly the same positive result by making sighting only required for IP edits in the first place (basically a variant of "semi-protection") as we would be implementing flagging of all articles. Collect (talk) 16:47, 4 January 2009 (UTC)

We could test that. If we set up one trial with a bot to run round sighting all non-IP edits to stable versions (we'd have to give the bot 'reviewer' as well as 'bot' rights, because it can't normally do that), we'd be simulating exactly the system you propose. That could be an interesting implementation of FlaggedRevs (a full implementation would give autoreview to all registered (autoconfirmed?) users), certainly worth further consideration. Happy‑melon 17:09, 4 January 2009 (UTC)

”

Is such a test feasible? Would it delay the current proposed trial, or could it be done fairly quickly? I doubt it requires a great deal of time to run, and I would hope the statistics it furnishes would be of great use in discussing how well or ill the trial fares. Thanks! Collect (talk) 17:18, 4 January 2009 (UTC)[reply]

I meant that this is a trial that could be run under the configuration this proposal is considering. Writing a bot to handle the sighting would be fairly easy. If there was a consensus to run such a trial, it could be done very easily with the proposed configuration. Happy‑melon 18:55, 4 January 2009 (UTC)[reply]

Why would you need a bot, just give autoreviewed right to all users -- Gurch (talk) 19:18, 4 January 2009 (UTC)[reply]

If we concluded that the system was a success and we wanted to do it 'for real', then yes, that's exactly what we'd do. The thing is that implementing that configuration wouldn't allow us to test anything else, whereas by having the configuration proposed, we can with a little more effort simulate this scenario, while also trying other options. Happy‑melon 20:18, 4 January 2009 (UTC)[reply]

The idea is to get some concrete numbers to see whether the test as posited is likely to be as beneficial as hoped. If a smaller change has essentially the same benefits, then we should be apprised of that fact. At this point, absent solid figures, but using a sample of one million edits, it would appear that the whole flagging trial may not be any more effective than modification of what is now called "semi-protection." Alternatively, getting solid figures in this manner might prove to be the greatest argument flagging has in its favor. My personal opinion is that the system which requires the least added volunteer time would have the greatest probability of actually getting sufficient volunteers <g>. Collect (talk) 20:22, 4 January 2009 (UTC)[reply]

My personal opinion is that the system which requires the least added volunteer time would have the greatest probability of actually getting sufficient volunteers. I feel like this should be a named law (if it isn't already). Collect's Law? Lot 49a^talk 21:50, 8 January 2009 (UTC)[reply]

Autoreview period

I realize that this been proposed before, but I was thinking about some of the objections and I think many of them could be addressed by the idea of an autoreview period; any edit that has stood as the current revision for, say, 1 day, will be automatically marked reviewed. In this framework, users with reviewer privileges are solely responsible for expediting this process, so there isn't as much of a scaling problem; users without reviewer privilege would still be able to revert edits to prevent them from becoming autoreviewed. It helps avoid the issue of articles that aren't watched by reviewers never getting reviewed. And statistics show that nearly all vandalism would be caught in this period. Of course it would require an experiment to investigate this option, but it would be nice to at least get the technical support for it.

I also believe this is technically feasible; all that is necessary is that whenever a page is viewed, a check is performed, and the correct revision is autoreviewed if necessary. There is no need to "poll" all articles.

Thoughts? Dcoetzee 19:59, 4 January 2009 (UTC)[reply]

I don't like this much, as several of the articles I have watchlisted are not heavily watchlisted by others, and so there is often more than 24 hrs between times that someone takes a look. --Rocksanddirt (talk) 05:19, 5 January 2009 (UTC)[reply]

"It helps avoid the issue of articles that aren't watched by reviewers never getting reviewed" - please explain. If no one cares to edit them, even watch, why will anyone care to review them? And if the system simply resets the flag in 24 hours, what's the point? NVO (talk) 12:50, 5 January 2009 (UTC)[reply]

The point is that there's a 24 hour window in which to notice and revert the vandalism. Unlike full flagged revisions, it would not catch all vandalism, but it would catch most vandalism (to watched articles); in exchange, good edits would get through without the attention of reviewers. It's sort of a compromise between full flagged revisions and the current system. Dcoetzee 21:39, 5 January 2009 (UTC)[reply]

I like the idea, with a slightly longer timespan. As was noted very early in the discussions there's some php that can "sight" the articles if it stays stale for so long. I'll try to find it and put the quote(s) here. §hep • ¡Talk to me! 21:56, 5 January 2009 (UTC)[reply]

Not sure what this all exactly would do, but this seems its definitely possible. §hep • ¡Talk to me! 22:06, 5 January 2009 (UTC)[reply]

An expiration system, automatically showing revisions that are old enough to IPs, at least for non-blps, would eliminate the harm caused by backlogs and wouldn't decrease significantly the efficiency of flaggedrevs. Cenarium (Talk) 15:37, 20 December 2008 (UTC)

In order to limit or avoid backlogs, I've proposed to automatically 'sight' (more technically, expire) edits after a certain period (for example, 6 hours), with various ways to adapt the system, so that backlogs shouldn't be a problem.
Cenarium (Talk) 19:03, 15 December 2008 (UTC)

However, using an expiration system to show old enough revisions to IPs, at least for non-blps, would solve this problem, and still prevent the quasi-totality of vandalism and other disruption from being seen. Cenarium (Talk) 15:37, 20 December 2008 (UTC)

Watchlist?

When the most recent edit has been reviewed, will it show up differently on a watchlist? If I glance at my watchlist, can I tell which articles have approved edits and which have edits waiting to be approved or otherwise? J Milburn (talk) 23:51, 4 January 2009 (UTC)[reply]

Yes, there is an exclamation mark next to entries that have unreviewed versions, and a "review" link to confirm the latest changes. These also appear in the RC feed. The extension has been quite carefully designed to make it as easy and natural as possible to review things :D Happy‑melon 10:24, 5 January 2009 (UTC)[reply]

I personally think that the default watchlist entries and links are very satisfactory. The current demo can be used to see the impact on the watchlists, you just need to register on the demo wiki, or look at the history pages which have a similar format. It is the article page and edit page messages and links that need to be improved greatly. Dbiel ^(Talk) 20:45, 9 January 2009 (UTC)[reply]

Just a thought...

Sorry if this was already brought up before but, I wonder if anyone considered this idea, reviwer status is given to all users with autoconfirmed rights, and we replace semi-protection with flagged rev if an article is receiving a considerable amount of vandalism. So in theory it will work in a fashion similiar to semi-protection, but instead of blocking out the edits from new accounts and anon's completely, we can flag-rev the article which the autoconfirmed accounts can review anon contributions. Since pretty much anyone who regularly edits Wikipedia with an account pretty much has an autoconfirmed account, hopefully, there is no need to create a new usergroup with this implementation. Y. Ichiro (talk) 02:31, 5 January 2009 (UTC)[reply]

This makes sense to me. Actually call flagged revision "semi-protect" so that IP users can edit semi-protected pages in a sandbox like environment. There seems no reason to make all articles subject to "sighting". In this configuration, we would actually be increasing the ability of the general public to make edits. xschm (talk) 02:39, 5 January 2009 (UTC)[reply]

I started Wikipedia:Flagged revisions/Semi-protection implementation, if anyone's intrested, putting this into to a concrete proposal. Y. Ichiro (talk) 04:25, 5 January 2009 (UTC)[reply]

I'm liking it. Alastair Haines (talk) 08:15, 5 January 2009 (UTC)[reply]

I like the concept too; as noted above somewhere, this is something we could test using this trial configuration. Happy‑melon 10:25, 5 January 2009 (UTC)[reply]

This may be the only implementation I could approve, since the problem of good anon edits being lost or delayed is much less serious. On a semi-protected article, such edits (except for an occasional talk page suggestion) are not even being suggested now. Septentrionalis PMAnderson 15:48, 5 January 2009 (UTC)[reply]

This is also the only FR implementation that I've heard of that I like. And I do like it, at least as far as it's been described. I still would like to know how we would test it, how we would measure success of the test, and so on before supporting it. And also how we would avoid flag creep (the tendency for FRs, once turned on, to stay turned on even after they're no longer necessary to prevent vandalism). Ozob (talk) 19:12, 5 January 2009 (UTC)[reply]

Ditto to the above. This is the first time I've heard an FR trial that made me think "yes, that's an improvement!" I see that User:pmanderson has already added this as a trial to the list of proposed trials. Lot 49a^talk 19:25, 5 January 2009 (UTC)[reply]

I like it too. This is a way in which flagged revisions might actually enhance the principle that Wikipedia is the "encyclopedia that anyone can edit". I would foresee semiprotection being used more widely under such a scheme, but I don't see that as a bad thing: it is better (in my view) to protect more articles, but filter good edits to them, than to protect fewer and forbid all IP edits. This might also contribute to the BLP issue discussed elsewhere. Geometry guy 19:39, 5 January 2009 (UTC)[reply]

How do we test it? My suggestion would be to pick about 600 semi-protected articles, split them randomly into three groups of 200, and enable FlaggedRevs on one of them. As noted, this could be done with the trial implementation under consideration here by creating a bot with the 'reviewer' flag that would scan the RC feed for edits to those articles, and review any edit made by an autoconfirmed user. Once those measures are in place, we unprotect the group of articles that are in the 'FLR' cohort, and one of the other groups. Wait a predetermined period of time before resetting everything to how it was before. Then you analyse the data; the 'semi-protected' cohort is your control group, and you have two other groups which should go in generally opposite directions. Plenty of statistics for people to get their teeth into. This would be a very interesting trial. Happy‑melon 20:00, 5 January 2009 (UTC)[reply]

Hmm, testing this would be rather difficult using this version of FR as of now. Of course if there is a consensus to implement this in full scale, the devs might need to make some major modifications to the current version of Flagged Rev, such as having administrators being able to groups of users having reviewer status for specific articles, simliar to how protection is done by just clicking some textbox. A built-in system created to give reviewer permission for specific type of groups of user might be helpful, especially when if it is article-dependent too. Y. Ichiro (talk) 20:08, 5 January 2009 (UTC)[reply]

Well, as of right now, there's about 63% support for the current proposal. That's almost a two-thirds majority, but it's not exactly consensus. On the other hand, this suggestion managed to get the tentative support of three opponents to the proposal in less than twenty-four hours. This seems to be the way forward.

In order to make a serious proposal out of this idea, it seems like we'd need the following:

Technical implementation: Some PHP code, probably very similar to what's already been proposed.
Procedures:
1. Draft policies for when flagged revisions are to be turned on and turned off. Similar to Wikipedia:Semi-protection and Wikipedia:Rough guide to semi-protection.
2. Draft policy for which edits are to be sighted and which are not to be sighted.
Testing: We need a test that measures whether flagged revisions, semi-protection, or nothing is best for heavily vandalized pages. I'm not sure how easy it would be to do this, because pages get semi-protected and un-semi-protected; we'd need to find 600 (or so) articles that would normally be semi-protected for the entire duration of the test. (It'll depend on the length of the test, too.) We need to specify a duration for the test, criteria for success or failure, and a place to discuss the results.

Most of this is already in Y. Ichiro's proposal. I think it wouldn't be that hard to fill in the few remaining gaps, except maybe for details about testing. Does anyone else think it would be a good idea to drop the present proposal and focus on this one instead? Ozob (talk) 07:22, 6 January 2009 (UTC)[reply]

Y. Ichiro's proposal will require several serious changes to Flagged Revisions extension (and probably to WikiMedia software in general), which are unlikely in the foreseeable future. See Wikipedia_talk:Flagged_revisions/Semi-protection_implementation#Problems. Ruslik (talk) 12:04, 6 January 2009 (UTC)[reply]

That section contains no claim that this is technically unfeasible; but if this is true, it is an excellent reason for opposing implementation of the present system until it is fixed. Septentrionalis PMAnderson 17:44, 7 January 2009 (UTC)[reply]

Just like the core MediaWiki software, Wikimedia wikis run the latest stable versions of all extension software. So as soon as any modifications or improvements are made to the FlaggedRevs code, they will be available within days on all live wikis. Such improvements could also receive a higher priority and so be done more quickly if there is a wikimedia site that would immediately benefit from the changes. Like everything else 'wiki' problems are things to be fixed 'on the run' rather than sitting back and waiting for perfection to materialise. Happy‑melon 21:37, 8 January 2009 (UTC)[reply]

I would just like to say, YES, this is the ONLY flaggedrevs proposal I support, I switched my vote after hearing about this, as it actually HELPS the wiki atmosphere by opening semiprotected pages for anon and new user editing. I am very interested in watching this go forward.--Res 2216 firestar 18:39, 11 January 2009 (UTC)[reply]

Purpose?

Apologies if this is covered above, but I was lead here by a notice I found when I viewed my watchlist. It lead me to the Trial page, but at no point on that page does it tell you clearly what the purpose of Flagged Revisions are, just how it will be implemented. I suggest a short "purpose" paragraph should be added. I won't be doing it, as I'm still trying to work out what the hell is being proposed(!), but I think that it would help many, many others like me... Stephenb (Talk) 13:35, 5 January 2009 (UTC)[reply]

I advise you to start with Wikipedia:Flagged_revisions, which contains necessary definitions and links. Ruslik (talk) 17:23, 5 January 2009 (UTC)[reply]

I did, but that was not the point of my comment. The point was to point out that the link on everyone's watch page leads to something which has very little context, especially for the novice editor, and this will put a lot of people off from reading further and contributing to this discussion. Stephenb (Talk) 17:58, 5 January 2009 (UTC)[reply]

Another thing which should be included in any future draft. This is why we discuss before polling, and why hairline majorities do not prevail; we want the best text, with the greatest practicable agreement before implementing any wideranging proposal. Septentrionalis PMAnderson 18:02, 5 January 2009 (UTC)[reply]

There is a discussion above that started 22 days ago, and that was just on top of all of the older discussions that have already gone 'round. §hep • ¡Talk to me! 21:52, 5 January 2009 (UTC)[reply]

It's not an explanation; and most importantly, it's not in the proposal where people will see it. Septentrionalis PMAnderson 04:07, 6 January 2009 (UTC)[reply]

Legal liability of Reviewers

Here it states that while the legal liability of the Wikipedia foundation would probably be not significantly affected by Flagged revisions, that it could make the reviewer liable. While most people will be prepared to take responsibility for their own edits, this could lead to reviewers being liable for other people's edits, which is a whole different issue. Some sort of clarification of these sorts of legal issues should be provided before we go live with this, even in a trial. I certainly would be unwilling to "sight" articles is I thought there was a significant risk of becoming involved in a court case for liable (or more likely, at least in the EU) privacy breaches.Nigel Ish (talk) 21:04, 5 January 2009 (UTC)[reply]

This sounds like legal paranoia to me. Is the editor of a newspaper held responsible if an editorialist is sued for libel? The newspaper as a corporation may be, just as someone might target WMF, but they would not target individual employees other than the writer. However, I'm not a lawyer (and neither are you) so this is all speculation. Dcoetzee 21:44, 5 January 2009 (UTC)[reply]

I agree with Dcoetzee. You are no more liable for sighting a revision than you are for editing one. So, if you are inclined to feel paranoid about legal threats for actions on wiki, you should just exercise the same discretion reviewing revisions as you would making them. Protonk (talk) 22:22, 5 January 2009 (UTC)[reply]

The, admittedly rather conservative, Opinion from Mike Godwin (the Foundation's General Counsel) is here. It essentially clarifies that, in his opinion, FlaggedRevisions does not make the Foundation any more liable. The explicit statement is that the situation vis avis FLR making sighters liable is a complete legal unknown, suggesting that this acts as a deterrent against potential plaintiffs launching a suit against editors. He also notes that the potential upside for plaintiffs of winning such a suit is fairly minimal in financial terms, a further disincentive. That quote is not the entirety of his e-mail response; the general impression I got from it was that he does not consider the legal ramifications to be a significant issue, at least in the wider context. Happy‑melon 22:42, 5 January 2009 (UTC)[reply]

So, the conclusions are that we don't know what the legal position is (particularly outside the US) but its probably unlikely that individual reviewers will normally be sued as a litigant is unlikely to get large sums of money from an individual. It makes me feel slightly easier, particularly if the reviewer stays within the field of his or her normal areas of editing where they are more likely to not let dangerous edits slip through.

However if Flagged revisions is going to work, particularly if/when it gets rolled out large scale, a lot of reviewers will be needed, and there will be pressure to clear articles that most reviewers won't be familiar with. This could either increase risks that things slip through or lead to longer delays if reviewers are not willing to operate outside their normal area of expertise or in higher risk areas through fear of litigation (whether justified or not).Nigel Ish (talk) 20:05, 6 January 2009 (UTC)[reply]

Two things:

You are legally responsible for your edits, and sighters will be legally responsible for their sights. You are legally responsible for everything you do, there's no reason to expect an exception here.
In many countries, editors in fact get sued and fined if a journalist publishes a libel in their newspapers. Where I live, the legal term for "editor-in-chief" translates as "responsible editor". Zocky | picture popups 05:09, 12 January 2009 (UTC)[reply]

Test page?

Is this the current starting point for the demo to this proposed change? - http://en.labs.wikimedia.org/wiki/FlaggedRevs

If it is, I have found it to be a very poor demo indead, or should I say a very disappointing demo. If this is what an IP user will see, I am totally opposed to this proposal.

The following is what is displayed on the page when there is an unapproved edit (note: the leading icon is missing:

Draft [view page] (compare) (+/-)

If I were a new user, I would have absoluting no idea what this meant

And when editing the page the following is displayed:

The latest sighted revision (list all) was approved on 7 January 2009. 1 change needs review.

As a new user, I would have no idea what this meant.

I can see value in the concept, but the implementation is severly lacking.

I like the idea of being able to easily find the last reviewed page, and the history links are very useful for those who know what they mean, but for a new IP user they seem to make Wikipedia more difficult to use. Dbiel ^(Talk) 04:12, 7 January 2009 (UTC)[reply]

These are the default messages, which no one has been bothered to customise on en.labs. Every part of the user interface is fully customisable through Mediawiki: messages; we can make them louder or quieter, longer or shorter, include links to help pages or shrink them for convenience. Everything is changeable; these are just the 'bog standard' messages. Happy‑melon 10:40, 7 January 2009 (UTC)[reply]

I would be more than willing to support the trial, but only after an initial demo interface has been developed that is at least somewhat acceptable. I am fully opposed to any trial, no matter how limited, that would make use of the current default interface as seen in the demo. EN.Wikipedia is much to widely used to implement any demo that does not fully take into consideration its effects on new users. Dbiel ^(Talk) 14:44, 7 January 2009 (UTC)[reply]

One of the goals of the trials is to find which messages are appropriate and necessary. en.lab Wiki is a separate entity, and we have no direct control over it. Without enabling a working implementation here on en.wiki it will be impossible to customize anything. Customization of the interface is actually the first step in any trial. Ruslik (talk) 14:52, 7 January 2009 (UTC)[reply]

Nobody is going to display default messages on thousands page here. Initially FR will be, probably, enabled over only a few pages, so that all editors can try them in action. After that default messages will customized based on the feedback from the editors who tried FR on those few pages. Only after customization is complete, any serious trial can begin. Ruslik (talk) 15:01, 7 January 2009 (UTC)[reply]

The initial test needs to be done in a sandbox type environment and not on any live pages. No live page should be tested until a basic set of notices can be agreed upon. The first test should not be accessable by new users to Wikipedia (unless they are doing so as part of the testing process and with a full understanding of what the test is about first) This can be easily done by simply creating new test pages, NOT by using any current live pages. Dbiel ^(Talk) 18:41, 7 January 2009 (UTC)[reply]

Revision divergence (forking)

I have tried to find some practical information on how the complete procedure would work, but have not been able to find it. My main worry is about divergence of articles (forking in the history). When a revision has not yet been "sighted", subsequent edits would apply to the last sighted revision. After that there will be two imcompatible unsighted versions. The "sighting" process would need to integrate the modifications in both relative to their common ancestor, which is absolutely non-trivial and a heavy burden on the sighter. −Woodstone (talk) 09:37, 7 January 2009 (UTC)[reply]

Whenever anyone edits an article, in any situation, the contents of the edit window is always the wikimarkup for the current version of the page. If this differs from what was visible on 'view' before there is a diff above the edit box to highlight what's changed. So while traditional edit conflicts are still easily possible, revision 'forking' of this nature is not. Happy‑melon 10:42, 7 January 2009 (UTC)[reply]

So this would mean that I, a born proofreader, would click edit, search for the error I'd just seen, only to discover that it had been fixed, but not yet "sighted" (what a meaningless word!). Meeting that situation a few times, I'll stop bothering.

Worse, as I usually edit a section rather than the whole article, the text I'm looking for may not even be there, having been moved to another section a week or two ago, but not yet "sighted".

There's more than enough rigamarole to learn already for a new editor. I've !voted oppose. - Hordaland (talk) 15:00, 7 January 2009 (UTC)[reply]

The editing window would quite clearly point out to the editor that he/she is editing a draft version, and that changes have been made to it by other users. This objection is unfounded. -- Ec5618 15:16, 7 January 2009 (UTC)

No, it is well founded; that will give Hordaland an explanation of what it going on, but he will still not be able to tell whether his typo is in the current unsighted version or not without looking. Hordaland is assuming that he is not a reviewer. Septentrionalis PMAnderson 18:37, 7 January 2009 (UTC)[reply]

Yes. Thanks. - Hordaland (talk) 20:45, 7 January 2009 (UTC)[reply]

How so? All users would see the same message, and all users would be able to edit the draft in the same way. -- Ec5618 20:51, 8 January 2009 (UTC)

I'll add to that: when editing a page that has been edited since it was last sighted, the user, in all cases, is presented with a difference view at the top of the edit window, showing the changes that have been made to the article since it was last sighted. The user would thus be able to spot quite readily that the error had already been fixed, and be done. The process is quite transparent. Have a look at the demonstration. -- Ec5618 21:07, 8 January 2009 (UTC)

Don't forget, of course, that being a logged-in user, Hordaland will not ever see content that's not in the latest revision anyway. The changes since the last sighted revision are very clearly indicated with a diff display at the top of the edit window, there is no concern over people not knowing what's in the latest version. Happy‑melon 21:14, 8 January 2009 (UTC)[reply]

That's another change of specification since you last described it. It was supposed to be only reviewers who saw the latest version. Please make up your mind and put it in the proposal. Septentrionalis PMAnderson 23:18, 8 January 2009 (UTC)[reply]

Septentrionalis, many of your comments approach the condition of FUD but that goes well over the line. On every proposed implementation of FlaggedRevs, logged-in users see the most recent rev, flagged or not, unless they set a preference to see the flagged version. Read the proposal before commenting, and especially before telling people to add what's already there. PaddyLeahy (talk) 11:06, 9 January 2009 (UTC)[reply]

Thank you; that merely demonstrates why one only discusses one subject in one paragraph. I will divide the statements on logged-in users from those on reviewers for clarity, because I genuinely did not see them, and have had to reread the paragraph to note the change in subject. Hordaland's objection stands, I think; but I oppose all proposals which mean that editors do not see the text we actually display to the rest of the world. That will decrease our rate of corrections and our reliability. Septentrionalis PMAnderson 15:55, 9 January 2009 (UTC)[reply]

Even after this sensible explanation, I can still not find the procedure in the proposal. The procedure for anon's, regular named users and reviewers should be crystal clear. Why don't you add them very prominently to the proposal. I now gather the following:

in a primary article view:

anonymous users will see the latest sighted version
logged-in users will see the latest edited version
all users can switch between the edited and sighted version

in an edit session:

anyone will start from the latest edited version and be shown a "dif" with the latest "sighted" version

One general comment

On reading many of the comments here I think I need to point out that the flagged revisions are already enabled on some Wikipedias. A lot of comments read as if the wheel needs to re-invented at EN-WP. That's not the case: Just go to German or Polish Wikipedia to see the flagged revisions in action.

It's kinda like people from EN-WP don't realize there are Wikipedias in other languages too. Sorry, it sounds like the stereotype American that realizes there are countries other than the US too. Please try to not live up to that clichée of people from the English-speaking world just us Germans are trying to not live up to that clichée of bureaucracy and orders and so on. I know this statement is not fair and but that really is the impression I get by reading many of the comments on this page: I can assure you that everything has been discussed before. Open your eyes and go discover foreign galaxies;-)

Anyways, more comments from users at DE or PL would be great. --X-Weinzar (talk) 15:09, 7 January 2009 (UTC)[reply]

It's rather hard to "open your eyes" if you don't speak German or Polish, you know? Most people don't- in America, at least, the popular second languages (if you learn one) are Spanish and French. --Pres N 14:38, 8 January 2009 (UTC)[reply]

I know. I already apologized while writing this. It was mostly targeted at comments like two threads above me (#Test page? and numerous comments whose authors didn't seem to be aware of the fact that Wikipedia comes in other languages too and that there are projects who have introduced the FlaggedRevisions already. You don't need to speak German or Polish or whatever for this. For example if you wonder about what the layout FlaggedRevs in effect will be as was the question in the above mentioned thread just go to those Wikipedias, not to the en.labs-wiki. --X-Weinzar (talk) 13:53, 14 January 2009 (UTC)[reply]

Comments from DE-WP

Not the first time that I feel the communities from different language versions are way too much apart from each other. (For those interested: Recent examples include the RfA of User:lustiger seth and de-adminship of Gryffindor at Commons) I think there needs to be a lot more communication which is why I decided to share my thoughts from DE.

Some preliminaries:

The whole thing has produced megabytes of discussion on DE (yes, MBs!) and still is being discussed. So you can be sure we've discussed every possible aspect already and the community is deeply divided. Still, opinion ranges from „all in all, this does more harm to Wikipedia than all vandals together. How come the German chapter of Wikimedia foundation waste money on this“ to „Best thing that has ever happened to us“. So be careful with anyone stating „well, it's just perfect“ or „greatest bulls*** ever“ but ask for an explanation.

Success of Flagged Revisions is impossible to measure. You can't measure quality and you can't measure the amount of time spent for checking and approving or disapproving changes (which is, in many cases, time that would be spent when checking your watchlist normally anyway). So everyone is entitled to his own impressions of whether it is useful or not. While it may be useful in some areas, it may do harm on others. It is also almost impossible to tell whether IP editors are discouraged by this or not.

What needs to be clearly defined in case you want to try it out:

Specify an end date. Otherwise the „test“ may just run forever as it probably would have on DE-WP until the community pressure was strong enough to force a further vote to actually vote on the introduction of Flagged Revs.
The criteria for flagging need to be clearly defined. On the German Wikipedia, the criteria was to make sure the edits „don't contain blatant vandalism“ (you could call this a winged word by now). Later, people realized „blatant vandalism“ never was problem because 99.9% of this is caught by RC patrol anyways, almost all the rest by „normal watchers“. So what's the point of flagging revisions then? Also, it has shown that it doesn't keep vandals from vandalizing which was one of the main reasons it was introduced for.

Now, discussion is going on whether to change the criteria for flagging. This would create a lot of extra work for the „flaggers“ and the lag would increase.

So the most fundamental question in my view is: Is all of this worth the amount of extra work? As stated above, there is no way to measure this. However, my personal impression is that it's not. There may be a few articles that are not watched by people who can decide between useful edits and not-so-useful edits where „flaggers“ will revert the edit. But this doesn't justify the amount of work involved. Think about what people could do with their time instead: They could actually be improving articles;-) So all in all, while there are positive aspects, I've come to the conclusion that it's rather a waste of time. --X-Weinzar (talk) 15:17, 7 January 2009 (UTC)[reply]

I agree. I believe that this is reaching for the diminishing returns on vandalism reduction. Vandalism will never be eliminated in any society. As stated, the RC patrol reverts the majority of the changes almost instantaneously anyway; so my impression is that the latency introduced in the system could be better utilized within WP elsewhere. —Archon Magnus^{(Talk | Home)} 15:42, 7 January 2009 (UTC)[reply]

I contest that flagged revisions are ineffective against blatant vandalism; RC patrol and watchlists may result in quick reverts, but the vandalized revision is still visible to some number of readers. On popular articles, or articles that are vandalized often, this can be quite a significant number of readers. Moreover, many vandals will be deterred from introducing vandalism in the first place, once they realize that it's futile. The result is that the overall vandal fighting workload for contributors is much smaller.

Nevertheless, I don't see this as the primary purpose of flagged revisions; their main purpose is to remove the urgency from editing. Blatant vandalism can be left primarily to watchlists instead of RC patrol, because it doesn't need to be reverted that quickly anymore, freeing up RC patrollers to do something else. Edit wars are deterred because changes don't appear immediately, and each editor will have to think a little more about whether their changes will last long enough to be reviewed, encouraging more attention to consensus and discussion. To me "anyone can edit" isn't about instant gratification, it's about anyone having the opportunity to make a contribution with very little effort, and flagged revisions doesn't change that. Dcoetzee 19:41, 7 January 2009 (UTC)[reply]

Instant gratification is very satisfying for a new editor. It's pretty satisfying for me, too. My own suspicion is that anything that takes away from that would slow Wikipedia's improvement. And that's something we should avoid.

I occasionally edit in German WP, but as I have less than 200 edits there, they have to get sighted by another user. After initial skepticism, I found that this considerably adds to satisfaction, at least for me: My edit is still immediately visible - just perhaps one click away - but after a while, my edit gets sighted, which means at least one other user has actually noticed and read my change. Now isn't that great? :-) I cannot judge to what extend the system really helps the RC patrollers, but I think it certainly does no harm. -- Momotaro (talk) 21:15, 21 January 2009 (UTC)[reply]

Regarding X-Weinzar's comments on measurement, I do believe that it's possible to measure these things. There are some proposals for tests. I, for one, will totally oppose any implementation of FRs without some sort of test of its effectiveness. Unfortunately, the present proposal has no provisions for tests. Ozob (talk) 23:25, 7 January 2009 (UTC)[reply]

To my opinion (based on the experience in German WP), it's quite important to implement some tests to evaluate major expectations on impact of FR. Furthermore, it might help to reduce a lot of discussions on interpreting influences of FR. --Septembermorgen (talk) 12:50, 21 January 2009 (UTC)[reply]

Thank you for the detailed report. I think the summary of the de.wiki experience is: FR is a solution looking for a problem. Xasodfuih (talk) 01:49, 9 January 2009 (UTC)[reply]

Something regularly overlooked is indeed the amount of work it creates, work that could have been spent on RC patrol or article writing. Every small fact now has to be checked twice because e.g. if I change a number like 1011 people have died in this incident, the only way to find out whether or not this is blatant vandalism is to re-check the fact. This does make WP more reliable, but at what price? --Pgallert (talk) 10:14, 9 January 2009 (UTC)[reply]

Limiting the revisions to flag for large-scale implementations

See Wikipedia_talk:Flagged_revisions#Limiting_the_revisions_to_flag_for_large-scale_implementations Cenarium (Talk) 14:26, 8 January 2009 (UTC) [reply]

Not a panacea

Most of the support !votes seem to be cast on the assumption that FR will somehow solve the vast majority of our BLP problems. This is plainly false; the writer of this proposal disavows that view in this discussion, as a misunderstanding: FR are only for obvious vandalism and libel.

One supporter cites the case of Taner Akçam, who was detained going through Canadian customs because our article made a very serious accusation against him (see the article). That is a very bad thing; but I doubt FR would have prevented it. The falsehood was inserted by a named account, now permablocked (I note with amusement that the supporter is too); it appears to have had a footnote, citing a website. That is not obvious vandalism or libel; do we really expect all sighters to go to the amount of trouble required to disprove such an edit? How many sighters will we need, if they are expected to do so? (And if we recruit very many of them, I'm sure the libeller would have volunteered.) I gather on de-wikipedia they don't do any such thing; if we do here, what will our backlogs be? Septentrionalis PMAnderson 18:24, 7 January 2009 (UTC)[reply]

Founder thinks they're a good idea for BLPs, says something. Backlogs have been discussed above, we could in theory put in a few bits of code and expire pages after a set timeframe making the backlogs very manageable. Its been discussed of modifying programs like Huggle, Twinkle, and the other counter-vandalism programs to have the ability to sight articles as well. We have an ungodly amount of information to protect, but we also have one of the best taskforces on the planet who could just move around what they do a bit to take in sighting. General comment not towards this section A lot of the comments seem to be missing that this is an RFC for the ability to trial; before the trial would go everything...every little thing would be ironed out and then probably !voted on again before it went live. There's many comments on: there's not enough info on this or that and I won't support unless this is guaranteed; we're not even at that stage yet and a good majority seem to be missing that point. §hep • ¡Talk to me! 06:03, 8 January 2009 (UTC)[reply]

Those who wish to advocate doing what Jimbo says because he says so should consult with him first; he disagrees. Septentrionalis PMAnderson 16:53, 9 January 2009 (UTC)[reply]

I think that the point that a good chunk of us are making is that this is backwards. FIRST figure out what you want to trial and THEN set up to trial it. The current order is more along the lines of a police department saying "we'd like to buy a bunch of tasers please - once we have the tasers, we'll sort out the details of how we want to use them". I'm opposed to this broad "let us have A trial, we'll work out the deets later" mentality, but I'm working with other editors on the proposed trials and restrictions section at the same time. Hope this perspective helps! Lot 49a^talk 07:57, 8 January 2009 (UTC)[reply]

This is, respectfully, a poor analogy. Until a specific trial is approved, no trial will be conducted, regardless of the outcome of this poll; even if the technical feature is turned on, it's not going to be doing anything until a specific trial is approved. Practically speaking, the only effect of this poll is that it will put other polls and discussions on the table if approved. A latent piece of software is not like a box of tasers. Dcoetzee 11:24, 8 January 2009 (UTC)[reply]

I couldn't agree more; that's an awful analogy. If you must compare it to buying tasers, what is actually being proposed is to buy a bunch of tasers and lock them up in a basement, giving the key to the police commissioner, until we decide how to use them. The tasers are there, and when we can present a convincing case for their use, they can be made immediately available, but they are as safely secured against accidental/unauthorised use as they were before they were purchased. The alternative is to spend hundreds of man-hours constructing a detailed plan of action, only for the budgetary committee to say they can't afford them. This proposal is not to say that we want to use tasers in this way or that way or even (and especially) not in any way we like. It's to ask "do we want to try using tasers? If so, we'll need to buy some tasers". Just because we have the tasers in store doesn't mean we have to use them now (without suitable governing policies) or even that we have to use them at all (unlike the real world, it costs us nothing to hold the tasers in store for ten years and then throw them away). It merely notes that if we want to experiment with using tasers in any way, shape or form, we need tasers to experiment with. Finally discarding the taser analogy (:D), Dcoetzee is entirely right: the extension is completely invisible unless and until a specific, complete, thorough and detailed trial proposal gains consensus. Unless and until that happens, it has no effect whatsoever. This really is 'part 1 of 2'. Happy‑melon 21:30, 8 January 2009 (UTC)[reply]

My understanding (please correct me if I am wrong) is that the software is already implemented (it's live on WP-DE after all) so turning it on is near-trivial. If that is the case, a separate poll for turning it on in principle seems pointless without a scaffolding of assurance for HOW it will be used once it is turned on.

I think that a latent tool for crowd (IP vandal) control is a pretty good analogy, actually. One of the problems with non-lethal tools like tasers (FR) as a supplement for things like firearms (protecting and semi protecting pages) is that rather than reduce the harm caused by the lethal tools, it makes police officers more willing to use force. So people who would never have been shot get tased and there is a kind of violence-creep. Am I being paranoid? It's very possible. But I think "we have the tools, so we may as well figure out a way we can use them" is a very probably response to turning this thing on. I'd rather have a plan and then enable the equipment. Lot 49a^talk 21:38, 8 January 2009 (UTC)[reply]

The issues are not technical but social: the huge amount of contention and debate that this proposal has generated (with arguments ranging from the coherent to the absurd) is testament to how "difficult" it is to implement. Technically it is indeed just a case of flicking a switch, but this proposal is not just about turning it on. It's equally a consensus on whether we want to make the social step of conducting trials and testing FlaggedRevisions with an open mind to its implementation - really it's "are we prepared to let go of our individual gut instincts (be they fire-and-brimstone or pot-of-gold-at-the-end-of-the-rainbow) and approach this from a scientific perspective?". The technical configuration is just a tool to facilitate that process. This proposal is not about FlaggedRevisions, it's about testing FlaggedRevisions, and about whether we want to give ourselves the greatest possible freedom to test it as thoroughly as possible while still retaining complete control over the process, as would befit a properly scientific approach.

I still wouldn't say that FlaggedRevs best suits a taser - I'd give that analogy to the Abuse filter extension, which is also in the works. Perhaps FlaggedRevs is more like a riot shield; it works by keeping vandals out, not by kicking them out. Either way, you still need to have them before you can work out how best to use them. Happy‑melon 21:48, 8 January 2009 (UTC)[reply]

Please stop calling me the 'author' of this proposal; not only does it imply an authority and responsibility that I do not have, it demeans the work of the numerous other editors who have contributed to its development. Despite what you might believe, this idea did not jump out of my head one evening and emerge fully-fledged for a vote the following day. Happy‑melon 21:20, 8 January 2009 (UTC)[reply]

My view on stable versions

I don't want to take the time to read this whole discussion, but I'd like to point out my arguments for stable versions. Jason McHuff (talk) 12:48, 8 January 2009 (UTC)[reply]

Question re tagging

One feature of Flagged revisions, is that edits by logged on users (who aren't reviewers) and bots will not be seen by everybody if they are on top of an un-reviewed edit by an IP until the version is reviewed. While for general edits this may merely be a nuisance, what about more time sensitive edits, such as (particularly) prodding an article or adding an Afd tag. In this case the fact that not everyone can see the tag could mean that IP or non-logged on users will not realise that an article needs help until possibly it is too late. Would it be possible to tweak the way we do things so that the clock doesn't start on these processes on a FR page until the tag has been "sighted" and anyone can see it? What happens on de and pl about these issues?Nigel Ish (talk) 18:40, 8 January 2009 (UTC)[reply]

I guess you could hard-code the interface to disallow the addition of tags to unreviewed pages. I guess it is pretty pointles even saying that benign tags like {ref-improve} should potentially not be seen by IPs. MickMacNee (talk) 19:15, 8 January 2009 (UTC)[reply]

A user who is experienced enough to be adding CSD/XfD tags to articles is experienced enough to ask for and receive the 'reviewer' flag in almost all proposed implementations. Happy‑melon 21:32, 8 January 2009 (UTC)[reply]

2005 WP:Block all anonymous edits

A very old discussion (short) where some of the same concerns we have now were raised several years ago. Collect (talk) 19:11, 8 January 2009 (UTC)[reply]

You wouldn't know it from some of the support votes, but Flagged Revisions is being proposed so that we do not have to force all editors to register. MickMacNee (talk) 19:19, 8 January 2009 (UTC)[reply]

As I've commented before, flagged revisions does not make it more difficult for anyone to make a contribution; it only delays the publishing of their contribution. This is the distinct difference between FR and no-anonymous-editing. That's not to say it has no negative effects, but it's a different thing. Dcoetzee 23:24, 8 January 2009 (UTC)[reply]

'More difficult' depends on which set of articles it is applied to. If it is going to be applied to every single BLP, irrespective of whether they were being protected beforehand, then obviously it is going to be more difficult for IPs to edit. If it is going to be applied to all articles, even more so. If applied just to protected articles, then it depends if you think changing from having to ask to edit, to having to wait for an edit to be approved, is making anything any easier. This is why asking for a poll when every supporter has conflicting reasons for believing it of benefit was unwise. Nobody can make a conclusive argument and nobody cna make a conclusive counter argument, because even the basic issues such as which article set to apply it to have not been agreed. 15:18, 9 January 2009 (UTC)

I fail to see why this is the case. The poll needs to be held separately, if nothing else, because the devs don't care about our policies and the way we'll use it. They just want to know if they should turn the software on or not. Since no specific trial is proposed, having the software turned on will not make a blind bit of difference to the daily running of Wikipedia - no pages will be flagged or anything. In that sense then, having this poll is just as good as having a poll on a specific trial. Since no trials can run without another community discussion, in which the different factions of faith in FR will have to come to some consensus about it, I don't really see why a lot of the people above are opposing it (except those who are against FR on principal, which I completely understand) Fritzpoll (talk) 18:23, 9 January 2009 (UTC)[reply]

The logical benefit from deciding which general but incompatible uses have the most general support to go forward for specific trials before you switch it on is just obvious to me. These discussions are already way too complex, convoluted and hard to follow without intentionally making it more so further down the line by not even starting with some basic assumptions, all seemingly just for the convenience of devs. It is horrifying to think what percentage of the community (mainly IPs) are already automatically disenfranchised from just this poll, never mind the future massively complex multi-way multi-aspect discussions that are due to come under your model. MickMacNee (talk) 19:00, 9 January 2009 (UTC)[reply]

I agree, there might have been some aspects that could have been discussed first, but as far as I can see, these are just the parameters for the technical implementation. Everything else that would be needed for a trial can be determined by the parameters of individual trials. A sitenotice announcing this poll might have been a good idea, except that it doesn't really affect them at present. I would oppose the implementation of any trial that wasn't advertised on a sitenotice where all (including IPs) could see it, FWIW. Fritzpoll (talk) 19:19, 9 January 2009 (UTC)[reply]

That's not the only such discussion. Disabling editing by people without accounts is a perennial proposal, also to be found at m:Anonymous users should not be allowed to edit articles and repeatedly at the Village Pump and elsewhere. It is perennially countered by those who point out that people without accounts are actually responsible for most of our content, and that in some cases it is the people without accounts who are quietly fighting the vandalism of the people with accounts. Uncle G (talk) 01:46, 10 January 2009 (UTC)[reply]