Jump to content

Wikipedia:Edit filter/Requested

From Wikipedia, the free encyclopedia
    Requested edit filters

    This page can be used to request edit filters, or changes to existing filters. Edit filters are primarily used to address common patterns of harmful editing.

    Private filters should not be discussed in detail. If you wish to discuss creating an LTA filter, or changing an existing one, please instead email details to wikipedia-en-editfilters@lists.wikimedia.org.

    Otherwise, please add a new section at the bottom using the following format:

    == Brief description of filter ==
    *'''Task''': What is the filter supposed to do? To what pages and editors does it apply?
    *'''Reason''': Why is the filter needed?
    *'''Diffs''': Diffs of sample edits/cases. If the diffs are revdelled, consider emailing their contents to the mailing list.
    ~~~~
    

    Please note the following:

    • Edit filters are used primarily to prevent abuse. Contributors are not expected to have read all 200+ policies, guidelines and style pages before editing. Trivial formatting mistakes and edits that at first glance look fine but go against some obscure style guideline or arbitration ruling are not suitable candidates for an edit filter.
    • Filters are applied to all edits. Problematic changes that apply to a single page are likely not suitable for an edit filter. Page protection may be more appropriate in such cases.
    • Non-essential tasks or those that require access to complex criteria, especially information that the filter does not have access to, may be more appropriate for a bot task or external software.
    • To prevent the creation of pages with certain names, the title blacklist is usually a better way to handle the problem - see MediaWiki talk:Titleblacklist for details.
    • To prevent the addition of problematic external links, please make your request at the spam blacklist.
    • To prevent the registration of accounts with certain names, please make your request at the global title blacklist.
    • To prevent the registration of accounts with certain email addresses, please make your request at the email blacklist.


    Edit filter for copy-paste pagemoves

    [edit]
    • Task: Prevent copying drafts into the article space. This would apply to all editors, and would target the article space.
    • Reason: a very common entry in Category:Candidates for history merging these days is a page that was copy/pasted from the draft space, either because there is an existing redirect in the way or because the page was draftified and the creator (or someone else) likely does not know how to request a redirect be deleted (usually via {{db-move}} or WP:RM/TR).
    • Diffs: Special:Diff/1248536996, Special:Diff/1249173005

    I'll note that this sort of filter will not necessarily stop copy/paste pagemoves from the draft space where the article is a redlink (e.g. Special:Diff/1245946107 or Special:Diff/1249205898) but it will hopefully stop copy/pastes over redirect. Primefac (talk) 21:11, 5 October 2024 (UTC)[reply]

    I'm not sure if this has to do with why no one is replying, but I tried looking at the diffs when you first added them and found it hard to understand what type of edit you are asking for a filter about... presumably because you merged the histories of the pages and that changed the diffs. From a general description it also sounds difficult to figure out how detecting for copy-paste moves would work, seeing as the filter only has context of what is (and was) on the one page being edited in the action it triggered on.
    Is/was there something specific about these diffs that could be used to detect others like them? – 2804:F1...29:CE67 (talk) 00:20, 19 October 2024 (UTC)[reply]
    It basically boils down to "someone overwrites a redirect with a large amount of text and there is a draft at the same title"; from what I have seen that is almost always a copy/paste pagemove that requires a histmerge. Primefac (talk) 11:46, 19 October 2024 (UTC)[reply]
    Sorry for taking so long to reply.
    Unfortunately I don't think there is a way to know if an article exists at Draft:ArticleName from a filter action that happened at ArticleName unless there is a link to the draft in the new version (after the big addition) which would allow a search in the new_html for class="new" title="Draft:ArticleName. 1112 (hist · log) ("Notable people" disruption) does this.
    This discussion for checking if it was a disambiguation link, for that same filter, thought it was not possible to retrieve article content from a title until someone brought that up. The variables(mediawiki) only seem to contain information about the page(s) where the action happened and/or about the user doing the action.
    -
    On the other hand one of the edits did trigger and get tagged by 164 (hist · log) (Possible cut and paste moves). That filter works by checking, for users with less than 250 edits creating new pages (page_id 0), if the added content contains "[edit]" or maintenance templates to guess that it was copied from a different page; that's not as narrow as 'copied from the Draft', but it is something detectable at least.
    Now, would people agree with disallowing edits like that? I don't know.
    -
    I say this to more be informative, I hope others share their thoughts/ideas too. – 2804:F1...EE:EFBD (talk) 19:26, 21 October 2024 (UTC)[reply]

    Filter unsourced tornado / hurricane rating changes

    [edit]

    Also, I know this can happen with hurricanes; see the edits on Hurricane Beryl from early on July 2 and you'll see why it needed protection. GeorgeMemulous (talk) 13:37, 23 October 2024 (UTC)[reply]

    (denied removed) and Deferred to requests for page protection. The first diff you present seems like it was made in good faith (?) based on the edit summary alone, though I'm not too familiar with tornados. This seems to be something that pending changes would help with more than a filter, though. EggRoll97 (talk) 23:46, 23 October 2024 (UTC)[reply]
    Disruption has been ongoing since 2023 and isn't limited to those four pages, even if they are the most recent targets. Let me assemble a few more diffs from various pages: 2023 Rolling Fork tornado, 2021 Western Kentucky tornado, Tornado outbreak of March 31, 2023, Tornado outbreak of December 10, 2021, Tornadoes of 2020, 2015 Rochelle-Fairdale tornado, Tornadoes of 2014, Tornadoes of 2013, Tornadoes of 2013 again, Tornado outbreak of November 17, 2013, and one, two, three, and four instances on 2013 El Reno tornado. There are probably more out there and there are certainly more to come as this is one of the easiest ways to vandalize a tornado article (literally changing one number). Also note the first diff was a reversion to a clean version after multiple previous disruptive edits, as are at least one of these new examples. All tornado and tornado outbreak articles are vulnerable to this and disruption often occurs years after the event leaves the news cycle so protection may not be the way to go in my opinion. GeorgeMemulous (talk) 00:22, 24 October 2024 (UTC)[reply]
    Doing... Fair enough. I'll see if I can whip up a preliminary start to this. EggRoll97 (talk) 00:29, 24 October 2024 (UTC)[reply]
    I'll summarize a few points as you said you aren't too familiar with the topic:
    • Tornadoes in the US and Canada are rated on the Enhanced Fujita scale, shortened to EF. This scale ranges from 0 to 5.
    • Tornadoes in the rest of the world are often rated on the International Fujita scale, shortened to IF. Again, 0 to 5.
    • Some countries still use the legacy Fujita scale, shortened to F. This goes from 0 to 12, but only 0 to 5 have ever been final.
    • All are formatted similarly: F0, EF1, IF2, F3, EF4, IF5.
    • Citations to verify typically come from the NCEI database or ESWD, but preliminary ratings often come from Twitter or a statement from the local NWS office.
    • The TORRO scale is more or less unused and obscure to the point where it's an unlikely disruption target.
    Cheers! GeorgeMemulous (talk) 00:48, 24 October 2024 (UTC)[reply]
    Update, Still doing..., though at a fairly slow speed. If anyone wants to take over on coding, absolutely go ahead. Things in the real world have been taking a slight bit of a toll over the last bit. EggRoll97 (talk) 22:34, 30 October 2024 (UTC)[reply]
    Update, probably don't see myself working on this, but a filter should be made. Not sure if anyone wants to pick this up by chance. EggRoll97 (talk) 04:55, 9 November 2024 (UTC)[reply]
    @EggRoll97 and GeorgeMemulous: Here is some basic filter code we could use:
    !("extendedconfirmed" in user_groups) &
    page_namespace == 0 &
    !(added_lines contains "<ref") & (
      scaleStr := "(?:E|I)?F[0-5]";
      removed_lines contains scaleStr &
      added_lines contains scaleStr
      !(removed_lines = added_lines)
    )
    
    What this should do is check if anyone is adding hurricane scale numbers and removing different ones without a source. Thanks, – PharyngealImplosive7 (talk) 17:50, 10 November 2024 (UTC).[reply]
    Testing at 1324 Looks good for testing. I've been busy over the last bit, but I can toss this in and keep an eye on it (by the way, an & was forgotten at the end of line 6). Thanks! EggRoll97 (talk) 23:44, 10 November 2024 (UTC)[reply]

    I think the current filter is broken that it could not catch the changes, even with FilterDebugger. contains would have to look for the entire phrase itself, while irlike is recommended for regex. Here's what I wrote instead:

    page_namespace == 0 &
    page_title irlike "tornado" &
    !contains_any(user_groups, "extendedconfirmed", "sysop", "bot") &
    !(added_lines contains "<ref") &
    (
        scaleStr := "[EI]?F[0-5]";
        removed_lines rlike scaleStr &
        added_lines rlike scaleStr &
        !(removed_lines == added_lines)
    )
    

    I am pinging both PharyngealImplosive7 and EggRoll97. Codename Noreste 🤔 Talk 01:30, 11 November 2024 (UTC)[reply]

    I would suggest rlike since the scale ratings are usually marked with capital letters, but otherwise, looks good. Also do bots really make these changes? Anyways thanks for the help. – PharyngealImplosive7 (talk) 03:20, 11 November 2024 (UTC)[reply]
    Bots make a lot of edits that change a line that doesn't contain '<ref' so excluding bots near the top means the filter doesn't needlessly check all the way to removed_lines or added_lines.
    The last line's comparison seems unfinished, I think you meant to compare if the scale added is different than the one removed (i.e. an unrelated change to the same line), but the current check is if removed and added lines are different, which is (surely?) always the case. – user usually at 2804:F14::/32, currently 143.208.239.58 (talk) 03:52, 11 November 2024 (UTC)[reply]
    Modified the suggested code to use rlike for the regex, and added a condition piece to only target pages with the title tornado. Codename Noreste 🤔 Talk 04:14, 11 November 2024 (UTC)[reply]
    Also, I noticed that you changed my original regex to (?:E|I)?F[0-5]{1,2}. Numbers above 5 are not used in any scale we are tracking, though they could exist theoretically on the Fujita Scale. As a result, I think you should delete the "{1,2}" part. – PharyngealImplosive7 (talk) 04:52, 11 November 2024 (UTC)[reply]
    Looks good, though I've added hurricane to the page_title check, since this appears to occur with hurricane ratings as well. EggRoll97 (talk) 04:53, 11 November 2024 (UTC)[reply]
    @EggRoll97: The regex also might need to be fixed, see my comment above. – PharyngealImplosive7 (talk) 04:56, 11 November 2024 (UTC)[reply]
    {1,2} denotes that one minimum or two maximum numbers are allowed in the regex, but I will remove it from the filter's regex. Codename Noreste 🤔 Talk 05:05, 11 November 2024 (UTC)[reply]
    And it's removed, PharyngealImplosive7. Note that I also changed (?:E|I)? to [EI]? as it only denotes a set of these two letters, so I don't think a non-capturing group is needed here. Codename Noreste 🤔 Talk 05:09, 11 November 2024 (UTC)[reply]
    Yes that looks good. The IP in the conversation suggested we modify the last line of the regex (whether added lines is the same as removed lines. Any ideas on how to fix that like the IP said? – PharyngealImplosive7 (talk) 05:12, 11 November 2024 (UTC)[reply]
    Maybe changing == to in would work? Codename Noreste 🤔 Talk 05:14, 11 November 2024 (UTC)[reply]
    Just saw the comment about needing the regex fixed. Sorry, I was working on the filter with an old version of this page, so I didn't see the comment about fixing it until now. I've just removed the {1,2} from the regex, and changed (?:E|I)? to [EI]?. EggRoll97 (talk) 05:16, 11 November 2024 (UTC)[reply]