Jump to content

Wikipedia:Edit filter/Requested/Archive 19

From Wikipedia, the free encyclopedia
Archive 15Archive 17Archive 18Archive 19Archive 20Archive 21

Block addition of __INDEX__ in mainspace

-- Asartea Talk | Contribs 19:33, 21 January 2022 (UTC)

__INDEX__ anywhere is probably weird enough to reasonably trigger a filter. ~ ToBeFree (talk) 15:47, 22 January 2022 (UTC)
...except when discussing __INDEX__! Certes (talk) 16:08, 22 January 2022 (UTC)
Yeah, I could see a case for adding a filter to check for any additions anywhere where someone adds __INDEX__ without surrounding <nowiki> which warns, but since in mainspace you aren't discussing it and it doesn't do anything its probably fine to just flatout block. -- Asartea Talk | Contribs 16:59, 22 January 2022 (UTC)
About 1000 articles use the word (sample), so we may want a purge once further additions have been prevented. Certes (talk) 17:27, 22 January 2022 (UTC)
Okay, having been poking:
  • There are 187 uses of INDEX in Draft: which aren't nowikied. These should all be wasteful, so it can probably get blocked
  • There are 282 uses in User: and User talk:, which I think are all reasonable and should be allowed.
  • There are 697 uses in (Article) which aren't nowikied, and none which are. These are all a waste and probably should be blocked
  • There are 18 uses in all other namespaces which aren't nowikied, at least some of which are reasonable, but also some which aren't.
    • (all of these results can be checked by going to https://en.wikipedia.org/w/index.php?sort=last_edit_desc&search=insource%3Aindex+insource%3A%2F__INDEX__%2F+-insource%3A%22%3Cnowiki%3E+index+nowiki%22&title=Special%3ASearch&profile=advanced&fulltext=1&ns0=1 and selecting the namespace(s). Annoyingly enough I can't just post the urls as external links cause MediaWiki tries to be smart and breaks them (also note I substracted one from the all other namespaces count, since I have an open AWB perm request which contains INDEX in a search url, which I just realised I also need to fix).
So overall I think that __INDEX__ can just be flatout blocked in (Article), Draft: and Draft talk:, allowed in User: and User talk:, and that insertions of __INDEX__ where there are no <nowiki> tags surrounding it in all other namespaces should output a warning message. For bonus fun there is also {{Index}}, but I think that can be handled at a template level with a switch based on namespace. -- Asartea Talk | Contribs 17:28, 22 January 2022 (UTC)
Many of the articles have recently been indexed, e.g. List of Acts of the 1st Session of the 42nd Parliament of the United Kingdom and many similar lists of UK acts. I also noticed {{INDEX}} but was unsure whether to mention it. It seems to be unused. I sneaked in a URL by searching for _\_INDEX__. Certes (talk) 17:35, 22 January 2022 (UTC)
Per the doc page of {{INDEX}} __INDEX__ only does anything in User: and User talk:, so BEANS shouldn't be an issue. I'm currently writing a patch to make the template hardfail with warning in all other namespaces. -- Asartea Talk | Contribs 17:39, 22 January 2022 (UTC)
I'd like to see mainspace and draftspace tagged rather than blocked (for now)...let's see what sorts of folks think they should be adding it. GeneralNotability (talk) 19:44, 22 January 2022 (UTC)
@GeneralNotability I get what you mean, but every time its added it will need to be removed at some point, which adds volunteer time. I'm fine with tagging for now, but long term I'm going to advocate for blocking in the namespaces I mentioned above (Article, Draft, Draft talk) -- Asartea Talk | Contribs 19:51, 22 January 2022 (UTC)
TPER filed at Template talk:INDEX#Template-protected edit request on 22 January 2022, so once that gets merged that fixes the {{INDEX}} issue. -- Asartea Talk | Contribs 19:45, 22 January 2022 (UTC)
@Asartea: See 1183 (hist · log). This isn't really disruptive and it has no effect on the reader, and doesn't actually enable spam, so I don't think we should be disallowing. But I agree with GeneralNotability that it might be worth seeing what sorts of folks think they should be adding it. Right now the filter is really broad; logging all uses of INDEX and NOINDEX in all namespaces, by all users. It might be worth excluding users with (say) more than 1000 edits, but I want to see what it catches for now. Suffusion of Yellow (talk) 00:09, 23 January 2022 (UTC)
@Suffusion of Yellow could you exclude user and user talk? Those are the only two in which it actually has a effect, so I don't think we need to log those, since its fine to add there. -- Asartea Talk | Contribs 08:22, 23 January 2022 (UTC)
Still makes sense to log them if you assume that a certain group of problematic editors (COI) is using them. Regards SoWhy 10:10, 23 January 2022 (UTC)
I excluded non-extendedconfirmed editors in userspace. That's already covered by 930 (hist · log). Suffusion of Yellow (talk) 19:58, 23 January 2022 (UTC)
Oh also @Suffusion of Yellow alerting you to the existence of MediaWiki talk:Robots.txt#COIBot report, which if done should remove the need for a COIBot exclusion. -- Asartea Talk | Contribs 20:04, 23 January 2022 (UTC)

Persistent addition of incorrect soundtrack credits

  • Task: To prevent adding the name Mohammed Shanooj to Malayalam film articles.
  • Reason: There is this IP hopper who is persistently adding incorrect soundtrack credit in film articles, probably for self-promotion because there's no other reason why they should do it. This person managed to obtain circular references for Meppadiyan[1] and Hridayam[2] that copied his own hijacked version of the article and cited it back into the same articles. If you google his name there's actually an IMDb page and other pages for this composer, but if you look deeper you can see that he's a teenager who has uploaded few amateur music videos on YouTube (which is not original but altered version of existing works). I guess what he's trying to do is obtaining circular references mentioning his name as the composer of notable films so that he can promote himself as a music composer and create composer's profile (like the one in IMDb) at popular music websites that still needs more sources for verification. ToBeFree advised edit filter since their IP range is too large to block.
  • Diffs: [3][4][5][6][7][8][9][10][11][12].

2409:4073:2094:FF07:3452:920C:D85A:CFEF (talk) 14:03, 22 January 2022 (UTC)

Thank you for requesting this! Looks like a good task for an edit filter, as the person won't be interested in circumventing the filter by using a different name. Catching them using a filter should be simple. ~ ToBeFree (talk) 15:42, 22 January 2022 (UTC)
 Done I have an existing private filter that handles similar types of self-promotion. OhNoitsJamie Talk 16:10, 22 January 2022 (UTC)

Possible strange widespread vandalism of talk pages

In the past day I've seen (and reverted) edits on three talk pages: on Talk:Trivia, on Talk:Messenger, and on Talk: Google Ngram Viewer.

They're unrelated and come from unrelated IP addresses, but all create a new discussion with a single-word title and a single word of content. And if I saw three of them on my watchlist there are probably thousands of others. Any thoughts? Thanks, Dan Bloch (talk) 19:05, 21 December 2021 (UTC)

The only tool which can be relevant here is the edit filter; the more examples you can find, the easier it will be to stop this. 2A03:C5C0:207F:22C2:F8CF:DD87:F2C:49C7 (talk) 19:13, 21 December 2021 (UTC)
Weren't article talk pages enabled for logged-out mobile web editors recently? I wonder if what we're seeing is just the kind of crud that appears in the "comments" section of ... any webpage with a "comments" section. I'm not seeing many high-quality comments here, of any length. Suffusion of Yellow (talk) 20:08, 21 December 2021 (UTC)
Thanks! That's almost certainly it. Many of the edits from your "recent changes" query look like the ones I'm seeing ([13], [14], [15], ...) and it would explain why the articles and IP addresses are all unrelated, and also why the edits have signatures. Dan Bloch (talk) 20:21, 21 December 2021 (UTC)
Found the task, it was phab:T293946. Looks like the talk page link was enabled in mid-November, making the pages easier to find. Wondering if that was such a good idea, given that banners are hidden and edit notices are nonexistent. Suffusion of Yellow (talk) 20:34, 21 December 2021 (UTC)
Yeah, I suspect this is the cause. I'm seeing the typical mix of spam, random one word comments, and totally offtopc comment (eg [16]). -- Asartea Talk | Contribs 20:38, 21 December 2021 (UTC)
If we do want to create a filter for drive-by mobile comments (say, disallowing comments under 25 bytes), remember that the only message that a mobile editor will see is "The topic can't be added due to an unknown error." There's no possibility of a custom message, or indeed any message that mentions that the edit was stopped by a filter. See phab:T281544. Suffusion of Yellow (talk) 20:48, 21 December 2021 (UTC)
Oh dear. Yet another failing of MobileFrontend to adequately display information to editors. firefly ( t · c ) 20:57, 21 December 2021 (UTC)
adds it to the pile of them. But seriously, wasn't AF not displaying supposed to be fixed for the Android app? The fact this is now happening suggests it wasn't that fixed, or whatever they did they special cased it to the (Article) namespace only? -- Asartea Talk | Contribs 08:24, 22 December 2021 (UTC)
I think this is coming from mobile browser clients, not the App. Article talk pages were recently made visible to anonymous mobile web editors. I imagine some readers see the "talk" link, and do what many Internet commenters do - leave an irrelevant message just to show the world they were there. firefly ( t · c ) 09:21, 22 December 2021 (UTC)
I noticed a big increase in such contributions from about 27 November. Earlier examples: [17] [18] (unsigned) [19] [20] [21]. Certes (talk) 22:53, 21 December 2021 (UTC)
... And now we have talk page vandalism?AssumeGoodWraith (talk | contribs) 01:33, 22 December 2021 (UTC)
Thread convergence: those edits have summaries of "Fixed typo" with a significant size change. Certes (talk) 12:42, 22 December 2021 (UTC)
The "fixed typo" and "added content" summaries are there because it's a suggestion in the mobile edit summary box. Usually not fixing a typo. – AssumeGoodWraith (talk | contribs) 13:15, 22 December 2021 (UTC)
I'll add some examples if this helps:
Completely irrelevant, using as a forum (stuff like "hi"), etc: [22], [23], [24], [25], [26], [27], [28], [29]
Random and nonsensical: [30], [31]
Vandalism of talk page templates: [32], [33]
Creating a random talk page that doesn't have a main article: [34]
Related but not about improvement of article: [35], [36], [37], [38], [39], [40], [41] (thought the article subject owns the article)
Creating an article in a talk page(?): [42] (IPs can freely create talk pages) – AssumeGoodWraith (talk | contribs) 11:53, 24 December 2021 (UTC)
(Non-administrator comment) I agree that the ones that are super-short and/or contain only "hi" messages, like this one from March at Talk:Quantum mechanics, should have an edit filter. Actually, I think I've seen similar edits in mainspace, but I don't recall where. Let's see what an EFM thinks about this. –LaundryPizza03 (d) 03:07, 2 January 2022 (UTC)
Danbloch, Suffusion of Yellow, I'm responsible for the proposal. Some level of crap was expected anyway, but it's not entirely trivial to determine whether it's an epidemic or anything like that. I'm not unsympathetic towards disabling editing for anons altogether (it's hard to collaborate with a fleeting IP), but as long as they are editors they need talk page access. The examples indicate some users think this is Twitter or something. A few appear to be mistaking Discussion Tools for a search engine.
Based on the examples I suggest creating an edit filter for additions from mobile anons in talk where the edit summary contains "new section" and less than 200 bytes were added. Set it to just log first. — Alexis Jazz (talk or ping me) 23:52, 6 January 2022 (UTC)
Hi y'all – it's helpful to see the edits you are encountering on talk pages and to know this trend is pronounced enough for you to consider taking action to mitigate it.
I* recognize I'm a bit late to this discussion. Tho, I thought you would value knowing how the Foundation is thinking about this uptick in destructive behavior.
  1. We share the hypothesis @Suffusion of Yellow noted above that this increase in vandalistic talk page edits is a consequence of exposing the Talk link to anons on mobile on 15 November 2021.
  2. We are are in the process analyzing the impact of making Talk visible to anons on mobile is impacting metrics like: talk page revert rate, talk page bounce rate and talk page page views.
  3. In the coming weeks, we will share a summary of the analysis mentioned in "2." so that we can collectively discuss what actions we should consider experimenting with in response. I'm thinking we'll start this discussion on WP:VPR where we last talked about this.
Alright, if any new thoughts/questions/ideas emerge between now an "3." please ping us here.
*I'm Peter. I work as the product manger for the Editing Team who, along with @OVasileva (WMF) (the product manager for the Readers Web Team) is investigating this. PPelberg (WMF) (talk) 17:43, 28 January 2022 (UTC)

Vitaium (talk) 06:39, 2 February 2022 (UTC)

@Vitaium: That would require community consensus, first. And I think such a proposal would have a WP:SNOWball's chance of succeeding, but even if it did, mass protection would be the way to go. Suffusion of Yellow (talk) 23:25, 3 February 2022 (UTC)
OK, i will switch to RfC instead of WP:EFR Vitaium (talk) 23:47, 3 February 2022 (UTC)

Brief description of filter

  • Task: To stop long-time vandal
  • Reason: Repeatedly posting violent content on biographies of living persons
  • Diffs: They are revision deleted, 1 and 2 but the gist is that he writes, " I will slit your throat [name] I will slit your throat [name]" over and over again. I can't see any future possible use of the phrase "I will slit your throat" that will be impacted by this filter. I'm sure he will probably change his MO but this would prevent a lot of revision deletion that occurs on a daily basis these days. There is also a more violent and sexual graphic phrase that he regularly uses but I'd rather email that to those who create edit filters than repeat it here. Thank you. Liz Read! Talk! 23:18, 16 February 2022 (UTC)
    @Liz: There are already filters targeting this vandal, but this LTA changes their MO repeatedly and I believe has some technical proficiency so would prefer not to discuss much publicly. A rangeblocks approach would probably be better though. ProcrastinatingReader (talk) 00:22, 17 February 2022 (UTC)

Copy and pasted vandalism

  • Task: Disallow copying parts of the interface and pasting them unmodified
  • Reason: this will prevent some vandalism
  • Diffs: [43]

-- lomrjyo (📝) 21:29, 9 February 2022 (UTC)

Another one: [44] CutlassCiera 18:24, 3 March 2022 (UTC)

Stop BLP vandalism against Benjamin Netanyahu

2.55.13.156 (talk) 04:34, 9 March 2022 (UTC)

See also WP:VPT#‎Is there a better way to stop this?. Certes (talk) 11:39, 9 March 2022 (UTC)

Spamrefs of Riggs, P.J.

DVdm (talk) 22:20, 8 March 2022 (UTC)

Spammer is problematic and continued spamming after multiple warnings. I found a number of previous edits by searching for "riggs, p.j.".
FYI: a "Riggs, Peter, J." was cited on footnote #2 Anne Elk's Theory on Brontosauruses that was added on 26 September 2007 by Pocopocopocopoco (talk · contribs) who has not edited since February 2009. Adakiko (talk) 00:10, 9 March 2022 (UTC)

Additional IP:

Adakiko (talk) 01:26, 9 March 2022 (UTC)

Possibly helpful information: all the IPs geolocate to Canberra, Australia, and there is a Dr Peter J. Riggs based there. Tercer (talk) 08:38, 9 March 2022 (UTC)

Additional new editor:

Adakiko (talk) 22:12, 17 March 2022 (UTC)

The Trudeaus

  • Task: Prevent unregistered editors from adding "Castro" to any article about a member of the Trudeau family (Justin Trudeau, Margaret Trudeau, etc.) or its talk page. Likewise for adding Trudeau to Fidel Castro or its talk page.
  • Reason: Unimaginative and repetitive BLP vandalism.
  • Diffs: See history of articles mentioned above.

Ivanvector (Talk/Edits) 15:27, 23 February 2022 (UTC)

Testing 2. ProcrastinatingReader (talk) 21:05, 1 March 2022 (UTC)
Doesn't seem like this LTA is very active Ivanvector, see logs for 2 (hist · log) between 1–20 March (and the revision of filter code preceding the current one). Is it missing any hits (due to too narrow filter code or revdel, or otherwise)? ProcrastinatingReader (talk) 11:37, 21 March 2022 (UTC)
Also, all but one of those is a false positive. I haven't noticed any missed vandalism, either. I don't think this is one particular LTA, there's a fairly well known conspiracy theory going around about Castro and Margaret Trudeau. It's demonstrably false (Fidel and Margaret didn't meet until several years after Justin was born) but it's something that Canadians who would wet themselves if they could vote for Trump like to shout about in between rolling coal and whining about how expensive gas is. We can stick to WP:RBI for this. Ivanvector (Talk/Edits) 13:10, 21 March 2022 (UTC)

Strange IP portal talk vandalism

Okay, this is a very specific but limited request for a bot to do some sleuthing. I just discovered an editor, using IPs, mainly from Ontario, Canada but previously from all over, who has a strange kind of vandalism. They create pages and post long stories about riding an elevator. Here is a diff of an example of the same content that just gets reposted. Any way, they are always posted on Portal talk:Current events pages for different days of the year. I was just finding Portal talk pages from random days in 2007 but the example I just shared was a day in 2013. Going through some pages I found that they were doing this, posting this same content about riding an elevator back in 2012! Since I doubt any editors have Portal talk pages on their Watchlist, there might be a lot of this nonsense that still exists or it could be that I found all of it today and there is none left.

Could a bot run a check on Portal talk pages for different days of the year and see if there are any pages that have this strange content? So far, I haven't found any reason for there to be a Portal talk page for each day of every year so maybe a mass deletion is in order if there are a lot of these pages that have been created. Since these pages are typically not seen by readers, this is obviously not a high priority task but since the vandal has recently been very active at doing this this month, this might serve to discourage them. Thank you. Liz Read! Talk! 01:18, 31 March 2022 (UTC)

Those who have admin goggles, can see on Portal talk:Current events/2007 September 21 that they have done this repeatedly, at least on this page, going back to 2013. Liz Read! Talk! 01:23, 31 March 2022 (UTC)
A search reveals four shorter travelogues. An edit filter might prevent further journeys. Certes (talk) 10:50, 31 March 2022 (UTC)
No further elevator pitches have appeared for a while, so I've blanked them. Certes (talk) 14:18, 11 April 2022 (UTC)

The Law of One:Ra materials

{{resolved}}

  • Task: prevent addition of references to this fringe book[45]
  • Reason: An editor who states that " I am a scientist, among other things, that has discovered how to interact with the proverbial "akashic" record in this modern era." and "A month or so ago I followed up on messages received via channeling communication that led me to the ra materials on Wikipedia." has said that there are infinite IP addresses they can edit from. I've blocked the editor. Edit filter would have to be more than "the law of one" as that's not an unusual phrase.
  • Diffs: See '[46] and [47]

If this isn't feasible/reasonable, no problem. Doug Weller talk 12:01, 16 March 2022 (UTC)

Tracking. ProcrastinatingReader (talk) 11:36, 21 March 2022 (UTC)
@Doug Weller: do you know if this vandal is still active? I'm seeing no filter hits in 2 (hist · log) -- could be the pattern is too narrow. ProcrastinatingReader (talk) 10:11, 13 April 2022 (UTC)
@ProcrastinatingReader: I don't think they are, let's drop this request. Doug Weller talk 10:36, 13 April 2022 (UTC)
@Doug Weller: No worries, let me know if they pop up again. ProcrastinatingReader (talk) 11:17, 13 April 2022 (UTC)

Insulting emoji

{{resolved}}

Quandale Dingle vandalism

{{resolved}}

  • Task: When an IP or non-autoconfirmed editor adds the phrase "Quandale Dingle" into an article, tag it as possible vandalism.
  • Reason: It would make countervandalism easier. Quandale Dingle is an internet meme; because of this, it has often been used for vandalism (similar to 'deez nuts' or 'amogus').
  • Diffs: Special:Diff/1083030694 This is the most recent diff I could find without rummaging through my revert history, but I promise you that this is not the only one I have come across.

Helen(💬📖) 16:21, 16 April 2022 (UTC)

I popped it in 1 (hist · log) for a little while to test. Not sure if we have a general vandalism filter for this; could maybe add it into 11 or (if appropriate, and after further testing for FPs) into the disallow 614. ProcrastinatingReader (talk) 14:07, 17 April 2022 (UTC)
Are there any legitimate uses for this phrase? Also, that log is full of crossed out italic usernames and crossed out IPs, and after going through the links, I don't see anything even remotely constructive. Mako001 (C)  (T)  🇺🇦 05:34, 30 April 2022 (UTC)
@Mako001: Looking through the filter too, I can’t find a single edit in the log that could of been constructive. Off the top of my head, I can’t think of any legitimate uses of “Quandale Dingle.” Signed,The4lines |||| (Talk) (Contributions) 05:47, 30 April 2022 (UTC)
@ProcrastinatingReader: Any objections if I add this 614 now? Looks live every edit that saved has been reverted. Suffusion of Yellow (talk) 21:42, 11 May 2022 (UTC)
@Suffusion of Yellow: No go ahead, looks good to me. ProcrastinatingReader (talk) 23:23, 11 May 2022 (UTC)
And  Done. Suffusion of Yellow (talk) 23:34, 11 May 2022 (UTC)
Vandalism ongoing now at Battle of Savage's Station. Certes (talk) 22:19, 11 May 2022 (UTC)

SCP vandalism?

This is infrequent enough that I'm not really sure if an edit filter is justified, but would it be worth it to have a filter that tags additions of strings like "keter", "overseer council", "secure. contain, protect" etc. to pages with SCP in the title other than SCP Foundation and SCP: Containment Breach? See, for example, [49], [50]. casualdejekyll 03:17, 11 March 2022 (UTC)

That's one of the weirdest types of vandalism I've ever seen. Stifle (talk) 14:10, 4 April 2022 (UTC)
It's not all that different to something like what happened with Bishop Auckland. Internet memes are inevitable, and Wikipedia is on the internet. casualdejekyll 23:37, 20 April 2022 (UTC)

Number vandalism

Was helping out another project, anyone got a good tip for this case: They are trying to stop subtle numeric vandalism. I was thinking possibly something along the lines of comparing only the numbers in the old to the numbers in the new to see if any changed. I would expect a ton of FP's here.

pseduo code:

(user_age == 0) &
(summary == '') &
(action == 'edit') &

(
 (FROM: removed_lines - extract and concatenate just [0-9])
 !=
 (FROM: added_lines - extract and concatenate just [0-9])
)

Think this would be too "expensive" on any busy project as well, any thoughts? — xaosflux Talk 15:25, 12 April 2022 (UTC)

Pinging Crow who has done something useful in this area. Certes (talk) 17:15, 12 April 2022 (UTC)
@Xaosflux: How do we extract and concatenate just [0-9]? I'd love a str_replace_regexp() function; then we could say str_replace_regexp(added_lines, "[^0-9]", ""). But I don't see how to do this with just str_replace(). Suffusion of Yellow (talk) 19:29, 12 April 2022 (UTC)
@Suffusion of Yellow - I think I'm hoping for a function that doesn't exist here too, just checking if I'm missing something! — xaosflux Talk 19:53, 12 April 2022 (UTC)
Does string(get_matches("[0-9]+", text)) suffice? Certes (talk) 19:58, 12 April 2022 (UTC)
That would get the first number. Not necessary the first number that was changed; just the first to appear in text Which might be better than nothing, I guess. Suffusion of Yellow (talk) 20:09, 12 April 2022 (UTC)
Ah yes, its almost useless. I was fooled by the plural name and Looks for matches of the regex needle ... in the haystack; it actually looks for only one match. Certes (talk) 20:23, 12 April 2022 (UTC)
Not proud of this, but I guess we could say:
parts := "(?s)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(\D*)(\d*)(.*)";

old := get_matches(parts, removed_lines);
new := get_matches(parts, added_lines);

text_old := (old[1] + old[3] + old[5] + old[7] + old[9] + old[11] + old[13] + old[15] + old[17] + old[19] + old[21]);
text_new := (new[1] + new[3] + new[5] + new[7] + new[9] + new[11] + new[13] + new[15] + new[17] + new[19] + new[21]);
nums_old := (old[2] + old[4] + old[6] + old[8] + old[10] + old[12] + old[14] + old[16] + old[18] + old[20]);
nums_new := (new[2] + new[4] + new[6] + new[8] + new[10] + new[12] + new[14] + new[16] + new[18] + new[20]);

text_old == text_new & nums_old != nums_new
Which would check the first ten numbers. I don't think that + uses up conditions, so would could even go for more than ten, in theory. No idea about the time this will take. Suffusion of Yellow (talk) 20:38, 12 April 2022 (UTC)
Simplified a bit (now checking 50 numbers):
not_num := "(?s)^(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(\D*)\d*(.*)$\K";

get_matches(not_num, removed_lines) == get_matches(not_num, added_lines) &
added_lines != removed_lines
Tempted to just try this. Suffusion of Yellow (talk) 23:48, 12 April 2022 (UTC)
Well testwiki loves tests (just disable that monster when not actively testing it!) — xaosflux Talk 00:56, 13 April 2022 (UTC)
See testwiki:Special:AbuseFilter/190. I really doubt this will be all that expensive. grep -P scans through a 100MB enwiki database dump in less than a second, while the pattern from filter 384 takes about 8 seconds. Suffusion of Yellow (talk) 03:19, 13 April 2022 (UTC)
I think I suggested str_replace_regexp to Daimona Eaytoy before on IRC and (iirc x2) they weren't opposed, but just mentioned it wasn't too common to see use cases needing it. ProcrastinatingReader (talk) 10:08, 13 April 2022 (UTC)
Indeed, I kinda remember that request. At any rate, I confirm that I would gladly accept that change. I also recognize that no better solution currently exists. --Daimona Eaytoy (Talk) 11:22, 13 April 2022 (UTC)
Seems I popped it into a phab task afterwards (T285468), forgot about that. I submitted a Gerrit patch for this: 783448. ProcrastinatingReader (talk) 15:16, 17 April 2022 (UTC)
@ProcrastinatingReader, has the patch been merged yet? I haven't checked. - Klein Muçi (talk) 11:19, 22 April 2022 (UTC)
Not yet - I had to make a few changes. Hopefully soon though! ProcrastinatingReader (talk) 12:01, 22 April 2022 (UTC)
This is going to catch a lot of drive-bys who think it's amusingly original to replace "123 is an integer..." by "420 is a integer..." or similar. Lots get caught by an existing filter but I still revert those that slip through regularly. If there aren't too many false positives then it could be a real time-saver. Certes (talk) 12:47, 13 April 2022 (UTC)
I suspect it'd be full of FPs, at least initially. Even when I do RCP, this is a common type of vandalism but there are many times when the IP is actually fixing dates, and the current data is bad. Maybe combined with edit summaries like "fixed typo" (633) would narrow down the FPs(?), for example ProcrastinatingReader (talk) 23:25, 13 April 2022 (UTC)
Not even that it seems; I tested against most recent 3 past hits of 633; two were updating data, and one was fixing an actual error. ProcrastinatingReader (talk) 23:27, 13 April 2022 (UTC)
@ProcrastinatingReader, I'd suggest you read this discussion which deals exactly with what you just wrote and which was the inspiration of the discussion that we're having. - Klein Muçi (talk) 23:36, 13 April 2022 (UTC)
Right on cue, here is a typical example. Certes (talk) 22:15, 14 April 2022 (UTC)

Possible username policy violation: Set to 𝐖𝐚𝐫𝐧

  • Task: catch "your mom" or toilet humor usernames. Maybe some profanities, stuff like that
  • Reason: often, people don't realize such usernames are against policy and end up getting an avoidable indef block.
  • Diffs: See WP:UAA, where such usernames are relatively common.

67.21.154.193 (talk) 15:29, 20 April 2022 (UTC)

  • As someone who has worked UAA for over a decade, I agree this is a real problem, but I'm not sure an EF is the solution. I'd rather we provided better advice about the username policy when creating an account. (although I'd also not that unless they are vandalizing or spamming, most username blocks are "soft" blocks, meaning they can simply try again with a different name) Beeblebrox (talk) 17:47, 26 April 2022 (UTC)
    Yeah, if it's just a minor username issue, then as Beeblebrox said, they can always try again. Otherwise, it's better if they made accounts with usernames like "YurMom69420" as they can get then get indeffed, (sometimes before even vandalising, whilst still giggling about their username) and the autoblocker can then make life difficult for them. Any account with a username which would get flagged by that filter will almost certainly be a vandalism-only account anyway, so it's better to get them hardblocked ASAP. Mako001 (C)  (T)  🇺🇦 05:12, 30 April 2022 (UTC)

Is a filter needed to stop abuse of the "big" markup?

This isn't really a proper filter request, but more of a question about if a filter might be needed. The <big></big> markup can be nested indefinitely, and ends up ridiculously large very quickly. I was surprised to note that there isn't an abuse filter preventing non-autoconfirmed users from doing this outside of their own userspace or the sandbox. See diffs below for an example of the sort of abusive use that I am thinking of. With emoji, with text. Basically just replace the text with some array of choice words, and the emoji with something more insulting. These were both me on an alternate account, testing to see if any filters would trip, as whilst I couldn't see any public filters that would trip, I wasn't sure if a private filter might. However, nothing tripped, not even merely to log it. Might this be a problem? Mako001 (C)  (T)  🇺🇦 14:19, 26 April 2022 (UTC)

I guess the question would probably be how big is too big? In article space I see no reason to have more than one, or maybe two in rare cases. (Does MOS have anything to say on that?)
warning, below is newbie nonsense and is likely to be incorrect
page_namespace == 0 & ( added_lines contains <big><big> ) would likely do it (not including the non-confirmed check) - I'm not sure if this requires a PST check or not. EDIT: Actually, reversing that and doing the big-big check first would likely save conditions , unless it needs PST. casualdejekyll 16:35, 26 April 2022 (UTC)
I feel like this not a widespread problem requiring an edit filter. I've seen it misused a few times over the years, but it's not that .....big... of a deal. Beeblebrox (talk) 17:36, 26 April 2022 (UTC)
Thanks Beeblebrox, if you don't think that it is something needing a filter that's enough for me. I guess we can close this off now. Mako001 (C)  (T)  🇺🇦 03:43, 27 April 2022 (UTC)
@Mako001: This was part of 384 (hist · log) for years; I removed it in 2019 because I couldn't find a single match in 5000 hits. If this is problem (in mainspace), I can restore it without testing; it hadn't caused any false positives, at least. But I don't like disallowing hypothetical or rare strings; eventually so much cruft accumulates that the filter becomes unmaintainable. Suffusion of Yellow (talk) 18:09, 27 April 2022 (UTC)
It was also in 31 briefly but got removed because it completely prevented some guy from using the Talk namespace casualdejekyll 01:12, 6 May 2022 (UTC)

IP user rapidly reverting edits

Would it be possible and useful to extend 249 (non-autoconfirmed user rapidly reverting edits) to IPs with this sort of editing pattern? I'd expect some newly registered accounts to do it ten times to win a prize (allow four days for delivery), but even from an IP it disruptively clogs the page histories. Certes (talk) 20:21, 21 April 2022 (UTC)

obligatory newbie disclaimer - if anything I say here is wrong, the trout is available for extensive usage.
The current filter is based off of edit summaries, which manual reverts aren't going to be covered by in all cases. And I'm pretty sure we can't take the tag. That's done after the filters are all done IIRC. It's not really possible to check what's happened in previous edits either, or if it is, it would be quite slow. So possible is likely a no.
Useful? That's more debatable... how often even IS this sort of thing?
I don't even think it's possible to check for repeated, quick edits to the same page.. or if it is, it would require a filter that trips on literally every edit by a user without the confirmed group, which sounds to me like a bad idea inherently. The "bad idea" factor is compounded by how that category of "repeated, quick edits", depending on how you define it, could easily include good faith stuff.
I could be wrong on literally all of this, though. casualdejekyll 21:09, 21 April 2022 (UTC)
I think (not sure, someone else might know for sure) that this filter was a reaction to a sort of vandalism where a new user or IP just "derps" all recent changes, probably using a huggle based script that also drops a ridiculous level 4im warning onto everyone's talkpage after reverting, I've seen one do something like 150 edits in under two minutes before being blocked. It already works on IPs, as IPs are not autoconfirmed. I used to get it when I did RC patrol as an IP. It's *[bleep]* annoying (and actually got me blocked once, though I never realised at the time as the IP was reallocated before I tried to edit again a few days later). The way to deal with the sort of thing you describe above though is deliver them an editing tests warning, as they shouldn't be doing tests in mainspace. Start with uw-sandbox and then test2, test3, vand4, and then AIV for a bit of time out (if they still don't get the message).
Also, it shouldn't be too difficult to have a filter that trips for repeated, quick edits to the same page, but as you pointed out, the issue would be that it would have a massive number of false positives, or it would only catch the very fastest of bots. Mako001 (C)  (T)  🇺🇦 14:52, 26 April 2022 (UTC)
@Certes, Casualdejekyll, and Mako001: Worth a try. It's not like it's going to take hours of fussing over a tricky regex. See 1199 (hist · log). Anyway, casualdejekyll, mostly correct. The first can't "see" tags, so there's no way to trip on exactly what that IP was doing. So I'm just logging, as you said, all non-confirmed edits that exceed a rate limit (over 8 edits in 300 seconds). If it's too spammy, it might be worthwhile to limit this to new accounts, and exclude IPs, because anyone editing at that rate is almost certainly not "new". Or, it might be worth only tagging editors who hit N different pages in M seconds. I realize that's exactly what the example IP wasn't doing, but again it's far more suspicious. Sometimes people don't use preview, and take a dozen attempts to fix their own mistakes. Suffusion of Yellow (talk) 18:45, 27 April 2022 (UTC)
Thanks. "Of the last 1,525 actions, this filter has matched 335 (21.97%)": I hope that just means 22% of edits are by the unconfirmed, rather than 22% are rapid fire! Certes (talk) 18:49, 27 April 2022 (UTC)
That is referring to all unconfirmed edits, @Certes. The filter has only actually tripped 1 rapid fire edit so far, Special:AbuseLog/32467726 casualdejekyll 18:51, 27 April 2022 (UTC)
Ooh, you got a hit! An IP is systematically creating Category talk: pages. Certes (talk) 18:51, 27 April 2022 (UTC)
Just a suggestion, but maybe the private "new account suspicious activity" filter may have something in it which could be modified to assist with this? I don't know what that filter is exactly, but I have seen it pick up on stuff like EC gaming and such, but it apparently only looks for edits by registered users. Mako001 (C)  (T)  🇺🇦 10:53, 28 April 2022 (UTC)
@Suffusion of Yellow - Can you remove it from User and User Talk namespaces (lots of The Wikipedia Adventure)? casualdejekyll 01:17, 6 May 2022 (UTC)
@Casualdejekyll: Yeah, limited it to mainspace and templates. Suffusion of Yellow (talk) 21:15, 10 May 2022 (UTC)

Changing height tag needs fixing

This is from the edit history of 95.70.214.241 (talk · contribs), who appears to be a certain LTA. The height of eight athlete BLPs was changed by one or two cm. Only three of the edits had "‎(Tags: ... changing height and/or weight). Would it be possible to have all such changes tagged? It could help with wp:RCP. Thank you! Adakiko (talk) 07:31, 20 April 2022 (UTC)

All these FNs are caused by ! ( page_title in added_lines ) in filter 391 (hist · log). Maxim, I realize I'm asking you about something from 11 years ago, but any idea what that's doing? I can't figure it out. It effectively prevents the filter from tagging any time there's already a reference on the same line with article's subject in the title. But people change referenced figures all the time. Suffusion of Yellow (talk) 21:08, 10 May 2022 (UTC)
Hi Suffusion of Yellow. It's interesting that this question arises now, as I've been thinking about the fact that I still have the edit filter manager flag even though it feels like I last made edits to the filters roughly when I did this one, that is, 11 years ago.
At the risk of being pedantic, the original filter didn't have ! ( page_title in added_lines ) but instead ! article_text in added_lines, which was changed by Zzuuzz in 2019 here. But, article_text and page_title have the same meaning per documentation, but the former is deprecated. I don't recall why that line was in there—it could be something as banal as thinking that article_text was something that it is not. I am fairly sure that I used a different filter as a starting point for this one; it's possible that this line was germane to the other filter, but I really don't remember what the other filter could have been, nor why the line is here now.
I've just tested the filter with and without offending line (line 3), and that seems to pick up the untagged edits (which were not caught because the lines in question had a reference which contain the title of the article). I think the offending line can just be deleted, but I'm interested to see whether if you had any thoughts on this.
As a final note, I find it very heartening that this filter, generally with the same rules as when it was written, is still useful after 11 years. :-) Maxim(talk) 00:03, 11 May 2022 (UTC)
Thanks,  Done. Suffusion of Yellow (talk) 21:35, 11 May 2022 (UTC)

How'd the repeat filter not catch this?

this and this seem rather obvious instances... RandomCanadian (talk / contribs) 22:17, 26 April 2022 (UTC)

Does the filter you had in mind (1163?) only check the article namespace? Certes (talk) 22:36, 26 April 2022 (UTC)
@Certes: Looks like it does. Is there any reason not to extend it to talk pages, beyond the obvious "well, they get less traffic". That kind of edit is still the kind of stuff that's so universally useless that there's no point to allow it or pollute even talk page histories with it. RandomCanadian (talk / contribs) 02:33, 27 April 2022 (UTC)
I'd say extend to everything but sandboxes, also see this for one too. If active in talk pages (at least) it would stop a good deal of nonsense. Mako001 (C)  (T)  🇺🇦 03:48, 27 April 2022 (UTC)
I set 1163 to mainspace only because the filter catches a lot of hits, and because I haven't managed to narrow down FPs to make disallow or DatBot appropriate, it needs to be manually checked to be useful. If it gets tons of hits (due to other namespaces) I figure it will just turn into those log-only filters that never get checked. ProcrastinatingReader (talk) 11:37, 30 April 2022 (UTC)
@ProcrastinatingReader: This probably has a really obvious answer, but would it reduce the FPs enough to disallow and/or Auto-report if another filter existed that only triggered when it repeated more than, say, six times? Mako001 (C)  (T)  🇺🇦 14:12, 14 May 2022 (UTC)

Repeated emojis

I'm surprised ClueBot didn't catch this edit, but seems it didn't. Could we make an edit filter for additions of repeated strings like it? {{u|Sdkb}}talk 02:51, 28 April 2022 (UTC)

Ha, related to my report above... Are emojis also not covered by Special:AbuseFilter/1163? Two improvements for the price of one, I say... RandomCanadian (talk / contribs) 03:24, 28 April 2022 (UTC)
It does, but that's template namespace. It would've caught that edit in mainspace. ProcrastinatingReader (talk) 11:38, 30 April 2022 (UTC)
If it's not too expensive, most mainspace-only filters might benefit from covering templates too. Apart from oddities like Template/Did you know nominations/..., they can do more damage. Certes (talk) 12:01, 30 April 2022 (UTC)

Expand the "poop" filter to include "poo poo"

  • Task: Stop edits adding this string, by adding it to a "disallow" filter (such as the existing "poop" filter)
  • Reason: Because the poop vandalism filter doesn't catch it, and this is quite common.
  • Diffs: [51]

Can this string ("poo poo") be added to the ones prevented by the "poop" filter? I see virtually no legitimate use for this string in mainspace. 💩 Mako001 (C)  (T)  🇺🇦 12:20, 12 May 2022 (UTC)

just another example. Mako001 (C)  (T)  🇺🇦 10:44, 18 May 2022 (UTC)

should we revive Special:AbuseFilter/402 with a warning message similar to MediaWiki:abusefilter-warning-AfC-unsourced-submissions?

The filter here (for unreferenced articles) was deleted back in 2013 [52] because it apparently had no purpose. Now, a warn+tag filter exists are submitting completely unsourced afc submissions see here. I think reintrodicing filter 402 (with warn) would help against non-notable or spam creations, as well as make new users add more reliable sources.

Also, are the above "pronoun change" and "adding death" filters going to be implemented?

67.21.154.193 (talk) 15:34, 2 May 2022 (UTC)

Perhaps #964 should be extended to mainspace? casualdejekyll 01:11, 6 May 2022 (UTC)

Pinging @Tamzin: since she's an edit filter manager, and no EF manager has responded to any section below this one yet. 67.21.154.193 (talk) 13:41, 30 May 2022 (UTC)

False GA/FA tags

  • Task: Warn when articles are created with Good Article or Featured Article tags already attached.
  • Reason: To prevent editors who don't understand GA/FA procedure from misusing the tags, and to stop malicious use of the same.
  • Diffs: Most recent one I could find (I removed the tag later): Special:Diff/1090058919

Sumanuil. 05:48, 27 May 2022 (UTC)

Is this a common issue? I can't imagine it happens particularly often and these kinds of non-urgent problems are likely to be picked up as part of new page patrol. Sam Walton (talk) 21:25, 30 May 2022 (UTC)
Not sure how common, but it shouldn't be happening at all. Sumanuil. 03:10, 31 May 2022 (UTC)
There is Special:AbuseFilter/716, but it only catches non-autoconfirmed accounts. 67.21.154.193 (talk) 12:16, 2 June 2022 (UTC)

pronoun changes

  • Task: changes or pronouns in articles "he" to "she", "they" to "he", etc
  • Reason: often a MoS violation when transgender BLP subjects are involved. Saw a request in the archives of the pages but with no response
  • Diffs: one example

67.21.154.193 (talk) 15:29, 20 April 2022 (UTC)

Previous comment: Wikipedia:Edit_filter/Requested/Archive_18#Changes_of_pronoun started by User:Valereee. No one responded 67.21.154.193 (talk) 13:52, 27 April 2022 (UTC).
Probably something with similar logic to Special:AbuseFilter/1154 would work. Will work on this. Galobtter (pingó mió) 20:32, 2 May 2022 (UTC)
Tamzin and Firefly already started on this, see private filter 1200 (hist · log). Suffusion of Yellow (talk) 20:35, 2 May 2022 (UTC)
@Suffusion of Yellow Thanks for letting me know. Galobtter (pingó mió) 21:40, 2 May 2022 (UTC)
seems to be public now and getting a good amount of hits. thx! 67.21.154.193 (talk) 13:44, 30 May 2022 (UTC)
@Tamzin: I think there are edits in India Willoughby that have not been hit by the filter but weren't. Maybe someone should fix that? 67.21.154.193 (talk) 13:27, 6 June 2022 (UTC)

Turkey / Türkiye

  • Task: Log only: when a user changes "Turkey" to "T[u|ü]rkiye". Article namespace only.
  • Reason: Turkey has (officially) changed its name to Türkiye, however per Talk:Turkey#Requested_move_3_June_2022, the overwhelming consensus is that COMMONNAME should apply and our article remain at Turkey. There have already been a number of attempts to change the name of the country (and even to move a page) based on the new name, so it would be useful to log such changes when the RFC closes. Log only, as a minority may be valid (i.e. when the actual name of an organization includes the Turkish name). Black Kite (talk) 11:46, 4 June 2022 (UTC)
@Black Kite: Simple initial attempt at this logging at Special:AbuseFilter/1207. Sam Walton (talk) 08:52, 6 June 2022 (UTC)
Thanks! Black Kite (talk) 13:12, 6 June 2022 (UTC)

Racist labeling of political leaders and historical figures

Here are the two most recent examples:

This has been happening for several months, perhaps back to last year. I've seen various combinations of the wording "white supremacist" and "racist " edited into political articles, both currently serving individuals and historical figures. These would be edits made after the article was already created. Not limited by geographical area, time period, living or deceased office holders. Can we create a bot that blocks these? And once created, can we update it if a new similar term begins happening? — Maile (talk) 22:40, 30 September 2021 (UTC)

I just blocked Special:Contributions/2600:1700:12E1:A090:0:0:0:0/64 for a year as it appears every edit from there has been junk for a long time. That covers several examples of what you describe although I don't know if there are more from other IPs. Johnuniq (talk) 23:21, 30 September 2021 (UTC)
Well, that's helpful. Thanks. I don't know if it's been this one IP or not. But I can date this phenomenon to beginning after the BLM events of the last year or two. For whatever reason, one or more editors have been motivated to label BLP and deceased individuals, or geographical areas as, racist, by one term or another. — Maile (talk) 23:59, 30 September 2021 (UTC)
@Maile66: Started testing at 1014 (hist · log). Just checking for "racist" or "supremacist" for now. I'll add a check for biographies later. Any other words?
FYI, I doubt this could ever be refined to the point where it's possible to disallow. Yes, all filters have false positives, but I'm worried about what message we'll be perceived as sending if we stop "X was the target of racist taunts on the field", etc. Suffusion of Yellow (talk) 22:38, 11 October 2021 (UTC)
@Suffusion of Yellow: no other words come to mind. I keep hoping this type of editing will fade on its own, but I doubt so in my lifetime, because a lot of it is fed by national-international media reports. Not necessarily limited to the United States, or any other country. — Maile (talk) 22:44, 11 October 2021 (UTC)
@Maile66: I added both words to 189 (hist · log) (tag-only). This is actually really common; see the log of 1014 (hist · log). I still might try to work out a disallowing filter if possible. I just don't feel comfortable with stopping Senator McSenatorface resigned after admitting to sending hundreds of racist texts...<ref><ref><ref>. It looks like we're whitewashing. So maybe I'll just disallow very small edits without refs, e.g. Special:Diff/1053952115 and leave 189 to tag the rest. Suffusion of Yellow (talk) 23:43, 7 November 2021 (UTC)
@Suffusion of Yellow: Understood. My request here is more about the concern the past year or two of a pattern of adding blatant labeling to existing articles, usually in the opening sentences of a lead, and begin more or less, " ...(name) is a racist and white supremacist ... " without any sourcing indicating it as factual. I've never seen, "" ...(name) is a racist and black supremacist ... " Or pick any color inbetween. — Maile (talk) 23:55, 7 November 2021 (UTC)
Well, here's one. But of course it's not as common. Suffusion of Yellow (talk) 18:55, 11 November 2021 (UTC)

Should this section still be pinned? It has been months since the last comment here67.21.154.193 (talk) 12:13, 14 June 2022 (UTC)

Unpinned. Sorry, Maile66, I could never work out a filter targeted enough to set to disallow. I moved the contents of my test filter 1208 (hist · log), at least. Suffusion of Yellow (talk) 23:14, 14 June 2022 (UTC)

should characters in the 33xx range and circled letters and numbers be included in the filter?

67.21.154.193 (talk) 15:25, 24 May 2022 (UTC)

see Enclosed Alphanumerics, Enclosed Alphanumeric Supplement and Enclosed CJK Letters and Months unicode blocks. Edit: also CJK Compatibility 67.21.154.193 (talk) 17:01, 24 May 2022 (UTC)
Also some at Dingbats. 67.21.154.193 (talk) 13:18, 2 June 2022 (UTC)


Vaguely related: are "funny" Greek letters being normalised to, er, normal Greek letters? This looks like an FP: Special:AbuseLog/32463333 (06:00, 27 April 2022: Ωχγ triggered filter 1,168). Certes (talk) 16:03, 24 May 2022 (UTC)

It seems like the ohm symbol is now removed bc of this. This is the same filter 67.21.154.193 (talk) 17:00, 24 May 2022 (UTC)
Yes. The Angstrom symbol, Kelvin symbol and ohm symbol all normalised to regular characters, so I had to remove them to avoid false positives. — The Anome (talk) 15:15, 8 June 2022 (UTC)

Also:
Change:ℬ and ℭ should have a pipe | between them. This is on the line that starts with “(accountname rlike "℀|℁|ℂ|℃|℄|℅|℆|ℇ|℈|℉…”
Add (test for false positives by unicode normalization first): ªº (ordinal indicators), ₐₑₒₓₔₕₖₗₘₙ₊₋₌₍₎ₚₛₜⱼ (subscript), ꬲꬽꬾ (blackletter), ⁱⁿ⁺⁻⁼⁽⁾ (superscript), ⱻꜰɢʛʜɪʟɴɶʀꝶꜱʏꭥꞮꟸᶦᶧᶫᶰʶᶸ(small caps/modifier) ᶛᶜᶝᶞᶟᶠᶡᶢᶣᶤᶥᶨˡᶩᶪᵚᶬᶭᶮᶯᶱᶲᶳᶴᶵᶶᶷᶹᶺᶻᶼᶽᶾᶿꟹꭟʰʱʲʳʴʵʷˣʸꭜꭝꭞ (modifiers) ⅠⅡⅢⅣⅤⅥⅦⅧⅨⅩⅪⅫⅬⅭⅮⅯⅰⅱⅲⅳⅴⅵⅶⅷⅸⅹⅺⅻⅼⅽⅾⅿↀↁↂↃↅↆↇↈ (roman numerals)
(more small caps and modifiers --->) ꟲ (unicode A7F2) ꟳ (unicode A7F3) ꟴ (unicode A7F4) 𝼂 (unicode 1DF02) 𝼄 (unicode 1DF04) 𝼐 (unicode 1DF10)
Latin Extended-F unicode block (10780-107BF; bunch of modifier letters) 67.21.154.193 (talk) 12:14, 2 June 2022 (UTC)

Thank you. These strings are a nightmare to edit, because they break text renderering in the online editor. I'll take a look through your list and see what I can incorporate. I suspect some of these might already be caught by higher-level filters at the Mediawiki or global config level, see MediaWiki:Titleblacklist and [53], which are useful, but not comprehensive enough, as hits on Filter 1168 keeps on demonstrating. — The Anome (talk) 15:19, 8 June 2022 (UTC)
Yeah, I think this would be better in the global blacklist, but the problem is if there are false positives, by unicode normalization, how would we know. It would also be more difficult to pinpoint which charactor exactly is causing the false positives. BTW, there are a bunch of likely problematic characters in the 2000-2bff and 1f000-1ffff range. 67.21.154.193 (talk) 14:20, 9 June 2022 (UTC)
Also, this would probably need a custom message if it were to be added to the title blacklist against usernames. 67.21.154.193 (talk) 15:08, 9 June 2022 (UTC)
@The Anome: Haven't looked into the changes suggested here, but I swapped out those literal characters with \x{...} escapes, which should make the filter easier to edit. Suffusion of Yellow (talk) 21:12, 13 June 2022 (UTC)
@Suffusion of Yellow: I'm never quite sure what regex format anything supports, so I'm glad to hear that \x{...} escapes work. Regarding normalization false positives, I can easily check for that with a bit of Python code: the ohm, Angstrom and Kelvin signs came as a bit of a surprise to me. Ultimately, it would be great if we could get these characters pushed into the top level Mediawiki filter, so we don't need to have this filter at all. UAX #31 might be our friend here. But we are already doing pretty will by just blocking the mathematical and IPA characters, as they are so popular with text obfuscators/prettifiers. — The Anome (talk) 22:00, 13 June 2022 (UTC)
Should we include ligatures (except W) in the set of unwanted characters? Certes (talk) 18:26, 14 June 2022 (UTC)
Maybe, but that would need testing first. Otherwise, we might risk blocking valid usernames by unicode normalization. Some ligatures (such as AE and OE) should definitely not be blocked. Also, is it time to get the characters in filter 1168(as well as maybe circled letters) in the global blacklist (with a custom message)? 67.21.154.193 (talk) 13:22, 16 June 2022 (UTC)

Claiming the death of an article subject

  • Task: section titled "death". claims in general that the subject is dead
  • Reason: This filter is needed for the same reason Special:AbuseFilter/712 in needed

67.21.154.193 (talk) 15:40, 20 April 2022 (UTC)

Might I add that it should trip to tag if there is a ref provided, but trips to warn or something if no reference is provided. Generally this would be a filter, active on BLP articles, that trips at the addition of "foo died" or "foo [wildcard, to allow for adjectives] passed away" or similar strings. I've seen a bit of this on RC patrol. Mako001 (C)  (T)  🇺🇦 14:34, 26 April 2022 (UTC)
The main reason I want this is to help prevent death hoaxes from appearing. If it gets overlooked early, it could end up being a while before someone notices. 67.21.154.193 (talk) 13:49, 27 April 2022 (UTC)
so.... is there gonna be any discussion? any action? 67.21.154.193 (talk) 15:22, 2 May 2022 (UTC)
Should I ping someone to this discussion? 67.21.154.193 (talk) 12:06, 31 May 2022 (UTC)
I decided that I am going to ping @Ohnoitsjamie: to this page because he's quite active, and numerous sections have not had any response from EF mamagers. 67.21.154.193 (talk) 14:37, 1 June 2022 (UTC)
I don't have time at the moment to work on that; I have written that kind of filter before; I'd want to do a lot of testing on it first as it's more complex than average. OhNoitsJamie Talk 13:57, 2 June 2022 (UTC)

Came across deleted filter 40, but it seems like it would require significant rework if we were to revive it. 67.21.154.193 (talk) 12:44, 8 June 2022 (UTC)

@Mako001: I really doubt this will be all that useful; there are just too many ways to say, or imply, that someone's dead. In addition to the usual euphemisms, there's also "he was shot by an angry fan", "he overdosed on heroin", "he was eaten by a grizzly bear", and so on and so on. But I'm testing a few common words in 1014 (hist · log); we'll see how much noise there is. Suffusion of Yellow (talk) 19:34, 26 June 2022 (UTC)
@Suffusion of Yellow: Yeah, I'd agree with you on that, it may just prove to be impossible to actually make a filter to check this. I guess it's worth a shot to see what your test filter gets though. Mako001 (C)  (T)  🇺🇦 03:58, 27 June 2022 (UTC)
  • Task: Use a custom warning message to notify a newer user or IP when they attempt to remove a url or ref with an edit summary including "dead url/ref/link" "404 error/message" "doesn't work" or any other string which would indicate that they are removing it because it is a wp:dead link. The message would be more friendly in tone, and would include links to guidelines about dealing with dead links, and a brief summary of those guidelines, something along the lines of "Instead of removing dead links, tag them with {{dead link}}..., try to find an archive yourself at one of these sites (add links to useful archive sites here) or, if you have an account, try using (IABot console link here)."
  • Reason: Many inexperienced users will (in good faith) remove broken links, thinking that they are of no use anymore, and not actually realise that it is a problem. Whilst "references removed" is helpful, this would be a narrower filter than that one, and is designed to offer advice and guidance to users who may not realise that their edits are possibly problematic. If alerted to the appropriate way of dealing with such links, I believe that most of these users would do so, as the responses I have got to uw-dead1 warnings whilst on RC patrol seem to be quite positive along the lines of "oh, thanks, I didn't realise I could fix that, I'll do that now".

Mako001 (C)  (T)  🇺🇦 05:50, 28 May 2022 (UTC)

@Mako001: That could also catch a certain spammer who replaces dead links by links to irrelevant content on a website they promote. Certes (talk) 10:55, 28 May 2022 (UTC)
I think I've seen that one before too, though it would probably need to be more complex than the idea that I had of just "lines removed contains <ref> or </ref>" and "edit summary contains dead link/ref/page/site or 404 or link broken/doesn't work/dead". Mako001 (C)  (T)  🇺🇦 11:05, 28 May 2022 (UTC)
@Mako001: No, the spammer uses an edit summary along the lines of "dead link". They believe (or want us to believe) that a link to their website is an improvement on a dead link. Certes (talk) 11:28, 28 May 2022 (UTC)
@Certes: Do they ever remove the ref tags? If not then it would probably need to be a more complex filter, but the issue you are referring to would be better handled by the spam blacklist anyway (as I understand). Did you want to propose an addition there? Mako001 (C)  (T)  🇺🇦 11:51, 28 May 2022 (UTC)
@Mako001: No, they leave everything unchanged except the URL. It's low volume and we have checks for the text they add, but if it grows then they can go on the blacklist. Certes (talk) 11:53, 28 May 2022 (UTC)
Hmm, if that was a filter, it would have to be a separate filter then, and an LTA one too, so it'd be best to not discuss it here. My idea is to just check if the lines removed contains ref tags or http(s):\\ and has an edit summary suggesting a dead link was removed, so it wouldn't catch their sort of edits. Mako001 (C)  (T)  🇺🇦 12:03, 28 May 2022 (UTC)
So: Lines removed would contain (or similar function if it would be lighter) "<ref>" and/or "</ref>" and/or "http(s)://" along with an edit summary containing "dead link/ref/page/site" "404" "does not/n't work" (and anything else that whoever hypothetically writes the filter can think of). Mako001 (C)  (T)  🇺🇦 04:08, 27 June 2022 (UTC)

Fix <source> tag detection in Filter 432

Currently, filter 432 does a check to ensure that <source lang= is not in the new wikitext. However, this has 2 failures to it that make the detection practically useless.

First of all, <source> is deprecated, and has been superceded by <syntaxhighlight>, which is used instead, so the detection should be at least swapped from <source lang= to <syntaxhighlight lang= (Or both, though judging from the changes in the deprecation tracking category, I don't see source getting used ever).

Second of all, the filter immediately follows the check with a look for lang=, which disregards the possibility of the inline attribute which could come before it (E.g. <syntaxhighlight inline lang=text>).

Side note: I have no idea if this is the correct place to suggest an edit to a filter rather than a new filter, but I don't see any pages anywhere for filter requests other than this, so I'm putting it here. Aidan9382 (talk) 08:16, 16 June 2022 (UTC)

Replacing line 8 !( "<source lang=" in new_wikitext ) & with one of the following should resolve this request:
  • source + syntaxhighlight + inline !( new_wikitext rlike "<(source|syntaxhighlight) (inline )?lang=" ) &
  • syntaxhighlight + inline !( new_wikitext rlike "<syntaxhighlight (inline )?lang=" ) &
Adding conditions using in might have better performance than using rlike. Adding these lines would add the described detection:
  • source without inline exists as line 8
  • source + inline !( "<source inline lang=" in new_wikitext ) &
  • syntaxhighlight + inline !( "<syntaxhighlight inline lang=" in new_wikitext ) &
  • syntaxhighlight without inline !( "<syntaxhighlight lang=" in new_wikitext ) &
I don't have the permission needed to change editfilters, I'm just replying to hopefully save time for someone who can do it. PHANTOMTECH (talk) 20:02, 16 June 2022 (UTC)
@Aidan9382 and PhantomTech:  Done, see Special:AbuseFilter/history/432/diff/prev/27233. Just removed the "lang" check entirely to keep it simple. Suffusion of Yellow (talk) 18:55, 26 June 2022 (UTC)

{{unblock reviewed}} template removal while blocked

  • Task: Warn blocked users when removing the template
  • Reason: As per the declined unblock message, the template should not be removed while blocked. When the template is removed, they should be warned about it.

Sheep (talk) 17:21, 18 June 2022 (UTC)

@Sheep8144402 There are some disabled private filters that might be related to this so there may be a reason why this wont be done, possibly due to low occurrence. This isn't something that needs to be caught before it happens so a bot noticing and reverting the change might be a better option since, if I'm remembering correctly, every edit filter slightly slows the time it takes to process every single edit but a bot would not do that.
Edit filters can warn, allowing the editor to make the edit anyway, but they can also block. Just to clarify, are you suggesting that if a filter is made for this it only warns the user or are you suggesting that it blocks the user from doing this? PHANTOMTECH (talk) 06:32, 19 June 2022 (UTC)
I'm suggesting that it warns the user just in case that if it's set to disallow, they add the template and then when removing it, the edit is blocked by the filter. Warn is a useful action in case this happens and to let the editor be aware of the removal of the unblock template while blocked. Sheep (talk) 12:22, 19 June 2022 (UTC)
I will say only this. The "no removing declined unblock messages" is a terrible rule. It amounts to punishment for daring to question a block. Get blocked by trigger-happy admin? Ok, go ahead and blank your talk page, and leave. But appeal the block? Forever badge-of-shame for you! As in, literally, a Wikipedia page, with your name right at the top, where someone says nasty things about you, that you can't do anything about. All because, apparently, it's just too too hard to click on the page history before reviewing a block.
Ok, end rant. But I will take no part in enforcing this rule; though I won't try to stop anyone else who does. Suffusion of Yellow (talk) 19:59, 26 June 2022 (UTC)
This seems like the kind of filter which, despite not being set to disallow, should probably get community consensus before being implemented. Sam Walton (talk) 11:53, 28 June 2022 (UTC)

A filter to warn editors who sign their mainspace contributions

  • Task: Warns users that it is inappropriate to sign their main space contributions
  • Reason: To avoid users signing their mainspace contributions.
  • Diffs: None found; this filter is meant to be pre-emptive.

I propose the following code, but by all means, please check it first:

page_namespace == 0 &
(
 added_lines irlike "(~~~|~~~~|~~~~~)"
)

Thanks, NotReallySoroka (talk) 13:00, 7 July 2022 (UTC)

NotReallySoroka, if a user signs on mainspace they would match Special:AbuseFilter/1090, although ~~~~~ isn't matched. (edit: not sure if edit filters run after signatures are expanded - if they are run before that then this would be useful) 0xDeadbeef 15:10, 7 July 2022 (UTC)
@0xDeadbeef Edit filters run added_lines evaluates before signature expansion, see traps and pitfalls. added_lines_pst would need to be used for the current pattern in 1090 to catch a signature. PHANTOMTECH (talk) 17:38, 7 July 2022 (UTC)
Thank you for your insights. NotReallySoroka (talk) 00:26, 8 July 2022 (UTC)
@NotReallySoroka: I'd really want to see some diffs demonstrating that this is common and disruptive enough for an edit filter. We don't have limitless resources for filters and need to prioritise those which help prevent disruption to the project. Sam Walton (talk) 09:18, 8 July 2022 (UTC)

De-duplication

"niggah" is matched twice in the regex of AbuseFilter/260. AbuseFilter/384 has it as well. They should be unified. 0xDeadbeef 15:29, 16 July 2022 (UTC)

@0xDeadbeef:  Done Already took care of this when responding to your EFN post. Suffusion of Yellow (talk) 19:32, 18 July 2022 (UTC)

N-word edit filter

  • Task: Prevent vandalism that later requires revision deletion by preventing the word "niggerpox" from being accepted in an edit
  • Reason: Because of persistent vandalism, on and off for several months
  • Diffs: The diffs get revision deleted but you can see evidence of their RD status at the contributions list of an IP range where this broke out.
  • There are probably a half dozen or dozen monkeypox articles where I've seen this happen (and also on their talk pages). I would think that there is a general filter that prevents edits containing the N-word but somehow these edits get through. I'm not sure whether it is one persistent IP editor who gets the urge to do this once in a while or it's a "thing" to do but I've seen it go on since the outbreak of monkeypox. It seems like it would be a simple thing to add this term to whatever filters out the N-word. I can't think of a legitimate use for this term that a filter would prevent. Thank you. Liz Read! Talk! 22:41, 22 July 2022 (UTC)
    @Liz and Samwalton9: Restored 1209 (hist · log). "Nigger" (without word boundaries) was already stopped by several filters, at least in article space; likely they were using some tricks to evade the filters. Suffusion of Yellow (talk) 21:18, 29 July 2022 (UTC)
    Suffusion of Yellow, they were likely using an invisible space, such as the Zero-width joiner. An example of this can be found here. Note that even though the diff doesn't show anything, if you copy the "Test" I inserted and use the javascript console to check its length, it would show as 5. This is because I inserted a zero-width joiner between the "e" and the "s". 0xDeadbeef 18:08, 30 July 2022 (UTC)
    @0xDeadbeef: FWIW, the degree of determination to get around the filter yesterday suggests an LTA, or at least of those people who tries to get around filters to make themselves feel smart. So maybe best to end this public discussion. Suffusion of Yellow (talk) 19:35, 30 July 2022 (UTC)