Jump to content

User:Chris the speller/SkipTrick

From Wikipedia, the free encyclopedia

Sometimes it would be handy to have more than one "Skip" text field in AWB. You can use Find & Replace expressions to simulate this, though it will not be quite as fast, and will take a minute or two to set up. However, this trick may enable you to avoid looking at many pages and avoid accidentally making erroneous changes. It temporarily changes certain strings, allowing AWB to skip articles that only contain false hits. Save these settings so you can use them periodically without having to look at the same false positives each time.

A simple example

[edit]

You want to change "infact" to "in fact", but there are many articles that contain URLs with "&infact", "?infact", ".infact", etc. In the table below, the first rule "moves" these out of the way temporarily, by changing them to a different string that is extremely unlikely to appear in the article. The second rule performs the actual desired fix, then the third rule restores the temporarily altered strings. If you have checked "Skip if no replacement", and the second rule did not actually find something to fix, then the article has no net changes and will be skipped. In the rare case where the first and second rule both made changes, those made by the second rule will be the only ones visible in the difference window. But if you have checked "Add replacements to edit summary", it will produce a rather bewildering edit summary. For this reason, you may want to uncheck that box and manually enter the edit summary for that one article.

Find Replace RegEx? Purpose
([?/$+.&])infact $1inqxqxfact RegEx change false-positive targets
\b(I|i)nfact\b $1n fact Regex actually change all others
inqxqxfact infact no undo all temporary changes

You may also add other rules before and after these to temporarily change "infact [sic]" and other such strings.

Some similar protection can be obtained by checking "Ignore external/interwiki link..." and "Ignore templates, refs...", but these can often prevent you from finding actual misspellings. The technique shown here lets you dig a little deeper. Take off the training wheels, but be careful!

"infront" example from real life

[edit]

The first 11 rules clear out known proper names, known image names, links, known reference names and templates. The last two rules make the real changes and then undo all temporary changes.

Find Replace RegEx? Purpose
infront of temple nature inqxqxfront of temple nature no part of image name
Infront Motor & Spo Inqxqxfront Motor & Spo no part of proper name
Infront Investments Inqxqxfront Investments no part of proper name
Infront Sports Inqxqxfront Sports no part of proper name
infront of the Pearl Roundabout inqxqxfront of the Pearl Roundabout no part of image name
(cannel) infront (cannel) inqxqxfront no part of image name
Road infront of lohagara Road inqxqxfront of lohagara no part of image name
publisher=Infront publisher=Inqxqxfront no part of cite template
[[Infront [[Inqxqxfront no part of a link
name="infront name="inqxqxfront no part of a reference name
(I|i)nfront}} $1nqxqxfront}} RegEx part of a template such as "Proper name" or "Not a typo"
(?<![-/.])\b(I|i)nfront\b $1n front RegEx actually change all others
(I|i)nqxqxfront $1nfront RegEx undo all temporary changes