Wikipedia talk:WikiProject Check Wikipedia/Archive 2
This is an archive of past discussions about Wikipedia:WikiProject Check Wikipedia. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 |
Randomize the top 50
Naturally some will be more tedious to fix than others, even in the same category. Or they may be a few you don't want to bother with because you've already nominated them for deletion, or maybe you've deliberately skipped it for some other reason, but the same pages shouldn't be stuck at the top of the list every day in the meantime. — CharlotteWebb 17:23, 15 June 2009 (UTC)
- Well right now there aren't any updates... I think that the selection for the 50 changes a fair bit each day normally. –Drilnoth (T • C • L) 20:33, 15 June 2009 (UTC)
- It doesn't change day-to-day (beyond removing those fixed) but I guess this is (and maybe SK will confirm) by design. It finds the first 50, you fix them, it gives you the next 50. In the meantime, it only has to scan down the list for the first 50 errors, rather than working out all possibilities and then randomly choosing 50. Maybe it's a technical limitation in that sense then. - Jarry1250 (t, c) 15:57, 16 June 2009 (UTC)
- I agree on the randomization. Err 028 "Table not correct end" is a great example of this. LilHelpa (talk) 17:43, 16 June 2009 (UTC)
Attention!
Particularly suited to CHECKWIKI, I think:
This a general notice to all AWB users: you can now install the Fronds plugin, and contribute towards improving it. Find/Replace On Demand Services (FRONDS) are collaboratively-created blocks of Find-Replace combinations for AutoWikiBrowser, where knowledge can be shared for maximum efficiency. All AWB users are invited to try them out, and make suggestions. Don't know anything about regular expressions? Fear not, you can still enjoy using the plugin. Fronds is particularly suitable for those collaborating to make repetitive edits. Any questions can be directed to the talk page or my user talk page. Cheers, - Jarry1250 (t, c, rfa) 19:51, 22 June 2009 (UTC)
- I plan to as soon as I'm doing CHECKWIKI work again... taking a bit of a break from it right now (and giving my bot a rest before the massive task that it might be doing in the near future. –Drilnoth (T • C • L) 19:59, 22 June 2009 (UTC)
is an article on my watch list. Today the article was damaged by someone who was editting at a robotic rate quoting WP Check rule number X. The article was not the only one damaged. I have undone these changes that did damage. However I'm guessing the WP Check wikipedia is not using robots to make the changes as it wants to avoid doing damage. Warning to readers. Some damage is happening. Is there a team who overcheck the work and make do all wrong changes? Can I get the bot rules changed so that damage is not done? Victuallers (talk) 14:35, 18 June 2009 (UTC)
- In the first instance I suggest you contact the user involved to see what went wrong. Contributors to this project do act in good faith with the intention of zero mistakes and a positive contribution to Wikipedia. Once the cause of the problem is known we can see what measures are needed to prevent it happening again. Rjwilmsi 16:20, 18 June 2009 (UTC)
Sorry I thought I had left a message on the users page ... but I see you have anyway ... thanks Victuallers (talk) 21:48, 18 June 2009 (UTC)
- Those were my edits. Would you please point out the specific damage? I've looked at the changes I made and I'm not sure what you consider damage, but I'm willing to be educated! Thanks! --Auntof6 (talk) 21:53, 18 June 2009 (UTC)
- From the diffs it appears that you commented out an image that had an imagemap with a wikilink to the title of the page. Clearly, commenting out an image, if that's what happened, is not the correct fix. Rjwilmsi 07:59, 19 June 2009 (UTC)
- OK, I see it. I also looked at my other edit that you reverted, to Antoine Polier and I see the same thing. What must have happened is that I removed the wikilinks to the article title, which caused a problem for the image map code (it apparently requires a link, not just text). When I looked at a preview of the page, I just saw the error messages and assumed (yeah, I know...) that the errors had been there before. Mea culpa.
- I'll check the text for the Wikicheck error I was working on and make sure it mentions not to remove links inside image maps, even for the article's own title. Maybe AWB can be changed to recognize that situation as well. Sound good?
- Thanks for pointing this out. --Auntof6 (talk) 08:34, 19 June 2009 (UTC)
- AWB has already been updated not to remove any self links in imagemaps. The latest snapshot has all the fixes for that. Rjwilmsi 08:55, 19 June 2009 (UTC)
- Oops, OK, then I'll go remove the bug for it that I just reported. Thanks for facilitating this discussion. --Auntof6 (talk) 09:18, 19 June 2009 (UTC)
- AWB has already been updated not to remove any self links in imagemaps. The latest snapshot has all the fixes for that. Rjwilmsi 08:55, 19 June 2009 (UTC)
- From the diffs it appears that you commented out an image that had an imagemap with a wikilink to the title of the page. Clearly, commenting out an image, if that's what happened, is not the correct fix. Rjwilmsi 07:59, 19 June 2009 (UTC)
I'd argue that removing the image-maps isn't entirely a bad thing, but that's another matter. — CharlotteWebb 15:48, 4 July 2009 (UTC)
PerfectIt (computer program for easier copyediting)
I have just discovered a computer program for easier copyediting. See Intelligent Editing - Cleaner, Smarter, Better Documents.
-- Wavelength (talk) 02:58, 29 June 2009 (UTC)
Couple suggestions for additions to the check wiki list
I have a couple suggestions to things that could be added to the list of things to look for.
- replace passed away with died
- cleanup related to talk pages (no project or need parameters)
- replace lived with born and died in Infobox Military person articles
- add the width 98% fix to tables that display the scroll bar at the bottom of the page
- replace page= p, page= pp and other variations with just page=. Since page= renders the pp page= pp displays like pp pp. --Kumioko (talk) 21:00, 30 June 2009 (UTC)
On #6, are you referring to templates like {{cite book}}? For these the correct input is | page = 109 or | pages = 255–310 so it uses the agreed abbreviation depending on whether it is singular or plural. Conceivably this could be re-written to use something like "start_page", "end_page", shrug… — CharlotteWebb 15:25, 4 July 2009 (UTC)
Pseudo-list using bullet character (•)
See [1]. Lines starting with \n(?:\u2022|�*2022;|�*8226;|•)\s* should be changed to use an asterisk. — CharlotteWebb 21:22, 4 July 2009 (UTC)
Broken character entity references
Any chance you could run a script to find things like [2], [3]. I've been finding a lot of these lately where the semi-colon is missing. Obviously this would be listed as a higher-priority error. — CharlotteWebb 21:07, 6 June 2009 (UTC)
- I'll mention it to sk. –Drilnoth (T • C • L) 23:10, 6 June 2009 (UTC)
Many instances of this error were introduced by certain bots. Take for example [4] (page created by bot), [5] (bot-generated link title). Odd that they would have this bug in common. — CharlotteWebb 21:42, 13 June 2009 (UTC)
- You might want to drop a note to the operators of the two bots. Especially the first one is fairly recent. -- User:Docu
Too bad the does not identify which bot generated the title. I can't be arsed to figure out which bot does this, though I'm sure I knew it once and forgot. Anyone who does remember could kindly pass along this advice as well. — CharlotteWebb 17:09, 15 June 2009 (UTC)
- I’m pretty sure the vast majority of those came from DumZiBoT as it when through this wiki a few times before we found a way to determine the unknown character encoding (It’s is a problem the website not specifying this information). There have been two other bots to my knowledge both with slightly different signatures (one with a dash, the other uppercased) and both have refused to implement the fix due to complexity. If you could find a way to detect, I could patch the Toolserver’s web port of reflinks.py to automatically fix this problem. — Dispenser 17:16, 9 July 2009 (UTC)
Headline ALL CAPS
Is it possible to have an "exceptions" list for this test? About 42 of the top 50 items are valid capitalised items, mainly either abbreviations, keywords or titles. welsh (talk) 19:21, 7 June 2009 (UTC)
- I will create in the next future a Whitelist for all errors. So we can solve this problem. -- sk (talk) 08:56, 8 June 2009 (UTC)
- I reckon 48 of the top 50 items are now valid capitalised items and they do not change between toolserver updates. Is there a way to mix them up or create the exceptions list as it has become very restrictive? welsh (talk) 18:29, 8 July 2009 (UTC)
Media: fix
We have external links which could be internally linked with [[media:]] these typically reference PDF files on commons. It would be better to link these directly as the Check Usage utility would report. Even in the case that the files have been deleted it still would be better as it would then redirect read/editors to the deletion log and not a 404 page. Bellow I provide a sample SQL query and the first ten results.
SELECT page_title
FROM page JOIN externallinks ON page_id = el_from
WHERE page_namespace=0 AND (
el_to LIKE "http://upload.wikimedia.org/wikipedia/commons/%"
OR el_to LIKE "http://upload.wikimedia.org/wikipedia/en/%");
+-------------------------------+------------------------------------------------------------------------------------------------------+ | page_title | el_to | +-------------------------------+------------------------------------------------------------------------------------------------------+ | Gramsci_Melodic | http://upload.wikimedia.org/wikipedia/commons/0/03/Pitt_News-Gramsci_Article-1-9-09.png | | Acqua_Vergine | http://upload.wikimedia.org/wikipedia/commons/0/04/CampoFioriFontana.JPG | | Occitan_cross | http://upload.wikimedia.org/wikipedia/commons/0/04/Occitan_and_French_language_signs_in_Toulouse.jpg | | Radevormwald | http://upload.wikimedia.org/wikipedia/commons/0/06/Kreuz-graeber-2004-1280x960.jpg | | Ed_Gein | http://upload.wikimedia.org/wikipedia/commons/0/07/1930_census_Gein.jpg | | Occitan_cross | http://upload.wikimedia.org/wikipedia/commons/0/07/Blason_Moissac.svg | | St._Mark's_School_of_Texas | http://upload.wikimedia.org/wikipedia/commons/0/07/SM_Planned_Development.jpg | | Sackville_Street_(Manchester) | http://upload.wikimedia.org/wikipedia/commons/0/09/Map_of_Manchester_1801.PNG | | Kaʻula | http://upload.wikimedia.org/wikipedia/commons/0/0b/19380_OAHU_TO_NIIHAU.png | | Area_rule | http://upload.wikimedia.org/wikipedia/commons/0/0b/Patent-Area-Rule.pdf | +-------------------------------+------------------------------------------------------------------------------------------------------+
— Dispenser 10:38, 26 June 2009 (UTC)
- You'll want to also look for http://upload.wikimedia.org/wikipedia/en/%. Of course if you have already run the full query for commons images, you might as well post that list somewhere. — CharlotteWebb 15:46, 4 July 2009 (UTC)
- Oops, that what I ment to type for the second search part, fixed that. It occurs to me that Special:linksearch/http://upload.wikimedia.org/wikipedia/commons/ and Special:linksearch/http://upload.wikimedia.org/wikipedia/en/ are probably easier ways to access this information. I've programmed commonfixes to convert these. — Dispenser 18:14, 9 July 2009 (UTC)
Unicode conversions
The conversion of expressions such as → (→) into unicode is not universally regarded as a good thing, and could be a WP:ACCESS issue for some readers who also edit the encyclopedia: raw text is easier to manipulate that unicode. I know several editors who dislike the unicode characters and find them hard to read. I encourage those contributing here to switch off such wholesale conversions during their clean-up operations. Thanks, Geometry guy 21:56, 22 July 2009 (UTC)
- I've deactivated this listing; you have a good point. It will probably take a few days for the toolserver listing to register the change. –Drilnoth (T • C • L) 22:24, 22 July 2009 (UTC)
Other tools
I have developed some tools among them is a general fixes library used by the "web" set of tools called commonfixes. Commonfixes fixes the errors in the "Reference before punctuation", " External link with line break", and "Reference duplication". Although the latter is still incomplete as the naming logic hasn't been fully separated from the combining logic. Kühn might want to look through the script for other ideas of things to check for. — Dispenser 10:45, 26 July 2009 (UTC)
Stupid question (sorry)
im sorry this is a noob question but how do i get AWD to "make a list" using the lists you have on this page?
i read the manual and could not see any obvious way--Tim1357 (talk) 01:31, 4 August 2009 (UTC)
- I've been doing this to get the lists into AWB:
- Click on "List of all articles with error xxx" that appears with each error on the main project page. A screen will open with a list of all articles that had the error.
- Select all or part of the list with your cursor, then copy what you selected (ctl-C or right-click/copy or however you like to do it)
- Paste into AWB in the box where article titles normally appear. You might want to empty that list first
- By the way, we usually put new talk items at the bottom of the page rather than the top. Have fun! --Auntof6 (talk) 04:16, 4 August 2009 (UTC)
Check the current problems after time again
What about checking the items on the page faster. For example 1-3h each article on Wikipedia:WikiProject Check Wikipedia gets checked again if the problem still exists, elsewhere removed. Or only the TOP10 of every problem so the queue gets automaticly gets filled again. --213.168.121.154 (talk) 04:49, 11 August 2009 (UTC)
- This isn't possible to my knowledge... this project exists on many different wikis, and is already using up a lot of the toolserver's resources. Scanning each project every few hours just won't work because then more than one or two scans would be running at once, an uneeded drain on Wikimedia's servers. –Drilnoth (T • C • L) 20:26, 11 August 2009 (UTC)
Valid Double Pipe in Sylow theorems
The Sylow theorems article has a valid double pipe in a link to generate the symbols (|Ω|) as a link to Additive_p-adic_valuation like this: νp(|Ω|), with wikisource:
[[Additive_p-adic_valuation|ν<sub>''p''</sub><nowiki>(|Ω|)]]</nowiki>
I've fixed it to use nowiki (here and in the article). If it still comes up in the list after the next run, can someone fix the bot to ignore double pipes inside nowiki tags? twilsonb (talk) 01:55, 12 August 2009 (UTC)
- not in today's version --AwOc 09:37, 12 August 2009 (UTC)
Possible new check - external link should be wikilink
I've just fixed a batch of Error 86 External link with two brackets where several articles had external links which needed to be converted to internal wiki links. I think these could easily be identified as a new check and flagged for fixing. An example would be: [http://en.wikipedia.org/wiki/xyzzy], which becomes [[xyzzy]]. welsh (talk) 06:26, 14 August 2009 (UTC)
- Agree, though not sure whether they could be automatically fixed: maybe move to 'see also' section if present? Rjwilmsi 11:04, 14 August 2009 (UTC)
- The ones I saw were simple in-line replacements. It is conceivable that they could be more complex to change, so I think it has to be a manual intervention once flagged. welsh (talk) 17:00, 14 August 2009 (UTC)
- When I ran this report back on 7/16/2009 there were 9265 articles on en.wiki that used external links to reach an internal page. If you'd like to take a crack at them I can re-run the report and post it for you. --Pascal666 19:36, 14 August 2009 (UTC)
- Hmmm. That's a lot, but if its easy to run the report it would be good to have a look at least. welsh (talk) 20:13, 14 August 2009 (UTC)
- The list can now be found at User:Pascal666/external. --Pascal666 08:51, 15 August 2009 (UTC)
- Hmmm. That's a lot, but if its easy to run the report it would be good to have a look at least. welsh (talk) 20:13, 14 August 2009 (UTC)
- When I ran this report back on 7/16/2009 there were 9265 articles on en.wiki that used external links to reach an internal page. If you'd like to take a crack at them I can re-run the report and post it for you. --Pascal666 19:36, 14 August 2009 (UTC)
- The ones I saw were simple in-line replacements. It is conceivable that they could be more complex to change, so I think it has to be a manual intervention once flagged. welsh (talk) 17:00, 14 August 2009 (UTC)
parsed it for you too: additional pages with erros from the toolserver parsed to wiki-language. the information where on the page the error is is missing of course, but for some errors that is not that important --AwOc 04:17, 21 August 2009 (UTC)
- Nice! That will be useful for things that need human-editing (bots can just use the toolserver lists directly). –Drilnoth (T • C • L) 14:55, 21 August 2009 (UTC)
New interface
New interface -- sk (talk) 06:56, 1 September 2009 (UTC)
- Cool! –Drilnoth (T • C • L) 19:49, 1 September 2009 (UTC)
IM requesting a bot
Hey guys,
Im requesting a bot to fix this error[6] See that request here Tim1357 (talk) 01:40, 2 September 2009 (UTC)
Template value ends with break (AutoEd)
This is not always an error, and is often necessary, e.g. in the honorific_prefix field of Template: Infobox officeholder. It is not always necessary to place a break after the prefix, so editors are permitted discretion. People working on this project might do well to look at why the 'error' was placed there before contributing their 'corrections'. Johnhousefriday (talk) 23:33, 4 September 2009 (UTC)
- This is a known error. The script which generates the lists can't skip these very easily, so human discretion is needed. –Drilnoth (T • C • L) 00:40, 5 September 2009 (UTC)
False positive for error 59
Teplate:Nihongo, because it is a table and the break is often needed for astetics. Tim1357 (talk) 11:19, 7 September 2009 (UTC)
I have what I consider another false positive: Central Ishikari Mountains. I need the break to separate the English name and the long native name. The native name spans two lines in the Geobox template unless I force a line break. Then it lands on its own line. imars (talk) 11:55, 7 September 2009 (UTC)
Another fixing library: Commonfixes
I've been developing commonfixes as a component to all my tools. Some of the fixes overlap with the errors detected here like the reference punctuation fixes, but it also includes more cosmetic fixes such as converting HTML to CSS and link simplification. It’s apart of my pywikipedia web package. — Dispenser 06:11, 8 September 2009 (UTC)
WikiCleaner
Hi,
I have starting working on WikiCleaner to add features for this project. It's not functional yet, and there's still a lot of work to do, but you can see how it's supposed to work with the version v0.93. Don't hesitate to try it and write comments (rather on my page on the French wp).
What I have to do in the next versions :
- Allow editing and saving the page.
Show detected errors directly in the textand propose fixes.- Manage errors other than the first two (48 and 80).
- Read complete list on the tool server.
- ...
--NicoV (talk) 13:24, 30 August 2009 (UTC)
- v0.94 is available : the page text is scanned and errors are highlighted directly in the text. Still not functional, since editing and saving are not done. --NicoV (talk) 20:17, 1 September 2009 (UTC)
I have just released v0.95 with the following enhancements :
- Editing the articles and saving them is added.
- The following errors are detected (at least partially) : 2 (Article with false <br/>), 32 (Double pipe in one link), 34 (Template programming element), 43 (Template not correct end), 46 (Square brackets not correct begin), 47 (Template not correct begin), 48 (Title linked in text), 80 (External link with line break), 81 (Reference duplication).
Comments are welcome. --NicoV (talk) 17:22, 4 September 2009 (UTC)
Due to a hosting change, there's a new URL to install WikiCleaner ([http://site4145.mutu.sivit.org/WikiCleaner/WikiCleaner.jnlp here]). It's better to uninstall the old version before going for the new one (0.97). --NicoV (talk) 19:27, 14 September 2009 (UTC)
87: HTML named entities without semicolon
Are we ever going to activate this one? I've been finding more errors like this. — CharlotteWebb 10:02, 20 September 2009 (UTC)
Proposed check
I'm not sure if this can be done since it deals with more than one page. Let's say we have {{rainbow colors}} and that it contains links to Red, Orange, Yellow, Green, Blue, Indigo and Violet. Let's say that Orange redirects to Orange (color) *and* that Orange (color) contains the rainbow colors template. Since the link in the template is to Orange rather than Orange (color), the template code won't change the Orange link to a non-link the way that it should. Can this be made into a check and if not, is finding these something that is generated automatically elsewhere? A slightly weaker, but perhaps more doable check is making sure that links from templates do not redirect. Links from Templates should only redirect if the resultant page doesn't contain the template.Naraht (talk) 18:03, 22 September 2009 (UTC)
- In other words, a search for self-redirects: if there's a link to Foo on Foobar, say, that should show up in this hypothetical check, since Foo redirects to Foobar. Naraht specifically points out the case of self-redirects introduced by templates, which are harder to catch (and therefore more of a problem). I support its addition if that is technically feasible, though it will certainly require resolution by humans rather than by AWB/bots. {{Nihiltres|talk|edits}} 18:49, 26 September 2009 (UTC)
- That sounds about right. I'm still not sure it is possible since all of the checks I've seen have been analysis of a single page...Naraht (talk) 03:10, 27 September 2009 (UTC)
SELECT DISTINCT template.page_title,template.page_namespace
FROM pagelinks AS tplpage
JOIN page AS tplrds ON (tplrds.page_namespace=tplpage.pl_namespace AND tplrds.page_title=tplpage.pl_title)
JOIN redirect ON rd_from=tplrds.page_id
JOIN page AS rd ON (rd.page_namespace=rd_namespace AND rd.page_title=rd_title)
JOIN templatelinks ON tl_from=rd.page_id
JOIN page AS template ON (template.page_namespace=tl_namespace AND template.page_title=tl_title)
WHERE tplpage.pl_namespace=0 AND template.page_id=tplpage.pl_from AND tplrds.page_namespace=0
LIMIT 50;
- Your looking for WP:Database reports, the above query will give results for links from templates which redirect to a page which transcludes said template. I would assume that all templates that transcludes {{navbox}} can be done by a bot. — Dispenser 05:18, 27 September 2009 (UTC)
- Transcluded self-redirects in template namespace: tsr-enwiki-ns10.log (viewer), took about an hour to run. These can be fixed using WikEd's "bypass redirects" button, but it would be wiser to get a bot to do this. The results look right, but its hard to tell with so many JOINs. — Dispenser 16:08, 27 September 2009 (UTC)
Another proposed check
Hatnotes, the slightly self-referential navigational aids in italics at the top of pages and sections, are made using special templates: {{dablink}} or {{rellink}} for the formatting, called in turn by a series of templates like {{about}} or {{see also}} that standardize the text used.
Often newbies or even not-so-newbies will not know about these templates and instead use the semantically incorrect but visually equivalent code :''Example hatnote''
. These instances should be caught so that they can be corrected. I suggest perhaps searching for the (per-line) regex ^:+(('')+|('{5})+)[^']
or some such (that's just off the top of my head; haven't tried debugging it). It might not be perfect, but it would certainly find many of the problems. {{Nihiltres|talk|edits}} 18:49, 26 September 2009 (UTC)
Page size
The page is getting a little big; one user had trouble refreshing it from the toolserver page, and I have occasional trouble loading it. Should we maybe reduce the maximum number of errors listed from 50, maybe to 20 or 25? --Auntof6 (talk) 22:36, 18 September 2009 (UTC)
- I don't even see any need to refresh the page with the new interface available. –Drilnoth (T • C • L) 22:55, 18 September 2009 (UTC)
- I mostly agree, although it might be good to have statistics continue being updated. --Auntof6 (talk) 03:50, 19 September 2009 (UTC)
I think it would be better to list more than fifty, but on separate pages. — CharlotteWebb 10:02, 20 September 2009 (UTC)
- That's already available under each individual check. Click on either "List of all articles with error xxx" or "See this error with the new interface" and you'll see all the articles with an error. --Auntof6 (talk) 11:28, 20 September 2009 (UTC)
- I also have this trouble. I can always retrieve the current page successfully, which is all I normally need. Retrieving an old version or comparing versions via View history consistently shows an error screen: Wikimedia Foundation / Error / Our servers are currently experiencing a technical problem ... Submitting a section edit consistently shows the error screen rather than the updated page, but my changes are usually saved anyway. I have not had this problem with any other page, and I suspect the cause is page size. Could it be split into a few pages, perhaps with them all transcluded from a slim master page with the current name? Certes (talk) 17:25, 16 October 2009 (UTC)
en.wikipedia.org in references
(?is)<ref[^</>]*>[^<>]*http://\w+\.wikipedia\.org/wiki/(?!(?:Talk|Help|User|Wikipedia|Portal|MediaWiki)(?:_talk)?:)
Could we have a check for pages which link main space wikipedia articles in references? I've included code above which should match them. — Dispenser 22:06, 8 October 2009 (UTC)
Timeout error while updating
Every time I try to update the page from the master copy, I get the technical problem/pledge drive page. I'm told to report it as follows:
- Request: POST http://en.wikipedia.org/w/index.php?title=Wikipedia:WikiProject_Check_Wikipedia&action=submit, from 75.144.111.73 via eiximenis.wikimedia.org (squid/2.7.STABLE6) to 208.80.152.43 (208.80.152.43)
- Error: ERR_READ_TIMEOUT, errno [No Error] at Tue, 20 Oct 2009 18:38:34 GMT
The page is about 300 KB in length; I can see how it would run up against time limits. Do updates succeed if they're done at a particular time of day? (If so, what UTC?) Or does an editor need special privileges (e.g. sysop, bot) in order to extend the timeout periods to push it through? Or has the page finally grown to the point where it needs to be broken up into subpages? --Damian Yerrick (talk | stalk) 18:46, 20 October 2009 (UTC)
- It appears such edits are taking effect anyway. --Damian Yerrick (talk | stalk) 11:28, 22 October 2009 (UTC)
- My edits took effect too, though of course I can't guarantee all will. See #Page size above. Certes (talk) 15:39, 22 October 2009 (UTC)
- If you have problem with the updating of the Wikisite then please use the new interface. -- sk (talk) 10:09, 28 October 2009 (UTC)
False positives in 059
I've been using the "new interface" to fix up 059 (line break at end of template parameter). But as one can see in the "Notice:" column of the new interface page for 059, a supermajority of the first 50 results from 059 ordered by article title are clogged with the known "honorific-prefix" false positive. Either someone needs to exclude articles using this parameter from the filter, or they'll never get rescanned properly. --Damian Yerrick (talk | stalk) 11:36, 22 October 2009 (UTC)
- Why not change the Infobox Officeholder? Pleace seperate design and data. This brake at the and of this parameter should stand inside the infobox not inside the value field. -- sk (talk) 10:06, 28 October 2009 (UTC)
- What would you recommend to distinguish situations in which the honorific prefix or honorific suffix is long enough to need its own line vs. situations where it is not? --Damian Yerrick (talk | stalk) 19:08, 31 October 2009 (UTC)
nihongo
I found another false positive for 059 in Berryz Kobo: {{nihongo}}. A line break at the end of the Latin/English puts the Japanese on another line. --Damian Yerrick (talk | stalk) 11:36, 22 October 2009 (UTC)
Suggested check: incorrect template parameters
Should we try to detect incorrect template parameter names, e.g. {{Infobox Actor | job=[[Film director]] | ...}} when the author intended "occupation="? We may need a clever way of ignoring false positives where the value of an unnamed parameter genuinely contains an equals sign. Unfortunately I think the problems found would need to be resolved manually. Certes (talk) 15:38, 15 November 2009 (UTC)
DEFAULTSORT with blank at first position
Is having whitespace before the entry really an error? Articles with the problem appear to list properly in the categories. I'm not sure this is an actual error. Jason Quinn (talk) 17:49, 25 November 2009 (UTC)
Not the same number of right and left parenthesis, brackets or braces
It has been suggested here to use a filter to detect edits unbalancing the number of left vs right braces, but I thought check wikipedia reports would be better. Is it possible to scan for pages with not the same number of right and left parenthesis, or right and left brackets and same for braces ? Cenarium (talk) 19:57, 4 December 2009 (UTC)
Regarding this error, see Wikipedia:Administrators' noticeboard/Incidents#SmackBot changing referencing style, again (dearchived). Generally, bots should not be making this change as the use of sequential references and named references are equally preferred by the relevant guidelines. Christopher Parham (talk) 13:59, 9 December 2009 (UTC)
Suggested check: Double sections in articles
Please, read the discussion on Wikipedia_talk:AutoWikiBrowser/Feature_requests#Removing_double_sections_in_articles. -- Magioladitis (talk) 22:52, 12 December 2009 (UTC)
- Examples: [7], [8]. -- Magioladitis (talk) 11:38, 13 December 2009 (UTC)
- A list was created by Svick and can be found in User:Svick/Double sections. -- Magioladitis (talk) 12:32, 29 December 2009 (UTC)
Suggested check: Multiple single-item lists
Is it possible to add another item to this list? Some pages have busted list formatting in ways that create accessibility problems for some readers. The blank-lines-between-items approach:
* Apple
* Banana
* Cherry
actually creates three separate, single-item lists in the HTML. The effect is visible but not really disruptive on most screens, but it gets read by screen readers as something like this: "A list of one item. 'Apple'. A list of one item. 'Banana'. A list of one item. 'Cherry'."
If it's written correctly (that is, no blank lines between items, in exactly the format you'd use for a numbered list), then the result is one three-item list (and read as "A list of three items: Apple, Banana, Cherry"). I've been correcting these when I see them, but a systematic search might be very useful -- and there are very likely thousands of instances of this formatting error. WhatamIdoing (talk) 23:26, 9 January 2010 (UTC)
Images at end
Tag articles with images after the external links section such as in California Limited.Smallman12q (talk) 15:56, 15 January 2010 (UTC)
Space before ref tag
Link to issue related to this project: Wikipedia:Content noticeboard#User:Stemonitis and space in front of ref tag. Comments are welcomed. --Snek01 (talk) 11:16, 10 February 2010 (UTC)
See also, main, redirect self
I'd like to suggest a check for {{see also| {{main article| and redirects that point to themselves.Smallman12q (talk) 15:38, 11 January 2010 (UTC)
- This is actually useful when the main article link has an anchor that points to another section in the main article. twilsonb (talk) 00:19, 27 June 2010 (UTC)
WikiCleaner 0.99
Hi,
I have done some work on my own tool, WikiCleaner, so that it can help fixing errors detected by this project. If you want to use it, [http://site4145.mutu.sivit.org/WikiCleaner/WikiCleaner.jnlp install it] and run it:
- Log in (you can try in demo mode, but you can't save pages)
- Click on the "Project Check Wikipedia" button
- The list of errors is retrieved from the toolserver using the new interface provided
- You can then choose the type of error you are interested in fixing and load a page by double-clicking on it
- The page is analyzed and the errors found are displayed in red
- You can fix them in the window (for some errors, right clicking on the red text gives suggestions on how to fix)
- The "Next Occurrence" button scrolls the window to the next error
- The "Validate" button reanalyzes the text to see what errors are left
- The "Send" button updates the page on Wikipedia and mark the errors as fixed on the toolserver
The development is not complete, but it's already fully functional for some error types (n°2, 8, 10, 19, 26, 28, 32, 34, 38, 39, 43, 46, 47, 48, 50, 64, 80, 81). I'm really interested in comments about the following points :
- First, ergonomy of the tool : ideas for improvement, ...
- Second, for the errors that are currently supported :
- Case A: situations where the tool doesn't find the error in the page
- Case B: situations where the tool reports an error which is a false positive
- Case C: what suggestions could be made by the tool for fixing some errors
- Third, what are the other error types that you would like to see in the tool
Thanks, --NicoV (talk) 18:44, 1 April 2010 (UTC)
- There was a bug in the previous version : the Check Wiki interface was available only if the Experimental features were enabled in the Options. It's fixed in the new version. --NicoV
Blank the 2010-03-07 Content
The outdated errors from 2010-03-07 are confusing to a new arrival at this page. It looks like those errors are current, and it takes a while to find this current page. Should we blank the old content on Wikipedia:WikiProject Check Wikipedia and http://toolserver.org/~sk/checkwiki/enwiki/enwiki_output_for_wikipedia.html? twilsonb (talk) 00:24, 27 June 2010 (UTC)
- I dunno.. the old list may be deprecated.. but whoever blanked all the project information, categories, and participants list wasn't thinking.. -- Ϫ 07:31, 21 August 2010 (UTC)
Manually?
I haven't done much work with this project. Is it realistic or useful to work on these issues manually? I have auto-ed installed, but that's the only tool. Will it still make sense to work on these issues like that? 69.142.154.10 (talk) 10:21, 19 July 2010 (UTC)
Also, why isn't Auto-ed a bot, automatically fixing the Auto-ed indicated errors on Check? 69.142.154.10 (talk) 10:21, 19 July 2010 (UTC)
non-terminal newline within template parameters which compose a link
The citation displayed incorrectly before my edit here. Well-formed input may resemble the following:
*{{cite book | author = Goldstein, Emmanuel | publisher = The Brotherhood | year = 1984 | title = The Theory and Practice of Oligarchical Collectivism | url = http://books.google.com/books?id=w-rb62wiFAwC&pg=PA230 }} |
|
A single newline is usually irrelevant because it is rendered as a space (' ') in most cases. However, in parameters which become part of a link, e.g. [{{{url}}} {{{title}}}], the '\n' first will cause the link to break. The brackets become unmatched and the bare url becomes visible. If (as in this example) the template is part of a list item (<li>), the same line-break also will break that list item. Thirdly it will break the list (<ul>) itself and begin a new paragraph (<p>) because the character following '\n' is something other than an asterisk '*'. Last of all the binding of the italic markup ('') is reversed within said paragraph.
Here is the same input with a carriage-return added: | |
*{{cite book | author = Goldstein, Emmanuel | publisher = The Brotherhood | year = 1984 | title = The Theory and Practice of Oligarchical Collectivism | url = http://books.google.com/books?id=w-rb62wiFAwC&pg=PA230 }} |
|
Badly-formed input is easy enough to fix given a list of template-name and parameter-name permutations for which it creates a problem. Grepping the template name-space for various instances of /\[+\{\{+/ might be a good start (keeping in mind that some parameters are actually forwarded to and from other templates as in this case), but not sufficient.
Or we could derive a general rule for identifying insignificant single newlines in wiki-text and replacing them by a space. This would be based on the characters adjacent. Something like .replace(/(\w)\n(\w)/g, "$1 $2") (anywhere outside <pre> tags) would be “safe”, but not sufficient. ―cobaltcigs 21:42, 2 September 2010 (UTC)
- rev 7168 I have added logic to AWB to converts newlines to spaces in citation
|title=
fields when|url=
is present. That should be a good start. Rjwilmsi 10:19, 23 September 2010 (UTC)
Check 48 and check 64
I'm going to run a bot in order to speedily fix Check #48 and reduce some redundancies. Please take a look at Wikipedia:Bots/Requests for approval/FrescoBot 7, your comments are welcome. -- Basilicofresco (msg) 10:36, 20 September 2010 (UTC)
Selflinks in taxoboxes
Some editors have been using selflinks to produce the boldface in Taxoboxes, which isn't normally a problem, since it doesn't function as a link, and so appears identical to using boldface directly. However, User:Yobot has been "correcting" these selflinks, using WP:CHECKWIKI as an excuse. I have tried to stop Yobot, with limited success, with the bot's owner insisting I bring it up here. Selflinks in taxoboxes may be replaced with boldface markup, perhaps, but must not be simply removed as this damages the appearance of the page. Leaving them in does no harm, but taking them out causes damage. --Stemonitis (talk) 04:41, 28 September 2010 (UTC)
- I can ask for a change in AWB's code to handle this case. -- Magioladitis (talk) 19:04, 29 September 2010 (UTC)
Yobot
Can someone point yobot at Category:ÖBB and fix them all. (see Wikipedia:Ani#Please_block_Rich_Farmbrough_-_thousands_of_unnecessary_capitalization_changes for brief explanation) At the same time maybe yobot could be programmed to branch and check affected cats to avoid the problem of erratic categorisation - actually I think that might create a hard computational problem - perhaps there's another way.
Can I assume that yobot will be altering all pages, not just recent ones as it seems to be doing now.? 94.72.245.124 (talk) 18:42, 29 September 2010 (UTC)
- I am right now. In fact I finished that before stopping my bot with message on its talk page! -- Magioladitis (talk) 18:47, 29 September 2010 (UTC)
- Sooorrry. ouch Forgot messages on bot talk pages often stop bots.Sf5xeplus (talk) 18:55, 29 September 2010 (UTC)
I don't get it
I got here from the WikiCleaner page. I figured I might be able to help with this project but I can't figure out what it actually does. Sure, the goals say things about fixing syntax but what do those changes entail? Following the link to Wikipedia:WikiProject Wiki Syntax gives me some idea but not a complete one. The goals description says that this project was "inspired by" Wiki Syntax but doesn't really say what parts of Wiki Syntax were kept and such. I think it would be beneficial to give an example of something that is fixed by this project. Dismas|(talk) 06:48, 13 October 2010 (UTC)
- Click on the "New interface" link, and you'll end up here. From there, you can find lists by priority of the various errors that are scanned for: click on one that looks interesting and that has a number in the "To-do" column, and you'll get a page describing the error and listing articles that have the error. Pick whichever ones you'd like to clean up. Does that help? --Auntof6 (talk) 08:21, 13 October 2010 (UTC)
- Perhaps it would be good to provide a simple list of tasks. I'm happy with what we've done at WP:WikiProject Medicine#How_you_can_help. It might be possible to use it as a model here. WhatamIdoing (talk) 22:56, 15 December 2010 (UTC)
Country demonym navbox generators
If someone knowledgeable about how to work template coding could help fix the country demonym navbox generators, it would be greatly appreciated. At the moment, the existing country demonym navbox generators are Template:African topic, Template:Asian topic, Template:European topic, and Template:South American topic, but Template:North American topic and Template:Oceanian topic should be created as well. Template:Asian topic is currently up for deletion because it duplicates the functionality of Template:Asia topic even though that is not what it is supposed to do. The country demonym navbox generators are supposed to be the demonym counterparts of the country name navbox generators (ie. Template:Africa topic, Template:Asia topic, Template:North America topic, Template:South America topic, Template:Oceania topic, and Template:Africa topic). While the country name navbox generators create navboxes that create strings like "History of Canada" and "Culture of Iraq", the country demonym navbox generators are supposed to create navboxes that create strings like "Canadian literature" and "Iraqi cuisine". Nonetheless, the country demonym navbox generators have never worked because no one knowledgeable about the code has ever fixed them. If anyone would be willing to take on this task, it would be greatly appreciated. These navbox generators have the potential to make a substantial improvement to navigation between related articles on international topics. Neelix (talk) 15:59, 13 December 2010 (UTC)
I forgot to change the name of the source at the start of the aricle (the Flight International source is not valuable, because Air Armenia itself refuses to say if their An-32 is really operational or not...). I correct this error. --78.126.134.198 (talk) 14:59, 21 March 2011 (UTC)
I tried but i can't change the source: why is it impossible to insert <ref name="ARR">Air Armenia (2011)</ref>''''' (that's the updated source) instead of ''<nowiki><ref name="FI">[[Flight International]] 27 March 2007</ref> and <ref name="ARR"/> io <ref name="FI"/>?--78.126.134.198 (talk) 15:18, 21 March 2011 (UTC)
Request checks
Could the following checks be considered for addition?
- Page [[XYZ]] contains [[Category:XYZ]] where the sort key on that category is not a blank. (in other words it should be [[Category:XYZ| ]]
- Page [[List of foo bar baz]] contains no default sort and categories without a sort key. It should have sort key "foo bar baz", I think.
Naraht (talk) 20:44, 5 April 2011 (UTC)
Headline checks cleared out
Snotbot has cleared out many of the backlogs in the checks for headline problems (id's 7, 19, 25, and 83). This was done under task 5 for the bot. I have marked them all as done on the toolserver site. Does anyone know if the articles marked as done will be rechecked at some point to see if the changes have been reverted? Or should the bot manually go through the articles again at some point to see if the changes have been reverted and the problems still exist? —SW— comment 18:04, 27 April 2011 (UTC)
- Also, just a minor issue, but the "Done" functionality on the toolserver site doesn't seem to work. There were 20,000+ articles in some of those lists yesterday, and after I marked them all as done, it still shows Done:0. —SW— chat 18:06, 27 April 2011 (UTC)
Two priorities changed (err 81 and 84)
I have changed the priorities from error #81 (Reference duplication) and #84 (Section without content) from 1 (high) to 2 (middle).diff--Ben Ben (talk) 17:10, 12 June 2011 (UTC)
Why are there so few high-priority errors lately?
The number of articles listed under high priority has plummeted in the last couple of weeks or so. Are there bots handling more of these now, or what? Just curious. --Auntof6 (talk) 21:17, 21 June 2011 (UTC)
- Would the previous section provide any clues? –Drilnoth (T • C • L) 00:13, 22 June 2011 (UTC)
- No, because it's most of the high-priority errors, not just those two. --Auntof6 (talk) 00:20, 22 June 2011 (UTC)
- Okay, just a thought. I haven't looked at CHECKWIKI recently, I'm just watching the page :) –Drilnoth (T • C • L) 00:29, 22 June 2011 (UTC)
- Last dump 2011-01-15. The daily updates are only the results of some sort of "mini-scans".--Ben Ben (talk) 09:42, 9 July 2011 (UTC)
- Okay, just a thought. I haven't looked at CHECKWIKI recently, I'm just watching the page :) –Drilnoth (T • C • L) 00:29, 22 June 2011 (UTC)
- No, because it's most of the high-priority errors, not just those two. --Auntof6 (talk) 00:20, 22 June 2011 (UTC)
From prio 2 to prio 1.
Changed err_010 (Square brackets not correct end), err_046 (Square brackets not correct begin) and err_080 (External link with line break) from prio 2 to prio 1.--Ben Ben (talk) 09:36, 9 July 2011 (UTC)
TOC check
There is a list of longer articles where it could be worth checking the structure, it's at Wikipedia:WikiProject Check Wikipedia/AWB. -- User:Docu
- If you're interested, WPCleaner now has a function for editing TOC on articles, see the button in Wikipedia:WPCleaner/Page analysis. --NicoV (talk) 06:48, 27 August 2012 (UTC)
Improper removal of section titles by Yobot
Don't know if this is the correct place, but I can't find any particular list of "false positives" here... Magioladitis suggested I should raise the problem there. The following is copied from the Yobot talk page: --Matthiaspaul (talk) 11:51, 23 September 2011 (UTC)
In the article List of MS-DOS commands Yobot removed a section title named ":" and changed it to "", so that it no longer showed up in the TOC and was not clickable. This has been reverted but how can we ensure that this won't happen again? --Matthiaspaul (talk) 07:09, 23 September 2011 (UTC)
- WP:CHECKWIKI has a list of false positives. Please contact them as a first step. I'll see what I can do in AWB's too. -- Magioladitis (talk) 09:14, 23 September 2011 (UTC)
- Just for the records, I have changed the title to {{Template:Not a typo|:}} to try and keep the header from being changed again. BTW: {{Template:Not a typo|:}} did not work! --Matthiaspaul (talk) 14:32, 24 September 2011 (UTC)
Removing small syntax
Why is the removal of <small> being included in changes here? the bot just went through and removed it from the notes of several citation templates I was using it in.--Crossmr (talk) 13:55, 14 December 2011 (UTC)
Automated correction of Check Wikipedia ID 2: "Article with false <br/>"
I've been working on an AutoEd module that automatically corrects invalid line breaks. Feel free to use it if you're interested. The bare installation of the module requires the following code to be added to Special:MyPage/common.js.
importScript('Wikipedia:AutoEd/core.js'); importScript('User:Michael Anon/custombrcorrector.js'); function autoEdFunctions() { var txt = document.editform.wpTextbox1; txt.value = autoEdCustombrCorrector(txt.value); }
(This note was added based on a discussion with Magioladitis located at User talk:Magioladitis#.22Article with false .3Cbr.2F.3E.22: checkwiki error fix 2 and.2For general fixes using AWB .288235.29.) Michael Anon 17:50, 8 August 2012 (UTC)
New error fix suggestion: Unneeded prefix
Spot unnecessary template prefix in mainspace. Code must be simplified to: {{Template:foo}}→{{foo}} -- Magioladitis (talk) 11:16, 10 August 2012 (UTC)
- If you're interested, WPCleaner already does this if you activate error n°502 in the Check Wiki configuration, see documentation (example of configuration on frwiki). I've used errors numbers above 500 to add errors that are not detected by sk script. --NicoV (talk) 06:39, 27 August 2012 (UTC)
<h1>...<h6>
It seems someone has decided these are illegal now Error#49. They are still in the Wiki-markup documentation. How do we do list pages now with out little blue "[edit]"'s everywhere? -- :- ) Don 01:20, 10 October 2012 (UTC)
I checked the code and it appears that only h2 is affected. I can understand the reasoning, I was afraid all were affected. Having the use of h3 should be sufficient in most cases. I will update the documentation where I find it. -- :- ) Don 04:43, 11 October 2012 (UTC)
Date & list templates
These outstanding BOTREQ may be of interest to this project:
If so, I'm happy to advise further. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:26, 8 November 2012 (UTC)
Other detections
Hi,
If you're interested in other detections than the existing ones currently provided by Stephan Kuhn, I've added a few other detections in WPCleaner (see Wikipedia:WPCleaner/Configuration/Help#Check Wiki configuration) :
- Error 501 : Spelling and typography, based on AWB typo rules
- Error 502 : Unnecessary Template: namespace
- Error 503 : Internal link in title
- Error 504 : Reference in title
- Error 505 : Image without alternative description
- Error 506 : Reference with a numeric name
- Error 507 : Gallery without caption
- Error 508 : Missing template
WPCleaner doesn't create a list for those errors, like the ones you have on the toolserver, but can detect these errors when editing pages with it. If you're interested, you just have to activate those errors in the translation file. --NicoV (talk) 15:18, 19 November 2012 (UTC)
- Great. Thanks! -- Magioladitis (talk) 20:05, 19 November 2012 (UTC)
List of errors fixed by AWB
For those interested, here is a list of errors fixed by AutoWikiBrowser: User:Magioladitis/AWB and CHECKWIKI. -- Magioladitis (talk) 22:41, 28 December 2012 (UTC)
Failure
Hi,
Checkwiki screwed up here. Don't know if there are more like that.
— kwami (talk) 02:49, 4 January 2013 (UTC)
- Um, CheckWiki didn't fail. Checkwiki only detects errors, not fixes them. In the edit above, it was AWB that "fixed the error". In both edits, there is something wrong going on as all I see is blank boxes in yours and mixed characters in AWB's edit. The unicode you are trying to use doesn't show up in most browsers and you have to use the template:math Bgwhite (talk) 03:30, 4 January 2013 (UTC)
List of errors fixed by AWB
For those interested, here is a list of errors fixed by AutoWikiBrowser: User:Magioladitis/AWB and CHECKWIKI. -- Magioladitis (talk) 22:41, 28 December 2012 (UTC)
Failure
Hi,
Checkwiki screwed up here. Don't know if there are more like that.
— kwami (talk) 02:49, 4 January 2013 (UTC)
- Um, CheckWiki didn't fail. Checkwiki only detects errors, not fixes them. In the edit above, it was AWB that "fixed the error". In both edits, there is something wrong going on as all I see is blank boxes in yours and mixed characters in AWB's edit. The unicode you are trying to use doesn't show up in most browsers and you have to use the template:math Bgwhite (talk) 03:30, 4 January 2013 (UTC)
Depreciated font attribute
Hi all, Bgwhite made this edit today, but using <span>
doesn't work, as you'll see in the diff. I would like the table header to be have the normal font size, with the table at 80%. Is that possible without using <font>
? Thanks, Ed [talk] [majestic titan] 22:14, 17 January 2013 (UTC)
- Use the <big> tag in this case and it will work. I'm not sure why it doesn't work in wikitable caption lines. Very strange.Bgwhite (talk) 22:27, 17 January 2013 (UTC)
- I thought the same. Thanks so much! Ed [talk] [majestic titan] 22:29, 17 January 2013 (UTC)
Using template xt and !xt in report: ISBN
Are there any objections or consequences to introducing the xt and !xt templates into the report: ISBN subpage, so it looks like this? Mr Stephen (talk) 16:42, 2 February 2013 (UTC)
- Not at all. Very good idea. -- Magioladitis (talk) 18:17, 2 February 2013 (UTC)
- I've gone for that. If anyone sees any problems feel free to revert. Mr Stephen (talk) 22:05, 3 February 2013 (UTC)
Templates that ought to be wikilinks
I just fixed a problem that I don't recall seeing for a long while: someone had used curly braces when they should have used square brackets to create a link. Is there an easy way to search for links in the mainspace to non-existent templates? WhatamIdoing (talk) 18:10, 5 March 2013 (UTC)
- I'm personally unaware of anything. The closest I can think of is at Wikipedia:Database reports under the template section. Maybe MZMcBride could run something? Bgwhite (talk) 18:37, 6 March 2013 (UTC)
Extra errors detected by WPCleaner
Hi, if you're interested, Wikipedia:WPCleaner can detect 11 more types of error. The list of error types and how to activate their detection is described in Wikipedia:WPCleaner/Configuration/Help#Check Wiki configuration : basically, you have to configure errors from 501 to 511 as you would do for other errors in Wikipedia:WikiProject Check Wikipedia/Translation. You have an example in French configuration. --NicoV (Talk on frwiki) 17:03, 15 April 2013 (UTC)
False positive for #46
Hi Bgwhite, someone reported on frwiki that lately there has been false positives for #46 when image legend contains a link. Today examples:
- fr:AN/APG-76,
radar Norden [[AN/APG-76#AN/APQ-148|AN/APQ-148]]
: seems fine in the article[[File:AN-APQ-148 Radar, Norden, 1972 - National Electronics Museum - DSC00068.JPG|thumb|280px|Un radar Norden [[AN/APG-76#AN/APQ-148|AN/APQ-148]].]]
- fr:Bétail,
of sheep.jpg|thumb|Troupeau de [[mouton]]
: seems fine in the article[[Fichier:Flock of sheep.jpg|thumb|Troupeau de [[mouton]]s]]
--NicoV (Talk on frwiki) 08:56, 9 February 2016 (UTC)
- NicoV It's not a false positive, but it is giving the wrong location for the error. Both articles had the broken bracket fixed on the 9th. Bgwhite (talk) 22:58, 11 February 2016 (UTC)
- Bgwhite Yes, but sometimes it seems to be the opposite error that should be reported : currently, fr:Gandhara is reported in #46 with the notice
Gandhara Guimet 181171.jpg|thumb|[[Bodhisattva]]
while the actual problem is a #10 for[[shivaïsme]
. That was also the case for fr:AN/APG-76. --NicoV (Talk on frwiki) 07:44, 12 February 2016 (UTC)
- Bgwhite Yes, but sometimes it seems to be the opposite error that should be reported : currently, fr:Gandhara is reported in #46 with the notice
False positives for #3
@Bgwhite: On frwiki, there are some false positives for #3 due to:
- a whitespace in the
<references>...</references>
tag, like here or here - a carriage return in the template Références like here
Could this be prevented from being detected ? --NicoV (Talk on frwiki) 20:35, 9 March 2016 (UTC)
- NicoV
- Whitespace: The regex is
<references[ ]?\/?>
. Change it to<references(\s*\/)?>
? - Carriage return: I only slap {{ onto the front of the regex. You'll need to add a carriage return to your regex.
- Whitespace: The regex is
- Bgwhite (talk) 21:33, 9 March 2016 (UTC)
- @Bgwhite: For the whitespace, yes maybe. For the carriage return, I didn't know it was also a regex for #3, I thought it was only for #78: are you sure? --NicoV (Talk on frwiki) 08:40, 10 March 2016 (UTC)
- NicoV Never mind. I was thinking 78. It's amazing I think at all. Will look more tomorrow. Bgwhite (talk) 08:45, 10 March 2016 (UTC)
- @Bgwhite: For the whitespace, yes maybe. For the carriage return, I didn't know it was also a regex for #3, I thought it was only for #78: are you sure? --NicoV (Talk on frwiki) 08:40, 10 March 2016 (UTC)
Commons
@Bgwhite: Could you help set up CHECKWIKI for Commons, so that it will list errors there, which doesn't seem to be working? (t) Josve05a (c) 07:42, 24 May 2016 (UTC)
"Tags without content" screws up a format hack
I wanted to link a name that I inserted in brackets because it was simply "she" in the original quote, i.e. [[[Tammy Baldwin]]]. That displays without any formatting, so I put a blank span in the middle. Your bot just took it out. [9]
What worries me is that I've done this a LOT over time - not always for this reason, but it's amazing how often Wiki syntax fouls up some text with a single quote mark or some other feature for which this has been a workaround.
PLEASE stop removing empty tags and review the bot's edits. Wnt (talk) 10:44, 26 June 2016 (UTC)
- It's logical to use nowiki instead of span. It's really non-obvious to understand what your span means. I think you should form your code as something like that:
<nowiki>[</nowiki>[[Tammy Baldwin]]<nowiki>]</nowiki>
. In ruwiki, we usually use self-closing nowiki tags for such purposes as this one, but seems like they're going to be deprecated. :( Facenapalm (talk) 11:30, 26 June 2016 (UTC)- UPD: "but seems like they're going to be deprecated" - hm, seems like not. Then I would write this:
[<nowiki />[[Tammy Baldwin]]<nowiki />]
. But template is even better, yes. Facenapalm (talk) 11:42, 26 June 2016 (UTC)
- UPD: "but seems like they're going to be deprecated" - hm, seems like not. Then I would write this:
Wnt I used Bracket and fixed it for you. -- Magioladitis (talk) 11:37, 26 June 2016 (UTC)
- @Facenapalm and Magioladitis: Sometimes I've used nowiki tags, but I didn't care that much one way or the other and I wasn't sure the bot wouldn't come after those. The Bracket template adds [ to the text (NOTE: I just tried that with nowiki and it didn't work! It just displays [! And [ html comments also do not work for this sequence!) - I'd actually prefer to do that than to add the confusion of a template which you don't know what it is. I think an HTML comment would work also.
- But none of this really matters. My concern isn't trying to write this one sentence - my concern is that the bot is out there churning away, screwing up format kludges (good or bad) that will be very confusing for editors who don't know Wiki/HTML to figure out. It's the changes you don't know about that you need to be concerned about. Some of this stuff could be buried deep in tables and other arcane syntax. If the bot is going to take out empty spans, it should replace them with whatever you would tolerate
like nowiki or HTML commentsor whatever so that the text displays the same way. Wnt (talk) 11:48, 26 June 2016 (UTC)
- IMHO, using empty span is dirty hack to trick the parser. I'm not sure I'll understand what it means even if I'll edit code manually. So it's ok that bot broke this rare case. Usually empty spans are just empty spans, and they shouldn't be replaced by something like
<nowiki />
. Facenapalm (talk) 11:57, 26 June 2016 (UTC)
- IMHO, using empty span is dirty hack to trick the parser. I'm not sure I'll understand what it means even if I'll edit code manually. So it's ok that bot broke this rare case. Usually empty spans are just empty spans, and they shouldn't be replaced by something like
This is the reason that the templates were created. It makes wikicode cleaerer and no hacks are needed. -- Magioladitis (talk) 12:14, 26 June 2016 (UTC)
- So what is the template for writing [ without it coming out as a bracket? How do I look it up? (Or them up ... I have a feeling there are probably dozens, each used by one or two editors and unknown to the rest of us) Wnt (talk) 14:16, 26 June 2016 (UTC)
- A few things...
- Facenapalm and others... Self-closing HTML tags are being depreciated because they aren't in the HTML5 spec and they are removing them from the Mediawiki parser.
<nowiki />
is not HTML, so it is not being depreciated.<br />
is still in HTML5, but is not mentioned in 5.1 that I could find. They are so common, who knows when it will die. - Wnt
<span>
</span>
is bad HTML and should never be used, period. - That leaves three options:
- Of the three options, #3 is probably the worst for editors. Not many people know what that means, but it is in common use. Templates are nice because people can look up the doc page for them. I personally use nowiki tags and it is the most common in use. Use whatever option you want. Bgwhite (talk) 05:16, 27 June 2016 (UTC)
- The span element is still valid in HTML5.1, but "doesn't mean anything on its own" (cit.), and it's generally "used to color a part of a text" (cit.); so I agree that using the nowiki tag or the brackets templates in the above problem is preferable. Regarding the br element, nothing changed between HTML5 and 5.1, except that "Content model" has been renamed "Nothing" instead of "Empty". The only correct way to write it is
<br>
. The fact that the old XHTML<br />
is still in use is because Tidy is outdated; fortunately they are working on it (they mention the Sanitizer in the comments). --79.18.67.110 (talk) 14:31, 27 June 2016 (UTC) PS: I've run a little test and the W3C Validator doesn't see an empty span as an error; also, it has been used as a hack for some other reason (Fahrner Image Replacement#Implementations); so, it's just ugly, but harmless. --79.18.67.110 (talk) 15:43, 27 June 2016 (UTC)
- The span element is still valid in HTML5.1, but "doesn't mean anything on its own" (cit.), and it's generally "used to color a part of a text" (cit.); so I agree that using the nowiki tag or the brackets templates in the above problem is preferable. Regarding the br element, nothing changed between HTML5 and 5.1, except that "Content model" has been renamed "Nothing" instead of "Empty". The only correct way to write it is
#54
Would you please add {{Break}} and these redirects to list #54?Yamaha5 (talk) 08:48, 7 July 2016 (UTC)
False positives for #105
Hi Bgwhite, CW reports 2 false positives for #105 on frwiki, fr:Tournoi des candidats de Zurich 1953 and fr:Championnat du monde d'échecs 1963, both for the same reason, a table cell filled with several equal signs. Could you ignore those cases as I did with WPC : if the line starts with a pipe, then do not report it as an error as it is most probably a table cell. --NicoV (Talk on frwiki) 15:31, 7 July 2016 (UTC)
#90 and #91 for fa.wiki
Is it possible to deactivate #90 and #91 for fa.wiki? (the part which shows error for using other wiki as reference) because of lack of reliable online farsi sources At fa.wikipedia we have a consensus to use en.wikipedia and other big wikis as source for minor articles so most of #90 and #91's reprort for fa.wiki shouldn't solve.Yamaha5 (talk) 19:45, 3 July 2016 (UTC)
- Yamaha5 Unfortunately, no. Keeping #90 on should be fine, but you will have to turn #91 off. Bgwhite (talk) 06:45, 4 July 2016 (UTC)
- How can i turn of #91. can we control the lists? or you mean we should solve the articles on fa.wiki[ [User:Yamaha5|Yamaha5]] (talk) 08:02, 4 July 2016 (UTC)
- Yamaha5 You can turn off #91. You've edited the list before. I generally leave the lists to be maintained by whoever wants to. You know Farsi, I don't, so edit it to your heart's content.— Preceding unsigned comment added by Bgwhite (talk • contribs)
- I believe the stat page dosen't use that page becuase as you see we translated many of the labels but at the here we can't see them. for example
top_priority_script
was translated at fa:ویکیپدیا:ویکیپروژه_تصحیح_ویکیپدیا/ترجمه but still the fawiki_checkwiki page shows high priority also how can I disable #91?show me on english page (the line which should i remove)(I found it) Yamaha5 (talk) 08:43, 4 July 2016 (UTC)- Yamaha5 I think no
_script
variables are taken into account, you should use_fawiki
variables. --NicoV (Talk on frwiki) 16:59, 8 July 2016 (UTC)- NicoV thanks.Yamaha5 (talk) 19:34, 8 July 2016 (UTC)
- Yamaha5 I think no
- I believe the stat page dosen't use that page becuase as you see we translated many of the labels but at the here we can't see them. for example
- Yamaha5 You can turn off #91. You've edited the list before. I generally leave the lists to be maintained by whoever wants to. You know Farsi, I don't, so edit it to your heart's content.— Preceding unsigned comment added by Bgwhite (talk • contribs)
- How can i turn of #91. can we control the lists? or you mean we should solve the articles on fa.wiki[ [User:Yamaha5|Yamaha5]] (talk) 08:02, 4 July 2016 (UTC)
- I added two patchs here please merge them to use fawiki's translation and have better supportYamaha5 (talk) 12:43, 4 July 2016 (UTC)
Id 85 bug
Hello. Id 85 returns false positive on empty tags (as in "<center> </center>") if there is a code inside: "<center> <syntaxhighlight ... </syntaxhighlight> </center>" IKhitron (talk) 12:19, 13 April 2016 (UTC)
- IKhitron The first thing CheckWiki does is to remove various tags and their content, ie
<syntaxhighlight>
,<nowiki>
,<pre>
... These tags often have bad wikicode or wikicode symbols that aren't wikicode. There's nothing that can be done with the false-positive blank center tags. However, as<center>
is obsolete HTML, it's best to replace the tag. Bgwhite (talk) 19:47, 18 April 2016 (UTC)- Thank you, Bgwhite. But this is a special case id, it checks empty text. Can't you replace the tags with something neutral, as "qwerty" string, in place of removing, to work property? IKhitron (talk) 19:57, 18 April 2016 (UTC)
- Well, Bgwhite, I rephrased the template, and the new run did not catch it. But I still do not know, what was the problem. IKhitron (talk) 18:40, 26 July 2016 (UTC)
- IKhitron Wrong discussion. Do you mean #3 down below? Bgwhite (talk) 23:37, 26 July 2016 (UTC)
- Sorry. Bgwhite. It's ##60 possible false positive
- IKhitron Wrong discussion. Do you mean #3 down below? Bgwhite (talk) 23:37, 26 July 2016 (UTC)
- Well, Bgwhite, I rephrased the template, and the new run did not catch it. But I still do not know, what was the problem. IKhitron (talk) 18:40, 26 July 2016 (UTC)
- Thank you, Bgwhite. But this is a special case id, it checks empty text. Can't you replace the tags with something neutral, as "qwerty" string, in place of removing, to work property? IKhitron (talk) 19:57, 18 April 2016 (UTC)
Self-closing div and span tags to be deprecated
The latest Tech News (dated today) has this notice:
- Future changes
- Using self-closing tags like
<div/>
and<span/>
to mean<div></div>
and<span></span>
will not work in the future. Templates and pages that use these tags should be fixed. When Phabricator ticket T134423 is fixed these tags will parse as<div>
and<span>
instead. This is normal in HTML5. [10]
- Using self-closing tags like
Should a check for these tags be added to Checkwiki? – Jonesey95 (talk) 21:13, 16 May 2016 (UTC)
- Jonesey95 I've already run a list for them. There's a total of 72 in articles. There are
<span />
tags in template space and I left a message on Frietjes' talk page about these. I'd rather not touch templates. I'll be adding this to error #2. Bgwhite (talk) 21:22, 16 May 2016 (UTC)- Thanks. I don't mind editing templates, even if it means the occasional run-in with editors who either can't read or refuse to read and then blame me for their shortcomings. I know that you know what that's like. I'll head over to F's talk page for the list. – Jonesey95 (talk) 21:34, 16 May 2016 (UTC)
- Sadly, I've turned off the second error, because there's no consensus with <br clear="all" /> -> template replacement in ruwiki. Error #2 becomes more and more sophisticated, maybe it's time to divide it to the several errors? Or could you, please, disable founding br tags with "clear" attribute in ruwiki? If it's not very difficult. Facenapalm (talk) 07:57, 18 May 2016 (UTC)
It appears that the check for error #2 is not catching some cases of errors that cause pages to be placed in the new Category:Pages using invalid self-closed HTML tags. Examples:
- 2013–14 Kitchee SC season:
<div id="PLAYERS"/>
- Georges Cuvier:
</blockquote/>
- Calgary City Council:
<span id="Ward 1"/>
- Harold Arlen:
<p/>
Is error #2 supposed to find these? Can it be modified to do so? – Jonesey95 (talk) 01:45, 17 July 2016 (UTC)
- Jonesey95 @NicoV: That's a lot of articles in that category. One of the articles I looked at should be caught, but isn't.
- I'm currently only catching cases that don't have other attributes, such as id=.
- I'm not looking for any cases of some others, such as
<p>
.
- I'll work on adding them. I'm behind on coding things up due to trying to fix articles on the daily CheckWiki scans. Bgwhite (talk) 05:03, 18 July 2016 (UTC)
- The category is new, and it is filling slowly as the job queue runs through the whole population of pages. Some gnomes have been busy cleaning out the category, including fixing templates that have zillions of transclusions, but the category population has stayed relatively constant at a few thousand as new pages are null-edited by the job queue. At this writing, it seems likely that there are 5,000 to 10,000 individual pages left with these errors, not including pages transcluding pages that have errors in them.
- Jonesey95 @NicoV: That's a lot of articles in that category. One of the articles I looked at should be caught, but isn't.
- In addition to the above, I have seen
<small/>, <center/>, <p "with text" />
, and maybe one or two others, as well as all of those tags with both leading and closing slashes in the same tag. – Jonesey95 (talk) 05:47, 18 July 2016 (UTC)
- In addition to the above, I have seen
@Bgwhite and Jonesey95: I've started updating WPCleaner to handle some of the tags that trigger the categorization. It's not finished, but you can help me by listing cases I'm currently missing (not a lot of free time to analyze what's missing). --NicoV (Talk on frwiki) 17:10, 18 July 2016 (UTC)
- Is there a list somewhere? In addition to the above tags, I have seen
<big/>, <s/>, <del/>, <tr/>, <td/>
. – Jonesey95 (talk) 17:15, 18 July 2016 (UTC)- @Jonesey95: List available in the code. --NicoV (Talk on frwiki) 22:14, 18 July 2016 (UTC)
- I see a list of tags, but interpreting the code is beyond me. It looks like del, td, and tr are missing. Will it find tags formatted like
</blockquote/>
, with a leading and trailing slash? There are a surprising number of those. – Jonesey95 (talk) 22:51, 18 July 2016 (UTC)- The link was just for the list of tags, not to analyze the code ;-) The code will find both regular self-closing tags and also incorrect tags with a leading and trailing slash. I've added del, td and tr. If you see other cases, tell me. --NicoV (Talk on frwiki) 06:14, 19 July 2016 (UTC)
- I just found and fixed
<code/>
on one page. There may be more pages with this tag. – Jonesey95 (talk) 17:13, 19 July 2016 (UTC)
- I just found and fixed
- The link was just for the list of tags, not to analyze the code ;-) The code will find both regular self-closing tags and also incorrect tags with a leading and trailing slash. I've added del, td and tr. If you see other cases, tell me. --NicoV (Talk on frwiki) 06:14, 19 July 2016 (UTC)
- I see a list of tags, but interpreting the code is beyond me. It looks like del, td, and tr are missing. Will it find tags formatted like
- @Jonesey95: List available in the code. --NicoV (Talk on frwiki) 22:14, 18 July 2016 (UTC)
@Bgwhite and Jonesey95: If you're interested, I ran a dump analysis yesterday, the result for #2 is at Wikipedia:CHECKWIKI/WPC 002 dump. --NicoV (Talk on frwiki) 08:29, 21 July 2016 (UTC)
- Excellent. It looks like there might be a couple of false positives on that list, but they are not worth worrying about until the hundreds of real errors are fixed. Good work. – Jonesey95 (talk) 12:58, 21 July 2016 (UTC)
- This one doesn't look like a tag syntax error to me. As far as I know, any amount of white space is valid within a tag:
- 1980 NBL Finals: <br ↵↵/>
- Does WP have its own rules about tags like this? – Jonesey95 (talk) 14:33, 21 July 2016 (UTC)
- I don't know if I should keep detecting this or not : for the moment, carriage return are considered as invalid characters in a tag in WPC. --NicoV (Talk on frwiki) 16:29, 21 July 2016 (UTC)
- This one doesn't look like a tag syntax error to me. As far as I know, any amount of white space is valid within a tag:
Here are a few more tags to add to the check: <sup/>, <em/>, <i/>, <th/>, and <rb/> (typo for "br")
– Jonesey95 (talk) 21:28, 25 July 2016 (UTC)
@Magioladitis, Jonesey95, and NicoV: In theory, tomorrow CheckWiki will start to catch the br tags in NicoV's report and all the self-closing tags. It's also catching br tags with carriage returns. Bgwhite (talk) 00:32, 28 July 2016 (UTC)
- Do I look at Wikipedia:CHECKWIKI/WPC 002 dump or somewhere else for the updated list? I fixed a few hundred errors on that page and am looking forward to a refresh of it. I was unable to persuade my computer to run the Java command at the top of the page, so I was unable to refresh it myself. – Jonesey95 (talk) 03:38, 28 July 2016 (UTC)
- Jonesey95 In theory, August's dump will come out in a week or so. Might want to wait till then to see all the new and wonderful errors. I reran Nico's list via Checkwiki. The only errors listed were ones with the
<br>
tag... assuming I coded it right. Not sure if you or Nico have access to WMFLabs. Java and the dump files are available there. Bgwhite (talk) 04:55, 28 July 2016 (UTC)- Jonesey95 I have been trying to rerun the dump analysis for the last 2 days, but I'm only spending an hour or so home once a day (it failed the first time due to an out of memory error, and I don't know what's the status of the second run...). If you want to try it by yourself, the command on fr:Projet:Correction syntaxique/Analyse 002 is probably more explicit than the one displayed on enwiki... I won't be able to handle the August dump analysis, at least not until the 15th.
- Bgwhite I think WMLabs severely limits the amount of memory a process can have, so it's probably a no go for WPC for the dump analysis. --NicoV (Talk on frwiki) 08:47, 28 July 2016 (UTC)
- NicoV, No, they don't severely limit the amount of memory. One does have to specify the max amount of memory one needs. The default is 256MB. I've gone upto 3GB. Bgwhite (talk) 20:53, 28 July 2016 (UTC)
- Jonesey95 In theory, August's dump will come out in a week or so. Might want to wait till then to see all the new and wonderful errors. I reran Nico's list via Checkwiki. The only errors listed were ones with the
- Jonesey95 I updated the description of the command line to run the dump analysis for enwiki. --NicoV (Talk on frwiki) 12:30, 28 July 2016 (UTC)
More errors / more bots
If we manage to have more bots running daily we can reduce the time required to fix errors drastically. This means we have more free time to detect more errors and and add to our list. What could these errors be? In an ideal world, we could check all of WP:GENFIXES and see what is worth to be done even as a sole task. -- Magioladitis (talk) 09:25, 30 July 2016 (UTC)
Help with translation page
Resolved
Hello. I hope somebody who read this can find 5 minutes to help me. I'll be very glad if it's possible, so if I know it's not your "duty". I made a lot of changes in our translation page, because most of it was there from the time when checkwiki was a beta on dewiki. But it doesn't work any more! I tryed to find some variable without END or some another syntax error, but could not. What could be the problem? Thank you very very much in advance, IKhitron (talk) 11:55, 31 July 2016 (UTC)
- Isn't "description_text_hewiki" the one, that screws up everything? --Edgars2007 (talk/contribs) 13:54, 31 July 2016 (UTC)
- Everything is possible. Why do you think it's there, there is some problem in the description? Thank you very much, IKhitron (talk) 15:09, 31 July 2016 (UTC)
- As I don't know, how those translation files are getting parsed to Checkwiki system, I'm just guessing. </syntaxhighlight> looked suspicous (and other non-HTML stuff), but I may be wrong. --Edgars2007 (talk/contribs) 16:05, 31 July 2016 (UTC)
- I see. I created this part as in frwiki, and it works there. IKhitron (talk) 21:04, 31 July 2016 (UTC)
- As I don't know, how those translation files are getting parsed to Checkwiki system, I'm just guessing. </syntaxhighlight> looked suspicous (and other non-HTML stuff), but I may be wrong. --Edgars2007 (talk/contribs) 16:05, 31 July 2016 (UTC)
- Everything is possible. Why do you think it's there, there is some problem in the description? Thank you very much, IKhitron (talk) 15:09, 31 July 2016 (UTC)
Article that doesn't exist appears in the database and in maintenance categories
The page USA:S inrikessäkerhetsdepartement has appeared on sv.wp's list of #2-errors for ~1 year now (or longer), at least when processing with WPCleaner. That page does not exist (the page USA:s inrikessäkerhetsdepartement however does exists). Yet this page appears on the CHECKWIKI list, and in the automated maintenece category Pages using invalid self-closed HTML tags on sv.wp. Why is this? (t) Josve05a (c) 10:02, 1 August 2016 (UTC)
- It looks like parsers think USA is a namespace and automaticaly uppercase the first letter of the rest. IKhitron (talk) 15:07, 1 August 2016 (UTC)
#28 possible false positives
Hi. I started to fix #28, and found he:(Miss)understood and he:Anastacia at start of the list. It doesn't look like there are problems there. Maybe there are some more, didn't check yet. Thank you, IKhitron (talk) 18:06, 11 August 2016 (UTC)
- IKhitron It was fixed a few days ago. The problem happens when a table is the very last thing in an article... no categories, defaultsort or other templates. I made a change to catch more cases of #28. It was thinking |}} was a table ending when it's most likely a template ending. As a result of the change, #28 will pick up cases of {{|, such as {{|url=http... , where "cite web" is missing. This is an error, but not related to tables. Bgwhite (talk) 21:47, 11 August 2016 (UTC)
- Thank you, Bgwhite. It means, these articles will not be in the list in the next run? IKhitron (talk) 22:10, 11 August 2016 (UTC)
- IKhitron Correct. These should not be in next month's run. Bgwhite (talk) 22:16, 11 August 2016 (UTC)
- Thank you very much for your help. IKhitron (talk) 22:58, 11 August 2016 (UTC)
- IKhitron Correct. These should not be in next month's run. Bgwhite (talk) 22:16, 11 August 2016 (UTC)
- Thank you, Bgwhite. It means, these articles will not be in the list in the next run? IKhitron (talk) 22:10, 11 August 2016 (UTC)
#88 has false positive
At here most of the reported items are false positive. the {{DEFAULTSORT:}}
on fa.wikipedia is {{ترتیبپیشفرض:}}
. checkwiki shows any texts which is started with ترتیب:
it doesn't care that it should have {{
at the first. for example fa:آرایههای ادبی doesn't have blank at first position.Yamaha5 (talk) 11:48, 9 August 2016 (UTC)
- In other word: the report should only check cases which have
{{
with the first word of mediawiki magice word () for example for english if we have this text it will report it incorrectly
* some text DEFAULTSORT: foo some text...
for Persian
* some text ترتیب: foo some text...
it is wrong and it should check if DEFAULTSORT:
had {{
in advance then report it! like text in below
* some text {{DEFAULTSORT: foo some text...
for Persian
* persian text {{ترتیب: foo some text...
Yamaha5 (talk) 07:44, 12 August 2016 (UTC)
- at #88 the code should be like below
my $sortkey = $test_text;
$sortkey =~ s/^([ ]+)?$current_magicword//;
$sortkey =~ s/^([ ]+)?://;
to
my $sortkey = $test_text;
$sortkey =~ s/^{{([ ]+)?$current_magicword//;
$sortkey =~ s/^{{([ ]+)?://;
Yamaha5 (talk) 07:48, 12 August 2016 (UTC)
New false positives for #22
Hi Bgwhite, new false positives are appearing on frwiki when the category name itself contains a colon with whitespace characters around it, like [[Catégorie:Acteur de Lost : Les Disparus]]
in fr:Terry O'Quinn. --NicoV (Talk on frwiki) 19:21, 28 July 2016 (UTC)
- NicoV Should be fixed for the run that starts in an hour. enwiki doesn't have two colons in a cat. No good #*$(@ nothing &(*! French. Problem was caused by the update that catches the #22s WPC found. Bgwhite (talk) 23:11, 28 July 2016 (UTC)
- Bgwhite Most of them are fixed, except fr:Lost : Les Disparus where
[[Catégorie:Lost : Les Disparus|Lost : Les Disparus]]
is still detected by CW. --NicoV (Talk on frwiki) 19:57, 17 August 2016 (UTC)
- Bgwhite Most of them are fixed, except fr:Lost : Les Disparus where
Reference localization
Hello. Is there a possibility to recognize a template as footnote? Thank you. IKhitron (talk) 15:35, 29 July 2016 (UTC)
- You're talking about this?
error_003_templates_ruwiki=
Примечания
Список примечаний
Reflist
Reflist+ END
# ...
error_078_templates_ruwiki=
(Примечания|Список примечаний|Reflist\+?)(?![^}]*group) END
- Not at all, Facenapalm, thank you, I'm talking about a footnote (ref), bot references. IKhitron (talk) 16:28, 29 July 2016 (UTC)
- Facenapalm I'm also unclear what you are asking. Remember, I'm slow. Could you put what your asking in different words?
- Is there any possibility that you wanted to ask me this question, Bgwhite? IKhitron (talk) 23:59, 29 July 2016 (UTC)
- IKhitron Yes. Like I said, I'm slow. Bgwhite (talk) 00:39, 30 July 2016 (UTC)
- Well, Bgwhite, when you want to add a footnote you use
<ref name=somename...>some text</ref>
. I can't do this in rtl, so I use {{reftemplate|name=somename|...|some text}}, which is transcluded to the previous form. I asked if there is a possibility to add local name of footnote template, that will be recognized as ref tag. IKhitron (talk) 00:47, 30 July 2016 (UTC)
- Well, Bgwhite, when you want to add a footnote you use
- IKhitron Yes. Like I said, I'm slow. Bgwhite (talk) 00:39, 30 July 2016 (UTC)
- Is there any possibility that you wanted to ask me this question, Bgwhite? IKhitron (talk) 23:59, 29 July 2016 (UTC)
- Facenapalm I'm also unclear what you are asking. Remember, I'm slow. Could you put what your asking in different words?
- @Bgwhite: IKhitron (talk) 10:33, 9 August 2016 (UTC)
- IKhitron Ok, I've got some time this week. I'm still not understanding. Which error is this for? What would be an error case? Bgwhite (talk) 07:19, 31 August 2016 (UTC)
- Thank you, Bgwhite. There are some, especially 78 and 81, but also 61 and 67. For 78 if the article doesn't have no ref and no references, but have ref template, it's not recognized. For 81 it will be splendid if the case when ref text and template text are the same, for example, will be recognizable. IKhitron (talk) 11:34, 31 August 2016 (UTC)
- IKhitron Ok, I'm understanding. For #61 and #78, you can add reftemplate to your translation file. For #61, add at the end of its config:
- Thank you, Bgwhite. There are some, especially 78 and 81, but also 61 and 67. For 78 if the article doesn't have no ref and no references, but have ref template, it's not recognized. For 81 it will be splendid if the case when ref text and template text are the same, for example, will be recognizable. IKhitron (talk) 11:34, 31 August 2016 (UTC)
- IKhitron Ok, I've got some time this week. I'm still not understanding. Which error is this for? What would be an error case? Bgwhite (talk) 07:19, 31 August 2016 (UTC)
error_061_templates_enwiki= reftemplate END
- Then do the same for 78. For #67... either #61 is on, or #67 is on, but not both. #81 is a different story and its a bugger. Not sure on how to do that one. Do you have some examples so I can do some testing? Bgwhite (talk) 18:22, 31 August 2016 (UTC)
- Thank you very much, Bgwhite. It's already a lot for me. About example: You have he:Template:הערה and all transcluded pages. The base is: template named "הערה", which has some parameters, when the reference text is the first unnamed parameter. As in {{הערה|שם=refname1|reftext|קבוצה=refgroup5}}. If you'll decide it's possible I'll thank you even more. IKhitron (talk) 19:36, 31 August 2016 (UTC)
- By the way, Bgwhite, is there a possibility to do this for #3? I mean more ref templates, as in #61, not nore references template as in #78? Thank you, IKhitron (talk) 15:35, 4 September 2016 (UTC)
- and one more btw, #78 references templates does not recognize different groups. There is some parameter for this? Thank you. IKhitron (talk) 15:48, 4 September 2016 (UTC)
- Then do the same for 78. For #67... either #61 is on, or #67 is on, but not both. #81 is a different story and its a bugger. Not sure on how to do that one. Do you have some examples so I can do some testing? Bgwhite (talk) 18:22, 31 August 2016 (UTC)
False positive for #94
Hi, fr:Nicotinamide adénine dinucléotide is detected as having an isolated ref tag, with the notice </ref>| cl50 = | logp = | dja = | od
but I don't understand what's wrong because the reported closing ref tag has an opening tag <ref name="ChemIDplus">{{ChemID|53-84-9|Nadide}}, consulté le 16 août 2009</ref>
| CL50 =
| LogP =
| DJA =
. --NicoV (Talk on frwiki) 22:49, 18 September 2016 (UTC)
- NicoV It's not giving me an error. I haven't changed that part of the code this month. Article hasn't been changed this month. I don't know. Bgwhite (talk) 04:57, 19 September 2016 (UTC)
- Bgwhite checkarticle.cgi gives the following answer:
- 94 3695 </ref>| cl50 = | logp = | dja = | od
- so it's still reported as an error on wmflabs... --NicoV (Talk on frwiki) 05:21, 19 September 2016 (UTC)
- Bgwhite checkarticle.cgi gives the following answer:
There is a dead link to the Toolserver ([[tools:~sk/checkwiki/enwiki/enwiki_translation.txt|toolserver]]) in Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia/Translation. Please update the link or remove it entirely if it is no longer needed. --Meno25 (talk) 14:24, 2 September 2015 (UTC)
- At the same time, it should be updated to take into account the new elements that are managed by CW: whitelispage, ... --NicoV (Talk on frwiki) 14:39, 2 September 2015 (UTC)
- Meno25 I know nothing about this. What is this for and how is it used? Bgwhite (talk) 17:32, 3 September 2015 (UTC)
- Removed. -- Magioladitis (talk) 17:38, 3 September 2015 (UTC)
- Update this too: Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia. --79.52.67.85 (talk) 18:27, 3 September 2015 (UTC)
- I think Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia can be deleted as it doesn't seem to match the current situation and is probably useless now. @Bgwhite and Magioladitis: What do you think? --NicoV (Talk on frwiki) 10:53, 25 February 2016 (UTC)
- Update this too: Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia. --79.52.67.85 (talk) 18:27, 3 September 2015 (UTC)
- Removed. -- Magioladitis (talk) 17:38, 3 September 2015 (UTC)
- Meno25 I know nothing about this. What is this for and how is it used? Bgwhite (talk) 17:32, 3 September 2015 (UTC)
NicoV link was updated. Feel free to perform any further action. If Bgwhite agrees we can delete it. -- Magioladitis (talk) 23:23, 19 September 2016 (UTC)
- Magioladitis, I think that:
- Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia/Translation can be updated with other parameters (
whitelistpage_enwiki
) and that we could add|expiry=indefinite
to remove the warning - Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia should be deleted as it doesn't reflect the current status
- Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia/Translation can be updated with other parameters (
- -- NicoV (Talk on frwiki) 05:48, 20 September 2016 (UTC)
NicoV I deleted the latter. -- Magioladitis (talk) 07:29, 20 September 2016 (UTC)
Request: Report for wrong dictation
There are some pages on wikipedia's like below which shows common wrong dictation. please add this to the reports to show which pages have these words.
- de:Wikipedia:Helferlein/Rechtschreibprüfung/Wortliste
- es:Wikipedia:Corrector_ortográfico/Listado
- gl:Wikipedia:Revisor_ortográfico/Listaxe
- he:ויקיפדיה:סקריפטים/בודק_איות/מילון
- ur:ویکیپیڈیا:املا_پڑتالگر/فہرست_الفاظ
- fa:ویکیپدیا:اشتباهیاب/فهرست
- The first word before || is the wrong oneYamaha5 (talk) 09:30, 11 August 2016 (UTC)
- Yamaha5 On the enwiki side, you are talking about Wikipedia:Lists of common misspellings and Wikipedia:Lists of common_misspellings/For machines? If so, then this would be outside of CheckWiki's scope. In theory, CheckWiki find syntax errors and other errors in the source code. Spelling and other kinds of word errors wouldn't be in CheckWiki's scope. One can do a Google or a Wikipedia search to find these. Bgwhite (talk) 22:14, 11 August 2016 (UTC)
- Bgwhite I know we can search at google. I wanted monthly lists which can be solved by bots or AWB by users Yamaha5 (talk) 07:53, 12 August 2016 (UTC)
- Yamaha5 On the enwiki side, you are talking about Wikipedia:Lists of common misspellings and Wikipedia:Lists of common_misspellings/For machines? If so, then this would be outside of CheckWiki's scope. In theory, CheckWiki find syntax errors and other errors in the source code. Spelling and other kinds of word errors wouldn't be in CheckWiki's scope. One can do a Google or a Wikipedia search to find these. Bgwhite (talk) 22:14, 11 August 2016 (UTC)
AWB provides Typo fixing but this is the outside the scope of this project. - Magioladitis (talk) 07:56, 12 August 2016 (UTC)
Yamaha5, you can generate such lists using WPC, with error #501 (spelling) and the dump analysis feature, but it may require a few modification of configuration and tweaks. --NicoV (Talk on frwiki) 07:46, 25 September 2016 (UTC)
Line break tags
As part of mw:Parsing/Replacing Tidy, all the wikis need to be checked for invalid </br>
codes. They should ideally be replaced with just plain <br>
. In the regular search box, you can find these by typing insource:/\<\/br\>/
. There are (currently) only a few of these in articles at the English Wikipedia, but there are 900+ pages in the Template: namespace that contain this error, and there are potentially thousands of affected pages at other wikis. Whatamidoing (WMF) (talk) 17:27, 4 October 2016 (UTC)
- These are easy to fix (I just fixed 90 of them), but editors will continue to add them. We probably need a maintenance category to track this tag and similar tags that can cause HTML errors. Since all wikis will need the category, it should probably be created at the MediaWiki level. – Jonesey95 (talk) 20:59, 4 October 2016 (UTC)
- Whatamidoing (WMF) I think your meant
</br>
and not</br>/
. CheckWiki does find cases of</br>
in article space along with</hr>
. It also finds invalid self-clsed tags such as<span />
and<small />
. For some languages (ie enwiki), CheckWiki does a daily scan. CheckWiki also scans the monthly dump to catch any that were missed.
- Whatamidoing (WMF) I think your meant
- @NicoV, Meno25, Edgars2007, Facenapalm, Josve05a, Matěj Suchánek, and Magioladitis: Alerting the normal crew. The first link Whatamidoing gave talks about the issues replacing Tidy will cause. Many are fixed by CheckWiki, others are not. One of the maintenance categories already set up is Category:Pages using invalid self-closed HTML tags. This should be available on all Wikis. Bgwhite (talk) 21:12, 4 October 2016 (UTC)
- (I fixed Whatamidoing's apparent typo above, so Bgwhite's first sentence may be confusing to readers here.) – Jonesey95 (talk) 21:26, 4 October 2016 (UTC)
- @NicoV, Meno25, Edgars2007, Facenapalm, Josve05a, Matěj Suchánek, and Magioladitis: Alerting the normal crew. The first link Whatamidoing gave talks about the issues replacing Tidy will cause. Many are fixed by CheckWiki, others are not. One of the maintenance categories already set up is Category:Pages using invalid self-closed HTML tags. This should be available on all Wikis. Bgwhite (talk) 21:12, 4 October 2016 (UTC)
I fixed all articles and templates. -- Magioladitis (talk) 23:26, 4 October 2016 (UTC)
- Wow, that was fast! I find 75,835 pages when I search all namespaces for the insource string above. I think we need a maintenance category and a bot. – Jonesey95 (talk) 03:50, 5 October 2016 (UTC)
- Whatamidoing (WMF), is there a Phab task to add a tracking category to MediaWiki for these errant tags? – Jonesey95 (talk) 14:02, 7 October 2016 (UTC)
- I don't know. Whatamidoing (WMF) (talk) 20:52, 7 October 2016 (UTC)
- I added a note to T145530 with a link to this discussion. – Jonesey95 (talk) 22:54, 7 October 2016 (UTC)
- I don't know. Whatamidoing (WMF) (talk) 20:52, 7 October 2016 (UTC)
- Whatamidoing (WMF), is there a Phab task to add a tracking category to MediaWiki for these errant tags? – Jonesey95 (talk) 14:02, 7 October 2016 (UTC)
@Bgwhite: I searched for and fixed all instances of </br>
on Arabic (ar) Wikipedia (373 pages) and Egyptian Arabic (arz) Wikipedia (31 pages). However, since users are likely to add the invalid tag again, so, I will have to run the bot regularly to fix it. --Meno25 (talk) 12:25, 8 October 2016 (UTC)
False positives for #24
Hi Bgwhite, I've found 2 false positives for #24 (pre tags) on frwiki (fr:XML Schema and fr:XQuery), that shouldn't be detected for several reasons:
- They're not
<pre>
tags, but<prenom>
tags (and they're properly closed) - They're inside a
<source>
tag
--NicoV (Talk on frwiki) 08:27, 8 October 2016 (UTC)
- Interestingy, a similar issue at cs:JavaServer Pages. Matěj Suchánek (talk) 13:37, 8 October 2016 (UTC)
- And more interestingly, already disscused above. Matěj Suchánek (talk) 13:39, 8 October 2016 (UTC)
Invalid link to article
In dewiki the link to 1% of one shows only 400 Bad Request. The problem is the missing URL-encoding for the "%".
This is the correct link with encoding: https://de.wikipedia.org/wiki/1%25_of_one
This encoding should be done automatically.--GünniX (talk) 03:39, 25 September 2016 (UTC)
- Hi. We have the same problem. The name shown as "Ss", and I still do not know what is the right one. IKhitron (talk) 09:13, 25 September 2016 (UTC)
- GünniX In theory, this should now be fixed. IKhitron, could you give me an example when one show up. Bgwhite (talk) 00:26, 30 September 2016 (UTC)
- Thanx, I'll test it at the next occurrence. --GünniX (talk) 06:19, 30 September 2016 (UTC)
- Sure, Bgwhite, here you are: [11]. IKhitron (talk) 09:20, 30 September 2016 (UTC)
- The Ss, Bgwhite, was he:ß. IKhitron (talk) 18:50, 13 October 2016 (UTC)
- Sure, Bgwhite, here you are: [11]. IKhitron (talk) 09:20, 30 September 2016 (UTC)
#67 configuration does not work
Hi. I opened #61 a month ago in hewiki and it gave me 100,000 answers. Than I closed it and opened #67, and it gave only 100 answers. Both have the same template configurations, but in the last one it does not work and returns pure ref only. Thank you. IKhitron (talk) 12:44, 5 October 2016 (UTC)
- IKhitron Can't turn on both #61 and #67 at the same time. It depends on Hebrew Wiki's rules on which one to choose. On English Wiki, references come after punctuation marks, so #61 is turned on. On French Wiki, references come before punctuation marks, so #67 is turned on. With 100,000 errors on #61, I'd guess that Hebrew Wiki uses #67, before punctuation mark error. #67 doesn't return just the pure refs, the first thing is a punctuation mark. Bgwhite (talk) 19:29, 5 October 2016 (UTC)
- You did not understand me, Bgwhite. I did not turn on both. I wanted the list for 61 on one run and for 67 on another. And I did not mean pure refs without punctuation, I meant pure refs errors, and nothing with templates as references. And hewiki rule is: Does not metter if it's before punctuation mark or after, but it should be unique in every article. So I need both lists to get the common articles in AWB list comparer. IKhitron (talk) 19:38, 5 October 2016 (UTC)
- IKhitron I'm still not understanding. Could you give me an example of what's happening and what the desired result is? Bgwhite (talk) 15:26, 19 October 2016 (UTC)
- Sure, Bgwhite. #61:
<ref>text</ref>.
works,{{הערה|text}}.
works. #67:.<ref>text</ref>
works,.{{הערה|text}}
does not work. Thank you, IKhitron (talk) 15:34, 19 October 2016 (UTC)- IKhitron Ok, I understand... I'm slow. #67 doesn't check for templates. I'll need to add it. Could you give me a couple of articles to test on? Bgwhite (talk) 15:56, 19 October 2016 (UTC)
- w:he:ויקישיתוף, w:he:הולנד, w:he:מים (commons:, Netherlands and water), Bgwhite. Thank you. IKhitron (talk) 16:03, 19 October 2016 (UTC)
- IKhitron Ok, I understand... I'm slow. #67 doesn't check for templates. I'll need to add it. Could you give me a couple of articles to test on? Bgwhite (talk) 15:56, 19 October 2016 (UTC)
- Sure, Bgwhite. #61:
- IKhitron I'm still not understanding. Could you give me an example of what's happening and what the desired result is? Bgwhite (talk) 15:26, 19 October 2016 (UTC)
- You did not understand me, Bgwhite. I did not turn on both. I wanted the list for 61 on one run and for 67 on another. And I did not mean pure refs without punctuation, I meant pure refs errors, and nothing with templates as references. And hewiki rule is: Does not metter if it's before punctuation mark or after, but it should be unique in every article. So I need both lists to get the common articles in AWB list comparer. IKhitron (talk) 19:38, 5 October 2016 (UTC)
New id needed for new category sorting algorithm
Hi. There is a new algorithm for category sorting in wikipedias. It's much better than previous, but there is a new problem: 232,456,743 is sorted between 230 and 235. To fix this, such an article needs a defaultsort:232456743. Could you please create an id for article with name that includes comma separated number and hasn't default sort, or has but not comma removed? Thank you. IKhitron (talk) 13:03, 20 October 2016 (UTC)
- Agree Maybe together with DEFAULTSORT itself having comma separated digits. Matěj Suchánek (talk) 07:43, 22 October 2016 (UTC)
#6 and #37 mostly obsolete.
@NicoV, Magioladitis, Yamaha5, Josve05a, Edgars2007, and Facenapalm: MediaWiki is moving to a new collation scheme called Unicode collation algorithm (UCA). Letters with diacritics will be sorted the same as with the non-diacritic version. I still don't know the timetable, but I did find the phab ticket (T136150) on moving enwiki to UCA. They have already moved several other wikis to UCA, including Russian, French, Latvian, Farsi and Swedish wikis. The listing of wikis can be found here; I'm thinking, #6 and #37 will only check for punctuation at some point for all wikis. I'll work on getting the wikis already on UCA to only check punctuation. Bgwhite (talk) 02:14, 29 July 2016 (UTC)
- @Bgwhite: keep in your mined we have T139110 bug. is it makes problom for #6 and #37? Yamaha5 (talk) 03:49, 29 July 2016 (UTC)
- lvwiki has disabled those ones, so I'm fine. --Edgars2007 (talk/contribs) 06:44, 29 July 2016 (UTC)
- Same on ruwiki. In ruwiki, the only allowed letter with diacritic in titles is ё, but it's sorted correctly. Facenapalm (talk) 10:29, 29 July 2016 (UTC)
- lvwiki has disabled those ones, so I'm fine. --Edgars2007 (talk/contribs) 06:44, 29 July 2016 (UTC)
- Czech Wikipedia doesn't use these errors, so you can remove the hardcoded stuff for cswiki from the code. Matěj Suchánek (talk) 08:13, 22 October 2016 (UTC)
Are people really supposed to be doing WCW edits on user talk pages?
Resolved
It's unimportant and annoying. Can I opt out at least? --Floquenbeam (talk) 21:28, 9 November 2016 (UTC)
- Floquenbeam CheckWiki does not scan any talk pages for errors, only articles. CheckWiki only finds errors, not corrects them. There are tools and scripts out there that also detects and/or fixes CheckWiki errors. Most likely, an editor saw an "error" on your talk page, that happened to be a CheckWiki error. They then used a tool or script to "fix" it. Bgwhite (talk) 22:54, 9 November 2016 (UTC)
Wishes for exclusions on #34
Would it be possible to exclude some cases with {{{ or }}} from detection?
- all these {{{Zeige...-expressions like in de:Liste der EU-Vogelschutzgebiete in Berlin
- { in front of de:template:overline/ de:template:Oberstrich e. g. de:Ernstit
--Hadibe (talk) 19:19, 16 November 2016 (UTC)
- Hadibe I don't know what those Zeige expressions are or do, but I hate them. Some pages have hundreds of them. I can exclude all {{{ from being checked. {{{ is already excluded from ruwiki and ukwiki. Those wikis use {{{|} alot. This would also solve the overline template issue. Bgwhite (talk) 21:06, 16 November 2016 (UTC)
- Please don't skip them all. That would avoid detection of typos. Then better leave the status quo and hope that there won't be to much new uses and also hope that Zeige... can be wiped out some day. Anyway, thanks for your fast response. --Hadibe (talk) 21:42, 16 November 2016 (UTC)
For #85, ignore <div style="height: ...; width: "> ?
@Bgwhite: Should we also ignore div tags with a given width or height, like in Phtalocyanine ? For example, <div style="height:150px; width:150px; background-color:#000f89; border-bottom:solid 1px #000000;"></div>
gives
--NicoV (Talk on frwiki) 19:11, 17 November 2016 (UTC)
- NicoV I ran into one of these the other day. In that case, an infobox should have been used instead of
<div>
. On enwiki, there is {{Color swatch}} that does the same thing. There are 16 interwiki links listed, but not one to the French equivalent. Bgwhite (talk) 21:02, 17 November 2016 (UTC)
New id suggestion
Hi. What do you think about such an id:
- Read the article.
- Find all strings [^']''[^'] and count them as I.
- Find all strings [^']'''[^'] and count them as B.
- Find all strings [^']'''''[^'] and count them as IB.
- At the end, mark the article as new id if I count or B count (or both) are odd.
Thank you. IKhitron (talk) 14:26, 23 August 2016 (UTC)
- There will be false positives from constructions such as "Billboard's", which usually renders correctly wherever I have seen it. I suppose someone might think it valuable to replace that construction with Billboard's, but it's not really an error, since it renders correctly. – Jonesey95 (talk) 14:34, 23 August 2016 (UTC)
- Yes, you are right. But it's better from ignoring this problem. One can whitelist this article. IKhitron (talk) 14:37, 23 August 2016 (UTC)
- I find 1,213 articles with this search for
]]'''s
. It might be a fun little AWB project for someone to clean them all. – Jonesey95 (talk) 15:36, 23 August 2016 (UTC) - I afraid this whitelist will be really big. The other problem is that the article can be wrong even if counts are correct, for example, here:
a<ref>'''b</ref> '''c d
. Facenapalm (talk) 15:39, 23 August 2016 (UTC)
- I find 1,213 articles with this search for
- Yes, you are right. But it's better from ignoring this problem. One can whitelist this article. IKhitron (talk) 14:37, 23 August 2016 (UTC)
A nice template is {{'}}. -- Magioladitis (talk) 15:44, 23 August 2016 (UTC)
- Yes, 1,213 is a lot indeed. It's in enwiki (and other en* wikis) only, but you can't write one id for enwiki and other for rest wikis. So, what about the smaller project - mark if I+B is odd? IKhitron (talk) 16:29, 23 August 2016 (UTC)
- Well, Bgwhite, what's the decision? IKhitron (talk) 20:35, 30 August 2016 (UTC)
- IKhitron I think there are too many false positives. Looks like more false positives than actual errors. So, I don't think it would be a good idea. Bgwhite (talk) 21:16, 30 August 2016 (UTC)
- Thank you. IKhitron (talk) 21:19, 30 August 2016 (UTC)
- IKhitron I think there are too many false positives. Looks like more false positives than actual errors. So, I don't think it would be a good idea. Bgwhite (talk) 21:16, 30 August 2016 (UTC)
I would find this very helpful as this is what usually blocks pywikibot's library mwparserfromhell from successful parsing a template. Matěj Suchánek (talk) 18:01, 25 November 2016 (UTC)
feature list request
On fa.wikipedia we have a page and cleaning bot which lists and do some cleaning task, I will list some of useful Items for your tool:
- Category pages which have {{Category redirect}} and interwiki (local or wikidata)
- Categories which are like article (huge size) for example page_len>1000. some newbies add article text to category page.
- Redirect pages which have interwiki
- Pages which have old_interwiki (not wikidata)
- Pages which have duplicated coordination
- Redirect pages which their talk page is redirected to other page query
- Redirect pages which their talk page is not redirect query
- Redirect talk pages which the main page is not redirect query
- Redirect pages with (disambiguation) and linked to not disambiguation pages query
- Similar pages with different hidden characters query
- Cleaning content
- Pages which have : after == (for example
== foo ==\n:the text
) - Pages which have more
<br/>
after each other (for examplefoo<br/><br/><br/><br/><br/>bar
) - Page which have
[•●⚫⬤]
instead of * (for example• foo \n• bar
) - Pages which their lines started with numbers instead of #
- Page which have non-standard title for source or external links subsection (for example == our sources == or == the sources == ,...)
- Pages linked to (wiki(pedia|media|data|source|news|oyage|quote)|wiktionary)\.org without using their template
- Pages/articles which have more
['math', 'code', 'nowiki', 'pre', 'source', 's', 'su[bp]', 'noinclude', 'includeonly', 'big', 'small','gallery']
after each other - Pages which have
[\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000]
characters instead of normal space - Pages which have LRM، RLM characters like
(\u202A|\u202B|\u202C|\u202D|\u202E|\u200F)
- Pages which have ... instead of …
- Pages which have
----
for horizental line - Pages which have space between == (for example = =)
- Pages which have more than 5 = in their subsection (for example
========= foo ===========
) - Pages which have more empty lines in their content (for example
\n\n\n\n\n\n\n
or\n\n \n\n
) - Pages which have tab \t at their first lines (for example
\n\t
)
- Yamaha5 (talk) 01:25, 29 June 2016 (UTC)
- I believe that many of these features can be handled by queries or some PetScan lists. IMO CW should be aimed on things which are not accessible from database, such as wikitext or HTML markup errors. Matěj Suchánek (talk) 18:41, 29 June 2016 (UTC)
- The Cleaning content part shouldn't be possible by query. the database text's table is closed so it is not possible to get them by queryYamaha5 (talk) 20:25, 29 June 2016 (UTC)
- I believe that many of these features can be handled by queries or some PetScan lists. IMO CW should be aimed on things which are not accessible from database, such as wikitext or HTML markup errors. Matěj Suchánek (talk) 18:41, 29 June 2016 (UTC)
- Yamaha5 Egads. I hate to have been your mom. Yamaha, what do you want for dinner. Mom, I'll have chicken, steak, carrots, peas, mashed potatoes, cauliflower, spaghetti ...
- A quick look... some can't be implemented, for example interwikilinks and
----
are valid. - For #8 and #9 on the cleanup list, on enwiki the following are being checked: \x{007F}, \x{200B}, \x{2028}, \x{202A}, \x{202C}, \x{202D}, \x{202E}, \x{00A0}, \x{00AD}, \x{202B}, \x{200F}, \x{2004}, \x{2005}, \x{2006}, \x{2007}, \x{2008}
- To implement this is easy. Are there any on the enwiki list you don't want? Can you and Magioladitis (he is the expert, not me) look at the rest and see if they are ok to be added. I can't remember exactly but I think \x{202B} and \x{202F} caused problems if they were removed on enwiki.
- A quick look... some can't be implemented, for example interwikilinks and
- Bgwhite (talk) 22:18, 29 June 2016 (UTC)
4 will be a disaster ad I good proved why. -- Magioladitis (talk) 22:23, 29 June 2016 (UTC)
- User:Bgwhite :))) for characters we can omite LRM، RLM and ZWNJ they uses in foreign languages
- User:Magioladitis: 4 you mean #4 ? Yamaha5 (talk) 22:39, 29 June 2016 (UTC)
- Yamaha5 yes, I mean #4. -- Magioladitis (talk) 22:42, 29 June 2016 (UTC)
- Yamaha5 I've gotten these mixed up in the past. Do you want me to add enwiki's list for fawiki? Bgwhite (talk) 00:20, 30 June 2016 (UTC)
- Is it different list for projects? I thought lists for all projects are the same.In fawiki the query part we have active bot for them but the content part which is related to checkwiki we don't have active bot. is it possible to add them to whole project for all languages? if you want I can help you for adding them.
- If we have these lists at checkwiki we can clean them regularly. Yamaha5 (talk) 05:44, 30 June 2016 (UTC)
- Yamaha5 We are concerned that some of Unicode characters were needed in other languages, especially in right-to-left ones. I'd rather take this one slow and push any new Unicode characters to those projects that what them. For example, two of the LRM، RLM characters on your wanted list does cause problems on enwiki if they were removed. I get confused on what acronyms belong to which Unicode character... I've got dyslexia. I can read ok, it's processing in the head and also writing that causes me problems, LRM and RLM gets jumbled for example. So, what Unicode characters I listed above do you want or not want? These can be easily added for the next run, then we can test the others you mentioned in fawiki and enwiki for August's run. Bgwhite (talk) 05:14, 1 July 2016 (UTC)
- Bgwhite what Unicode characters I listed above do you want or not want? if you mean for fa.wiki Now we have cleaning tool which convert #8 to space and #9 to
\u200c
and do conversion for #10 we tested and It was fine. If you mean which characters may cause problem for other languages like English in my opinion we should get list and check one by one by the local users and they can tell us which one should remove for them. so for fawiki we need #8, #9, #10 as I mentioned above for other languages we can remove as they want.Yamaha5 (talk) 07:48, 1 July 2016 (UTC)- #8:I removed the duplicated characters in mine and your list so there is characters should add to the checkwiki for all languages. : U+0020, U+2000, U+2001, U+2002, U+2003, U+2009, U+200A, U+007F, U+200B, U+2028, U+202A, U+202C, U+202D, U+202E, U+00A0, U+00AD, U+202B, U+200F, --convert to--> space
- #9: for fa.wikipedia we need to list all mentioned in #9 for other languages I don't know.
- #10:for fa.wiki we need it.
- At end please take a look on this. we can add them to checkwiki (new request :) ).Yamaha5 (talk) 08:34, 1 July 2016 (UTC)
- Yamaha5 I've added fawiki to the same ones enwiki currently find. AWB can convert or remove these via the find and replace. For example, add "\u200E|\u200F|\uFEFF|\u200B|\u2028|\u202A|\u202C|\u202D|\u202E|\u00AD" to the find column and a space in the replace column. Bgwhite (talk) 01:04, 3 July 2016 (UTC)
- Bgwhite what Unicode characters I listed above do you want or not want? if you mean for fa.wiki Now we have cleaning tool which convert #8 to space and #9 to
- Yamaha5 We are concerned that some of Unicode characters were needed in other languages, especially in right-to-left ones. I'd rather take this one slow and push any new Unicode characters to those projects that what them. For example, two of the LRM، RLM characters on your wanted list does cause problems on enwiki if they were removed. I get confused on what acronyms belong to which Unicode character... I've got dyslexia. I can read ok, it's processing in the head and also writing that causes me problems, LRM and RLM gets jumbled for example. So, what Unicode characters I listed above do you want or not want? These can be easily added for the next run, then we can test the others you mentioned in fawiki and enwiki for August's run. Bgwhite (talk) 05:14, 1 July 2016 (UTC)
- Yamaha5 I've gotten these mixed up in the past. Do you want me to add enwiki's list for fawiki? Bgwhite (talk) 00:20, 30 June 2016 (UTC)
- Yamaha5 yes, I mean #4. -- Magioladitis (talk) 22:42, 29 June 2016 (UTC)
Coming back to this section, I support coding up #3 and #4 (and maybe #13 and #15) from the second list. Matěj Suchánek (talk) 09:25, 26 November 2016 (UTC)
ISBN error check - potential enhancement
Take a look at this version of Ahmad Shah Durrani, specifically the last entry in the Bibliography. The ISBN contains both hyphens and spaces, preventing it from becoming a magic link. Is it possible and/or advisable for WCW's ISBN error check to look for articles containing this error? I don't know whether it would result in a lot of false positives. – Jonesey95 (talk) 18:00, 25 November 2016 (UTC)
- From my experience, and we run bots to replace magic links to templates about two weeks, there are a lot of problems with this. Once even it converted cite template parameter name to template. IKhitron (talk) 18:04, 25 November 2016 (UTC)
- I do not understand this answer. I'm not talking about doing anything with templates. I am talking about detecting and reporting a problem with ISBNs that breaks magic links. – Jonesey95 (talk) 18:18, 25 November 2016 (UTC)
- Me too. Sorry, my English is not good. I meant that there are a lot of cases when you are sure you have right regexp, and then recognize another problem. IKhitron (talk) 18:20, 25 November 2016 (UTC)
- Magic links are going away soonish anyway. I'd just suggest fixing errors if you find them. Jerod Lycett (talk) 00:55, 26 November 2016 (UTC)
- We can't fix them if we can't find them, which is why I suggest enhancing this particular error check. We also can't make these particular magic links "go away", because they are not detected as such by the Mediawiki software. – Jonesey95 (talk) 00:58, 26 November 2016 (UTC)
- Magic links are going away soonish anyway. I'd just suggest fixing errors if you find them. Jerod Lycett (talk) 00:55, 26 November 2016 (UTC)
- Me too. Sorry, my English is not good. I meant that there are a lot of cases when you are sure you have right regexp, and then recognize another problem. IKhitron (talk) 18:20, 25 November 2016 (UTC)
- I do not understand this answer. I'm not talking about doing anything with templates. I am talking about detecting and reporting a problem with ISBNs that breaks magic links. – Jonesey95 (talk) 18:18, 25 November 2016 (UTC)
@Jonesey95: Interesting... Do you know what exactly breaks the magic link ? It's not only mixing hyphens and spaces as none of the following work: ISBN 978- 1-4907 - 1441-7 ; ISBN 978- 1-4907-1441-7 ; ISBN 978 14907 14417 ; ISBN 97814907 14417. I wonder if the problem is not when you have two consecutive filling characters (2 consecutive spaces seem to break the magic links). It rather seems to be a bug in the magic links that we should report to the developers. I put a comment on phabricator T145604. I've just also modified WPCleaner to report ISBN inside nowiki tags as errors #69 (list of results for frwiki), as it seems to be mostly crap produced by CX or VE. --NicoV (Talk on frwiki) 11:24, 26 November 2016 (UTC)
Deprecation of magic links (T145604)
@Bgwhite and Magioladitis: and others: I was thinking of adding features in WPC to help replacing magic links like RFC, PMID and ISBN by templates as the magic links will stop working when T145604 is activated (maybe in a year). The easiest way for me would be to create new error # (one for each magic link is probably better). What do you think ? Should we also add them for CW (yes: we should use error numbers like #112 to #114 ; no: I will user error numbers like #528 to #530): I don't think it's useful since dedicated categories are already filled up automatically by MW (Pages using PMID magic links, Pages using ISBN magic links and Pages using RFC magic links) ? --NicoV (Talk on frwiki) 14:20, 16 November 2016 (UTC)
- I think we need a centralized discussion about what en.WP wants to do with these before we take any action to flag them. – Jonesey95 (talk) 16:21, 16 November 2016 (UTC)
- Both CheckWiki and WPCleaner are used across multiple wikis, not just en. The issue is at every language Wikipedia. This could get messy. mw:Requests for comment/Future of magic links contains what already has been done and the status of other tasks. I'd rather not add new CW errors when system-wide categories have already been set up. I wouldn't add anything to CW yet. It looks like there will be a parser function and templates. Parser function isn't ready. Bgwhite (talk) 20:48, 16 November 2016 (UTC)
- Ok. I think I will add new errors to WPC as #528 to #530 if I find some free time, and they will be activated only on wikis that decide to activate them. I know that on frwiki I can at least replace all the PMID by a template call and probably ISBN also, less clear for RFC. --NicoV (Talk on frwiki) 13:52, 17 November 2016 (UTC)
- Both CheckWiki and WPCleaner are used across multiple wikis, not just en. The issue is at every language Wikipedia. This could get messy. mw:Requests for comment/Future of magic links contains what already has been done and the status of other tasks. I'd rather not add new CW errors when system-wide categories have already been set up. I wouldn't add anything to CW yet. It looks like there will be a parser function and templates. Parser function isn't ready. Bgwhite (talk) 20:48, 16 November 2016 (UTC)
I've added error #528 to WPC to detect PMID magic links and suggest to replace them with a template call. It requires modifications both in CW configuration page and WPC configuration page to be full functional. It's already operational for frwiki as the PMID template was already existing and working like the magic link. --NicoV (Talk on frwiki) 14:04, 22 November 2016 (UTC)
- I've added error #529 to WPC to detect ISBN magic links and suggest to replace them with a template call. Configuration is also required as for #528. --NicoV (Talk on frwiki) 11:39, 26 November 2016 (UTC)
#111 question
Hello. Thank you for #111, but I have a question how can I define a single ref template? Thank you. IKhitron (talk) 13:59, 27 November 2016 (UTC)
#4 expansion
Resolved
CHECKWIKI now catches unbalanced closing a tags. -- Magioladitis (talk) 22:30, 12 November 2016 (UTC)
- Same for WPC. --NicoV (Talk on frwiki) 09:20, 29 November 2016 (UTC)
#3 expansion
Resolved
CHECKWIKI now is case insensitive. -- Magioladitis (talk) 22:30, 12 November 2016 (UTC)
- Same for WPC. --NicoV (Talk on frwiki) 09:20, 29 November 2016 (UTC)
Exclude signatures from #63
Would it please be possible to check if #63 (small in sup) appears inside of a user's signature? These findings are listed anyway on #95, so the articles don't have to be mentioned twice. On dewiki you don't see anything else. Eventually it's the user's choice how tiny they want to show parts of their signature. --Hadibe (talk) 13:58, 9 December 2016 (UTC)
Edit: I exchanged the list number from 85 to 63. Sorry, bad mistake. --Hadibe (talk) 19:30, 9 December 2016 (UTC)
- Hadibe It's the user choice upto a point. For example, no images in signatures on dewiki. It can't cause a big gap between lines on enwiki. Signatures still have to be accessible. This means signatures have to be colour-blind accessible, such as no red text on black background. It also means fonts can't get too small. The
<small>
tag reduces font size to 85%. I'm not sure how much the<sup>
tag reduces text size. Around 80%-85% smaller is the cutoff point where text becomes too small. So, having both the<small>
and<sup>
together brings text well below 80%. Bgwhite (talk) 21:37, 9 December 2016 (UTC)
CX attributes
An other type of crap produced by CX : tags like <center>...</center>
with CX internal attributes (data-cx-weight="356" data-source="184" class="" id="cx184" contenteditable="true"
), like this example. Should we detect them in an other error? --NicoV (Talk on frwiki) 17:49, 17 November 2016 (UTC)
- NicoV A search reveals 26 articles on enwiki with
"<center "
and 134 for frwiki. My favourite is<center align = center>
. I don't see one on enwiki that should be kept. On frwiki, they should be deleted or moved to <div tags. Probably dedtect them, but don't automatically fix them? Bgwhite (talk) 20:37, 17 November 2016 (UTC)- Thanks Bgwhite. It's not only center tags, but other tags, also tables... (CX is creating crap in almost every part of wikitext syntax...). For example, 116 pages when searching for data-cx-weight. --NicoV (Talk on frwiki) 11:58, 18 November 2016 (UTC)
- Other example 386 results when searching for contenteditable on frwiki... --NicoV (Talk on frwiki) 12:03, 18 November 2016 (UTC)
- NicoV I'd never seen or heard of contenteditable before. After looking it up, that is a worthless element on a site where anybody can edit. There's also non-transcrapulator elements such as
moz-border-radius
. That's a firefox specific element and the generalborder-radius
should be used. Add"<center "
to #2? As much as this makes me cry, add "invalid css attributes" as a new error? Bgwhite (talk) 22:03, 18 November 2016 (UTC)- Bgwhite I think I prefer something like the "invalid css attribute" which is more general (contenteditable is crap left by CX and maybe VE, but in many tags). But adding also the center tags with attributes to #2 maybe interesting : are there any valid cases to have attributes to center tags ? --NicoV (Talk on frwiki) 09:59, 19 November 2016 (UTC)
- NicoV Not one of the "weird"
<center>
is valid on enwiki. Most common one is<center class="">
. Bgwhite (talk) 23:38, 19 November 2016 (UTC)
- NicoV Not one of the "weird"
- Bgwhite I think I prefer something like the "invalid css attribute" which is more general (contenteditable is crap left by CX and maybe VE, but in many tags). But adding also the center tags with attributes to #2 maybe interesting : are there any valid cases to have attributes to center tags ? --NicoV (Talk on frwiki) 09:59, 19 November 2016 (UTC)
- NicoV I'd never seen or heard of contenteditable before. After looking it up, that is a worthless element on a site where anybody can edit. There's also non-transcrapulator elements such as
@Bgwhite: Thanks for the new error #112. I see there are a few false positives that could be avoided:
- IUPAC nomenclature of organic chemistry
-O-CH<sub>3</sub> at carbon atom 15 is
- fr:Syriaque
-o- en syriaque occidental (ex. « saint
--NicoV (Talk on frwiki) 12:34, 15 December 2016 (UTC)
- NicoV I original had it just looking for "-moz-" and "-webkit-", but there we false positives with urls. Now there has to be either a space or ; first ... ";-moz-".
- I just added "-o-" and yesterday was the first run. I think I'll turn it off. I'll add "-ms-" today and see how it goes from there. Bgwhite (talk) 22:00, 15 December 2016 (UTC)
Linebreak inside internal links
CW doesn't catch stuff like this. Matěj Suchánek (talk) 19:15, 1 December 2016 (UTC)
- Matěj Suchánek This will be added to the new error #113. #113 will also include some
<br>
in wikilinks, for example[[Foo<br>]]
. Bgwhite (talk) 09:42, 7 December 2016 (UTC)- @Bgwhite: #113 seems to be available for a few days now, but there are no errors reported in enwiki or frwiki. Does it work ? --NicoV (Talk on frwiki) 12:36, 15 December 2016 (UTC)
- NicoV It's been added to the database, which is why it is showing up. It's not turned on yet. I'm waiting for #104 and #112 to calm down first. Bgwhite (talk) 22:12, 15 December 2016 (UTC)
- @Bgwhite: #113 seems to be available for a few days now, but there are no errors reported in enwiki or frwiki. Does it work ? --NicoV (Talk on frwiki) 12:36, 15 December 2016 (UTC)
34th error now detects {! in ruwiki. Again
66597 matches, in previous dump there were less than 1000. I think the problem is here:
if ( $project ne 'ukwiki' or $project ne 'ruwiki' or $project ne 'bewiki' ) {
Correct code should contain "and", not "or". Facenapalm (talk) 10:30, 28 November 2016 (UTC)
- Facenapalm This should be fixed now. Main problem was $project wasn't defined yet. Bgwhite (talk) 23:11, 28 November 2016 (UTC)
- Thanks! Are there some static analysis tools for Perl that you can use? I'm not sure if they can catch "or" instead of "and" (but some of static analyzers for other languages can), but they definetely can catch using undefined variables. It's impossible to avoid all stupid errors, but static analyzers can help you to detect them immediately. Facenapalm (talk) 11:29, 29 November 2016 (UTC)
- Facenapalm The problem is I bring all types of stupid. The variable was declared, otherwise, an error would have been thrown out. It's very useful to have undefined variables, so Perl doesn't check for it. Perl has strict and warnings pragma that catches a lot of things. I also use Perl::Critic and NYTProf. Bgwhite (talk) 00:04, 30 November 2016 (UTC)
- Thanks! Are there some static analysis tools for Perl that you can use? I'm not sure if they can catch "or" instead of "and" (but some of static analyzers for other languages can), but they definetely can catch using undefined variables. It's impossible to avoid all stupid errors, but static analyzers can help you to detect them immediately. Facenapalm (talk) 11:29, 29 November 2016 (UTC)
- @Bgwhite: the problem is still there. There are also some false positives in 43rd error: for example, i can't see any errors here. I'm not sure what algorithm you use for hidding "{{{!}}", so I suggest this: if project is ruwiki (ukwiki/bewiki?), replace all
{{{!}}
with{{(!}}
(there is such template with the same meaning in russian wikipedia, so notice in checkwiki table will be understandable),and all(see next message). This will allow to scan dumps with typical algorithm without checking if current project is ruwiki in different errors. Facenapalm (talk) 11:50, 6 December 2016 (UTC){{!}}}((?:\}\})*[^\}])
with{{!)}}\1
- @Bgwhite: sorry for constantly troubling you, I just want to be sure that this topic doesn't forgotten. I tested some parser features - seems like no sence to make replacements like
{{!}}}}}
->{{!)}}}}
, they're wrong (last bracket will be processed like text, not the one that is after{{!}}
), so second replacement becames even easier:{{!}}}([^}])
->{{!)}}\1
. Facenapalm (talk) 01:23, 19 December 2016 (UTC)
- @Bgwhite: sorry for constantly troubling you, I just want to be sure that this topic doesn't forgotten. I tested some parser features - seems like no sence to make replacements like