Wikipedia talk:Correct typos in one click/Archive 1
This is an archive of past discussions on Wikipedia:Correct typos in one click. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 |
Bug in 'Type' removes surrounding whitespace
See this edit. A workaround is to add surrounding whitespace in the manual input box. ~ Tom.Reding (talk ⋅dgaf) 21:53, 7 October 2019 (UTC)
- Tom.Reding, I just updated the script to stop requiring space on type option. Let me know if there is any problem. Uziel302 (talk) 22:38, 7 October 2019 (UTC)
Wow
I just cleared the whole page up! :) 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 11:31, 4 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, thank you very much for your significant contribution! Uziel302 (talk) 23:02, 7 October 2019 (UTC)
Access to additional lists
@Uziel302: this is great. 4 additional lists are mentioned (11, 12, 13, 14), but the js doesn't appear to run on them. Could you update the js to make it run on all subpages of Wikipedia:Correct typos in one click, or, more safely, all subpages matching \d+
? ~ Tom.Reding (talk ⋅dgaf) 21:44, 7 October 2019 (UTC)
- Tom.Reding, I managed to load the script on additional pages but since I use the page to remove passages from it, it doesn't work when run on other pages. Maybe I can solve it by passing some variable of the page name to all applicable places, but this is too much work for now, I prefer just copying lists to the main page when it is emptied. Uziel302 (talk) 22:55, 7 October 2019 (UTC)
- @Uziel302: I got it working on both the base page and on numbered subpages - see User:Tom.Reding/test.js and
p_n
/pn
var usages. ~ Tom.Reding (talk ⋅dgaf) 00:11, 8 October 2019 (UTC)- @Tom.Reding: thanks a lot, I implemented the change in the project script. Uziel302 (talk) 04:43, 8 October 2019 (UTC)
- @Uziel302: I got it working on both the base page and on numbered subpages - see User:Tom.Reding/test.js and
"Labela"
"Labela", a parameter meaning "Label A" on certain templates, is being misinterpreted as a typo for "labels". @Uziel302: please fix this. 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 14:39, 8 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, there are many false positives here, when you click on the remove button it saves the removed word on edit history and for the next scan I will filter all the words that have been removed. Uziel302 (talk) 15:06, 8 October 2019 (UTC)
Adding template to user page
To increase visibility of this page, would it be possible to add a template to user pages, such as the "This user lets SuggestBot recommend articles to edit {{{twice a month}}}.", say something like "This user uses the Correct typos in one click tool"? Bellowhead678 (talk) 11:57, 12 October 2019 (UTC)
- You are welcome to use whatever template you want, I found it more effective to increase visibility through edits with link in summary. Uziel302 (talk) 21:34, 12 October 2019 (UTC)
Bug when passage can't be found
When the passage can't be accessed, then instead of coming up with a message (like "Success...." or "This word couldn't be found"), I get a very brief message and then the page refreshes, moving the page down a certain amount so I lose my place. Has this happened to anyone else? This happens sometimes even when I click on the bottom article on the list. I've tried this on both Chrome and Microsoft Edge and the issue still occurs. It also occurs on a MacBook (using Safari and Chrome). However, it doesn't happen on my phone. Bellowhead678 (talk) 11:21, 12 October 2019 (UTC)
- Bellowhead678, probably something adds stuff to paragraph code after the number. I updated the script to take only the number. As of refresh, this is intentional, if passage is really missing, it usually means you work on an old version and need to refresh. Uziel302 (talk) 21:27, 12 October 2019 (UTC)
- Thanks, it still happens but I don't move down the page so it's not a big issue anymore - I can just try again and normally it then works. Bellowhead678 (talk) 21:50, 12 October 2019 (UTC)
- I guess the move down depends on the amount of text on the page, it affects the ability of the browser to relocate after refresh. Uziel302 (talk) 04:32, 13 October 2019 (UTC)
- I've found that it works fine on my desktop as long as I use mobile mode. Bellowhead678 (talk) 09:56, 13 October 2019 (UTC)
- I guess the move down depends on the amount of text on the page, it affects the ability of the browser to relocate after refresh. Uziel302 (talk) 04:32, 13 October 2019 (UTC)
- Thanks, it still happens but I don't move down the page so it's not a big issue anymore - I can just try again and normally it then works. Bellowhead678 (talk) 21:50, 12 October 2019 (UTC)
Removing typos when the article no longer exists
There is a typo listed for Métis Nation of Ontario, but this article no longer exists. When I click remove, then it says "paragraph is missing" (understandably as the article doesn't exist!). Is it possible to write something to delete a listing if its article no longer exists? Bellowhead678 (talk) 10:02, 13 October 2019 (UTC)
- Bellowhead678, this is a rare case and it is treated by simply editing the passage and deleting its content. Uziel302 (talk) 22:29, 13 October 2019 (UTC)
How often does the page refresh?
I was wondering how often the page refreshes, and whether you have to add new typos manually or whether you use a spell checker to find them automatically? Bellowhead678 (talk) 11:38, 12 October 2019 (UTC)
- I find the typos by running some C script over the 70 GB of Wikipedia dumps, it takes a couple of hours on my digital ocean instance. I have some changes to make, and I also want to make sure I don't get typos already corrected, so I'll probably wait for 20th's dumps to be ready. Uziel302 (talk) 21:32, 12 October 2019 (UTC)
- What do you mean by 20th's dumps? Bellowhead678 (talk) 21:49, 12 October 2019 (UTC)
- Bellowhead678, dumps are published twice a month. Uziel302 (talk) 04:29, 13 October 2019 (UTC)
- Ah I see, thanks! Bellowhead678 (talk) 07:46, 13 October 2019 (UTC)
- Bellowhead678, I didn't want the project to wait so I created a wider list for the scan and removed the list we already used in last scan. Next dump I will bring back the old list to see what's left. Uziel302 (talk) 22:32, 13 October 2019 (UTC)
- Ah I see, thanks! Bellowhead678 (talk) 07:46, 13 October 2019 (UTC)
- Bellowhead678, dumps are published twice a month. Uziel302 (talk) 04:29, 13 October 2019 (UTC)
- What do you mean by 20th's dumps? Bellowhead678 (talk) 21:49, 12 October 2019 (UTC)
Detecting words with a full stop or comma in the middle
Looking at the list of typos at Wikipedia:Typo Team/moss/A, a huge number of them are words with a full stop or a comma in the middle of them. Is it possible to include these words in the list of possible typos? Bellowhead678 (talk) 12:04, 13 October 2019 (UTC)
- Bellowhead678, please give me an example. I have issue with word like ab.c because there are many websites written like this. Uziel302 (talk) 22:25, 13 October 2019 (UTC)
- See for example this particular section of the list. The first one is the article Arcadian Adventures with the Idle Rich which contains "publication,Arcadian" (I'll correct it once you've seen this message). Is it possible to look for something like word1.word2 where word1 and word2 are accepted words? Bellowhead678 (talk) 22:28, 13 October 2019 (UTC)
- It is possible but my current model isn't built for such things. You can simply run the search insource:/[a-z],[a-z]/ to see the millions of occurences and insource:/[a-z],and/ to fix the frequent ,and issue. AWB is great for such tasks. Uziel302 (talk) 09:47, 14 October 2019 (UTC)
- See for example this particular section of the list. The first one is the article Arcadian Adventures with the Idle Rich which contains "publication,Arcadian" (I'll correct it once you've seen this message). Is it possible to look for something like word1.word2 where word1 and word2 are accepted words? Bellowhead678 (talk) 22:28, 13 October 2019 (UTC)
Duplicate entries
@Uziel302: An issue that I've been noticing frequently within the lists of typos to correct involves duplicate entries. Sometimes, I have seen the same articles listed twice in a list. Sometimes, three times. And once seven times! I don't know if these are cases of the same typo being made multiple times within one article and the tool erroneously reporting them with identical contexts, or if there is only one instance of the typo per article and the tool is experiencing another type of error. 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 16:47, 14 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, this is some bug I'm trying to fix for next list. For now, just treat one and remove the others. I don't think same context can be twice in an article. If there are duplicates not in a row, let me know. Uziel302 (talk) 17:05, 14 October 2019 (UTC)
- One thing would be to discard typos where the "typo" is also in the name of the article, e.g. in the article about the artist n0thing, n0thing was coming up as a typo. Bellowhead678 (talk) 08:55, 15 October 2019 (UTC)
Finding typos in foreign language excerpts
Of the false positives, often the typos found are actually words in the middle of quotes in languages other than English. Is it possible to filter out words which are in the middle of text in a different language? Bellowhead678 (talk) 09:58, 13 October 2019 (UTC)
- If there is something that marks that text as different language, otherwise I can't tell. I just excluded poem tag which had many foreign languages. Uziel302 (talk) 22:31, 13 October 2019 (UTC)
- Ah, not that I know of. If possible, could you exclude pages with Language in the title? Bellowhead678 (talk) 13:22, 14 October 2019 (UTC)
- Bellowhead678, that's can be done easily. I guess it isn't much work to do it manually, but when I get the output in one list, removing passages with some word in title is a simple regular expression. Uziel302 (talk) 19:05, 20 October 2019 (UTC)
- Ah, not that I know of. If possible, could you exclude pages with Language in the title? Bellowhead678 (talk) 13:22, 14 October 2019 (UTC)
Bug
This edit fixed the wrong word - it changed the first instance of rotein it found to protein rather than going to the correct one. Bellowhead678 (talk) 13:04, 15 October 2019 (UTC)
- Bellowhead678, thanks for the attention, the issue is that I use first line of context to make sure I get the right string, in cases the word is first in passage I don't have preceding context, it is on my top priority for next list to make sure context line is bigger than the word itself, to prevent such cases. Maybe I'll put minimum of 15 chars for replacement context. Uziel302 (talk) 19:41, 20 October 2019 (UTC)
Moving to talk
The "move to talk" is something that I think can definitely be improved. Currently the message given is (for example):
from Wikipedia:Correct typos in one click
reevocation->revocation? context:
~~~ professional [[ice hockey]] team in [[Serie A (ice hockey)|Serie A1]], Italy's top division. Cortina is also the start and end point of the annual [[Dolomites Gold Cup Race]], a historic reevocation
reevocation event for production cars on public roads.{{sfn|Fodor|1975|p=350}} The town hosted the Red Bull Road Rage in 2009.<ref>{{cite web|url=http://www.redbull.it/cs/Satellite/it_IT/Article/Red-Bull- ~~~
For this example, I added on Reevocation seems to be a typo but it's not obvious what it should be - can anyone figure out what it should be? and then signed it.
However in general I think something like:
A typo was detected in this article using WP:Correct typos in one click. Reevocation seems to be a typo but users on that page couldn't see an obvious correction. Can you help figure out what it should be? The context was:
would be more understandable. Thoughts? Bellowhead678 (talk) 10:17, 13 October 2019 (UTC)
- Bellowhead678, it is possible, and you can offer your own version for the script, I don't have time right now to deal with regex needed to transform the passage to your format so I just move it as is, in hope most people will understand what it's about. Uziel302 (talk) 22:28, 13 October 2019 (UTC)
- Bellowhead678 see the new version, e.g. Talk:Santa Cristina Gela. Uziel302 (talk) 19:35, 20 October 2019 (UTC)
- Thanks, that looks better. Bellowhead678 (talk) 20:06, 20 October 2019 (UTC)
- Bellowhead678 see the new version, e.g. Talk:Santa Cristina Gela. Uziel302 (talk) 19:35, 20 October 2019 (UTC)
Template:typo help inline
Is it reasonable to replace "move to talk" with add template? Bellowhead678, would you prefer it? Uziel302 (talk) 19:44, 20 October 2019 (UTC)
- Yes, I think this would be better. Bellowhead678 (talk) 19:46, 20 October 2019 (UTC)
- Bellowhead678, done, let me know if there is room for improvement. Uziel302 (talk) 15:25, 21 October 2019 (UTC)
Thanks to all the contributors!
Thank you very much for helping me in this mission.
User:Titodutta, User:Lee Vilenski, User:DisillusionedBitterAndKnackered, please update your script to User:Uziel302/typo.js, this is the script I'll be maintaining from now on. I have no access to edit User:Matankic/typos.js. Uziel302 (talk) 15:11, 1 October 2019 (UTC)
And me? 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 11:26, 2 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, thank you very much. I didn't tag you cause you didn't have the old script. Your work is highly appreciated! Uziel302 (talk) 12:49, 2 October 2019 (UTC)
- Indeed, this is a great tool. I would be interested to hear more about how this page came about and how it works - would it be possible to write down some of the history of the tool somewhere? I think the Signpost would probably be interested in doing an article about it. Bellowhead678 (talk) 11:34, 12 October 2019 (UTC)
- Bellowhead678, I started the tool on Hebrew Wikipedia, it is running in similar form since April and helped correcting over 17000 typos there, I don't know what Signpost is and who should I talk to, and for what reason. Uziel302 (talk) 10:47, 14 October 2019 (UTC)
- Thanks, sorry I should have linked! The Signpost is the Wikipedia newsletter - I thought they might be interested in writing an article about the tool. Bellowhead678 (talk) 10:58, 14 October 2019 (UTC)
- Bellowhead678, so it's kind of a newspaper about Wikipedia, we have one on Hebrew Wikipedia too, I didn't understood who should I talk to or where should I right in order to get the story included there. Uziel302 (talk) 19:01, 20 October 2019 (UTC)
@Uziel302: I have the old script. Please can you tell me how to install the new script. Thanks, Willbb234Talk (please {{ping}} me in replies) 10:40, 23 October 2019 (UTC)- Sorry, I think I do have the new script, but it doesn't seem to be working for me. The replace, etc. buttons aren't appearing. Willbb234Talk (please {{ping}} me in replies) 10:41, 23 October 2019 (UTC)
- Ignore, I installed your friends script. Should that work now? Willbb234Talk (please {{ping}} me in replies) 10:47, 23 October 2019 (UTC)
- Willbb234, User:Uziel302/typo.js is working. Uziel302 (talk) 11:16, 23 October 2019 (UTC)
- Bellowhead678, so it's kind of a newspaper about Wikipedia, we have one on Hebrew Wikipedia too, I didn't understood who should I talk to or where should I right in order to get the story included there. Uziel302 (talk) 19:01, 20 October 2019 (UTC)
- Thanks, sorry I should have linked! The Signpost is the Wikipedia newsletter - I thought they might be interested in writing an article about the tool. Bellowhead678 (talk) 10:58, 14 October 2019 (UTC)
- Bellowhead678, I started the tool on Hebrew Wikipedia, it is running in similar form since April and helped correcting over 17000 typos there, I don't know what Signpost is and who should I talk to, and for what reason. Uziel302 (talk) 10:47, 14 October 2019 (UTC)
- Indeed, this is a great tool. I would be interested to hear more about how this page came about and how it works - would it be possible to write down some of the history of the tool somewhere? I think the Signpost would probably be interested in doing an article about it. Bellowhead678 (talk) 11:34, 12 October 2019 (UTC)
The end?
@Uziel302: Are we done with this project? Every typo seems to have been fixed, and all the pages are blank. Are new typos going to start appearing, or is this the end? 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 12:55, 21 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, this is not the end, I have prepared a larger list to find on wikipedia and waiting for new dump file to be ready. Uziel302 (talk) 14:33, 21 October 2019 (UTC)
- Oh, good. :) 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 15:55, 21 October 2019 (UTC)
- Hasn't the new dump been released already? Bellowhead678 (talk) 18:05, 21 October 2019 (UTC)
- @Bellowhead678: By any chance, do you know where it is? 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 12:48, 22 October 2019 (UTC)
- This is it, not that I have any idea how to use it!
- "If you are reading this on Wikimedia servers, please note that we have rate limited downloaders and we are capping the number of per-ip connections to 2. This will help to ensure that everyone can access the files with reasonable download times. Clients that try to evade these limits may be blocked. Our mirror sites do not have this cap." 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 13:04, 22 October 2019 (UTC)
- @Uziel302: ??? 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 13:04, 22 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, already downloaded the new dumps, I will upload new lists soon. Uziel302 (talk) 14:13, 22 October 2019 (UTC)
- @Uziel302: Thank you. 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 15:41, 22 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, Bellowhead678, User:Ira Leviton, the new list is live. Thanks for your help. Uziel302 (talk) 03:53, 23 October 2019 (UTC)
- Thanks, is there a reason the new list is shorter than the last ones seem to have been? i.e. it only went up to 12 pages rather than the full 20. Bellowhead678 (talk) 13:23, 25 October 2019 (UTC)
- Bellowhead678, the reason was that I searched for the same words for the second time, so many of them were already removed or fixed. On latest list I searched for extended list of suspect words, which is why I uploaded about half of the output so far. Uziel302 (talk) 17:44, 27 October 2019 (UTC)
- Thanks, is there a reason the new list is shorter than the last ones seem to have been? i.e. it only went up to 12 pages rather than the full 20. Bellowhead678 (talk) 13:23, 25 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, Bellowhead678, User:Ira Leviton, the new list is live. Thanks for your help. Uziel302 (talk) 03:53, 23 October 2019 (UTC)
- @Uziel302: Thank you. 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 15:41, 22 October 2019 (UTC)
- 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡, already downloaded the new dumps, I will upload new lists soon. Uziel302 (talk) 14:13, 22 October 2019 (UTC)
- This is it, not that I have any idea how to use it!
- @Bellowhead678: By any chance, do you know where it is? 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 12:48, 22 October 2019 (UTC)
- Hasn't the new dump been released already? Bellowhead678 (talk) 18:05, 21 October 2019 (UTC)
- Oh, good. :) 𝕎𝕚𝕜𝕚𝕎𝕒𝕣𝕣𝕚𝕠𝕣𝟡𝟡𝟙𝟡 (talk) 15:55, 21 October 2019 (UTC)
Same "typos" cropping up again
A lot of the "typos" which were removed last time have appeared again. Is it possible to remove things permanently from the list rather than have them appear again next time you run it? Bellowhead678 (talk) 20:12, 23 October 2019 (UTC)
- Also, I've just removed about eight examples of "apsed" from the list, all used correctly - is it possible to add this to the list of correct words? Bellowhead678 (talk) 21:24, 23 October 2019 (UTC)
- Could you also add "landsite" to the list of correct words please? Bellowhead678 (talk) 14:04, 26 October 2019 (UTC)
- @Uziel302: Also any of the "List of drugs" series of articles, or the "Autobiography of a Scalawag" article, both of which seem to produce huge numbers of false positives. Thanks! Bellowhead678 (talk) 16:19, 27 October 2019 (UTC)
- Bellowhead678, I did remove from the list of suspect words anything that was removed in this project and documented on edit summary (documentation started by Oct 1st, my bad). In Autobiography of a Scalawag I added poem tag, which will exclude it next time. I can put here lists ordered by articles, I order alphabetically because I thought it is easier this way. List of drugs are many articles, I can remove them by searching this string on output before posting. You can also search it on each page and remove all the passages. But if you remove without the script, the words won't be recorded on the edit history for future automatic removal. Uziel302 (talk) 17:41, 27 October 2019 (UTC)
- Ok great, many thanks! Bellowhead678 (talk) 17:47, 27 October 2019 (UTC)
- Bellowhead678, I did remove from the list of suspect words anything that was removed in this project and documented on edit summary (documentation started by Oct 1st, my bad). In Autobiography of a Scalawag I added poem tag, which will exclude it next time. I can put here lists ordered by articles, I order alphabetically because I thought it is easier this way. List of drugs are many articles, I can remove them by searching this string on output before posting. You can also search it on each page and remove all the passages. But if you remove without the script, the words won't be recorded on the edit history for future automatic removal. Uziel302 (talk) 17:41, 27 October 2019 (UTC)
- @Uziel302: Also any of the "List of drugs" series of articles, or the "Autobiography of a Scalawag" article, both of which seem to produce huge numbers of false positives. Thanks! Bellowhead678 (talk) 16:19, 27 October 2019 (UTC)
- Could you also add "landsite" to the list of correct words please? Bellowhead678 (talk) 14:04, 26 October 2019 (UTC)
Also "Early life and career of Recep Tayyip Erdoğan" seems to be cropping up quite a few times for some reason Bellowhead678 (talk) 23:22, 27 October 2019 (UTC)
Frequent words
I am putting here 100 words I got on last scan and their occurrence, to check manually. I excluded from the list so they won't be spamming it.
parentid - 500 segmentid - 100 draconium - 49 subcat - 29 buyrate - 22 tailnum - 21 amlaka - 19 demorph - 18 electionist - 17 liensman - 17 altcat - 16 storyed - 16 bhukti - 15 broadline - 15 pilosaurus - 15 quinola - 15 replatted - 15 enfoeffed - 14 advisorship - 13 gladed - 13 mawards - 13 subwin - 13 luarin - 12 jrank - 11 liensmen - 11 lirated - 11 practing - 11 rebanding - 11 taami - 11 turretted - 11 unservicable - 11 arsoned - 10 hikey - 10 ethnicaly - 9 pitbox - 9 plugable - 9 redelineation - 9 remphasised - 9 taantrik - 9 tectate - 9 archdiaconate - 8 beddown - 8 buckoff - 8 debutted - 8 demorphs - 8 getdi - 8 glilim - 8 itometer - 8 larai - 8 oidea - 8 reroled - 8 shillingi - 8 terminii - 8 todate - 8 violinu - 8 bogrim - 7 collarred - 7 coveror - 7 decombent - 7 dichromated - 7 emmigrated - 7 frontlit - 7 inedit - 7 manoevre - 7 nitramene - 7 overram - 7 parakiya - 7 poruke - 7 privire - 7 rapprochment - 7 reshirted - 7 scientarum - 7 subfaction - 7 topynomy - 7 ubyte - 7 unganging - 7 buyrates - 6 debutting - 6 digher - 6 dihidroxy - 6 endyr - 6 exceled - 6 gamette - 6 gamettes - 6 heriditary - 6 hokei - 6 kabasa - 6 layreader - 6 medresse - 6 mubber - 6 pallisaded - 6 pedicelate - 6 phaerie - 6 polytocy - 6 ratmen - 6 reinaugural - 6 runable - 6 spellt - 6 stonecut - 6 toboy - 6 urundai - 6 werer - 6 xilom - 6 Uziel302 (talk) 20:43, 25 October 2019 (UTC)
- I think unservicable -> unserviceable, heriditary->hereditary, spellt-spelled and liensman -> linesman should generally be true? Bellowhead678 (talk) 11:53, 26 October 2019 (UTC)
- Bellowhead678, can you add to AWB list? Maybe User:WereSpielChequers can? Uziel302 (talk) 17:21, 27 October 2019 (UTC)
- I've requested it. Bellowhead678 (talk) 17:37, 27 October 2019 (UTC)
- Bellowhead678, can you add to AWB list? Maybe User:WereSpielChequers can? Uziel302 (talk) 17:21, 27 October 2019 (UTC)
- Always a good idea to do a search first, "liensman" would appear to be archaic and rare, but where it is used on Wikipedia it seems to be correct. heriditary->hereditary I have just fixed all 14 - that should be fine for AWB. ϢereSpielChequers 17:50, 27 October 2019 (UTC)
- done and raised with AWB
- remphasised - reemphasised
- exceled - excelled
- debutting/debutted - debuting/debutted
- Ones that look like words or jargon
- "gamette/gamettes", quinola (a pirate ship) and mubber
- manoevre was mostly typos, but I think may be a French word and therefore not appropriate for AWB
- rebanding included some typos of rebranding that I have corrected, but most look like a new word, jargon about bands in the electromagnetic spectrum.
- reroled looks like wikt:reroled
- Not suitable for AWB
- practicing could be either practicing or practising
- ϢereSpielChequers 23:42, 27 October 2019 (UTC)
Typos staying in place in the list page after being fixed
I fixed this typo at 2011 East Africa drought yesterday, but it wasn't removed from the list of typos. I've just removed this now. The same thing happened with Mackay of Aberach and Drobo, Ghana. I think this has happened a few times now, I quite often see typos that I think I've seen before. Any idea why that's happening? Bellowhead678 (talk) 18:32, 3 November 2019 (UTC)
- Bellowhead678, thanks for reporting. When you click a button you send commands from the browser to the server. If you close the browser or move to another page within it, it stops sending commands. If the command to correct the article went through but the command to remove the passage was missing due to a change on your browser, the paragraph will stay on the list but the article will be fixed. Does it happen to you that instead of staying on the project page until finished you click on a link? I recommend opening new tabs while working on the project pages, and let me know if it still happen.
- You did a great job treating most of the lists and as a thank you gift I will work on a new list ;) Uziel302 (talk) 14:06, 8 November 2019 (UTC)
- Bellowhead678, last list has many fixed typos, I had trouble excluding old stuff and didn't have time to go over history of all the pages. It's a shorter list so I hope we will manage. After Nov 20 dumps will be ready I will go over edit history and exclude only removed words. Fixed words won't be on the new dump. Uziel302 (talk) 01:18, 9 November 2019 (UTC)
- Bellowhead678, it seems like the updates to the project page are getting lost when sent in high rate, probably it can't edit the same project page twice in the same second or something. The solution is probably to move the project to toolforge, and not use passage deletion as means to mark treatment. Uziel302 (talk) 06:56, 23 November 2019 (UTC)
- Bellowhead678, last list has many fixed typos, I had trouble excluding old stuff and didn't have time to go over history of all the pages. It's a shorter list so I hope we will manage. After Nov 20 dumps will be ready I will go over edit history and exclude only removed words. Fixed words won't be on the new dump. Uziel302 (talk) 01:18, 9 November 2019 (UTC)
Weird indents
In several places, across a few different lists, the entry has an indent before the header, making it display weirdly. Any idea what is causing this? Bellowhead678 (talk) 21:42, 3 December 2019 (UTC)
- It was a mistake I did when replacing passages in the file and I didn't notice I replaced with space. Uziel302 (talk) 00:09, 15 December 2019 (UTC)
List of incorrect typos
I thought I'd start a list of incorrect typos, rather than creating a new section for each one. Feel free to add to the list Bellowhead678 (talk) 21:26, 2 November 2019 (UTC)
- repaches (something to do with cycling races)
- recodable (as in something can be coded again)
- rebelay and unbelayed (to do with a belay device for climbers)
- nowait (something in coding)
- yo-yoer (yo-yo player)
- underfin (fin on the underside of a boat)
- uncoxed (in rowing, meaning without a cox)
- surpanch (head of a village in India)
- Rec.Sport.Soccer (there are a load of these in List 17)
- stobbies in Rug making - Leschnei (talk) 13:01, 26 December 2019 (UTC)
Correct protocol for removing corrections from the list
The word stobbies is listed as a possible correction in Rug making. I found a book that lists this name for prodded rug; I added the reference to the article. I am not a regular member of this project, so I was wondering about the proper procedure for removing this typo from the list - should I just delete the section? I have added stobbies to the List of incorrect typos above. Thanks for the help, Leschnei (talk) 13:08, 26 December 2019 (UTC)
- Thanks for doing that, to remove the entry just click "Remove" under the name of the article. Bellowhead678 (talk) 21:18, 26 December 2019 (UTC)
- Leschnei, if you delete a passage I won't notice. When you use the project script and click "Remove" button, it is removed with an edit summary on the project page, which I later convert to list of dismissed words. Uziel302 (talk) 04:54, 27 December 2019 (UTC)
- Ah. I thought that the script was being used to generate the list - if I had paid better attention to the explanation at the top I would have realized that it is for using the list. Thank you both for the explanation. Leschnei (talk) 12:29, 27 December 2019 (UTC)
- Leschnei, if you delete a passage I won't notice. When you use the project script and click "Remove" button, it is removed with an edit summary on the project page, which I later convert to list of dismissed words. Uziel302 (talk) 04:54, 27 December 2019 (UTC)
Typos with punctuation in the middle
Thanks for creating the new lists Uziel302 - most of these seem to have commas in the middle. Following on from previous lists which mostly had typos with full stops in the middle, is it possible to have some without punctuation in the middle? While the suspected typos are quite likely to actually be typos, they are mostly just missing spaces which isn't that interesting (for me at least) to correct. It would be nice to have some "normal" typos again which take some thinking about, rather than just adding extra spaces which is relatively mundane. Bellowhead678 (talk) 18:00, 8 January 2020 (UTC)
- Bellowhead678, thanks for the feedback. Indeed all the new lists are adding spaces. I will update you when a new typos list is ready. In the meantime you can help the project on wikibooks and wikitravel, which have many typos lists. Thanks for your great work. Uziel302 (talk) 18:05, 8 January 2020 (UTC)
- Bellowhead678, I added new list. Uziel302 (talk) 08:20, 15 January 2020 (UTC)
Tool not working in mobile view
@Uziel302: the tool isn't working in the mobile view - the buttons for replace, type etc. do not appear. Anyone else having this issue? Bellowhead678 (talk) 16:00, 20 February 2020 (UTC)
- Bellowhead678, I get on console "importScript is not defined" so I guess Wikipedia removed support for this function on mobile view. What you need to do is edit the common.js page to import scripts using full path, like this:
mw.loader.load('//en.wikipedia.org/w/index.php?title=User:Uziel302/typo.js&action=raw&ctype=text/javascript');
- For me it fixed the issue. Let me know if you have any problem. Uziel302 (talk) 06:11, 21 February 2020 (UTC)
Script doesn't remove entries where the paragraph is missing
I noticed that the script doesn't remove entries when the paragraph is missing. Clicking the "remove" button does not remove the entry either. Thanks, Darylgolden(talk) Ping when replying 00:44, 10 February 2020 (UTC)
- Darylgolden, can you please elaborate about the scenario for missing paragraph? The one I know is when you got top down on the list, after removing paragraph number 1, paragraph number 2 is becoming number 1 in the updated page, but since there was no refresh, it is still looking for paragraph 2 in the 2nd place, and it is no longer there. This is why the instruction is to go over the list bottom up. Uziel302 (talk) 14:15, 21 February 2020 (UTC)
- I don't quite remember. I will reply if I remember when there's a new batch of typos. Darylgolden(talk) Ping when replying 02:58, 29 February 2020 (UTC)
Script missing the first letter of words
Sometimes the script seems to think there is a typo because it has missed the first letter of a word, for example here. Any idea what's causing this? Bellowhead678 (talk) 10:44, 9 February 2020 (UTC)
- Bellowhead678, There was a typo there and you fixed it. The list creation wasn't wrong, but the correction script indeed has a bug there. When no preceding context available, the script replaces the string, e.g. "esonator" with "resonator" without noticing it is part of a good word. To prevent those cases I added jjjjj to the output whenever there isn't preceding context, and I usually delete all the replacements that have jjjjj, but the last time I tried to manually choose which of them to remove, and I probably made a bad choice on the example you brought here. Uziel302 (talk) 09:48, 2 March 2020 (UTC)
; represented as jjjjj
In a few places, where there is a semicolon in the article, it is coming up as jjjjj in the text around the typo in this page. For example here. Is it easy to fix that? Bellowhead678 (talk) 15:44, 23 February 2020 (UTC)
- As mentioned above, jjjjj is just a sign for me where no preceding context is available. I usually remove all of them on the post-processing of the list. Uziel302 (talk) 09:53, 2 March 2020 (UTC)
Template button not working
For some reason, the template button is finding that the word is missing when it isn't, for example here. Bellowhead678 (talk) 13:44, 26 February 2020 (UTC)
- The example you gave has the same jjjjj discussed above. Uziel302 (talk) 09:54, 2 March 2020 (UTC)
Stugged
Hi - I'm going through these lists a lot lately and just wanted to flag something that was incorrectly tagged as an incorrect spelling - the word "stugged" has a UK English definition here and should maybe be added to whatever dictionary is being used for this project. If there's a better way to report something like this next time please let me know. Cheers! Paradoxsociety 03:00, 10 May 2020 (UTC)
- User:Paradoxsociety, thank you for your work on the project. I used a US dictionary from aspell. I usually take all the dismissed words and remove them on the next scan so no need to report specific word. Uziel302 (talk) 14:50, 15 May 2020 (UTC)
- Happy to help, and sounds good! I should note though, that there have been a handful of occasions where I'll dismiss the entry because I either checked the page and noticed multiple occurrences, fixing it there first, or noticed the typo no longer existed, etc. Paradoxsociety 21:28, 15 May 2020 (UTC)
- Another note - see my recent Trigonoceratidae edits. Looks like the suggested auto-fix was actually not a valid spelling. Not sure how that happened. Paradoxsociety 21:39, 15 May 2020 (UTC)
- Paradox, I am well aware that clicking remove doesn't mean the word is necessarily legitimate, I save the name of the article in which the removal was done so maybe in the future I'll search for all the dismissed words on other articles. As of the typo Distributution that was suggested, I can't say specifically where it came from but I remember taking some chemistry books and manipulating their words so I suspect the typo appeared there. I should create a new list and include only words that I can trace to their source. Thanks for your feedback. Uziel302 (talk) 21:09, 18 May 2020 (UTC)
- Paradox, Bellowhead678, now all lists are updated with lower case words that are variations of aspell small list, excluding all words that appear on aspell largest list. We have many names that should be capitalized and many foreign words that need to be removed here, in order to exclude in future scans. Some foreign words appear in English context and should be fixed. Thanks a lot for your great work. Uziel302 (talk) 09:19, 25 May 2020 (UTC)
- One more thing, I excluded anything that appeared 7+ times, you can find the list here, some are real typos. Uziel302 (talk) 09:22, 25 May 2020 (UTC)
- Paradox, I am well aware that clicking remove doesn't mean the word is necessarily legitimate, I save the name of the article in which the removal was done so maybe in the future I'll search for all the dismissed words on other articles. As of the typo Distributution that was suggested, I can't say specifically where it came from but I remember taking some chemistry books and manipulating their words so I suspect the typo appeared there. I should create a new list and include only words that I can trace to their source. Thanks for your feedback. Uziel302 (talk) 21:09, 18 May 2020 (UTC)
- Another note - see my recent Trigonoceratidae edits. Looks like the suggested auto-fix was actually not a valid spelling. Not sure how that happened. Paradoxsociety 21:39, 15 May 2020 (UTC)
- Happy to help, and sounds good! I should note though, that there have been a handful of occasions where I'll dismiss the entry because I either checked the page and noticed multiple occurrences, fixing it there first, or noticed the typo no longer existed, etc. Paradoxsociety 21:28, 15 May 2020 (UTC)
Scripts need kittens too!
- — Preceding unsigned comment added by Coolabahapple (talk • contribs) 04:22, 25 October 2019 (UTC)
Recent typos
Uziel302, there seem to have been fewer typos appearing recently - have we fixed most of them, or have you been busy? Anything we can do to help? Bellowhead678 (talk) 11:18, 13 March 2020 (UTC)
- Bellowhead678, indeed you fixed all the word variations found on latest dump. I think the opportunity lies on capital letters, and I ran a scan for the same variations, just capitalized. Here are the frequent words, I thought it is better to treat them separately instead of dismissing the same word multiple times. If you find real typos there, I can upload the replacements here. Words appearing up to 4 times I started to upload now to the project. There are many false positives because of names, but I see there are many real typos too. Every variant has its type, omit, double etc. so if you figure out which types have less real typos, we can focus on the more probable types of typos. Uziel302 (talk) 21:49, 14 March 2020 (UTC)
- Bellowhead678, I tried to do some filtering since you got very low rate of corrections, let me know if it helped. Uziel302 (talk) 19:22, 29 March 2020 (UTC)
- Thanks, that's much better now! Bellowhead678 (talk) 20:35, 29 March 2020 (UTC)
- Bellowhead678, I just added over 1000 low case corrections to pages 2,3,4. Uziel302 (talk) 07:55, 11 April 2020 (UTC)
- Thanks! By the way, "netwurk" seems to crop up a lot of the time but is actually a word. Bellowhead678 (talk) 21:04, 12 April 2020 (UTC)
- Bellowhead678, pages 0-5 are low case corrections I got from this month's dump. Uziel302 (talk) 21:08, 8 May 2020 (UTC)
- Bellowhead678, pages 2-6 are double words. This was method was highly effective on Hebrew Wikipedia but in English it seems to be more frequent to double words. Let me know if you have any idea which types of words are doubled (I already removed capital letters and words that appeared over 5 times on the scanned part). Uziel302 (talk) 21:16, 18 May 2020 (UTC)
- Bellowhead678, I moved all the remains of the previous batch to page 1, all other pages are new batch. Uziel302 (talk) 08:33, 20 June 2020 (UTC)
- Bellowhead678, pages 2-6 are double words. This was method was highly effective on Hebrew Wikipedia but in English it seems to be more frequent to double words. Let me know if you have any idea which types of words are doubled (I already removed capital letters and words that appeared over 5 times on the scanned part). Uziel302 (talk) 21:16, 18 May 2020 (UTC)
- Bellowhead678, pages 0-5 are low case corrections I got from this month's dump. Uziel302 (talk) 21:08, 8 May 2020 (UTC)
- Thanks! By the way, "netwurk" seems to crop up a lot of the time but is actually a word. Bellowhead678 (talk) 21:04, 12 April 2020 (UTC)
- Bellowhead678, I just added over 1000 low case corrections to pages 2,3,4. Uziel302 (talk) 07:55, 11 April 2020 (UTC)
- Thanks, that's much better now! Bellowhead678 (talk) 20:35, 29 March 2020 (UTC)
- Bellowhead678, I tried to do some filtering since you got very low rate of corrections, let me know if it helped. Uziel302 (talk) 19:22, 29 March 2020 (UTC)
Queries
Hi, thanks for developing the tool. I have just tried it out and noticed that all the errors seemed to be in the range a to e. Does that mean that the next batch will start where this one finished? I dismissed lots of foreign words as well as a few fictional, archaic or dialect words. Let me know if you notice anything I should have done differently. Some oddities were:
- amed part of F amed (part of an acrostic)
- amee acronym for Avoiding Mass Extinctions Engine
- amba's part of Coop amba
- edid acronym for Extended Display Identification Data
- elis part of Margaret Michaelis-Sachs
- ects acronym for European Credit Transfer and Accumulation System
- alost French for Aalst, Belgium
- continents's (corrected manually) TSventon (talk) 03:34, 13 June 2020 (UTC)
- TSventon, thank you very much for your work on the project, it is highly appreciated. Indeed I uploaded here only a-e. Can you please elaborate about the oddities? I don't see how they are different from regular dism issed words. If you are not sure about the correction, there is an option to add typo template. If the correction requires another replacement, there is the type button. Thanks again, Uziel302 (talk) 09:19, 13 June 2020 (UTC)
- I wasn't sure how concerned you are about only adding real words to your dictionary.
- I should have used the type button for "continents's" but dismissed it and corrected it manually, so that doesn't need to be added.
- alost, the French name for Aalst, Belgium appeared uncapitalized in 13 Commandments I dismissed it and corrected it manually, so that doesn't need to be added.
- alignement appeared in Church of the Gesù along with many other errors. I dismissed it and corrected it manually, so that doesn't need to be added.
- edifices appeared in Ravat in a paragraph of Catalan text. I dismissed it and corrected it manually, so that doesn't need to be added.
- F amed, P eerless (and similar) appear in an acrostic poem in Percy Furnivall so it is probably better to tag the poem as a poem rather than add amed, eerless, etc. to the dictionary.
- amee appears lower-case as an acronym in the Avoiding Mass Extinctions Engine article, so it's unlikely to appear correctly elsewhere, but it's not a likely typo either so it could be added to the dictionary. The other examples can probably be added to the dictionary for the same reason. TSventon (talk) 12:17, 13 June 2020 (UTC)
- Uziel302 I didn't know that dismissed words could be added to the dictionary until I read the talk page. Would it make sense to add "The highlighted word will be added (or considered for addition) to the list of recognised words." to the description of the fourth button?
- I have added poem tags to the acrostic in Percy Furnivall, it might also be worth mentioning the use of poem tags on the project page.
- By the way I have fixed and removed "mothers's" from your sandbox, which actually occurred 11 times rather than 7. TSventon— (talk) 23:19, 14 June 2020 (UTC)
- TSventon, thank you very much, the count there is based on my filtered parts. I tried to filter tags like code and poem, which one I missed? The words removed are being added to the list of recognised words by me based on edit summary and I planned to also use the "in page" there to narrow the addition to the specific page it was dismissed in. Currently we have enough typos without it. I wanted to make the process as fast as possible so I didn't want to make people feel guilty if they dismiss real typos, the focus here is to fix as many typos in the minimum time. Feel free to expand the guidelines as you understand them. You can ping me when you're done to verify. Uziel302 (talk) 02:52, 15 June 2020 (UTC)
- Uziel302 I would recommend explaining the process in more detail once you have decided on it, but "quick and dirty" works too given the number of errors. I don't want to add my imperfect understanding of the process without asking first. If you listed the tags that you have filtered out, then people could add them where appropriate as I did with the acrostic in Percy Furnivall.
- Are you planning to add words from the sandbox list to the dictionary or use "in page" or decide case by case? For example alla is mostly Italian with a bit of Swedish, but I don't want to check 1,000 instances.
- An easy suggestion: could you add lower case Roman numerals 1-2030 (easily generated via Excel) to the dictionary? TSventon (talk) 09:02, 15 June 2020 (UTC)
- TSventon, here is the code. I added something to the remove line. I also added a list of the tags, but I think it should be moved to less prominent location. I didn't get the suggestion about sandbox, I put it there because I don't think this format of replace/remove is good for frequent words. AWB is better for these cases. As of roman numerals, they are already included in the dictionary I downloaded from SCOWL (And Friends), can you give specific example it missed? Uziel302 (talk) 14:46, 18 June 2020 (UTC)
- TSventon, thank you very much, the count there is based on my filtered parts. I tried to filter tags like code and poem, which one I missed? The words removed are being added to the list of recognised words by me based on edit summary and I planned to also use the "in page" there to narrow the addition to the specific page it was dismissed in. Currently we have enough typos without it. I wanted to make the process as fast as possible so I didn't want to make people feel guilty if they dismiss real typos, the focus here is to fix as many typos in the minimum time. Feel free to expand the guidelines as you understand them. You can ping me when you're done to verify. Uziel302 (talk) 02:52, 15 June 2020 (UTC)
- I wasn't sure how concerned you are about only adding real words to your dictionary.
Uziel302 Thanks for the code, but it doesn't mean much to me as I haven't done any coding. Thanks also for the tags: I will try and add some links and I agree about a less prominent location. I was asking about the sandbox because I didn't understand what you wanted to do with it. I have manually fixed a few words that looked like typos, which probably wasn't very efficient. I will try to investigate AWB at some point. It is hard to find the Roman numerals I saw as they may have been fixed by now. If I do see one I will let you know. What I meant was would your script recognise xxxv as a Roman numeral as well as XXXV? TSventon (talk) 23:28, 19 June 2020 (UTC)
- TSventon, I see the recent lists icluded things like cxvi->xcvi so maybe the aspell's list included only up to 100. I didn't see much of those so I'm not sure how needed it is, but I may add those in the future. The sandbox list was inteded for use with AWB (I prefer the web version, WP:JWB, with search: User:Colin M/scripts/JWB annotated.js). Uziel302 (talk) 08:07, 20 June 2020 (UTC)
Uziel302 thanks for explaining cxvi->xcvi, that shouldn't happen too often. I look forward to trying AWB at some point, but it is good that your tool works without much technical knowledge. TSventon (talk) 14:59, 21 June 2020 (UTC) Uziel302 PS I have just dismissed 33 Roman numerals from page 18, e.g. lviii->lvii from List of Acts of the Parliament of the United Kingdom, 1960–1979 which didn't take too long. TSventon (talk) 00:47, 22 June 2020 (UTC)
Comments
Hi Uziel, I have been using your excellent tool to find a bunch of typos, and I have noticed some you seem to filter out:
- this one perhaps because it was in an unnumbered list "*"
- this was in an infobox
- the caption of a photograph
- not sure why this one was omitted
Hope that's useful. ϢereSpielChequers 15:27, 26 June 2020 (UTC)
- User:WereSpielChequers, indeed the script exclude many areas and cases, in order to get less false positives. I can try to remove some exclusions after we finish current list. Uziel302 (talk) 06:39, 3 July 2020 (UTC)
Reporting false positives
Is this where we report false positives?
- Fictional world of The Hunger Games (edit | talk | history | protect | delete | links | watch | logs | views) and associated articles
- word: "muttation"
- Not a typo because: in-universe term for a particular type of organism. Elizium23 (talk) 01:18, 18 July 2020 (UTC)
- Elizium23, if there were no false positives, I would run a script to fix all issues. The idea is to manually decide for each word: replace/remove. If you don't see those buttons, you haven't properly installed the script. By clicking remove button the passage is removed from the page with relevant edit summary, which I later collect to make a list of dismissed words and not to show them again. Uziel302 (talk) 12:59, 18 July 2020 (UTC)
- Uziel302, I don't have the script, I came here because someone "fixed" it on another page. So please remove it manually. Thanks. Elizium23 (talk) 01:33, 19 July 2020 (UTC)
- Elizium23, Uziel302, I have added that typo back onto the list and dismissed it to indicate it was not an error. TSventon (talk) 02:17, 19 July 2020 (UTC)
- Ira Leviton for information it appears that muttations, which you corrected on 17 July, is a word. TSventon (talk) 23:39, 19 July 2020 (UTC)
- TSventonThank you - I also received a message from Elizium23 about it. (However, I have to say that for a word like this, editors should insert something, such as a hidden sic template, to indicate that it isn't a typo, so that people who are unfamiliar with the subject don't think it is...) Ira Leviton (talk) 00:01, 20 July 2020 (UTC)
- Ira Leviton I suppose better two messages than none. Adding a sic template would probably be easier for typo-checkers to do than everyday editors who don't need a lot of technical knowledge. Hopefully muttation is unusual in being a fictional word that looks just like a typo. This project seems to be focused on fixing "low hanging fruit" quickly rather than adding tags to false positives. TSventon (talk) 00:24, 20 July 2020 (UTC)
- TSventonThank you - I also received a message from Elizium23 about it. (However, I have to say that for a word like this, editors should insert something, such as a hidden sic template, to indicate that it isn't a typo, so that people who are unfamiliar with the subject don't think it is...) Ira Leviton (talk) 00:01, 20 July 2020 (UTC)
- Ira Leviton for information it appears that muttations, which you corrected on 17 July, is a word. TSventon (talk) 23:39, 19 July 2020 (UTC)
- Elizium23, Uziel302, I have added that typo back onto the list and dismissed it to indicate it was not an error. TSventon (talk) 02:17, 19 July 2020 (UTC)
- Uziel302, I don't have the script, I came here because someone "fixed" it on another page. So please remove it manually. Thanks. Elizium23 (talk) 01:33, 19 July 2020 (UTC)
False positives
Hi, I've noticed that most of the false positives are from sentences that are in foreign languages - is there any way to filter out these sentences? Also, causing less of an issue but possibly easier to filter out: often the tool is picking out words as typos even when they are also the article title, e.g. "Floxing". Bellowhead678 (talk) 20:57, 13 July 2020 (UTC)
- Bellowhead678, if you can tell me of a template or a tag that marks the sentence as foreign it would be great. As of excluding titles, I did it in the past but since titles list include redirections from misspellings I thought to give it a try without titles. Maybe I could get the list of titles without redirects using Quarry, but it will still have some typos in it since there are names of things with intended typo. I can filter those who include the titles of the article as suspect word, but only where the words spelled the same in title and suspect word, are those frequent enough?
- Refreshed the lists and we're getting close to finish first cycle with some extended list of suspect typos, with all the words dismissed in this cycle we will get a better list after that. Uziel302 (talk) 05:13, 17 July 2020 (UTC)
- @Uziel302: Template:Lang marks text as foreign, but I haven't seen any selected typos where Template:Lang was used. Filtering typos where the suspect word is spelled the same as the title would be worthwhile if possible. Please could you publish how many suspect typos were identified in the cycle and how many corrections were made at the end of the cycle? TSventon (talk) 13:37, 18 July 2020 (UTC)
- TSventon, in addition to the current lists, we have about 11 more lists, each with 250 suspect words to check. When done, I am running a script that collects all edit summaries so I can tell how many typos were fixed, dismissed, not found etc. Then I will exclude the dismissed words and see what is found on new dumps. I guess removing words in titles won't be that much of an issue since all were removed in current cycle. The technique of excluding words appearing in title is REGEX, so you actually can run it yourself on the lists here, something like \[\[(.*)\]\].*\n\1. Uziel302 (talk) 17:09, 18 July 2020 (UTC)
- @Uziel302: I look forward to some statistics at the end of the cycle. Is it possible to exclude words in titles which are lower case, e.g. shme in SHME or which are part of the title, like skar in Tibetan skar? I am happy to help with the project, but don't want to learn REGEX at present. TSventon (talk) 23:35, 19 July 2020 (UTC)
- Tried to do some regex but it wasn't that easy, if I did find, I would have to locate those and dismiss, not that much different than we currently do. Uziel302 (talk) 18:59, 20 July 2020 (UTC)
- @Uziel302:, thanks for checking. I have looked at my last 1000 edits and the success rate (number of errors replaced or typed divided by number of potential errors) seems to be around 25%, it will be interesting to see if that is typical. TSventon (talk) 19:32, 20 July 2020 (UTC)
- TSventon, I copied the edit histories since May 2020 to one file, got the following: 11.8K dismissed, 6.7K fixed, 958 not found, 307 not fixed, 120 added template. Uziel302 (talk) 18:05, 6 September 2020 (UTC)
- @Uziel302:, thanks, that is almost 34% fixed so I must have been unlucky with my sample. TSventon (talk) 22:07, 6 September 2020 (UTC)
- TSventon, I think some here focus on typos and skip dismissing words. Many times I get to a half list with very few corrections to do. Uziel302 (talk) 03:35, 7 September 2020 (UTC)
- @Uziel302:, thanks for checking. I have looked at my last 1000 edits and the success rate (number of errors replaced or typed divided by number of potential errors) seems to be around 25%, it will be interesting to see if that is typical. TSventon (talk) 19:32, 20 July 2020 (UTC)
- Tried to do some regex but it wasn't that easy, if I did find, I would have to locate those and dismiss, not that much different than we currently do. Uziel302 (talk) 18:59, 20 July 2020 (UTC)
- @Uziel302: I look forward to some statistics at the end of the cycle. Is it possible to exclude words in titles which are lower case, e.g. shme in SHME or which are part of the title, like skar in Tibetan skar? I am happy to help with the project, but don't want to learn REGEX at present. TSventon (talk) 23:35, 19 July 2020 (UTC)
- TSventon, in addition to the current lists, we have about 11 more lists, each with 250 suspect words to check. When done, I am running a script that collects all edit summaries so I can tell how many typos were fixed, dismissed, not found etc. Then I will exclude the dismissed words and see what is found on new dumps. I guess removing words in titles won't be that much of an issue since all were removed in current cycle. The technique of excluding words appearing in title is REGEX, so you actually can run it yourself on the lists here, something like \[\[(.*)\]\].*\n\1. Uziel302 (talk) 17:09, 18 July 2020 (UTC)
- @Uziel302: Template:Lang marks text as foreign, but I haven't seen any selected typos where Template:Lang was used. Filtering typos where the suspect word is spelled the same as the title would be worthwhile if possible. Please could you publish how many suspect typos were identified in the cycle and how many corrections were made at the end of the cycle? TSventon (talk) 13:37, 18 July 2020 (UTC)
Expanding to other languages
The project is currently running on Hebrew Wikipedia, English Wikipedia, Wikibooks and Wikivoyage. Here we are close to finish checking all the words I got. I am thinking about trying other languages, I don't need to know the language in order to create the lists but I do need someone to start running the project and making native contributors join. For me the interesting languages are Arabic, French and Spanish, but every language with enough text online can benefit from the project. Thanks in advance, Uziel302 (talk) 08:39, 20 September 2020 (UTC)
Cleared a page
I managed to clear an entire page last night. Some of the typos might not be fixed accurately but I did my best. — Preceding unsigned comment added by Blaze The Wolf (talk • contribs) 13:14, 31 October 2020 (UTC)
- @Blaze The Wolf, thanks for your contribution, but if you are unsure whether your correction improves the article, it is fine to leave that line for somebody else to deal with. Having said that, I looked at your contributions list for 30 October and only disagreed with one ((berrios->berries- in List of common Spanish surnames) and that has been deleted by a later edit. TSventon (talk) 11:09, 16 November 2020 (UTC)
How to handle "correct in context, but probably a typo in other articles" situations?
As an example, the page suggests replacing "arrivall" with "arrival" in Historie_of_the_arrivall_of_Edward_IV. In the context of that page, "arrivall" is correct (title of a book from the 15th century), but, in general, if a page includes "arrivall," it's probably a typo. Is there a way to dismiss this typo, but not add the word to the dictionary? BubbaJoe123456 (talk) 16:45, 24 March 2021 (UTC)
- I think just clicking the remove button does that. But I'm not sure. Blaze The Wolf | Proud Furry and Wikipedia Editor (talk) 19:50, 29 March 2021 (UTC)
- BubbaJoe123456, every dismissed word is added to the edit history with the link to the specific article, the idea was that in the future I will be able to search for the dismissed words in other articles. For now, there are many other typos around and I haven't invested time to implement this process, but the data is there. Uziel302 (talk) 04:39, 22 April 2021 (UTC)
Documentation expansion
Hello Uziel302 and all others. I have taken a stab at fleshing out this page in the hopes of improving the documentation for new users. (Actually, I came here to remind myself what "check template" was supposed to do. Then I got sidetracked with little "fixes"...) I hope the changes are more useful than disruptive. All feedback (or more fixes to my "fixes") welcome. Here a couple items I'd like to mention:
- I added a few headings to try to give an overview, but the highest level I can use is 2 (
==Head==
), and the list already uses these for each entry/page heading. So it's not exactly logical (After==List entries==
comes==Target page 1==
at the same heading level), but I don't know what to do about it. It's (IMO) too much text not to have some structure. - Adding headings means a Table of Contents get put before the first one. I don't mind too much but it moves most of the explanation until after the (usually quite long) TOC. That includes the "scroll to bottom" links. How much does that bother you? Should we move the TOC down, to, say, just before the List entries?
Looking forward to any comments or suggestions for improvement you may have. — JohnFromPinckney (talk) 02:11, 29 April 2021 (UTC)
- JohnFromPinckney, I changed the headings to simply bold lines using ; in order to prevent the addition of buttons and the location of TOC before them. People don't edit this text frequently so they don't need it in actual headings, just visual. Uziel302 (talk) 06:16, 29 April 2021 (UTC)
Redirecting subpages
Should I redirect all the subpages' talk pages to this one? ― Qwerfjkl|✉ 20:47, 29 April 2021 (UTC)
- That would actually make sense to me. There's nothing permanent on those pages so any discussion on subpage Talk pages would be a bit silly. It might be tricky if somebody writes here about something they saw "on the page", when they mean subpage 14, and nobody would know. I think that problem is soluble, though, if we just ask, "What? Where?". — JohnFromPinckney (talk) 01:48, 30 April 2021 (UTC)
- Okay, I'll begin redirecting them. ― Qwerfjkl|✉ 15:16, 30 April 2021 (UTC)
- Finished redirecting 1 to 20. ― Qwerfjkl|✉ 15:35, 30 April 2021 (UTC)
Erroneous paragraph
I saw this as one of the paragraphs: == Cnapan ==
antiquitie->antiquities? (omit) context:
~~~ "extremely popular in Pembrokeshire since greate antiquitie
antiquitie <nowiki>[sic]".[1] Cnapan was Uziel302 (talk)</nowiki> and it looks like an error, especially considering the end. It's located on the third subpage. This also lead.to a cite error. — Preceding unsigned comment added by Qwerfjkl (talk • contribs) 20:37, 29 April 2021 (UTC)
- It seems that [sic] breaks the code, presumably it happens relatively rarely. TSventon (talk) 21:03, 29 April 2021 (UTC)
- I presume it was the
<nowiki>
tag surrounding the [sic], given that [sic] on its own does nothing. ― Qwerfjkl|✉ 10:09, 1 May 2021 (UTC)
- I presume it was the
References
- I agree the nowiki tag is breaking the code, not sure why it is needed there. Uziel302 (talk) 15:10, 1 May 2021 (UTC)
' as correction
Sometimes when it proposes a correction such as everyones -> everyone's, it shows up as: everyone''s (omit), leading to a confusing text. — Preceding unsigned comment added by Qwerfjkl (talk • contribs) 20:57, 29 April 2021 (UTC)
- An example: == At attention ==
- armys->army''s? (omit) context:
- ~~~ In the three armys
- armys of [[Spain]] this order must ~~~ ― Qwerfjkl|✉ 14:12, 1 May 2021 (UTC)
- Indeed, and it is a consequence of my use of bold to mark the changed char in the word. You can go over the lists and replace every 7 ' with single ', it doesn't affect the correction process itself and it has only few occurrences so I won't change the code for it now. Maybe in the future. Uziel302 (talk) 15:18, 1 May 2021 (UTC)
template/check template
Why does the 'check temple' button appear as 'template' on mobile mode? (Also the lowercase first letter has been annoying me.🙂) ― Qwerfjkl|✉ 18:20, 6 May 2021 (UTC)
- Do you have that backwards? I only ever use desktop, and I have only ever seen "check template". Does mobile maybe abbreviate the label because it's too long? And FWIW, the lowercase "check" has bugged me a lot, too. — JohnFromPinckney (talk) 19:07, 6 May 2021 (UTC)
- @JohnFromPinckney: Sorry for the (very) late reply. I do not think it's because it's to long, which usually just extends the page width (which can get ridiculous for long rcats). ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
{{reply to|Qwerfjkl}}
on reply) 14:36, 23 May 2021 (UTC)
- @JohnFromPinckney: Sorry for the (very) late reply. I do not think it's because it's to long, which usually just extends the page width (which can get ridiculous for long rcats). ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
Error in Type
The type function does not work for me, and clicking on it does nothing. ― Qwerfjkl|✉ 18:39, 13 April 2021 (UTC)
- Qwerfjkl it worked for me just now, using desktop mode and Chrome. TSventon (talk) 18:46, 13 April 2021 (UTC)
- TSventon I'm using a mobile device ― Qwerfjkl|✉ 18:48, 13 April 2021 (UTC)
- Qwerfjkl, I assume that Uziel302 will see this in due course. Obviously you can try restarting your mobile device, but if that doesn't work you will have to leave corrections that need typing or fix them manually. TSventon (talk) 18:57, 13 April 2021 (UTC)
- Recently, the page has stopped working for me. The 4 functions only appear at the bottom of the page, for the final section. ― Qwerfjkl|✉ 16:20, 18 April 2021 (UTC)
- Uziel302, any ideas? Qwerfjkl, Otherwise perhaps try WP:VPT. TSventon (talk) 18:18, 18 April 2021 (UTC)
- In the past, this only happened when I ran the page in the desktop version. ― Qwerfjkl|✉ 19:29, 18 April 2021 (UTC)
- I am fairly sure it is because of one of the scripts I am using at User:Qwerfjkl/common.js, as the page works with no other scripts running. ― Qwerfjkl|✉ 21:04, 19 April 2021 (UTC)
- Uziel302, any ideas? Qwerfjkl, Otherwise perhaps try WP:VPT. TSventon (talk) 18:18, 18 April 2021 (UTC)
- Recently, the page has stopped working for me. The 4 functions only appear at the bottom of the page, for the final section. ― Qwerfjkl|✉ 16:20, 18 April 2021 (UTC)
- Qwerfjkl, I assume that Uziel302 will see this in due course. Obviously you can try restarting your mobile device, but if that doesn't work you will have to leave corrections that need typing or fix them manually. TSventon (talk) 18:57, 13 April 2021 (UTC)
- TSventon I'm using a mobile device ― Qwerfjkl|✉ 18:48, 13 April 2021 (UTC)
I have restarted my device since I first noticed this error, and it doesn't seem to have had any effect, so I have been manually fixing the corrections. Thanks ― Qwerfjkl|✉ 19:01, 13 April 2021 (UTC)
- Qwerfjkl, it seems like you've managed to overcome the issue by disabling other scripts, am I right? In order to investigate the issue, can you please share which devices+browsers you used to test it? Uziel302 (talk) 04:51, 22 April 2021 (UTC)
I am unsure wether or not I have fixed it, because there are no paragraphs to test it on.― Qwerfjkl|✉ 06:10, 22 April 2021 (UTC)- It has been fixed, I am using an Tablet and am unsure which browser I am using. ― Qwerfjkl|✉ 08:26, 25 April 2021 (UTC)
- @TSventon and Uziel302: Reopening this problem, I believe it may not work if the type function relies on a dialog box to enter to correction. I have had similar problems with Twinkle. ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
{{reply to|Qwerfjkl}}
on reply) 21:55, 22 May 2021 (UTC)- Qwerfjkl, thanks. But since dialog boxes are supported by all browsers, I can't reproduce the issue and find an alternative. Uziel302 (talk) 22:50, 26 May 2021 (UTC)
- @Uziel302: I think it's only alert dialogue boxes that don't work for me. ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
{{reply to|Qwerfjkl}}
on reply) 06:25, 27 May 2021 (UTC)
- @Uziel302: I think it's only alert dialogue boxes that don't work for me. ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
- Qwerfjkl, thanks. But since dialog boxes are supported by all browsers, I can't reproduce the issue and find an alternative. Uziel302 (talk) 22:50, 26 May 2021 (UTC)
Helpful scripts
― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use {{reply to|Qwerfjkl}}
on reply) 21:20, 27 May 2021 (UTC)
Skip to bottom
Should the 'to bottom' links be added to all the subpages? ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use {{reply to|Qwerfjkl}}
on reply) 14:38, 23 May 2021 (UTC)
- That's seems like a good idea. We're supposed to start at the bottom, so why not make it easier? (I always use CTRL-End on my Windows laptop, but if the links are useful, we should add them.) I assume mobile users would be especially happy.
- A small technical problem: I would add these links to all 20 possible sub-pages, but the empty ones aren't linked from the main page's page list (1 to 20), which is a very useful feature. I believe that if I (or somebody) were to add such jump-to-bottom links, then (1) the links on the main page would no longer be inactive/gray, but would point to useless, empty pages, and (2), when Uziel adds new datasets, they'll probably overwrite the links. Maybe @Uziel302 and Tom.Reding have some ideas about how to implement this (if we find consensus to actually do so). — JohnFromPinckney (talk) 13:13, 24 May 2021 (UTC)
- Thanks for the ping. This is a good idea. I can alter the navigation code to continue to work no matter the implementation.
- May I suggest the following:
{{Right|[[#Bottom|to bottom]] [[File:WWC menu dn.png|link=#Bottom]]}} ... ... ... {{Right|[[#Top|back to top]] [[File:WWC arrow up.png|link=#Top]]}} {{Anchor|Bottom}}
- Example @ Wikipedia:Correct typos in one click/20 @ Special:Permalink/1024890219. ~ Tom.Reding (talk ⋅dgaf) 16:16, 24 May 2021 (UTC)
- ✓ Works for me ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
{{reply to|Qwerfjkl}}
on reply) 19:17, 24 May 2021 (UTC) - Thanks, Tom. That's real purty. Will that work for both mobile and desktop users? On the "main" page, we have two separate textual links, presumably because mobile and desktop have different mechanisms/targets (about which I know nothing). Have you tried / can you try in mobile? — JohnFromPinckney (talk) 21:08, 24 May 2021 (UTC)
- The links work for me, a mobile user, in both desktop and mobile mode. ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
{{reply to|Qwerfjkl}}
on reply) 06:08, 25 May 2021 (UTC)- Good to know — JohnFromPinckney (talk / edits) 22:27, 25 May 2021 (UTC)
- The links work for me, a mobile user, in both desktop and mobile mode. ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
- ✓ Works for me ― Qwerfjkl | 𝕋𝔸𝕃𝕂 (please use
- Example @ Wikipedia:Correct typos in one click/20 @ Special:Permalink/1024890219. ~ Tom.Reding (talk ⋅dgaf) 16:16, 24 May 2021 (UTC)
@Uziel302: I can add this to all pages, if you like, but I just need the ok from you that it won't cause any problems/issues on your end. ~ Tom.Reding (talk ⋅dgaf) 23:26, 26 May 2021 (UTC)
- Tom.Reding, no problem at all. Thanks for maintaining the template, I have no experience with lua. Uziel302 (talk) 01:39, 29 May 2021 (UTC)
- @Qwerfjkl, JohnFromPinckney, and Uziel302: done. ~ Tom.Reding (talk ⋅dgaf) 16:39, 1 June 2021 (UTC)
- @Uziel302: this is what I was worried about - are you able to make the js ignore
{{Right|...}}
and following wikitext, or perhaps stopping at~~~</nowiki>
? ~ Tom.Reding (talk ⋅dgaf) 19:12, 1 June 2021 (UTC)- Tom.Reding, technically I can but I think a simpler solution is to put this footer under a separate paragraph. Uziel302 (talk) 16:53, 2 June 2021 (UTC)
- There doesn't have to be anything at the bottom. I don't see any need for a link to the top, and you definitely don't need a #bottom anchor – you can use the default #footer. MANdARAX • XAЯAbИAM 17:55, 2 June 2021 (UTC)
- @Mandarax: thank you! ~ Tom.Reding (talk ⋅dgaf) 20:52, 2 June 2021 (UTC)
- Done ~ Tom.Reding (talk ⋅dgaf) 21:02, 2 June 2021 (UTC)
- That's workin' reeal good. Thanks, Tom. — JohnFromPinckney (talk / edits) 22:44, 9 June 2021 (UTC)
- There doesn't have to be anything at the bottom. I don't see any need for a link to the top, and you definitely don't need a #bottom anchor – you can use the default #footer. MANdARAX • XAЯAbИAM 17:55, 2 June 2021 (UTC)
- Tom.Reding, technically I can but I think a simpler solution is to put this footer under a separate paragraph. Uziel302 (talk) 16:53, 2 June 2021 (UTC)
- @Uziel302: this is what I was worried about - are you able to make the js ignore
- @Qwerfjkl, JohnFromPinckney, and Uziel302: done. ~ Tom.Reding (talk ⋅dgaf) 16:39, 1 June 2021 (UTC)
Refining typo selection
Would it be possible to remove:
- Redirects
- 'Typos' that are in the title
- Articles who have a title of 'X Language' or 'Bible translations into X'
- 'Typos' in a link
These are some of the most common errors I have noticed. — Preceding unsigned comment added by Qwerfjkl (talk • contribs) 11:39, 1 May 2021 (UTC)
- @Qwerfjkl: please remember to sign your posts. Thanks— JohnFromPinckney (talk) 13:41, 1 May 2021 (UTC)
- It's because, being on mobile, I constantly switch between desktop and mobile mode. My posts are signed on mobile mode automatically, but not on desktop mode. I will try to ensure they are always signed. ― Qwerfjkl|✉ 13:55, 1 May 2021 (UTC)
- Qwerfjkl, I saw many redirects that required treatment (usually removing some spam with typos). If the typo is in the title in capital letters and the word appears lowercase in the article, sometimes it requires capitalization. Articles with specific terms in the title require a good list of the terms, and it also means we will miss all the typos on articles with those terms. Since the removal of each word is done once, I prefer to include all the articles. What do you mean by typos in a link? Inner links shouldn't have typos. External links are excluded already. Thanks for all the feedback. Uziel302 (talk) 15:27, 1 May 2021 (UTC)
- Here's an example of what I meant by a 'typo' in a link:
- == Danmarks gamle Folkeviser ==
- beder->bender? (omit) context:
- ~~~ ||437 ||[[Søster beder
- beder Broder]] (A-B) ||D 91 || ~~~ ― Qwerfjkl|✉ 18:58, 1 May 2021 (UTC)
- Qwerfjkl, this is an inner link, some inner links still have typos that need correction, for example capitalization. Uziel302 (talk) 20:16, 2 May 2021 (UTC)
- Could it be cross-referenced with a list of red links? ― Qwerfjkl|✉ 20:22, 2 May 2021 (UTC)
- Qwerfjkl, Redlinks can be typos. E.g. [[ethnogenisis|emerged as a distinct cultural and kinship group]], which should be ethnogenesis, in Michel Band. TSventon (talk) 20:57, 2 May 2021 (UTC)
- I have now fixed ethnogenisis. TSventon (talk) 21:06, 2 May 2021 (UTC)
- To clarify, I meant that it shouldn't be recognized as a typo unless it is in the above list. ― Qwerfjkl|✉ 21:00, 2 May 2021 (UTC)
- Qwerfjkl, capitalization issue doesn't make the link red. Uziel302 (talk) 18:33, 3 May 2021 (UTC)
- My apologies then for wasting your time. Sorry! ― Qwerfjkl|✉ 18:17, 6 May 2021 (UTC)
- Qwerfjkl, capitalization issue doesn't make the link red. Uziel302 (talk) 18:33, 3 May 2021 (UTC)
- To clarify, I meant that it shouldn't be recognized as a typo unless it is in the above list. ― Qwerfjkl|✉ 21:00, 2 May 2021 (UTC)
- Could it be cross-referenced with a list of red links? ― Qwerfjkl|✉ 20:22, 2 May 2021 (UTC)
- Qwerfjkl, this is an inner link, some inner links still have typos that need correction, for example capitalization. Uziel302 (talk) 20:16, 2 May 2021 (UTC)
- Qwerfjkl, I saw many redirects that required treatment (usually removing some spam with typos). If the typo is in the title in capital letters and the word appears lowercase in the article, sometimes it requires capitalization. Articles with specific terms in the title require a good list of the terms, and it also means we will miss all the typos on articles with those terms. Since the removal of each word is done once, I prefer to include all the articles. What do you mean by typos in a link? Inner links shouldn't have typos. External links are excluded already. Thanks for all the feedback. Uziel302 (talk) 15:27, 1 May 2021 (UTC)
- Typos that are in the title are worth suppressing. Hjärtats trakt – en samling contains "trakt", and it's clearly not a typo. My other rather nebulous suggestion is to suppress typos within non-English text. If it's feasible, limiting the listings to typos where most of the surrounding text (ignoring proper nouns) is English words would cut out a lot of false positives and probably not miss many real typos. (Back-of-envelope algorithm: half or more of the nearest matches for
$1
in/(?<!\S)["'(]{0,4}(\p{Ll}+)[,;:.?!"')]{0,4}(?!\S)/
, up to five ahead and five behind, are in the dictionary.) Certes (talk) 01:00, 22 July 2021 (UTC)- I'd support suppressing typos in the title - you mentioned that the case might have to be changed, but if it's lower case in both the title and the article, surely it can be removed? I'd also strongly support removing words where the surrounding text isn't English. Bellowhead678 (talk) 07:58, 22 July 2021 (UTC)
- It's because, being on mobile, I constantly switch between desktop and mobile mode. My posts are signed on mobile mode automatically, but not on desktop mode. I will try to ensure they are always signed. ― Qwerfjkl|✉ 13:55, 1 May 2021 (UTC)
- Since this thread is still kind of dribbling along: As expansion of Qwerfjkl's point 3, anything with "language" or "dialect" or "declension" in the title could be easily ignored. When I get to some supposed typo on that page it's invariably something like thau in the table at Dhimalish languages or sih in the table at Old High German declension, and I rather quickly, without much analysis, click on "Remove" (trying not to accidentally hit "Replace"). Since these make up, as a group, a fairly large percentage of bogeys, it'd save some time to not list them at all. Now if we could just figure out how to exclude hits in pages about pre-1800 subjects, thett ye woulde know theye wer not wronge, but spelt as theye wer at the time of writinge. — JohnFromPinckney (talk / edits) 23:31, 22 July 2021 (UTC)
- Suppressing typos surrounded by non-English words as above should deal with Old English, French, Klingon or anything else unsuited to correction using a modern English dictionary. Certes (talk) 00:39, 23 July 2021 (UTC)
Can't make script work
I'm sure this is a local problem as it clearly works well for other editors. I've installed the script, hard-refreshed my .js and opened a typo page that I've never visited before (hard-refreshing it for luck). All of the "Replace, Remove, Type, check template" links take me to the top of the page (Wikipedia:Correct_typos_in_one_click/11#) and do nothing else. I've turned off Ublock Origin and Umatrix in case they are blocking anything, though other user scripts work. I'm using Firefox 90.0 and Ubuntu 16.04. Please can anyone suggest what I'm doing wrong? Certes (talk) 17:13, 17 July 2021 (UTC)
- Certes, it seems like installation worked since you see the additional buttons. The question is why the JS code doesn't run like it supposed to. Please check if there are any console errors, and if you can try to use other browsers/devices it might help pinpoint the issue. I see that you are a programmer, so maybe you can debug by putting breakpoint on click event and then see what happens when you click. Uziel302 (talk) 19:19, 17 July 2021 (UTC)
- Thanks for the tip. When I load the typo list with the debugger on, it all works, even if I turn the debugger off, so that's a good enough workaround. In case it's of use, loading the list with debugging off raises an exception with
.attr()
undefined at typo.js line 63, 208, etc.sectionNum = $(this).parent().find('a').attr('href').match(/section=([0-9]+)/)[1],
. The debugger claims that$(this).parent().find('a').attr('href')
is indeed undefined, though$(this).parent().find('a')
has the expected array of hyperlinks, each with an href attribute. I am (or at least was) a programmer, but I haven't used JQuery seriously enough to work out what is going wrong here. If it's not affecting anyone else then I must have something configured oddly and can just use the workaround. Certes (talk) 20:55, 17 July 2021 (UTC) - It just occurred to me that past problems with other JavaScript which worked only with the debugger on were timing issues, such as attempting to call a function which hasn't yet loaded (but will have done by the time the debugger warms up). Is that a possibility here? Certes (talk) 00:24, 18 July 2021 (UTC)
- Thanks for the tip. When I load the typo list with the debugger on, it all works, even if I turn the debugger off, so that's a good enough workaround. In case it's of use, loading the list with debugging off raises an exception with
- Solved! Another script called hideSectionDesktop.js does not play nicely with typo.js, as both do incompatible things with section numbers. Removing the other script restores normal service. Certes (talk) 01:52, 20 July 2021 (UTC)
- For the benefit of anyone else with this problem, skipping hideSectionDesktop on typo pages fixes it. Certes (talk) 15:49, 20 July 2021 (UTC)
- Certes, thank you very much for finding out the source of the problem and a fix for it! Uziel302 (talk) 06:17, 24 July 2021 (UTC)
Watchlist
Thanks for this very useful script. One suggestion: I prefer not to have pages where I make minor changes watchlisted indefinitely, and have made this change to a local copy. (I'm aware that I'll have to keep that in sync.) Do other editors feel similarly? An alternative would be something like watchlistexpiry: '1 week'
, with the disadvantage that pages already permanently watchlisted for other reasons would become unlisted after a week. Certes (talk) 16:12, 20 July 2021 (UTC)
- I agree, I didn't realise that editing the pages would add them to my watchlist. Bellowhead678 (talk) 16:28, 20 July 2021 (UTC)
- Certes I don't automatically watchlist pages when I make changes. Is your watchlist customised? TSventon (talk) 16:44, 20 July 2021 (UTC)
- It uses the default setting. Under Special:Preferences#mw-prefsection-watchlist, I have checked "Add pages and files I edit to my watchlist", but I usually watch pages to I've made minor edits for only a week. (Sadly, that can't be set as a default.) I may not be a typical case. Certes (talk) 18:22, 20 July 2021 (UTC)
- @TSventon and Certes: I don't think that checkbox is a customization; I believe (although can no longer be certain, after all these years) that "checked" is the default. I am clueless as to the value/applicability/safety of the typo.js change above, but I do know the limitation of "one-click" edited pages to one week (or even one month) would be a very welcome improvement. When I'm feeling in a reduce-my-WP-time mood, especially, I hesitate to work through the typo lists in large part because of the huge addition of pages to my watchlist that would bring. Such a change would help a lot. — JohnFromPinckney (talk / edits) 20:06, 20 July 2021 (UTC)
- I am happy not to watchlist typo fixes, but I presume the change being discussed wouldn't alter that. TSventon (talk) 20:14, 20 July 2021 (UTC)
- Certes, I copied your changes to the main script. JohnFromPinckney, TSventon, Bellowhead678, after you hard refresh and get the new version, edits through the script won't add pages to your watchlist. Uziel302 (talk) 06:13, 24 July 2021 (UTC)
- Cool, it works! Thanks, Uziel302 and Certes, — JohnFromPinckney (talk / edits) 06:24, 24 July 2021 (UTC)
- Affects not only the target page but also the CTIOC page (e.g., page 4). Which is fine, of course. — JohnFromPinckney (talk / edits) 06:31, 24 July 2021 (UTC)
- That's a useful side-effect. If you fix "chakl", then I fix "cheees" in a different article on the same list, you probably don't want me cluttering your watchlist. Certes (talk) 11:48, 24 July 2021 (UTC)
- Yes! My watchlist gets well-flooded when somebody works through a list page that I have previously edited. I assume it's the same for others when I go on a spree through some list page. I assume direct edits of a list page (like just deleting an entry, without using the magical buttons) will still add the list to my watchlist (±my preference settings). — JohnFromPinckney (talk / edits) 23:11, 24 July 2021 (UTC)
- Yes, direct edits will affect your watchlist in the usual way (possibly no effect, depending on your preferences). As a one-off fix, I've also unwatched the subpages which were already on my watchlist. Certes (talk) 00:20, 25 July 2021 (UTC)
- Yes! My watchlist gets well-flooded when somebody works through a list page that I have previously edited. I assume it's the same for others when I go on a spree through some list page. I assume direct edits of a list page (like just deleting an entry, without using the magical buttons) will still add the list to my watchlist (±my preference settings). — JohnFromPinckney (talk / edits) 23:11, 24 July 2021 (UTC)
- That's a useful side-effect. If you fix "chakl", then I fix "cheees" in a different article on the same list, you probably don't want me cluttering your watchlist. Certes (talk) 11:48, 24 July 2021 (UTC)
- Thanks; I've moved back to the standard version. If anyone does want to list the page for a week or month (at the risk of affecting pages already on the watchlist) then that could be done with a configuration variable, but it's perfect for me just as it is. Certes (talk) 11:40, 24 July 2021 (UTC)
- Certes, I copied your changes to the main script. JohnFromPinckney, TSventon, Bellowhead678, after you hard refresh and get the new version, edits through the script won't add pages to your watchlist. Uziel302 (talk) 06:13, 24 July 2021 (UTC)
- I am happy not to watchlist typo fixes, but I presume the change being discussed wouldn't alter that. TSventon (talk) 20:14, 20 July 2021 (UTC)
- @TSventon and Certes: I don't think that checkbox is a customization; I believe (although can no longer be certain, after all these years) that "checked" is the default. I am clueless as to the value/applicability/safety of the typo.js change above, but I do know the limitation of "one-click" edited pages to one week (or even one month) would be a very welcome improvement. When I'm feeling in a reduce-my-WP-time mood, especially, I hesitate to work through the typo lists in large part because of the huge addition of pages to my watchlist that would bring. Such a change would help a lot. — JohnFromPinckney (talk / edits) 20:06, 20 July 2021 (UTC)
Paragraph is missing
What's the correct procedure when the "Paragraph is missing" and no change is required to the article? (Often, the paragraph really is missing. Other cases may be false positives, typos already corrected, articles moved or deleted, etc., but I think we can treat all similarly.) Should we edit the subpage to remove the item manually? Would it be sensible to have another link alongside Remove to actually remove the entry, even if the paragraph appears to be missing? Certes (talk) 13:56, 23 July 2021 (UTC)
- Certes, when I get "Paragraph is missing" I try again, possibly twice, and then get either processed or removed. TSventon (talk) 14:05, 23 July 2021 (UTC)
- Thanks for helping a newbie out again! I only tried a couple of times but eventually I'm managing to Remove these entries now. Certes (talk) 14:08, 23 July 2021 (UTC)
- (edit conflict) To be honest, I find the "in one click" to be quite a misnomer. Even excluding entries where I go and do one or more manual changes, I think the average number of click per entry I process is about 3.4 or so. I refresh the page occasionally, I press "Remove" or "Replace" multiple times, I go to the page linked to get a better (and more current) view of the context, etc. And when I've verified that the typo shown is actually still on the target page, and the CTIOC page is the current one, and I've clicked on Remove/Replace a few times, I then—Warning: utter voodoo!— press my mouse button longer on the respective button, and it seems to work. — JohnFromPinckney (talk / edits) 15:23, 23 July 2021 (UTC)
- It's not perfect but it's a lot easier on the mouse than the alternatives. I hope I'm helping it to get even better, if I'm not making too much of a nuisance of myself. Certes (talk) 15:32, 23 July 2021 (UTC)
- JohnFromPinckney, I hear a hint of frustration in your words. I corrected thousands of typos with single clicks. I know sometimes things require more research, I would count google search as the main reason to have more clicks because usually I can't tell for sure the word doesn't exist. But for cases that the correction is obvious, I simply click "Replace" once and hope the line of context still stayed in place and the typo gets corrected. No real need to visit each article. If there is something that you think can be treated better, please let me know. Uziel302 (talk) 05:52, 24 July 2021 (UTC)
- It's not perfect but it's a lot easier on the mouse than the alternatives. I hope I'm helping it to get even better, if I'm not making too much of a nuisance of myself. Certes (talk) 15:32, 23 July 2021 (UTC)
- (edit conflict) To be honest, I find the "in one click" to be quite a misnomer. Even excluding entries where I go and do one or more manual changes, I think the average number of click per entry I process is about 3.4 or so. I refresh the page occasionally, I press "Remove" or "Replace" multiple times, I go to the page linked to get a better (and more current) view of the context, etc. And when I've verified that the typo shown is actually still on the target page, and the CTIOC page is the current one, and I've clicked on Remove/Replace a few times, I then—Warning: utter voodoo!— press my mouse button longer on the respective button, and it seems to work. — JohnFromPinckney (talk / edits) 15:23, 23 July 2021 (UTC)
- Thanks for helping a newbie out again! I only tried a couple of times but eventually I'm managing to Remove these entries now. Certes (talk) 14:08, 23 July 2021 (UTC)
- In that case, Uziel302, would it be useful to you if we had two links, so we can explicitly signal one of:
- Remove: false positive: reported text is in the article but is not a typo here; if frequent, please consider hiding future appearances of this word
- Remove: already fixed: the text no longer appears in the article but would have been a typo; please continue to show similar cases elsewhere
- e.g. replacing Remove by False and Fixed which do the same thing but can be recorded separately? This suggestion aims to help your work rather than us directly, so please feel free to discard it if it wouldn't help. Certes (talk) 14:34, 23 July 2021 (UTC)
- Certes, the Replace button removes the paragraph without marking the word as false positive in case the word doesn't exist. The easier way to remove non existing word is by editing the paragraph and emptying it, this way you can rest assured that Replace button won't make any change. As of adding new buttons, I am afraid it's already too complicated for newcomers. I prefer to have more words simply removed that people getting confused. If we get to a point that we don't get enough suspect words and want to try again dismissed words, which might not be false positives in all contexts, I can extract from the edit history which article each dismissed word was found and search for the words in the rest of Wikipedia. Uziel302 (talk) 05:52, 24 July 2021 (UTC)
- Ah, thanks. So it's safe to hit Replace for all real typos, even if the text is no longer in the article. That's exactly what my proposed Fixed button would have done; I just misunderstood Replace. I agree that no change is needed. Certes (talk) 11:33, 24 July 2021 (UTC)
- @Uziel302: This is rather big and I want to make sure I understand. Are you saying that when (clearly) misspelled words on a list have already been corrected on the target page, that we should press "Replace" (or delete manually)? I have been pressing "Remove" this whole time because it seemed the only option. (I'm embarrassed to say it never occurred to me to directly delete the entry.)
- If the answer is "yes" to the above (we should use "Replace" for all real typos we don't use "Type" for), then I'll add it to the documentation. — JohnFromPinckney (talk / edits) 23:19, 24 July 2021 (UTC)
- JohnFromPinckney, Replace tries to edit the article, if the line of context doesn't exist the same way, it just removes the paragraph without the edit summary that marks dismissing of a suspect word. So, yeah, you can use it or manually empty the paragraph if you don't want to dismiss a word. I didn't want people to think a lot before they hit remove, if I ever want to search for dismissed words in other articles, I can. Uziel302 (talk) 05:19, 27 July 2021 (UTC)
- Certes, the Replace button removes the paragraph without marking the word as false positive in case the word doesn't exist. The easier way to remove non existing word is by editing the paragraph and emptying it, this way you can rest assured that Replace button won't make any change. As of adding new buttons, I am afraid it's already too complicated for newcomers. I prefer to have more words simply removed that people getting confused. If we get to a point that we don't get enough suspect words and want to try again dismissed words, which might not be false positives in all contexts, I can extract from the edit history which article each dismissed word was found and search for the words in the rest of Wikipedia. Uziel302 (talk) 05:52, 24 July 2021 (UTC)
Repeated correction
@Uziel302:: WP:CTIOC/3 suggested that I might like to change "scou" to "scout" in Justin Bruening. I agreed that this sounded like a jolly good idea, and hit Replace. Unfortunately, someone had beaten me to it, and I added a second "t". At the risk of increasing the dreaded "paragraph is missing" tally, could the check include a trailing \b or similar to prevent recurrence? Thanks, Certes (talk) 23:21, 27 July 2021 (UTC)
- Oops, I did it again. I expect old hands are more careful but I wonder how many more have slipped through. Certes (talk) 00:11, 28 July 2021 (UTC)
- Certes, this is due to my manual edit. I wrote above: "This issue started since I switched from C to Python by January, fixed the code now. I will remove the additional space from the lists now." I manually removed all the double spaces, seems like some of them were needed to prevent such cases. The fix in the code won't cause this issue. I hope that there weren't many of these edits. I guess we will catch the doubled letters at the end of words in the next run. Uziel302 (talk) 09:05, 30 July 2021 (UTC)
- Certes, the aim of "Correct typos in one click", is to correct typos in one click, without having to double check everything. In this case it is fortunate that you did check. Uziel302's explanation suggests that this is a one off problem which should fix itself in due course. TSventon (talk) 13:08, 30 July 2021 (UTC)
- Certes, this is due to my manual edit. I wrote above: "This issue started since I switched from C to Python by January, fixed the code now. I will remove the additional space from the lists now." I manually removed all the double spaces, seems like some of them were needed to prevent such cases. The fix in the code won't cause this issue. I hope that there weren't many of these edits. I guess we will catch the doubled letters at the end of words in the next run. Uziel302 (talk) 09:05, 30 July 2021 (UTC)
- I'm still getting these duplicates, though I now usually remember to check first if the correction consists of appending a letter. A similar point applies to looking for check templates, as I forgot to do here. I'm still grateful for this very useful tool, but we do need to keep an eye open for such pitfalls when using it. Certes (talk) 21:31, 2 August 2021 (UTC)
- @Uziel302: I think I've fixed this issue here. Rather than search for a string "scou", this will search for a regex /scou\b/, which doesn't match "scout". Whilst looking for a solution, I stumbled across another potential problem: the string may occur a few words earlier, in text like "discount a scou". To ensure that gets changed to "discount a scout" rather than "discoutnt a scou", I also changed its search from string "scou" to regex /scou$/. (\b wouldn't always work here). My changes only cover the Replace button; if you want to include them then they should probably be repeated for the other functions too. What do you think? Certes (talk) 00:59, 3 August 2021 (UTC)
- To clarify, here is an example of the second problem, where the wrong "histor" has "y" appended. (Sometimes, as here, such an edit can accidentally fix a different typo in the same sentence.) Certes (talk) 16:39, 3 August 2021 (UTC)
I also added \b at the start of the string, to prevent rtists → artists → aartists, etc. New diff here. Modern JS would use `backticks`, but they would not work with old browsers. Escaping oldcontext may be unnecessary as it should contain only letters but, as it's read from an editable page, caution doesn't hurt. Certes (talk) 01:15, 4 August 2021 (UTC)- I've withdrawn the initial \b. It does fix a missing initial in a one-word context like "rtists", but these are rare. However, it prevents fixes from occurring at all when the context starts with punctuation like "* Example", which is more common. The first diff seems better. Certes (talk) 01:18, 5 August 2021 (UTC)
- As my monologue is drifting off-topic, I'll add further comments at User talk:Uziel302/typo.js. Certes (talk) 08:34, 5 August 2021 (UTC)
New typo.js
Selecting an action (Replace, Remove, Type or check template) usually works first time but occasionally just refreshes the typo list, requiring us to click again. Uziel and I have released a new version of typo.js aimed at making these actions work first time more often. Accessing the new version may require a hard refresh. Lists should still be processed from the bottom upwards; this change does not enable top-down working. @JohnFromPinckney, TSventon, and Bellowhead678: please let us know if
- actions work first time more often, less often or similar to before;
- you notice any unexplained changes.
Thanks and hapy fixign, Certes (talk) 14:36, 5 August 2021 (UTC)
ToC
Should the ToC be enabled on all the CTIOC pages? I don't see why navigation would be necessary. Pinging @TSventon, @JohnFromPinckney, @Bellowhead678, @Certes, and @Uziel302. ― Qwerfjkltalk 17:36, 18 August 2021 (UTC)
- I don't find it useful. Do others? Certes (talk) 17:45, 18 August 2021 (UTC)
- Qwerfjkl, you can add __NOTOC__ to all the pages. Uziel302 (talk) 18:48, 18 August 2021 (UTC)
- Done @Uziel302 ― Qwerfjkltalk 19:04, 18 August 2021 (UTC)
- Qwerfjkl I do find the TOC helpful because it has a list of pages and gives a count of the lines. The list is useful if I think I have corrected a line and haven't I can see it is still in the TOC. The count is useful if I want to clear 50 or 100 lines from a page. TSventon (talk) 21:34, 18 August 2021 (UTC)
- Qwerfjkl, you can add __NOTOC__ to all the pages. Uziel302 (talk) 18:48, 18 August 2021 (UTC)
- We could consider using {{TOC hidden}}, which includes the ToC but collapses it by default until the reader clicks [show]. Certes (talk) 23:16, 18 August 2021 (UTC)
- Certes That would work for me. TSventon (talk) 06:38, 19 August 2021 (UTC)
- I see I am too late, having been away for a couple of days. I miss the TOC for two reasons: (1) it shows me a count of the listings on the page (now available via the main page's subpage list, but for that I have to navigate there and back), and (2) I can see at a glance the items which may particularly interest me (German-related items, for example). Not critical and I'll live without it, but I will miss it. — JohnFromPinckney (talk / edits) 23:02, 20 August 2021 (UTC)
- JohnFromPinckney I don't see why it should be too late, what do you think about using {{TOC hidden}}? IF two of four (or five?) responses find the TOC helpful, then it is probably worth putting it back. TSventon (talk) 20:47, 21 August 2021 (UTC)
- If TOC hidden is a good compromise for all, I'd like that. — JohnFromPinckney (talk / edits) 07:38, 22 August 2021 (UTC)
- @Uziel302 would you object to my adding {{TOC hidden}} to the pages as two of the five responses found a TOC useful? TSventon (talk) 07:07, 7 September 2021 (UTC)
- TSventon, go ahead. Uziel302 (talk) 08:09, 7 September 2021 (UTC)
- @Uziel302 and JohnFromPinckney, done. TSventon (talk) 10:20, 7 September 2021 (UTC)
- Thanks, colleagues. — JohnFromPinckney (talk / edits) 11:12, 7 September 2021 (UTC)
- @Uziel302 and JohnFromPinckney, done. TSventon (talk) 10:20, 7 September 2021 (UTC)
- TSventon, go ahead. Uziel302 (talk) 08:09, 7 September 2021 (UTC)
- @Uziel302 would you object to my adding {{TOC hidden}} to the pages as two of the five responses found a TOC useful? TSventon (talk) 07:07, 7 September 2021 (UTC)
- If TOC hidden is a good compromise for all, I'd like that. — JohnFromPinckney (talk / edits) 07:38, 22 August 2021 (UTC)
- JohnFromPinckney I don't see why it should be too late, what do you think about using {{TOC hidden}}? IF two of four (or five?) responses find the TOC helpful, then it is probably worth putting it back. TSventon (talk) 20:47, 21 August 2021 (UTC)
Erroneous snippet inclusion
This update of page 13 included the following:
== [[Mezangelle]] == netwurk-><!--network-->netw'''o'''rk? (sound) context: <nowiki>~~~ <ref>http://cramer.plaintext.cc:70/essays/textviren_-_mez/textviren_mez.pdf</ref> and an example of a netwurk </nowiki></br> '''netwurk''' <nowiki> is <nowiki>_The data][h!][bleeding tex][e][ts_</nowiki>.<ref name="netwurkerz">{{cite ~~~</nowiki>
The </nowiki>
included in the snippet caused the entire rest of the page to be messed up. One solution would be to replace <
in snippets with <
. MANdARAX • XAЯAbИAM 19:55, 24 September 2021 (UTC)
Typo in file name
The same subpage as above (13) had:
== [[Lantern Festival]] == nignt-><!--night-->nig'''h'''t? (shape) context: ~~~ Lantern Festival in Taiwan at nignt </br> '''nignt''' 5.jpg ~~~
In this case, it's easy to see that it's part of an image name which shouldn't be "fixed", but similar instances might not include the ".jpg" part in the snippet, so unsuspecting users may fix typos which break images. The context in the article is an infobox with | image = Lantern Festival in Taiwan at nignt 5.jpg
. MANdARAX • XAЯAbИAM 07:54, 25 September 2021 (UTC)
script error
I get the following error when trying to correct typos (probably a script conflict):
VM321:211 Uncaught TypeError: Cannot read properties of undefined (reading 'match')
at HTMLAnchorElement.removeReport (<anonymous>:211:56)
at HTMLAnchorElement.dispatch (load.php?lang=en&modules=jquery%2Coojs-ui-core|jquery.ui&skin=vector&version=1063s:70)
at HTMLAnchorElement.elemData.handle (load.php?lang=en&modules=jquery%2Coojs-ui-core|jquery.ui&skin=vector&version=1063s:66)
― Qwerfjkltalk 17:30, 29 September 2021 (UTC)
- I have no idea, but I know when a trouble report is missing information:
- 1) Which typos were you trying to correct?
- 2) Do you always get this error when you repeat your attempt?
- 3) Since you suspect a script conflict, what happens when you comment out/delete all of your other scripts? Can you tell (and tell us) which script is conflicting with CTIOC?
- — JohnFromPinckney (talk / edits) 00:00, 30 September 2021 (UTC)
- @JohnFromPinckney Just checked, it's a script conflict. The error appears whenever I try to correct any typo. It'll take a while to find out which script is causing it, so I'll see when I can get around to this. ― Qwerfjkltalk 06:22, 30 September 2021 (UTC)
- Oh, good grief. — JohnFromPinckney (talk / edits) 08:26, 30 September 2021 (UTC)
- User:Qwerfjkl, above someone noted about conflict with hideSectionDesktop.js, it seems like you don't have this one but you do have other scripts manipulating the sections, I would try to remove them first. Uziel302 (talk) 06:28, 4 October 2021 (UTC)
- I excluded another script from typo pages. Once you've identified the conflicting script(s), that may work for you too. Certes (talk) 15:29, 4 October 2021 (UTC)
- I do have a similar script, User:BrandonXLF/CollapseSections, but that doesn't seem to be causing the issue. ― Qwerfjkltalk 15:53, 4 October 2021 (UTC)
- I excluded another script from typo pages. Once you've identified the conflicting script(s), that may work for you too. Certes (talk) 15:29, 4 October 2021 (UTC)
- User:Qwerfjkl, above someone noted about conflict with hideSectionDesktop.js, it seems like you don't have this one but you do have other scripts manipulating the sections, I would try to remove them first. Uziel302 (talk) 06:28, 4 October 2021 (UTC)
- Oh, good grief. — JohnFromPinckney (talk / edits) 08:26, 30 September 2021 (UTC)
- @JohnFromPinckney Just checked, it's a script conflict. The error appears whenever I try to correct any typo. It'll take a while to find out which script is causing it, so I'll see when I can get around to this. ― Qwerfjkltalk 06:22, 30 September 2021 (UTC)
- @Certes @JohnFromPinckney @Uziel302, I finally got around to checking this out, and it turns out the scripts User:DannyS712/DiscussionCloser and User:Writ Keeper/Scripts/autoCloser.js both cause an error individually. ― Qwerfjkltalk 18:53, 30 October 2021 (UTC)
- Well spotted. You may want to prevent those scripts from running when you are on the typo page, like I did with hideSectionDesktop. Certes (talk) 23:13, 30 October 2021 (UTC)
Timing of new typos
The previous section suggests that new lists include some typos which had already been fixed. I've noticed this too. Can we concentrate on unfixed typos by timing the runs better, e.g. just after a monthly dump if that is what is being used? Certes (talk) 16:39, 2 November 2021 (UTC)
- Pinging Uziel302 in case you can optimise the timings. Certes (talk) 13:06, 12 November 2021 (UTC)
- Certes, I removed from the last run everything that was corrected this month, since the dump was done on the 2nd. Uziel302 (talk) 20:02, 12 November 2021 (UTC)
ccording
I found this one a bit funny, but something's definitely not right:
- Special:Diff/1039402992: According → AAccording
- Special:Diff/1042387090: AAccording → AAaccording
- Special:Diff/1053055448: AAaccording → AAaaccording.
~~~~
User:1234qwer1234qwer4 (talk) 17:03, 1 November 2021 (UTC)
- Pinging @Uziel302 ― Qwerfjkltalk 17:20, 1 November 2021 (UTC)
- @1234qwer1234qwer4: This is the one of the bugs reported at #Repeated correction above (search for "artists"). The problem is that the script wants to change ccording → according, which is appropriate when the text reads "foo ccording bar" but not when it has already been fixed to "foo according bar" and doesn't need another "a". A similar bug affecting additions after the word was fixed, but the change I tried to solve this one caused other problems. Certes (talk) 17:23, 1 November 2021 (UTC)
- I think I've fixed this one in my local copy of the software here. Limiting the addition of \b to contexts starting with a letter should prevent the side-effects I saw before. We can roll this out to the main version if desired. (My version also has a minor optional change to reduce white space and fit more typos on a screen.) Certes (talk) 19:36, 1 November 2021 (UTC)
- Updated the main script. Uziel302 (talk) 20:14, 12 November 2021 (UTC)
"Word could not be found in article" when word is in article
So, often when I try and fix a typo after fixing another one, it'll say "word could not be found in article" and then "removed" but usually when you go to the actual article, the word is there. Anyone know why this bug happens and how to fix it or get around it? Blaze The Wolf | Proud Furry and Wikipedia Editor (talk) 19:54, 29 March 2021 (UTC)
- Blaze The Wolf | Proud Furry and Wikipedia Editor, there are two cases I know for this: A. the article was moved and it searches the redirection. B. there was edit in other part of the sentence so it doesn't find the word with its context line. Uziel302 (talk) 04:41, 22 April 2021 (UTC)
- @Uziel302: neither of these are the case. The article usually isn't moved, and the sentence is usually the same as what is provided when this happens. Blaze The Wolf | Proud Furry and Wikipedia Editor (talk) 15:22, 22 April 2021 (UTC)
- A concrete example would help. I am working on a new batch, maybe it will easier to reproduce then. Uziel302 (talk) 20:34, 23 April 2021 (UTC)
- @Uziel302: fomula was not found in Foma Bohemia, see Special:Diff/1019792299, but article still contains ~~~ R09 - Liquid developer to [[Rodinal]] fomula . TSventon (talk) 12:55, 25 April 2021 (UTC).
- TSventon, the script looks for "R09 - Liquid developer to Rodinal fomula " which is not in the article, the ending space isn't there. This issue started since I switched from C to Python by January, fixed the code now. I will remove the additional space from the lists now. Uziel302 (talk) 20:41, 25 April 2021 (UTC)
- @Uziel302:, @Blaze The Wolf:, I am glad to hear there is an explanation and a solution. TSventon (talk) 20:47, 25 April 2021 (UTC)
- TSventon, the script looks for "R09 - Liquid developer to Rodinal fomula " which is not in the article, the ending space isn't there. This issue started since I switched from C to Python by January, fixed the code now. I will remove the additional space from the lists now. Uziel302 (talk) 20:41, 25 April 2021 (UTC)
- @Uziel302: fomula was not found in Foma Bohemia, see Special:Diff/1019792299, but article still contains ~~~ R09 - Liquid developer to [[Rodinal]] fomula . TSventon (talk) 12:55, 25 April 2021 (UTC).
- A concrete example would help. I am working on a new batch, maybe it will easier to reproduce then. Uziel302 (talk) 20:34, 23 April 2021 (UTC)
- @Uziel302: neither of these are the case. The article usually isn't moved, and the sentence is usually the same as what is provided when this happens. Blaze The Wolf | Proud Furry and Wikipedia Editor (talk) 15:22, 22 April 2021 (UTC)
- Similar issue: removal, manual fix. ~~~~
User:1234qwer1234qwer4 (talk) 01:11, 13 November 2021 (UTC)- I've come across this too. The problem occurs when the preceding text contains
. This is converted to a space in the typo report. We then search the article for text containing a space, which is not found. Other HTML entities may have similar effects. Certes (talk) 14:12, 13 November 2021 (UTC)
- I've come across this too. The problem occurs when the preceding text contains
Typos off main page?
Would it be an improvement to put all typos on subpages, keeping WP:Correct typos in one click itself free from actual typos? Perhaps #List entries could instead have a simple statement of the last run date, if only to trigger a watchlist entry for editors awaiting a new set of typos. Certes (talk) 23:41, 2 December 2021 (UTC)
- Certes, makes sense, you can simply copy paste the list from the main page to a subpage and edit it accordingly. For me, the main bottleneck now is time to improve the algorithm so that we will get enough results in each run and not leave the project empty. Uziel302 (talk) 09:52, 12 December 2021 (UTC)
- Well, IMO, empty's not any worse than dealing with duplicates. This run (I just did page 4) was the best I've ever experienced, with really good detection and very useful replacement suggestions. Bravo! But at the end I still had several "typos" which were not only already fixed, some of them were done (as early as 22 November), by myself, via CTIOC. I don't know how much work needs to be done between DB dump and list creation, but maybe that's an area that could be easily improved? — JohnFromPinckney (talk / edits) 10:18, 12 December 2021 (UTC)
Doesn't work for me
It seems that this has never come up before... The script does nothing for me, no additional buttons. I believe I managed to clear the browser cache (Safari on Mac) but I'm not 100% sure as the instructions on Wikipedia:Bypass your cache didn't do anything for me. Cmd-shift-r gives me some document view, but I clicked Develop-empty caches. Any other idea, or any test whether my cache has indeed been cleared? Thanks, Pgallert (talk) 08:01, 24 December 2021 (UTC)
- A couple of websites suggest that the appropriate keypress for Safari on Mac is ⌥ Option+⌘ Command+E or ⌘ Command+Alt+E (which I think is the same thing on a different version of the keyboard). Certes (talk) 13:47, 24 December 2021 (UTC)
- Thank you Certes. ⌘ Command+Alt+E is what 'Develop-empty caches' from the Safari menu does. I did that, but I do not see any new buttons. Could there be a conflict with settings from the preferences to make the script work? I noticed that I didn't have a
common.js
before. --Pgallert (talk) 07:59, 25 December 2021 (UTC)- Not having had a common.js is helpful. It eliminates a problem I had, which was that another script conflicted with this one. (Each modifies section headings, assuming that the other hasn't.) However, you're missing out on lots of other useful scripts. One possible confusion: the "buttons" you should see below each section header are actually wikilinks. When the script works, Wikipedia:Correct typos in one click/6 begins
- Band of the Estonian Defence Forces
- Replace Remove Type check template
- marss->mars? (double) context: ~~~ | Politsei marss
- with Replace etc. wikilinked to JavaScript functions to fix the article. Certes (talk) 12:41, 25 December 2021 (UTC)
- Pgallert, did you navigate to the lists? it only adds the buttons on the lists paragraphs. Uziel302 (talk) 19:54, 25 December 2021 (UTC)
- Not having had a common.js is helpful. It eliminates a problem I had, which was that another script conflicted with this one. (Each modifies section headings, assuming that the other hasn't.) However, you're missing out on lots of other useful scripts. One possible confusion: the "buttons" you should see below each section header are actually wikilinks. When the script works, Wikipedia:Correct typos in one click/6 begins
- Thank you Certes. ⌘ Command+Alt+E is what 'Develop-empty caches' from the Safari menu does. I did that, but I do not see any new buttons. Could there be a conflict with settings from the preferences to make the script work? I noticed that I didn't have a
Aaaah :) I was confused by several things at once: First, that new functionality fo far always added buttons or tab keys, but in any case something visible from the page to be repaired. And second, the "buttons". Thanks for the clarification, it works now.
But it only works somehow. It is not "in one click" - most of the time I have to click "replace" more than once, and it occasionally gives strange feedback even for the entries that eventually can be changed successfully ('passage unavailable', 'Paragraph is missing'). But thanks a lot for the clarification! --Pgallert (talk) 08:21, 26 December 2021 (UTC)
- Tips for a beginner: (1) Make sure you are working from the bottom of the page, as the entries on the list page are idendified by (hidden) section numbering. (2) Check that you aren't working a page at the same time as one or more other editors. I always check the Revision History and if somebody has been editing in the last, oh, 15 minutes, I don't touch the page. (3) The "passage unavailable" or "not found" messages can appear when the typo has already been corrected or the context changed. (4) If you're getting a lot of these messages (which, N.B., aren't necessarily errors), try refreshing the lists page. That should clear out all of the processed entries and let the section numbering get straightened out. — JohnFromPinckney (talk / edits) 18:48, 26 December 2021 (UTC)
- Thank you JohnFromPinckney, with these hints it is indeed one click. Just one more question: If the context clearly is not English, I opted not to do anything in order not to create a false white-list entry. Is that correct, or should I rather 'remove' if it is a very improbable typo? --Pgallert (talk) 07:00, 28 December 2021 (UTC)
- Pgallert, everything that is not a typo should be removed with remove button. Currently I do whitelist everything that was removed. In the future, I can filter the removals only for the specific article it was removed (this is why the edit summary contains the article where the word was found). Uziel302 (talk) 09:27, 28 December 2021 (UTC)
- Thank you JohnFromPinckney, with these hints it is indeed one click. Just one more question: If the context clearly is not English, I opted not to do anything in order not to create a false white-list entry. Is that correct, or should I rather 'remove' if it is a very improbable typo? --Pgallert (talk) 07:00, 28 December 2021 (UTC)
'Typos corrected' quarry is wrong
Currently, WP:CTI1C says "Over 93,744 typos have been fixed in the project as of 14 October 2021 (Quarry)
", but the quarry currently, 2 months later, results in only 57,719 rows. This was the last update, and the query URL did not change, so I'm inclined to believe the 93k figure was correct at the time. Does anyone know what's wrong/how to fix? ~ Tom.Reding (talk ⋅dgaf) 14:23, 12 December 2021 (UTC)
- Don't know all the workings, but why is it
LIKE "% was dismissed";
instead of something like, say,LIKE ">% fixed!";
? That query seems to count things we haven't fixed.. — JohnFromPinckney (talk / edits) 19:45, 12 December 2021 (UTC)- @JohnFromPinckney: bingo! I think this was the intended query, using
LIKE "% fixed!";
, which gives a result of 96,771 rows, a much more reasonable figure. I'll update. ~ Tom.Reding (talk ⋅dgaf) 20:32, 12 December 2021 (UTC)- Cool. I would have checked to see when/whether the query was last changed, but I have no idea how to do that. Or can Quarry queries even be changed? I don't see how. Does one just make a fork? — JohnFromPinckney (talk / edits) 21:31, 12 December 2021 (UTC)
- JohnFromPinckney, Tom.Reding, the creator of a query can edit it, and I mistakenly edited the original query I linked here instead of making a fork. Sorry for that and thanks for the fix. Uziel302 (talk) 09:32, 28 December 2021 (UTC)
- Cool. I would have checked to see when/whether the query was last changed, but I have no idea how to do that. Or can Quarry queries even be changed? I don't see how. Does one just make a fork? — JohnFromPinckney (talk / edits) 21:31, 12 December 2021 (UTC)
- @JohnFromPinckney: bingo! I think this was the intended query, using
Capitalising names
Wikipedia:Correct_typos_in_one_click/8 contains 11 sports team that are all in lower case. Is there an easyish way to change them to proper case? The Croatian team includes diacritics. TSventon (talk) 08:51, 9 January 2022 (UTC)
- TSventon, I usually just click type and change the first letter to capital, I see there are names with more than one word so it will be better to go to the article and edit all the lower case names properly. I don't see a way to do it not manully. Uziel302 (talk) 13:51, 9 January 2022 (UTC)
- šibenik → {{subst:ucfirst:šibenik}} works for A–Z and some diacritics, but I'm not how complete its coverage is. Certes (talk) 14:49, 9 January 2022 (UTC)
- Thank you both, I have capitalised the names using visual editor and found capitals with diacritics by searching in Wikipedia. TSventon (talk) 17:30, 9 January 2022 (UTC)
Template field
I just dismissed strunz here. strunz is a field in Template:Infobox mineral, which appears in several huindred articles, so is it possible to exclude it somehow? TSventon (talk) 15:01, 3 February 2022 (UTC)
- Anything between | and = (with any spacing) should probably be excluded, unless we're aiming to catch |auhtor= and similar. Certes (talk) 15:46, 3 February 2022 (UTC)
Typo list cleaner
I've just run though a page of typos. Although I got fewer "not found" messages than in previous runs, I still find that many of the typos are already fixed. Would it be worth adding a button which reads in each article on the page, working up the list, saving no edits but checking that the text is actually detected? Absent typos can then be removed from the list without troubling the humans. To avoid hammering the servers, we might want a one-second delay between page reads, meaning that a 250-typo list would take four minutes to process, but that's less time than it would a human to click or type in 250 corrections and is far less tedious. The optional new procedure would then be to click the filter button, go for a coffee, and return as the page refreshes with a list of maybe 125 corrections which actually need human intervention. Any thoughts? Certes (talk) 01:17, 10 January 2022 (UTC)
- No comment on the technical solution above, but one about this experience myself. I was working a list (/4) yesterday, and came across a few that were already changed using this tool (the specific instance I remember was from 1 Jan.). Is there at least a way to purge duplications within the CTIOC lists? — JohnFromPinckney (talk / edits) 04:13, 10 January 2022 (UTC)
- Many from my run of /7 yesterday had already been fixed with CTIOC by myself or others, e.g. bismal was not found in 2000s in Bangladesh which I fixed on 2 Jan. It would enhance the tool if we could eliminate those, either when preparing the page or later. Certes (talk) 11:00, 10 January 2022 (UTC)
- The original idea was that fixed typos won't appear on second scan. It doesn't always work, if the scan is based on 1.1 dump and the fix was made on 2.1, the typo would be in the list again. I agree that there should be a way to verify that a list is up to date, it can be done using a call similar to the one that send the error "not found". Certes, you are welcome to create such feature, I don't think I will have time for it soon. Uziel302 (talk) 19:53, 24 January 2022 (UTC)
- Many from my run of /7 yesterday had already been fixed with CTIOC by myself or others, e.g. bismal was not found in 2000s in Bangladesh which I fixed on 2 Jan. It would enhance the tool if we could eliminate those, either when preparing the page or later. Certes (talk) 11:00, 10 January 2022 (UTC)
- When there were 100,000 typos to clean, they dwarfed the false positives and most clicks made an improvement. Today, I went through a list of 224 suggestions and only fixed 16 (dismissing the rest). That shows what great work the project has done – and can continue to do, but we may need a slight change of direction to stop the false positives (still at the same low level) drowning out the real errors (getting rare thanks to this project). Certes (talk) 21:31, 24 January 2022 (UTC)
- Certes, the change in false positives ratio is due to a new version of the code, it works in a different way but the main change was that now I go over any replacement of vowel with vowel and consonant with consonant, not only letters that are similar in shape or sound. Uziel302 (talk) 12:16, 26 January 2022 (UTC)
- Hi User:Uziel302, I also noticed a vast increase in false positives, well over 90% of what I checked just now. Are we running out of typos to fix? But I also think the false positives could be reduced without huge effort, for instance by excluding:
- text in references, code, or blockquote tags
- text with double quotes nearby
- pages with a language name in the title
- This would have excluded most of what I removed today. Cheers, Pgallert (talk) 08:17, 27 January 2022 (UTC)
- At the risk of adding more work, we might also exclude:
- words which are English Wikipedia article titles (but not redirects, or at least not R from misspelling)
- words which have at least one English meaning in Wiktionary without the text "Misspelling of..."
- text with many other "typos" nearby.
- The latter would rule out passages in foreign languages and also some olde-faſhioned texte. Certes (talk) 10:42, 27 January 2022 (UTC)
- Building on that, I'd also suggest excluding wikilinks (or the front half of a piped wikilink) and text in math tags. GoingBatty (talk) 18:12, 27 January 2022 (UTC)
- Oh, and <syntaxhighlight lang="..."> blocks which contain keywords, variables, etc. Certes (talk) 19:12, 27 January 2022 (UTC)
- Yes, we may well be running out of typos to fix, or at least fixable ones. (No program will spot He red a book.) Of course, new typos still appear daily. It may soon be time to look only at typos that weren't in the previous run – either "previous to 27 Jan 2022" or "previous to the latest run". That only works if the software stops improving, otherwise we miss old errors with newly identified patterns. Certes (talk) 19:21, 27 January 2022 (UTC)
- Certes, current list excludes words that appear in English Wiktionary so there are less false positives, and I found out I am not getting all the results because something was killing the script in toolforge. Now I run the script locally and posting the results, there are enough typos to work with. Uziel302 (talk) 09:40, 2 April 2022 (UTC)
- At the risk of adding more work, we might also exclude:
- Hi User:Uziel302, I also noticed a vast increase in false positives, well over 90% of what I checked just now. Are we running out of typos to fix? But I also think the false positives could be reduced without huge effort, for instance by excluding:
- Certes, the change in false positives ratio is due to a new version of the code, it works in a different way but the main change was that now I go over any replacement of vowel with vowel and consonant with consonant, not only letters that are similar in shape or sound. Uziel302 (talk) 12:16, 26 January 2022 (UTC)
Last run date
Would it be sensible to add and maintain a statement like "Last run 29 March 2022" on the main typo page? That would allow editors to watch just the main page and get one alert for each new batch, and unwatch /1 to /20 so they don't flood the watchlist. Certes (talk) 15:22, 27 April 2022 (UTC)
This tool is introducing typos
Here here the tool changed hydromatic to hydromantic. In this case, hydromatic is correct - it was a trade name use by de Havilland for some of its propellers. Other examples are on page 6 - turcock->turncock? (omit) context: ~~~ 636 and 642. Blackburn Civil Biplane, turcock
turcock. - this is the Blackburn Turcock, while the page if full of incorrect suggestions. This tool needs to be stopped.Nigel Ish (talk) 08:33, 15 May 2022 (UTC)
- Nigel Ish, the tool is designed to be used by human editors, who take responsibility for accepting, changing or dismissing its suggestions. Obviously human editors will sometimes make mistakes, as in the first example you quoted. For the second example, I used the tool to correct turcock to Turcock. TSventon (talk) 08:55, 15 May 2022 (UTC)
- @Nigel Ish: Thank you for pointing out the mistakes. This tool has helped us correct over 100,000 typos. If it's also introduced a handful of errors, which we're happy to correct on request, that's unfortunate but a small price worth paying for a huge improvement overall. Editors make occasional mistakes with AWB and DisamAssist and Visual Editor but we haven't withdrawn those tools because, on balance, they improve the encyclopedia. This tool is used by experienced editors who generally dismiss or improve the incorrect suggestions, and I believe it has a very low error rate. Certes (talk) 11:45, 15 May 2022 (UTC)
Uziel302, this seems to be about one user who didn't understand the tool, but used it for 20 minutes on 14 May to make around 250 "corrections". This is under discussion at Wikipedia talk:WikiProject Languages#Non-typo fixes and Wikipedia:Administrators' noticeboard/Incidents#Incorrect use of typo-correcting script. TSventon (talk) 15:19, 16 May 2022 (UTC)
- It may be worth better instructions to stress that users of the tool must only make changes when 1) They are sure that it is a genuine typo that is being suggested and 2) That the suggested correction is correct in context.Nigel Ish (talk) 17:03, 16 May 2022 (UTC)
- I hoped that was obvious, but clearly it wasn't. I've made the need to check more explicit. Certes (talk) 17:48, 16 May 2022 (UTC)
Word not found in line
I switched back to c script, after I found out the python script was dying in the middle of execution. I see there is a problem with cases like this, where the line ends with special char (,) and space instead of letter and space: — Preceding unsigned comment added by Uziel302 (talk • contribs) 08:05, 31 May 2022 (UTC)
colured->cloured? (swap) context:
cover, a thin, dark-colured,
colured, breathable cloth, is sometimes used to protect articles from the sun while heating and drying them in the sun.
Certes, I think we should revert one of the changes of the js script, the c script I used was for sure working before I switched to python. Uziel302 (talk) 08:05, 31 May 2022 (UTC)
- Certes, I reverted the last two changes that as far as I could tell were the relevant changes, verified the bug is fixed following the change. I will need to change the C script to support this change before we can bring it back to JS. Uziel302 (talk) 10:21, 31 May 2022 (UTC)
- @Uziel302: If that comment is about air circulation: colured could be a typo for coloured or for cloured. Of course, coloured is many times more likely. Ideally, the script would take into account how common a potential replacement is, but I expect that would be very difficult.If the comment applies to the previous section #Word not found in line: \b may misbehave if the typo ends in a letter other than a-z. For example, in "fiancé", some implementations of \b may see a word break after the é (because they know é is a letter) but others see a word break before the é (because they see only [A-Za-z0-9_] as word characters). Could that be the problem here? Certes (talk) 13:09, 31 May 2022 (UTC)
- Certes, I only opened one paragraph, but since I copied to it an example, it broke into two. I indeed talk about the \b thing, I had to remove your related changes due to moving back to C script, which doesn't support cutting lines of context at \b, like my python script did. You can see the C script here: [1]. It is a bit messy. I understand there are some cases that the script will replace in wrong place because of the change to JS script, but I can't fix the C script yet. The python script was very slow and we only got partial scan, that is why we didn't have much in the lists lately, but now the lists are full. Uziel302 (talk) 19:08, 31 May 2022 (UTC)
- @Uziel302: I've not written C for years and had forgotten how verbose C can be. Would it help to
#include <ctype.h>
and use functions likeispunct
? Also, simple helper functions like#include <string.h> int found(const char *s) { int l = strlen(s); return strncmp(s, context+BEFORE-l, l) == 0; }
- could reduce (say) lines 80–81 to
if (found("<text")) {textflag=1;}
- which might make such things easier to write and debug. Certes (talk) 23:21, 31 May 2022 (UTC)
- I know the script is aweful, I didn't have any programming knowledge when I started. I made a fix for the punctuation issue: [2]. I hope it will solve the problem. I need to download new dump and run the script on it, and then I will bring back the changes to JS. Uziel302 (talk) 04:12, 7 June 2022 (UTC)
- It's not awful if it does the job and you can maintain it. Your current version may run marginally faster than strncmp(), which may be important here. Certes (talk) 10:38, 7 June 2022 (UTC)
- Certes, I don't know about efficiency but it was simpler for me to solve it this way. New lists from the updated script are in place, and I brought back your changes to JS script. Uziel302 (talk) 21:44, 7 June 2022 (UTC)
- It's not awful if it does the job and you can maintain it. Your current version may run marginally faster than strncmp(), which may be important here. Certes (talk) 10:38, 7 June 2022 (UTC)
- I know the script is aweful, I didn't have any programming knowledge when I started. I made a fix for the punctuation issue: [2]. I hope it will solve the problem. I need to download new dump and run the script on it, and then I will bring back the changes to JS. Uziel302 (talk) 04:12, 7 June 2022 (UTC)
- @Uziel302: I've not written C for years and had forgotten how verbose C can be. Would it help to
- Certes, I only opened one paragraph, but since I copied to it an example, it broke into two. I indeed talk about the \b thing, I had to remove your related changes due to moving back to C script, which doesn't support cutting lines of context at \b, like my python script did. You can see the C script here: [1]. It is a bit messy. I understand there are some cases that the script will replace in wrong place because of the change to JS script, but I can't fix the C script yet. The python script was very slow and we only got partial scan, that is why we didn't have much in the lists lately, but now the lists are full. Uziel302 (talk) 19:08, 31 May 2022 (UTC)
- @Uziel302: If that comment is about air circulation: colured could be a typo for coloured or for cloured. Of course, coloured is many times more likely. Ideally, the script would take into account how common a potential replacement is, but I expect that would be very difficult.If the comment applies to the previous section #Word not found in line: \b may misbehave if the typo ends in a letter other than a-z. For example, in "fiancé", some implementations of \b may see a word break after the é (because they know é is a letter) but others see a word break before the é (because they see only [A-Za-z0-9_] as word characters). Could that be the problem here? Certes (talk) 13:09, 31 May 2022 (UTC)
I broke it
I'm clicking "Replace" but the typo is not found, even though it's clearly there. I've tried replacing my common.js by a temporary version which loads only typo.js and no other scripts, and given it a hard refresh, but still no effect. Any ideas please? Certes (talk) 21:45, 8 June 2022 (UTC)
- Certes, it's my fault, I made sure the context line ends with word ending and haven't fixed the case that context line starts in the middle of the word. Will get to fix it soon. Uziel302 (talk) 12:03, 9 June 2022 (UTC)
- Certes, added requirement of a space before starting the context line, ran the script on the list of articles that had suspect typos on the last run (since we don't have new dump yet) and uploaded the new lists. Thanks a lot for reporting this bug. Uziel302 (talk) 03:45, 11 June 2022 (UTC)
- Thanks. That's now working again for me. Certes (talk) 09:12, 11 June 2022 (UTC)
- This batch seems a big improvement on the last one. It's
mostly38% actual typos to fix (the previous lot were mainly false positives or already fixed) and everything is working smoothly. That could be because I'm getting in early before other editors fix anything, but I think it's genuinely better. One small point: any punctuation after the suspect word is removed in the report, e.g. Roubaix reads ...desindustrialisation. The... but /1#Roubaix reads ...desindustrialisation The... without the ".". Certes (talk) 20:53, 11 June 2022 (UTC) - Another minor issue: /19#NMS gives
word not found in context line
. Also, the last word on the first line of that display is neuropepti rather than neuropeptid. Unusually the typo occurs at the end of a line. This could be the same problem as the missing punctuation, if there's an off-by-one error in the string length when the typo is followed by punctuation or a line feed rather than a simple space. (Yes, it's an end-of-line issue, also occurs on /19#Network simulation.) Certes (talk) 21:17, 11 June 2022 (UTC) - One more cosmetic issue: /19#Heat therapy works correctly but highlights the first "e" in the suggested fix rather than the added "e": displays neurodegenerative rather than neurodegenerative. It's sometimes a different letter: /19#1960 in country music shows neotraditionalist rather than neotraditionalist. Certes (talk) 21:45, 11 June 2022 (UTC)
- Certes, I fixed the issues you reported and uploaded new lists, they might be with more false positives than last batch since it's the beginning of the alphabet and was already here in some lists that people skimmed through. Uziel302 (talk) 20:39, 2 July 2022 (UTC)
- Thank you! I'm quite busy off-wiki at the moment but look forward to using the new pages soon. Certes (talk) 22:56, 2 July 2022 (UTC)
- Certes, I fixed the issues you reported and uploaded new lists, they might be with more false positives than last batch since it's the beginning of the alphabet and was already here in some lists that people skimmed through. Uziel302 (talk) 20:39, 2 July 2022 (UTC)
- Certes, added requirement of a space before starting the context line, ran the script on the list of articles that had suspect typos on the last run (since we don't have new dump yet) and uploaded the new lists. Thanks a lot for reporting this bug. Uziel302 (talk) 03:45, 11 June 2022 (UTC)
Script not working
I just tried this for the first time in a while, and I keep getting "paragraph missing" and "paragraph not found" errors. I doubt it's a script conflict because I don't load any other pages on CTIOC pages. — Qwerfjkltalk 20:30, 28 August 2022 (UTC)
- Seems to be working fine for me. You're definitely going from the bottom of the lists up? If so, maybe it's an issue on certain pages? Bellowhead678 (talk) 19:28, 29 August 2022 (UTC)
- @Bellowhead678, for instance Wikipedia:Correct typos in one click/3#CloseUp1 gives "paragraph is missing" . Are you on Vector 2022 (I am)? — Qwerfjkltalk 19:56, 29 August 2022 (UTC)
- Same issue for me, even on pages that are showing as last edited 10 days ago. Turtlecrown (talk) 00:52, 1 September 2022 (UTC)
- Turtlecrown, User:Qwerfjkl, I fixed the bug, please try hard refresh and let me know if everything works for you now. Uziel302 (talk) 08:15, 2 September 2022 (UTC)
- @Uziel302, it works now. — Qwerfjkltalk 09:17, 2 September 2022 (UTC)
- Turtlecrown, User:Qwerfjkl, I fixed the bug, please try hard refresh and let me know if everything works for you now. Uziel302 (talk) 08:15, 2 September 2022 (UTC)
I literally have no idea what i'm doing
I copy and pasted the code to my common.js as (and later with my username instead) neither worked, I've done a hard refresh and everything. Everything inside the brackets appears red and nothing on the edit source page is giving me any hints. Incase you can't tell, I have zero code experiance. Any help would be appreacated. EmilySarah99 (talk) 10:05, 17 October 2022 (UTC)
- EmilySarah99, thanks for your effort to help fix typos in Wikipedia. After loading the script, the only visible change is when viewing the lists, e.g. Wikipedia:Correct typos in one click/2, under each title should be few action links. In addition, there is a new tool that doesn't require script installation and aims to simplify the process: https://typos.toolforge.org/
- Thanks again and let me know if you still have issues. Uziel302 (talk) 10:33, 17 October 2022 (UTC)
Toolforge tool - summary of first round
5 hours ago the list that I uploaded to toolforge, containing 5,000 suspects typos, was done (once). I just put back to check the 733 skipped suspects, so there will be something to do until I make another scan.
Thanks to all the participants, here is the list with the fix count:
- Bellowhead678-1556
- QueenofBithynia-504
- Qwerfjkl-364
- TSventon-171
- Uziel302-165
- DaGizza-87
- NovoTape-21
- Mandarax-14
- Roostery123-4
- Uzieltest-4
- Yahya-2
Total of 2892 fixes out of 5,000 = 58%. The reason of this high success rate is that the list included only words that have been fixed on the project in the past.
Only 306 words were dismissed, which shows how people prefer to skip over deciding to dismiss a word as non-typo for sure.
1015 words were not found in articles on server side - it shows the importance of the feature of server side verification before serving the typo to the user to check.
Thanks again to all the contributors and I hope I will manage to post new list before you finish the rerun of 733 skipped words... Uziel302 (talk) 02:00, 19 October 2022 (UTC)
- Thanks! Ocassionally I get a word with no context, e.g. "ddition" in Agora Center. However, I can't see "dittion" on its own in the article - is it somehow picking up only the end of the word "addition"? Bellowhead678 (talk) 07:03, 19 October 2022 (UTC)
- Are the figures complete, 2892 fixes + 306 dismissed + 1015 not found + 733 skipped = 4946. The reason for the low dismiss rate will be that the list included only words that have been fixed on the project in the past. TSventon (talk) 08:15, 19 October 2022 (UTC)
- @TSventon, since I moved back skipped to new status and also there is "served" status, to prevent serving the same typo to multiple users at once, this is the current state:
- 3032 - fixed
- 1026 - wasn't found
- 403 - dismissed
- 343 - new
- 140 - skipped
- 56 - served
- Thanks, Uziel302 (talk) 12:04, 19 October 2022 (UTC)
- @TSventon, since I moved back skipped to new status and also there is "served" status, to prevent serving the same typo to multiple users at once, this is the current state:
- @Bellowhead678, indeed the initial check that the word exists didn't take into account word delimiters. The scan did, and the reason it got to the list is that the problem existed before this fix of one of the project contributors. I just added the requirement to have word delimiters around the suspect word before showing it to editor: github Uziel302 (talk) 11:58, 19 October 2022 (UTC)
- Are the figures complete, 2892 fixes + 306 dismissed + 1015 not found + 733 skipped = 4946. The reason for the low dismiss rate will be that the list included only words that have been fixed on the project in the past. TSventon (talk) 08:15, 19 October 2022 (UTC)
Not a typo
Would it be possible to add the {{not a typo}} template to the functions avaliable, similar to the {{typo help inline}}? ― Qwerfjkl|✉ 20:52, 29 April 2021 (UTC)
- Yes, it is possible, but I think it will be confusing for new editors and they might think it should be used for all non-typos here. I prefer to leave the rare cases that require the template to fully manual treatment. Uziel302 (talk) 15:14, 1 May 2021 (UTC)
- Others probably realised this long ago, but the Type link allows us to replace a word by
{{Not a typo|exampull}}
,{{sic|exampull|hide=y}}
, etc. Certes (talk) 15:29, 3 November 2021 (UTC)- @Certes, great tip, thanks, but it makes a mess of the edit summary in the page history. Doktor Züm (talk) 17:52, 2 November 2022 (UTC)
Buttons
Thank you for producing the tool. It's much easier to use than the wiki pages, and leaves the page histories free from clutter.
Would it be possible to move the "Replace Dismiss..." buttons to a fixed place, perhaps below the "Type" button but above "Found in <article>"? That way we'll always know where to click, rather than having to hunt for the buttons as they bounce around the screen as excerpts of differing height are shown. Thanks again, Certes (talk) 16:41, 3 November 2022 (UTC)
- This only affects be when I have a screen width of less than 1420px. — Qwerfjkltalk 17:11, 3 November 2022 (UTC)
- My current screen is 1600px wide but when I use a window below about 1420px I see a different effect: the four buttons move onto two lines, with two left-justified and two right-justified, even though they would fit easily on one line. That's unfortunate but it's not what I was writing about. Certes (talk) 17:38, 3 November 2022 (UTC)
- Ah, I see what you mean. — Qwerfjkltalk 18:08, 3 November 2022 (UTC)
- I change the breaking point for mobile from 1420 to 900. Uziel302 (talk) 08:12, 4 November 2022 (UTC)
- Ah, I see what you mean. — Qwerfjkltalk 18:08, 3 November 2022 (UTC)
- My current screen is 1600px wide but when I use a window below about 1420px I see a different effect: the four buttons move onto two lines, with two left-justified and two right-justified, even though they would fit easily on one line. That's unfortunate but it's not what I was writing about. Certes (talk) 17:38, 3 November 2022 (UTC)
- By messing about in Firefox's Developer Tools, I've moved the buttons up and found the new position much more usable. The one problem is that the Replace button now lies under the original spelling and the Dismiss button under the suggested replacement, and it's counterintuitive to press the button next to the spelling that's wrong, so I swapped them over too. I've also changed the colour of the suspect word in the extract to make it stand out. Those changes may not suit everyone but they're worth considering. Certes (talk) 18:19, 3 November 2022 (UTC)
- @Certes, I moved the buttons as you offered. As of the colors, I used Wikipedia colors of preview changes, what colors you think would be better? Uziel302 (talk) 08:12, 4 November 2022 (UTC)
- Sorry, you're absolutely right. No change needed. (I use an add-on called Dark Reader which reverses the colours to white text on a less glary black background. It has the unfortunate side-effect of making such subtle backgrounds virtually invisible, but that's a problem I can solve locally.) Certes (talk) 10:34, 4 November 2022 (UTC)
- @Certes, I moved the buttons as you offered. As of the colors, I used Wikipedia colors of preview changes, what colors you think would be better? Uziel302 (talk) 08:12, 4 November 2022 (UTC)
Also, is the "Type" button still useful, or can we now simply type the replacement into the box to the left of Type? Certes (talk) 16:51, 3 November 2022 (UTC)
- See #Toolforge tool above, where I requested this capability. BTW, Uziel302, I see that I neglected to thank you for adding that feature, so ... thanks! MANdARAX XAЯAbИAM 07:23, 4 November 2022 (UTC)
- Understood. Clearly "Type" is still useful for some editors, and it causes no problems for the rest of us. Certes (talk) 10:29, 4 November 2022 (UTC)
Access keys
Can you add shortcuts to the buttons, i.e. r for replace, d for dismiss, s for skip, and t for typo template (or maybe type?). I just use the developers tools to add the accesskey
atribute to the buttons, and it makes it a lot easier. — Qwerfjkltalk 13:09, 5 November 2022 (UTC)
- Though using
accesskey
means it can be held down, so an event listener for a key press would be better. — Qwerfjkltalk 13:21, 5 November 2022 (UTC)- User:Qwerfjkl, I have thought about it but understood that in a tool that supposed to accept typing in multiple fields, making some letters in the keyboard actually save edits might be dangerous and confusing. Do you have an idea how to make it safe and user friendly? Uziel302 (talk) 05:34, 6 November 2022 (UTC)
- @Uziel302, use a combination of e.g. alt + key, alt + shift + key. — Qwerfjkltalk 07:30, 6 November 2022 (UTC)
- User:Qwerfjkl, I have thought about it but understood that in a tool that supposed to accept typing in multiple fields, making some letters in the keyboard actually save edits might be dangerous and confusing. Do you have an idea how to make it safe and user friendly? Uziel302 (talk) 05:34, 6 November 2022 (UTC)
Uh...
I get the idea from this that if you pick up this tool and just robotically accept all the suggestions (which, people will -- you know how people are), most of the changes by far will introduce errors, so that is not good. Even beyond well meaning cleanuppers, this also seems an excellent way to vandalize by introducing scads of subtle slightly-incorrect word changes into the Wikipedia at high speed. Like, if I was a vandal, a tool to rapidly change "teragrams" to "tetragrams" (two entirely different words) etc. would be a godsend I would think.
Unless I'm missing something, should not access to this tool be limited to editors who can use it correctly? Herostratus (talk) 21:05, 5 November 2022 (UTC)
- Herostratus, can you please give details about how this tool can be "limited to editors who can use it correctly"? Currently to edit with the tool you must login with your Wikipedia account, and the edits are on your user and your responsibility, and there are tools to mass revert irresponsible users, like shown in the case you refer to. What else can be done? Uziel302 (talk) 05:29, 6 November 2022 (UTC)
- @Uziel302, you could restrict it to either auto confirmed or extended confirmed users. — Qwerfjkltalk 07:31, 6 November 2022 (UTC)
- I think limiting it to autoconfirmed would help. It's perfectly reasonable to insist on ten manual edits as a very minimal experience qualification for using the tool. I'd support tightening that to extended-confirmed the moment we see any problems. That's still far more lenient than AWB. Certes (talk) 12:09, 6 November 2022 (UTC)
- @Uziel302, you could restrict it to either auto confirmed or extended confirmed users. — Qwerfjkltalk 07:31, 6 November 2022 (UTC)
Second batch
We had high percentage of skips and we reached the end so I put back the skipped typo into the pool. I will work on a way to make skip per users so that one user that skips thousands of suspects won't make the rest left out of suspect typos. Uziel302 (talk) 12:14, 5 November 2022 (UTC)
- Summary of second batch:
- fixed: 1428
- dismissed: 3233
- skipped (twice, will be updated to dismissed in DB): 640
- not found: 300
- served and not handled: 229 (will get back to the pool)
- Thanks for all. Soon will upload third batch. Uziel302 (talk) 08:11, 23 November 2022 (UTC)
- Just uploaded third batch with over 4,000 words. Thanks to all the fixers of the second batch:
- User:Larry Hockett
- User:Bellowhead678
- User:Turtlecrown
- User:TSventon
- User:Certes
- User:Roostery123
- User:Ira Leviton
- User:Qwerfjkl
- User:BarleyButt
- User:Vukky
- User:Mandarax
- User:Uziel302
- User:Erick Soares3
- User:Balon Greyjoy
Please contact me if something is wrong. Uziel302 (talk) 11:26, 23 November 2022 (UTC)
Choosing a replacement
How does the tool choose a suggested correction? For example, when "kiled" appears, why does it suggest "kidel" rather than "killed"? Both are credible replacements, and it's not obvious whether a transposition or a doubling is the better suggestion, but I think any human could guess which is far more likely to be right. Would it be feasible to resolve such ties in favour of the most common word? Sorting the whole dictionary in order of usage may not be practical or necessary, but perhaps the top 100 or 1000 common English words could appear first and get priority, followed by those in a basic open-source word list of about 10,000 words, leaving the unlikely obscure words to be used only if nothing more common is plausible. Certes (talk) 18:56, 27 November 2022 (UTC)
- Certes, thanks for the feedback, it is indeed one of the features I wanted to implement long ago, it isn't a matter of technical difficulty, it is simply a matter of time to implement this. The current algorithm simply choose the first option alphabetically, and kidel is before killed. Uziel302 (talk) 06:45, 29 November 2022 (UTC)
- Does it acgtually choose the first alphabetically, or does it pick the first match from a list that just happens to be in alphabetical order? If it's the latter, and the list is not too huge, could you post it so that we can sort it by frequency? I think it would be enough to sort by position on a list like wikt:Frequency lists#Project Gutenberg, leaving any words that are not in the top 100,000 at the bottom. Alternatively, we could simply replace the list by the Gutenberg list: anything not on there may not be a plausible replacement. Certes (talk) 11:04, 29 November 2022 (UTC)
Toolforge error
I'm trying https://typos.toolforge.org/ but I see "could not get new typos from server". This may be due to my setup rather than any problem on Toolforge, as I set my browser to aggressively block ads, trackers, etc. However, I've whitelisted the site in UBlock Origin and uMatrix and can't think what else might be blocking it. I've tried both logged in and logged out, and with all add-ons disabled. The About link seems to work about 50% of the time ("My name is...") but sometimes gives a "Cannot GET /about". I use Firefox 106 on Ubuntu. Any ideas please? — Preceding unsigned comment added by Certes (talk • contribs) 21:58, 2 November 2022 (UTC)
- @Certes, I think there aren't any more typos for fixing. — Qwerfjkltalk 22:05, 2 November 2022 (UTC)
- That would explain it. Congratulations on fixing them all! Certes (talk) 22:11, 2 November 2022 (UTC)
- @Certes, also, see § Toolforge tool - summary of first round. — Qwerfjkltalk 22:14, 2 November 2022 (UTC)
- @Certes, User:Qwerfjkl, User:Bellowhead678, I just uploaded new batch of 5,000 suspect words to the server. This time you can expect more false positives since I didn't limit the scan to words that were corrected in the past. Uziel302 (talk) 08:54, 3 November 2022 (UTC)
- That's working fine now, thanks! Maybe a message saying there are no typos would be more informative, in case others mistakenly infer that there are typos but we couldn't get them. Certes (talk) 14:59, 3 November 2022 (UTC)
- I changed the message when batch is over: "All the typos were handled! We try to generate new batch once a month". This is a server side change and I don't want to clear login sessions without real reason so I will wait with server restart and it may take a while for the change to be applied. Uziel302 (talk) 08:14, 4 November 2022 (UTC)
- That's working fine now, thanks! Maybe a message saying there are no typos would be more informative, in case others mistakenly infer that there are typos but we couldn't get them. Certes (talk) 14:59, 3 November 2022 (UTC)
- @Certes, User:Qwerfjkl, User:Bellowhead678, I just uploaded new batch of 5,000 suspect words to the server. This time you can expect more false positives since I didn't limit the scan to words that were corrected in the past. Uziel302 (talk) 08:54, 3 November 2022 (UTC)
- @Certes, also, see § Toolforge tool - summary of first round. — Qwerfjkltalk 22:14, 2 November 2022 (UTC)
- That would explain it. Congratulations on fixing them all! Certes (talk) 22:11, 2 November 2022 (UTC)
Uziel302 the tool tells me that Barinas State Anthem, Achar (Buddhism) and Bhagyada balegara included "fera bien évidemment partie des deux grandes rétrospectives consacrées à Georges de La Tour à Paris, la première, qui acc". That looks like a bug. TSventon (talk) 10:04, 3 November 2022 (UTC)
- The only place I could find this text is on the French Wikipedia: [3]. Do you mean this text appeared as the context of the typo in those cases? I changed the status of those three articles to new and got them in the tool, and there was nothin weird about the context I got. I suspect something went wrong on the API asking wikipedia for the page, but I can't tell what. Let me know if it happens again. Uziel302 (talk) 07:54, 4 November 2022 (UTC)
- The text appeared as the context before the typo for every new typo suggested, I noted the first three. I eventually got rid of it by refreshing the page. Hopefully it won't happen again. TSventon (talk) 12:39, 4 November 2022 (UTC)
- I also got served text from the German version of the page https://en.wikipedia.org/wiki/Deister_Gate but the link was to the English one, where said text did not appear. Turtlecrown (talk) 23:39, 6 November 2022 (UTC)
- And again, the same thing with https://en.wikipedia.org/wiki/Louis_Galloche but this time being served the text of the French version. When I view the page source, I see that the original language is the hidden text in both, probably from when the article was translated. Turtlecrown (talk) 09:19, 7 November 2022 (UTC)
- @Turtlecrown, it looks like, for both of these, the text is in the article, but in a hidden comment. — Qwerfjkltalk 16:34, 7 November 2022 (UTC)
- And again, the same thing with https://en.wikipedia.org/wiki/Louis_Galloche but this time being served the text of the French version. When I view the page source, I see that the original language is the hidden text in both, probably from when the article was translated. Turtlecrown (talk) 09:19, 7 November 2022 (UTC)
Uziel302, I have noticed that if you drag text around in the context box (and then select replace), the text reappears in the next context box. Also if you copy and paste text into the context box (and then select dismiss), that text reappears in the next context box. Also a chunk of text from Ferry Wharf was added to Eero Palm here after I cut and pasted text into Ferry Wharf. Is this fixable or is it best to avoid dragging or pasting text? TSventon (talk) 11:49, 23 November 2022 (UTC)
I've listed the recent good-faith fix attempts which may have added or removed more content than intended. Certes (talk) 15:34, 23 November 2022 (UTC)
- @Certes, you might want to filter out edits with the
mw-reverted
tag. — Qwerfjkltalk 16:26, 23 November 2022 (UTC)- Done: the list now shows only edits not marked as reverted. Just 16 in the period covered, which I think is 30 days. They all look like either good and deliberate changes (false positives) or errors already fixed in a more complicated way than simple reversion. Certes (talk) 16:57, 23 November 2022 (UTC)
- TSventon, thanks for the feedback, I will investigate your scenarios and see how it can be fixed. Uziel302 (talk) 17:32, 23 November 2022 (UTC)
- TSventon, Certes, Qwerfjkl, thanks for you efforts to improve the tool, I understood that my previous use of "contenteditable" attribute around 4 different angular variables (context before, suspect, correction, context after) is error prone, and I could reproduce the issue of copy pasting. My solution is to make the context editing a separate feature, only if you click edit context you will get the full context as one editable text that will be sent to the server as is when clicking replace. The old context box that shows the highlighted suspect and correction won't be editable anymore. Please let me know how this solution works for you. Uziel302 (talk) 07:30, 25 November 2022 (UTC)
- Uziel302, thank you, your solution seems to have resolved the problems I identified. TSventon (talk) 11:27, 29 November 2022 (UTC)
- TSventon, Certes, Qwerfjkl, thanks for you efforts to improve the tool, I understood that my previous use of "contenteditable" attribute around 4 different angular variables (context before, suspect, correction, context after) is error prone, and I could reproduce the issue of copy pasting. My solution is to make the context editing a separate feature, only if you click edit context you will get the full context as one editable text that will be sent to the server as is when clicking replace. The old context box that shows the highlighted suspect and correction won't be editable anymore. Please let me know how this solution works for you. Uziel302 (talk) 07:30, 25 November 2022 (UTC)
- TSventon, thanks for the feedback, I will investigate your scenarios and see how it can be fixed. Uziel302 (talk) 17:32, 23 November 2022 (UTC)
- Done: the list now shows only edits not marked as reverted. Just 16 in the period covered, which I think is 30 days. They all look like either good and deliberate changes (false positives) or errors already fixed in a more complicated way than simple reversion. Certes (talk) 16:57, 23 November 2022 (UTC)
Recurring typos
In The Scottish feudal barony of Grougar, "bailliary" occurred twice, and two of us have corrected them to "bailiary". That obscure word wasn't the suggestion, and it took a bit of digging to find it. ("Bailiery" is also correct.) When an article has multiple occurrences of the same possible typo, would it be possible to offer each of them in turn for correction, rather than just the first one – either always, or only if the user clicks Replace? Please don't take any of my ideas as criticism; this is a great tool even without any further improvements but, if it's easy, let's do it! Certes (talk) 21:37, 4 December 2022 (UTC)
- Correction: I think it already does come up multiple times, but always presents the first context. For example, Tunggal panaluan presented "panaluan" seven times, but always with the same (first) context. (That word's in the article title, so an automatic Dismiss.) Certes (talk) 22:30, 4 December 2022 (UTC)
Random text added
See Special:Diff/1120155490. — Qwerfjkltalk 07:04, 11 November 2022 (UTC)
- Qwerfjkl, the text appears in Fossa dei Leoni, which you corrected at 13:02 on 5 November 2022. "In 1972 they moved from ramp 18" appears in another ten articles that you corrected between then and 13:11 on 5 November 2022, can you check and correct them? TSventon (talk) 10:36, 11 November 2022 (UTC)
- The text in the edit is based on the context box, sounds like in some cases some of the context gets stuck there, but for now I don't see a way to reproduce it. Uziel302 (talk) 14:01, 11 November 2022 (UTC)
- Qwerfjkl, I went over the edits of the toolforge tool that had change bigger than 3 bytes, only you had this bug. Made one revert that you missed. I couldn't find the reason for it. I can guess that somehow your browser kept this line inside the context box, please try to reproduce and check if indeed you manage to get wrong content in the context box. Thank. Uziel302 (talk) 06:40, 13 November 2022 (UTC)
- Qwerfjkl, I edited my reply. Uziel302 (talk) 08:50, 13 November 2022 (UTC)
- There was a change in the code to prevent such cases, as discussed in another paragraph here. Uziel302 (talk) 04:47, 13 December 2022 (UTC)
- Qwerfjkl, I edited my reply. Uziel302 (talk) 08:50, 13 November 2022 (UTC)
- Qwerfjkl, I went over the edits of the toolforge tool that had change bigger than 3 bytes, only you had this bug. Made one revert that you missed. I couldn't find the reason for it. I can guess that somehow your browser kept this line inside the context box, please try to reproduce and check if indeed you manage to get wrong content in the context box. Thank. Uziel302 (talk) 06:40, 13 November 2022 (UTC)
- The text in the edit is based on the context box, sounds like in some cases some of the context gets stuck there, but for now I don't see a way to reproduce it. Uziel302 (talk) 14:01, 11 November 2022 (UTC)
Manual correction
The suggested replacement is often correct but, when a different word is required, it usually resembles the original typo rather than the suggested replacement. For example, the typo is "kiled", the suggestion is "kidel", but I want to insert "killed". It is almost always quicker, easier and safer to amend the original typo (inserting the missing L) rather than the suggested replacement (replacing D by LL and L by D).
Would it be sensible for the tool to show the original word, rather than the suggestion, in a edit box? I can see a UI problem: does the Replace button insert the suggestion or the content of the edit box (or the context)? I'm not sure of the best solution for that.
This point may also apply to the new "edit context" feature (which I don't use for many edits but occasionally find very useful indeed). When editing the context, we'd want to start from the original text (Cain kiled Abel) rather than have it already amended by the suggestion (Cain kidel Abel).
Apart from the UI confusion, this change could provide a useful improvement with not much work. Any thoughts? Certes (talk) 18:27, 30 November 2022 (UTC)
- Certes, when you click type, you get the original word. I will change the edit context to also leave original context. Is there anywhere else that should keep original word? Uziel302 (talk) 14:16, 1 December 2022 (UTC)
- D'oh, I should have thought to click Type! Thanks, that does the job for me. The only other oddity I can see is in the edit summary, where I think the suggested replacement is added even if something else is typed in context, but that's a minor cosmetic issue. Certes (talk) 14:21, 1 December 2022 (UTC)
- ...and I've tested the "All the typos were handled!" message too :) Certes (talk) 14:36, 1 December 2022 (UTC)
- Certes, just uploaded to the server the fix that will leave original context after clicking edit context. I added to my tasks to add option to allow changing edit summary in case of editing context. As of all typos handles, not quite accurate, 1647 of them were simply skipped, I brought those back (but I think two rounds of skipping is enough to consider them as dismissed). Uziel302 (talk) 20:09, 4 December 2022 (UTC)
- Is anyone else seeing the buttons (Dismiss, Skip, Replace, Add typo template) greyed out? Certes (talk) 20:38, 4 December 2022 (UTC)
- ...if so, just log in again. I'd somehow got logged out of the tool. (Perhaps a new version expires the OAuth; it would be a sensible defence against software that might gets replaced by malware.) Certes (talk) 20:50, 4 December 2022 (UTC)
- Certes, when I replace Angular files, it doesn't affect login sessions, when I make changes to Node.js, I must restart the process and then the sessions are removed, there are better ways to handle sessions (with some external tools) but for now I don't see a reason to focus on that. Our bottleneck is on new lists generation - amount of output going down and being dealt faster than dumps cycles. Uziel302 (talk) 04:54, 13 December 2022 (UTC)
- I agree. Logging in again is no problem at all. I was simply wondering why the buttons had gone, and it took me a few minutes to realise that I had accidentally logged out. Certes (talk) 09:37, 13 December 2022 (UTC)
- Certes, when I replace Angular files, it doesn't affect login sessions, when I make changes to Node.js, I must restart the process and then the sessions are removed, there are better ways to handle sessions (with some external tools) but for now I don't see a reason to focus on that. Our bottleneck is on new lists generation - amount of output going down and being dealt faster than dumps cycles. Uziel302 (talk) 04:54, 13 December 2022 (UTC)
- ...if so, just log in again. I'd somehow got logged out of the tool. (Perhaps a new version expires the OAuth; it would be a sensible defence against software that might gets replaced by malware.) Certes (talk) 20:50, 4 December 2022 (UTC)
- Is anyone else seeing the buttons (Dismiss, Skip, Replace, Add typo template) greyed out? Certes (talk) 20:38, 4 December 2022 (UTC)
- Certes, just uploaded to the server the fix that will leave original context after clicking edit context. I added to my tasks to add option to allow changing edit summary in case of editing context. As of all typos handles, not quite accurate, 1647 of them were simply skipped, I brought those back (but I think two rounds of skipping is enough to consider them as dismissed). Uziel302 (talk) 20:09, 4 December 2022 (UTC)
Script tells lies
Example: For article Eastern Chalukyas, it suggested "vengi->venge? (replace)". "Vengi" is correct, but should be capitalised. I used the "type" option to replace "vengi" with "Vengi", but it replied that it couldn't find the problem, and had removed the suggestion. My user contributions list says "vengi was not fixed". Checking the article, "vengi" (lowercase) was still there.
This happens to me quite often, or so it feels. It would be super useful if the script printed a log of what it (thinks it) has done, so I could post it here ("nyah, nyah, definite bug"). Sorry if this has already been discussed above: TL;DR. -- Doktor Züm (talk) 14:17, 14 December 2022 (UTC)
- Doktor Züm, the script looks for the exact "context before" line, and there was space removal between the scan and your edit, so I failed finding the exact context line. The new tool in tool forge doesn't rely on context at scan time, it gets the context in real time, so it doesn't have the problem. Sorry for the inconvenience and thanks a lot for all your hard work in the project. Uziel302 (talk) 09:10, 7 January 2023 (UTC)
- Cheers! -- Doktor Züm (talk) 10:28, 7 January 2023 (UTC)
More misspellings
Good news: we're out of typos again! If anyone is looking for inspiration before the next run, I've created User:Certes/Misspellings out of past versions of WP:Database reports/Linked misspellings. It mainly covers proper nouns which are beyond the scope of this project. I continue to check that report daily, and it's been very fruitful, but I've fallen behind and my user page represents the backlog. The list comes with the usual caveats: the word may be correct or correctly quoting a semi-literate journalist, and some actual typos will already have been corrected. Although presented as wikilinks, the words deserve attention wherever they appear, usually without square brackets, and including lower case for words such as "alfabet". If you do find the list useful, please start at a random point, and remove or strike the cases you've checked, so others don't duplicate your work. Thanks and happy fixing, Certes (talk) 11:52, 5 December 2022 (UTC)
- Certes, I took your hint, and expanded the list of words to include proper nouns (but currently only in lower case). Also, to do so I looked at the link of words by frequency and found there a project listing the words by frequency in Wikipedia, which is much more relevant that Project Gutenberg. Here is the resulting code to generate suspect typos: variant-generator.py. I ran the script in the old fashion, first because I had it ready in python (the script for toolforge was in C), and second because I realized some people still use the old format and it has its own advantages, especially for skimming words without having to react to each of them. Uziel302 (talk) 09:29, 7 January 2023 (UTC)
- Thanks! I was thinking of proper nouns as a separate, small, manual project for month ends. I'm not sure how they could integrate into CTIOC software but I will be interested to see what you've done. The word frequency work looks very promising and I look forward to seeing the results. It will be very helpful on finding "wiht" (for example) to suggest "with" as the more common correction rather that "whit" because it sorts earlier alphabetically. Certes (talk) 11:53, 7 January 2023 (UTC)
Issue with edit summary
Yesterday I first used this tool and have become a fan of it since. However, I noticed an issue, see Special:Diff/1140155376. The tool did not think that "superin- tendent" could be a single word, and insisted that it should be "tendent"=>"tencent". So, I was able to fix the actual typo by using the "Edit context" boption directly, however, the edit summary given out is wrong. It would be great if it could be fixed. Thank you! —CX Zoom[he/him] (let's talk • {C•X}) 18:55, 18 February 2023 (UTC)
- Another example: Special:Diff/1140341078 —CX Zoom[he/him] (let's talk • {C•X}) 17:24, 19 February 2023 (UTC)
- Ideally, the edit summary should say what replacement was made. If that is too difficult, then at the very least there should be a question mark after the suggested replacement. —Kusma (talk) 20:57, 20 February 2023 (UTC)
- I presume the edit summary is generated from the replacement box, not the context box, so it just used the default summary. A generic "typo fixed" summary might be useful here. — Qwerfjkltalk 16:41, 21 February 2023 (UTC)
- "Typo fixed" works, or do a very naïve diff on the old and new text: show the removed and added texts, along with any unchanged letters immediately before and after, e.g. if I change "Smith scord the winnin goal" to "Smith scored the winning goal" then summarise as "scord the winnin→scored the winning", including the changed text ("d the winnin→ed the winning"), letters immediately before ("scor") and letters immediately afterwards (none in this case). Certes (talk) 19:25, 21 February 2023 (UTC)
- I presume the edit summary is generated from the replacement box, not the context box, so it just used the default summary. A generic "typo fixed" summary might be useful here. — Qwerfjkltalk 16:41, 21 February 2023 (UTC)