Jump to content

Wikipedia:Bot requests/Archive 77

From Wikipedia, the free encyclopedia
Archive 70Archive 75Archive 76Archive 77Archive 78Archive 79Archive 80

Misplaced brace

In this diff, I replaced } with | As a result, the article went from [[War Memorial Building (Baltimore, Maryland)}War Memorial Building]] to [[War Memorial Building (Baltimore, Maryland)|War Memorial Building]], and the appearance went from [[War Memorial Building (Baltimore, Maryland)}War Memorial Building]] to War Memorial Building. Is a maintenance bot already doing this kind of repair, and if not, could it be added to an existing bot? Nyttend (talk) 13:39, 21 June 2018 (UTC)

Probably best we get an idea of how common this is before we look at doing any kind of mass-repair (be it a bot or adding it to AWB genfixes). Could someone look through a dump for this sort of thing? I'm happy to do it if nobody else gets there first. ƒirefly ( t · c · who? ) 20:02, 21 June 2018 (UTC)
This has some false positives but also some good results. Less than 50. -- GreenC 15:56, 24 June 2018 (UTC)

Take over GAN functions from Legobot

Legobot is an enormously useful bot that performs some critical functions for GAN (among other things). Legoktm, the operator, is no longer able to respond to feature requests, and is not very active; they've asked in the past if someone would be willing to take over the code. I gather from that link that the code is PHP; see here [1]. There would be a lot of grateful people at GAN if we could start addressing a couple of the feature requests, and if we had an operator who was able to spend more time on the bot. This is not to criticize Legoktm at all -- without their work, GAN could not function; Legobot is a core part of GAN functionality.

I left a note on Legotkm's talk page asking if they would mind a request here for a new operator, and Redrose64 responded there with a link to the note I posted above, so I think it's clear they'd be glad for someone else to pick this up. Any takers? Mike Christie (talk - contribs - library) 23:10, 6 February 2018 (UTC)

I've heard from Legoktm and they would indeed be glad to have someone else take this over. If you're capable in PHP, this is your chance to operate a bot that's critical to a very active community. Mike Christie (talk - contribs - library) 00:21, 8 February 2018 (UTC)
I would like to comment that it would be good to expand the functionalities of the bot for increased automation, like automatically adding to the GA lists. Perhaps it would be better to rewrite the bot in a different language? I think Legoktm has tried to get people to take over the php for awhile with no success. Kees08 (Talk) 04:44, 8 February 2018 (UTC)
The problem with adding to the GA lists is knowing which one. There is no indication on the GAN as to where. All we have is the topic. Even the humans have trouble with this. Hawkeye7 (discuss) 20:20, 16 February 2018 (UTC)
To correct for the past, we could add a parameter to the GA template for the 'subtopic' or whatever we want to call that grouping. A bot could go through the current listing and then add that parameter to the GA template. Then, when nominating, that could be in the template, and the bot could carry that through all the way to automatically adding it to the GA page at the end. Kees08 (Talk) 20:23, 16 February 2018 (UTC)
Nominators would need to know those tiny divisions within the subtopics; as it's not something we have on the WT:GAN page, I doubt most are even aware of the sub-subtopics. Even regular subtopics are sometimes too much for nominators, who end up leaving that field blank when creating their nominations. BlueMoonset (talk) 22:15, 26 February 2018 (UTC)

@Hawkeye7: For what it is worth, due to your bot's interactions with FAC, I think it would be best if you took over the GA bot as well, for what it is worth. I think at this point it is better to just write a new bot than salvage the old bot; no one seems to want to work on salvaging. Kees08 (Talk) 21:59, 26 February 2018 (UTC)

We'd need to come up with a full list of functionality for whoever takes this on, not only what we have now but what we're looking for and where the border conditions are. BlueMoonset (talk) 22:15, 26 February 2018 (UTC)

I might interested in lending a hand. A features list and functionality details (as mentioned by BlueMoonset) would be nice to affirm that decision though. I shall actively watch this thread. --TheSandDoctor (talk) 21:30, 11 March 2018 (UTC)

Okay, I will attempt to list the features, please modify as needed:

  • Place notifications on nominators talk page when their nomination is onreview diff, onhold diff, passed diff, failed
  • Update GAN page when status of a review changes (new, on hold, on review, passed, failed, also number of reviews editors have performed) diff
  • Update the stats page (related to the last bullet point, this is where the stats are stored) diff
  • Transcludes GA review on article talk page diff
  • Adds GA icon to articles that pass diff
  • Adds the oldid parameter to the GA tempate diff

@BlueMoonset: Are you aware of other functions? Looking through the bots edit history and going off of what I know of the bot, this is what I came up with. Kees08 (Talk) 22:10, 11 March 2018 (UTC)

Thanks Kees08. Does anyone know if it would be possible to take a look at the database structure? --TheSandDoctor (talk) 22:28, 11 March 2018 (UTC)
@Legoktm: Are you able to answer their question? Thanks! Kees08 (Talk) 23:43, 11 March 2018 (UTC)
TheSandDoctor, it's great that you're interested in this. Kees08, the second item (about updating the GAN page) is much broader. I don't know whether the bot simply updates the GAN page or generates/recreates the contents of all sections on it. It's basically dealing with all the GA nominee templates out there—which indicate what is currently a nominated article that belongs on the GAN page, and the changes to that page. If an entry wasn't on the page last time but is this time, then it's considered new; if it was on last time but a review page has appeared for it, then it's considered under review and the review page is parsed for reviewer information (but if the GA nominee template instead says "status=onhold", then it's noted as being on hold)... there's a lot involved, including cross-checking, and any field in the GA nominee template, including page number, subtopic, status, and note, can change at any time. If the GA nominee template has disappeared and a GA template is there for that same "page" number, then it has passed; if a FailedGA is there for that same "page" number, then it has failed (but the current bot's code doesn't check this properly, so any FailedGA template on the talk page results in the "failed" message being sent to the nominator even if the nomination was just passed with a new GA template showing). Sometimes review pages disappear when they were created illegally or by mistake and are speedy deleted, and the bot realizes their absence and updates the GAN page accordingly, so it's a comprehensive check each time the bot runs (currently every 20 minutes). If the bot doesn't know how to characterize the change it has found, it appears under an edit summary of "Maintenance": status changes to 2ndopinion go here, as do passes and failures where there was something wrong with the article talk page according to its lights. For example, it views with suspicion any talk page of a nomination under review that doesn't have a transcluded review on it, so it doesn't send out pass or fail messages for them (and maybe not even hold messages; I've never checked that).
There's a difference here between features and functionality. I think the features (with the exception of the 2ndopinion status and the display of anything in the "notes" field of GA nominee) have been listed here. The actual functions—how it needs to work and what it needs to check—are harder to break down. One thing that was mentioned above is the use subtopics: we have been unable to add new subtopics for several years now, so new subtopics on the GA page are not yet available on the GAN page. I'm not sure how the bot gets its list of subtopics—I've found more than one possible page where they could be read from, but there may be a database for subtopics and the topics they come under that actually controls them, with the pages I've found being a place for some templates, like GA, FailedGA, and Article history, to figure out what subtopics translate to which topics, and which subtopics are legitimate. GA nominee templates that have invalid subtopics or missing status or note fields (or other glitches) can cause the bot to try every 20 minutes to enter or update a nomination/review and fail to do so; there are times when a transaction is listed dozens of times, one bot run after another, as the GAN edit summary because it needs to happen, but it ultimately doesn't (until someone sees the problem and fixes the problematic GA nominee template or GA review page). I'm hoping any new bot will be smarter about how to handle these (and many other) situations, and maybe there will be an accessible error log to aid us in determining what's wrong. BlueMoonset (talk) 00:55, 12 March 2018 (UTC)
Yeah there is a lot in the second bullet point I did not include diffs for, on account of me being lazy. I will try to do that tonight maybe. I tried to limit what I said to the current functionality of the bot and not include my wishlist of new things, including revamping how subtopics are done. There was an error log at some point in time (located here), not sure when we stopped using that, and if it was on purpose or not. Kees08 (Talk) 01:18, 12 March 2018 (UTC)
@TheSandDoctor: Just giving you a ping in case this slipped off your radar. Kees08 (Talk) 07:48, 20 March 2018 (UTC)
Thanks for the ping Kees08. I had not forgotten, but was waiting for other responses. I am still interested (and might be able to port it to Python), we just need to get Legoktm involved in the discussion. --TheSandDoctor Talk 15:31, 20 March 2018 (UTC)
Kees08 BlueMoonset I have started to (somewhat) work on a Python port of the GAN task. There are some libraries that can be taken advantage of to (hopefully) reduce the number of lines (hopefully simplify it) etc. --TheSandDoctor Talk 22:49, 20 March 2018 (UTC)
That's great, TheSandDoctor. I'm very happy you're taking this one on. There are some places where the current code doesn't do what it ought. Here are a few that I've noticed:
  • As mentioned above, even if the review has just concluded with the article being listed as a GA, if the article talk page also has a FailedGA template on it from prior nomination that was not successful, the bot will send out a "Failed" message rather than a "Passed" message.
  • If a subtopic isn't capitalized exactly right, the nomination is not added to the GAN page even though the edit summary claims it is; for example, the subtopic "songs" isn't written as "Songs", which prevents the nomination from being added to the page until it is fixed.
  • If a GA nominee template is missing the status and/or note fields, a new review is not added to the template, even though it is (ostensibly) added to the GAN page. One example: Abdul Hamid (soldier) was opened for review and appeared on the GAN page as under review, but in actuality, the review page was transcluded but the GA nominee status was not updated because the GA nominee template was missing the "note" field; only after that was manually added did the bot add the "onreview" status. It would make so much more sense for the bot to add the missing field(s) to GA nominee and proceed with adding the status to the template (and the transclusion of the review page on the talk page), instead of leaving its process step incomplete.
  • When an editor opens a GA review, the bot will increment the number of reviews they have, and it will adjust this number on all nominations and reviews that editor has open. Unfortunately, not only does it produce an edit summary that lists the new review, it also includes those other reviews in the edit summary because of that incremented number, when nothing new has happened to the other reviews. This was a problem before, and it's gotten much worse now that edit summaries can be 1024 characters rather than 128 or 256. For example, when Iazyges opened a GA review of Jim Bakker, the edit summary overflowed the 1024 characters, and it shouldn't have; the Bakker review was the only one that should have been listed for Iazyges.
I'm sure there are others; I'll try to think of them and let you know. Thanks again for taking this on. BlueMoonset (talk) 04:52, 21 March 2018 (UTC)
@BlueMoonset: Thanks! At the moment I am just trying to get the current code ported, but once I am confident that it should work, I will see about the rest. (The main issue of course being that I cannot actually test/run the ported script (isn't ready for that stage yet, but once it is. The most I could do would be to output to text files diffs instead of saving for a couple as I dont have bot access etc etc; Lego needs to be a part of these discussions at some point as they involve their bot). --TheSandDoctor Talk 05:17, 21 March 2018 (UTC)
@BlueMoonset:@Kees08: I have emailed Legoktm requesting for a glimpse at the database structure. --TheSandDoctor Talk 16:00, 27 March 2018 (UTC)
TheSandDoctor, that's excellent news. I hope you hear back soon. Incidentally, I noticed that Template:GA/Subtopic was modified by Chris G, who was the GAN bot owner (then called GAbot) prior to Legoktm, back when the Warfare subtopic "War and military" was changed to "Warfare", so I imagine this is one of the files that might need to be updated if/when the longstanding requests to update/expand the subtopics at GAN to break up some of the single-subtopic topics (something that's already been done at WP:GA. In particular, the Warfare topic/subtopic and the Sports and recreation topic/subtopic have been on our wishlist for several years, but Legoktm never responded to multiple requests; the last changes we had were under Chris G before his retirement in 2013. I don't know whether Template:GA/Topic is involved, and the underlying Module:Good article topics and the data it loads at Module:Good article topics/data, which would also need to be updated when topics and/or subtopics are revised or added to. BlueMoonset (talk) 23:26, 28 March 2018 (UTC)
Hi there BlueMoonset, I was waiting to hopefully hear from Lego, but have not. Exams have delayed my progress in this (and will continue to do so until next week), but unfortunately, even when I have the bot converted, there is no guarantee with would work (at first) as I don't have a way to test it nor do I have access to the existing database etc. I could probably figure out what the database looks like from the code, but the information contained within would be very useful (especially to get it up and running). It is still unclear if I would gain access to Legobot or have to make a "Legobot 2" (or similar). (cc Kees08) --TheSandDoctor Talk 01:05, 20 April 2018 (UTC)
TheSandDoctor, I don't know what you can do at this point, aside from pinging Legoktm's email account again. I know that Legoktm would like to give over the responsibility for this code, but doesn't seem to be around Wikipedia enough any more to give the time necessary to help achieve such a transition. With luck, one of these pings will eventually catch them when they have time and energy to make it happen. I do hope you hear something soon. BlueMoonset (talk) 03:42, 20 April 2018 (UTC)
@TheSandDoctor: What database are you looking for? This may be a dumb question..but we identified where the # of reviews/user list was, and the GA database is likely just from the GA page itself. Is there another database you are looking for? Kees08 (Talk) 00:04, 22 April 2018 (UTC)
Hi there and sorry for the delay in my response Kees08, Legobot uses its own database to keep track of the page states (to know if they have changed). Having access or at least an outline of the structure would speed things out somewhat as I would not have to regenerate the database and could have it clarified what exactly is stored about the pages etc. It is not a necessity, but would be a nice convenience, especially if I am to take over the bot's functions and maintenance to have access to its database (or at least a "snapshot" of its structure). As for further development on the translation to Python, once finals are wrapped up (by Tuesday PST), I should hopefully have more time to dedicate to working on it. In the meantime, I have an important final in between me and programming. I shall keep everyone updated here. I still foresee an issue with verifying that the bot works as expected though due to the lack of available testing and a bot account to run it on. Things will sort themselves out though in the next while, I am sure. Minus editing I could always check if it "compiles"/runs and could probably work in a dry-run framework similar to my other projects (where they go through the motions, without making actual edits, printing to a local text file(s) instead). --TheSandDoctor Talk 05:44, 22 April 2018 (UTC)
Sounds good; no rush, just seeing if I can help you hit the ground running when you get to it. Perhaps DatGuys's config structure would help you figure out a way to do dry runs; mildly similar, you would just have to make up some pages and a database structure, to get the best dry run that is possible prior to hitting the real articles. Best of luck on your finals, and if it makes you feel any better, you will still wake up in cold sweats about them several years in the future (note to dreaming self: no, I have no finals. No, it does not matter you did not study.). Kees08 (Talk) 06:21, 22 April 2018 (UTC)
Not sure how this is going, but I have found User:GA bot/Stats to be inaccurate. It simply needs to list the number of pages created by each editor with "/GA" in the title. Most editors have less listed than they have done. It might b e easier to look into this while the bot is being redone. AIRcorn (talk) 22:13, 11 May 2018 (UTC)
Can't we get the database structure from the code? Enterprisey (talk!) 04:25, 13 June 2018 (UTC)
I committed the current database structure. Let me know if you want dumps too. Legoktm (talk) 07:04, 13 June 2018 (UTC)
@Enterprisey and TheSandDoctor: Ping, in case you missed Lego's comment like I did. Thanks lego! Kees08 (Talk) 02:58, 25 June 2018 (UTC)
I totally missed that. Thank you so much Legoktm (and Kees08 for the ping)! A dump would be probably useful, though not entirely necessary. --TheSandDoctor Talk 03:55, 25 June 2018 (UTC)

Bot to change redirects to 'Redirect'-class on Talk pages?

As per edits like this one I just did, is it possible to have a bot go around and check all the extant Talk pages of redirect pages, and confirm/change all of their WP banner assessment classes to 'Redirect'-class?... Seems like this should be an easy and doable task(?)... FWIW. TIA. --IJBall (contribstalk) 15:41, 10 June 2018 (UTC)

Just ran a petscan on Category:All redirect categories, which I assume includes all redirects, but which also contains 2.9 million pages. Granted, 1.4mil of these are in Category:Unprintworthy redirects (which likely do not have talk pages of their own), and there are probably another million or so with similar no-talk-page status, but that's still a metric buttload of pages to check. Not saying it can't be done (haven't really even considered that yet), just thought I'd give an idea of the scale of this operation. Primefac (talk) 16:25, 10 June 2018 (UTC)
No one says it needs to be done "fast"!... Maybe a bot can check just a certain number of redirect pages per day on this, so it doesn't overwhelm any resources. --IJBall (contribstalk) 16:59, 10 June 2018 (UTC)
I believe the banners automatically set the class to redirect when the page is a redirect, sp that would fall under WP:COSMETICBOT. Not sure what happens if |class=C is set on a redirect, but that should be easy to test. If |class=C overrides redirect detection, that would be suitable for a task. Headbomb {t · c · p · b} 03:39, 11 June 2018 (UTC)
I'm telling you, I just had to do this twice, earlier today – two pages that had been converted to redirects years ago still had still had |class=B on their Talk pages. It's possible that this only affects pages that were converted to redirects years ago, but it looks there is a population of them that need to be updated to |class=Redirect. --IJBall (contribstalk) 03:47, 11 June 2018 (UTC)
Setting |class=something overrides the automatic redirect class. This should be handled by EnterpriseyBot. — JJMC89(T·C) 05:11, 11 June 2018 (UTC)
Yup, as JJMC89 mentioned, this is EnterpriseyBot task 10. It hasn't run for a while, because I let a couple of breaking API changes pass by without updating the code. I'm going to fix the code so it can run again. Enterprisey (talk!) 05:45, 11 June 2018 (UTC)
If such a task is done, it's best to either remove the classification and leave the empty parameter |class=, or remove the parameter entirely. As Headbomb and JJMC89 have noted, redirect-class is autodetected when no class is explicitly set. This is true with all WikiProject banners built around {{WPBannerMeta}} (but see note), so setting an explicit |class=redir just means that somebody has to amend it a second time if the page ceases to be a redirect.
Note: there are at least four that are not built around {{WPBannerMeta}}, and of the four that I am aware of, only {{WikiProject U.S. Roads}} autodetects redir class; the other three ({{WikiProject Anime and manga}}; {{Maths rating}}; and {{WikiProject Military history}}) do not autodetect, so for these it must be set explicitly; moreover, those three only recognise the full form |class=redirect, they don't recognise the shorter |class=redir that many others handle without problem. --Redrose64 🌹 (talk) 07:54, 11 June 2018 (UTC)
Yes, I skip anime articles explicitly, and the bot won't touch the other two troublesome templates due to the regular expressions it uses.
A bigger problem concerns the example diff that started this thread. It's from an article in the unprintworthy redirects category. I thought the bot should have gotten to that category already, so I just went into to inspect the logs. Unbelievably, after munching through all of the redirect categories, it has finally gotten stuck on exactly that category (unprintworthy redirects). Apparently Pywikibot really hates one of the titles in it. I'm trying to figure out which title precisely, so I can file a bug report, but for now the bot task is on hold.
However, all of the other redirect categories that alphabetically come before it should only contain articles that the bot checked already. Enterprisey (talk!) 20:11, 18 June 2018 (UTC)
It was actually a bug in Pywikibot, so the bot's held up until a patch is written for that. Enterprisey (talk!) 18:56, 26 June 2018 (UTC)

Hi! Potentially untagged misspellings (configuration) is a newish database report that lists potentially untagged misspellings. For example, Angolan War of Independance is currently not tagged with {{R from misspelling}} and it should be.

Any and all help evaluating and tagging these potential misspellings is welcome. Once these redirects are appropriately identified and categorized, other database reports such as Linked misspellings (configuration) can then highlight instances where we are currently linking to these misspellings, so that the misspellings can be fixed.

This report has some false positives and the list of misspelling pairs needs a lot of expansion. If you have additional pairs that we should be scanning for or you have other feedback about this report, that is also welcome. --MZMcBride (talk) 02:58, 15 June 2018 (UTC)

Oh boy. Working with proper names variations are often 'correct', usage is context dependent so a bot shouldn't decide. My only suggestion is to skip words that are capitalized. For the rest use something like approximate (fuzzy) matching to identify paired words that are only slightly different due to spelling (experiment with the agrep threshold without creating too many false positives), then use a dictionary to determine if one of the paired words is a real word and the other not. At that point there might a good case for it being a misspelling and not an alternative name. This is one of those problems computers are not good at and is messy. Unless there is an AI solution. -- GreenC 14:23, 17 June 2018 (UTC)
Spelling APIs, some AI-based. -- GreenC 01:44, 28 June 2018 (UTC)

Alexa rankings / Internet Archive

This isn't really a bot request, in the sense that this doesn't directly have anything to do with the English Wikipedia and no pages will be edited (no BRFA is required), but I'm putting it here nonetheless because I don't know of a better place and it's used 500% more than Wikidata's bot requests page. However, it will benefit both Wikidata and Wikipedia.

I have been archiving (with wget and a list of URLs) a lot of alexa.com pages onto the Internet Archive and archive.is, currently about 75,000 daily (all the same pages). This was originally supposed to be for Wikidata and would have been done once a month on a lot more URLs, but that hasn't materialized. Unfortunately maintaining this automatically would be beyond my rudimentary shell script skills, and to run it as I am doing currently would require time which I do not have.

Originally d:User:Alexabot did this based on some URLs from Wikidata, but the operator seems to have vanished after being harangued on Wikidata's project chat because he added the data to items which were not primarily websites. It follows that in the absence of an established process to add values for the property to Wikidata, the archiving should be done separately, with the data to be harvested where needed. Module:Alexa was to have been used with the data, but the bot only completed three runs so it would be outdated at best, and the Wikidata RFC might end up restricting its use.

Could someone set their Unix-based computer, and/or or their bit of the WMF cloud servers, to

  • once a day, archive (to the Internet Archive) and/or download several lists of domain names (e.g. those used on Wikipedia and Wikidata; from CSV files which are sitting on my computer; lists of the top 1 million websites) and combine the lists
  • format those domain names with the regular expression below
  • once a month (those below about ~100,000 in rank) or daily/weekly (those ~100,000 and above), archive (to the Internet Archive or archive.is) all of the URLs (collected on a given day) between 17:10 UTC and 16:10 UTC the day after (Alexa seems to refresh data erratically between 16:20 and 17:00 each day, independent of daylight saving time)
    • wget allows archival of lists of websites; use -i /path/to/file and -o /path/to/file flags for archival and logging respectively
  • possibly, as an unrelated process, download the archived pages using URL format https://web.archive.org/web/YYYYMMDD054000/https://www.alexa.com/siteinfo/url.website (where YYYYMMDD is some date) and then harvest the data (Unix shell script regular expressions are almost entirely sufficient)
    • alternatively, just download directly from alexa.com around the same time (see below)
https://web.archive.org/save/https://www.alexa.com/siteinfo/$1
https://web.archive.org/save/https://traffic.alexa.com/graph?o=lt\&y=t\&b=ffffff\&n=666666\&f=999999\&p=4e8cff\&r=1y\&t=2\&z=30\&c=1\&h=150\&w=340\&u=$1
https://web.archive.org/save/https://traffic.alexa.com/graph?o=lt\&y=q\&b=ffffff\&n=666666\&f=999999\&p=4e8cff\&r=1y\&t=2\&z=0\&c=1\&h=150\&w=340\&u=$1

Caveats:

  • The Wikidata property currently uses the URL access date as the point in time, instead of the date that the data was collected (one day and 17 hours before a given UTC time), and does not require an archive URL or even a date. This might be fine for Google's Wikidata item since it will be number 1 until the end of time, but for everything else it will need to be fixed at some point
  • If you don't archive the graph images at the same time (or you archive pages too quickly), alexa.com will start throttling connections from the Internet Archive and you will be unable to archive /siteinfo/* for about a week
  • web.archive.org does not allow a large number of incoming connections per IP for either upload or download (only tested with IPv4 addresses – might be better with multiple IPv6 addresses), so you may want to get around this somehow. I have been funneling connections through Tor simply because it seemed easier to configure torsocks, but this is not ideal
  • Given the connection limit, it is only possible to archive about 100,000 pages and 200,000 graphs per day per IP address (and there might be another limit on alexa.com, which I haven't tried testing)
  • You can use wget's --spider and --max-redirect flags to avoid downloading content
  • Rankings below a certain point (maybe 1 million) are probably not very useful, since the rate of change is high. The best way to check this – which I haven't tried, because it only just occurred to me – is probably to download pages straight from alexa.com while the data is being archived, and check website rankings that way.
  • Some URLs are inexplicably blocked from archival on the Wayback Machine. Those are …/facebook.com, …/blogger.com and …/camelcamelcamel.com (there may be others but I haven't found any more). archive.is (which archives page requisites server-side) seems to block repeated daily archival after a certain point but you can avoid this by using URLs which redirect to the content to be archived
    • archive.is isn't supposed to be scriptable, but I did it anyway with a Lynx script
  • Some websites inexplicably disappear from the rankings from day to day, so don't stop archiving websites simply because their ranking disappears

If you want I can send you the CSV files of archive links that I've accumulated (email me and/or my alternate account, Jc86035 (1)). I have also been archiving spotifycharts.com and if it would be of any use I've written a shell script for that website.

Notifying Cyberpower678, just because you might know something I don't due to running InternetArchiveBot (or you might be able to get the Internet Archive to do this from their servers). Jc86035 (talk) 15:02, 20 May 2018 (UTC)

Jc86035 - I read all this and understand some things, but don't really understand what the goal is. Can you explain in a sentence or two. Are you trying to archive all URLs obtained through alexa.com onto archive.org / archive.is on a daily/weekly basis? What is the connection to wikidata? What is the purpose/goal of this project? -- GreenC 15:53, 26 May 2018 (UTC)

Jc86035 - re: archive.is - there are some archive.is libraries on GitHub for making page saves, but generally when doing mass uploads you'll want to setup an arrangement with the site owner to feed them links, as it gets better results if he does the archiving in-house, as he can get around country blocks and other things. -- GreenC 15:53, 26 May 2018 (UTC)

@GreenC: Originally the primary motivation was to collect data for Wikidata and Wikipedia. Currently most Alexa.com citations do not even have an archive link (and sometimes don’t even have a date), so the data is completely unverifiable unless Alexa for some reason releases archives of their old data. Websites ranked lower than 10,000 had usually been archived about once before I started archiving. However, I don’t really know what data should be archived (I don’t know how to make a list based on Wikipedia/Wikidata outlinks and haven’t asked anyone for such a list yet), and as such have just archived pages based on other, more easily manipulable lists of websites (such as some CSV file that I found in a web search for the top 1 million websites, which is apparently monthly Alexa data), and because it’s generally difficult and tedious to maintain I’ve just gotten a script to run the same list of about 75,000 archive links at the same time every day.
archive.is seems to only archive one page every two seconds at maximum, based on its RSS feed. Since the Internet Archive is evidently capable of a much higher throughput I would rather not overwhelm archive.is with lots of data which isn’t really all that important. I might ask the website owner to archive those three pages every day, though. Jc86035's alternate account (talk) 15:21, 27 May 2018 (UTC)
Anytime a new external link is added to Wikipedia, the Wayback Machine sees it and archives it. This is done automatically daily with a system created and run by Internet Archive. In addition archive.is has done active archiving of all links though I am not sure what the current ongoing status. Between these two most (98%) newly added links are getting archived. I don't know what an Alexa/com citation is, a Special:External links search only shows about 500 alexa.com URLs on enwiki. -- GreenC 04:14, 30 May 2018 (UTC)
@GreenC: How does the external link harvesting system work? Is the link archival performed only for mainspace, or for all pages? If an added external link has already been archived, is the link archived again? (A list could be created in user space every so often, although there would be a roughly 136 chance of a given page's archival being done when the pages are being changed to use the next day's data, which would make the archived pages slightly less useful.)
There are lots of pages which currently do not have Alexa ranks but would benefit from having them added, mostly the lists of websites and the articles of the websites listed (as well as lists of other things which have websites, like newspapers). It would work as a proxy for popularity and importance. Jc86035's alternate account (talk) 08:11, 7 June 2018 (UTC)
@Jc86035: NoMore404. It gets links via the IRC system which I believe is for all spaces. Could test by adding a link to a talk page (not yet on Wayback) and check in 48hrs to see if it's on Wayback. Once a link is in the Wayback it automatically recrawls though how often hard to say. some pages multiple times a day, others once a year, etc.. not sure how they determine freq. -- GreenC 12:48, 7 June 2018 (UTC)
@GreenC: I've added links to Draft:Alexa Internet and User:Jc86035/sandbox, which should be enough for testing. Jc86035's alternate account (talk) 06:18, 8 June 2018 (UTC)
Both those URLs redirect to a page already existing in the Wayback not sure how nomo404 and wayback machine will respond. Redirects are a complication on Wayback. -- GreenC 15:42, 8 June 2018 (UTC)
@GreenC: None of the URLs have been archived. I think I'll probably stick to using the long list of URLs, although I might try putting them in the WMF cloud at some point. Jc86035 (talk) 16:19, 16 June 2018 (UTC)
Jc86035 The test URLs you used won't work, they are already archived on the Wayback. As I said above, "Both those URLs redirect to a page already existing in the Wayback". Need to use URLs that are not yet archived. -- GreenC 18:47, 16 June 2018 (UTC)

@GreenC: Okay. I've replaced those links with eight already-archived links and eight unarchived links; none of them are redirects. Jc86035 (talk) 06:39, 17 June 2018 (UTC)

Ok good. If not working 24-48hr I will contact IA. -- GreenC 14:07, 17 June 2018 (UTC)
Jc86035 - those spotify links previously exist on Wayback, archived in March. Need to find links not yet in the Wayback. -- GreenC 13:50, 19 June 2018 (UTC)
@GreenC: Eight of them (2017-12-14) were archived by me, and the other eight (2018-06-14) are too recent to have been archived by me. Jc86035 (talk) 14:41, 19 June 2018 (UTC)
@Jc86035: Clearly the links did not get archived. It still might be caused by a filter of userspace, so I added one of the links onto mainspace to see what happens. -- GreenC 01:56, 28 June 2018 (UTC)
I saved one manually to check for robots.txt or something blocking saves, but it looks OK. The one testing in mainspace: https://spotifycharts.com/regional/jp/daily/2018-06-14-- GreenC 02:03, 28 June 2018 (UTC)
@Jc86035: NoMo404 appears to be working. I added a Spotify link into mainspace here. The next/same day it showed up on Wayback. Looks like it's only tracking links added to mainspace, not Draft or User space. -- GreenC 23:25, 29 June 2018 (UTC)
@GreenC: Thanks. I guess it should work well enough for the list articles, then. Jc86035 (talk) 23:32, 29 June 2018 (UTC)

Tag cleanup of non-free images

I noticed a couple of non-free images here on the enwiki to which a {{SVG}} tag has been added, alongside with a {{bad JPEG}}. Non-free images such as logos must be kept in a low resolution and size to comply with the Fair Use guidelines. Creation of a high-quality SVG equivalent, or the upload of a better quality PNG for these files should not be encouraged. For this reason I ask a bot to run through the All non-free media category (maybe starting from All non-free logos) and remove the following tags when they are found:

{{Bad GIF}}, {{Bad JPEG}}, {{Bad SVG}}, {{Should be PNG}}, {{Should be SVG}}, {{Cleanup image}}, {{Cleanup-SVG}}, {{Image-Poor-Quality}}, {{Too small}}, {{Overcompressed JPEG}}

Two examples: File:GuadLogo1.png and File:4th Corps of the Republic of Bosnia and Herzegovina patch.jpg. Thanks, —capmo (talk) 18:16, 28 June 2018 (UTC)

This seems like a bad idea to me. There's no reason a jpeg version of a non-photographic logo shouldn't be replaced with a PNG lacking artifacts, or a 10x10 logo be replaced with something slightly larger should there be a use case. As for SVGs of non-free logos, that has long been a contentious issue in general. Anomie 12:37, 29 June 2018 (UTC)
I agree, particularly relating to over-compressed JPEGs. We can have small (as in dimensions) images without them being occluded by compression artefacts. ƒirefly ( t · c · who? ) 13:40, 1 July 2018 (UTC)

Cyberbot I Book report updates

The Cyberbot I (talk · contribs) bot used to update all the book reports but stopped from January 2018. It seems its owner is caught up IRL to fix this. Can anyone help by checking the bot or the code? —IB [ Poke ] 16:13, 5 May 2018 (UTC)

You need to check with User:cyberpower678 - see Wikipedia:Bots/Requests for approval/Cyberbot I 5 - User:cyberbot I says it's enabled. There is no published code. Ronhjones  (Talk) 20:53, 14 May 2018 (UTC)
@Ronhjones: I have tried contacting Cyberpower a number of times, but he/she does not look into it anymore. Although the bot is listed as active for book status, it has stopped updating it. So somewhere it is skipping the update somehow. —IB [ Poke ] 14:30, 21 May 2018 (UTC)
@IndianBio: Sadly the original request Wikipedia:Bots/Requests for approval/NoomBot 2 has "Source code available: On request", so there is no working link to any source code. If User:cyberpower678 cannot fix it the current system, then maybe the only option is to write a new bot from scratch. I see user:Headbomb was involved in the original BRFA, maybe he might have some ideas? I can think about a re-write if there's no alternative - I will need a bit more info on what the bot is expected to do. Ronhjones  (Talk) 14:57, 21 May 2018 (UTC)
The modification likely isn't very big, and User:Cyberpower678 likely has the source code. The issue most likely consist of finding what makes the bot crash/not perform, and probably update a few API calls or something hardcoded into the bot (like a category). Headbomb {t · c · p · b} 15:01, 21 May 2018 (UTC)
Yes, I have the source, but it was modified as needed to keep it operational over time. @Headbomb: If you email me, I can email you a current copy of the source to look at.—CYBERPOWER (Chat) 15:58, 21 May 2018 (UTC)
I suppose I could, but I'm a really shit coder. Is there a reason to not make the source public? Headbomb {t · c · p · b} 16:35, 21 May 2018 (UTC)
It actually is.—CYBERPOWER (Chat) 17:03, 21 May 2018 (UTC)
@Headbomb: will you take a look at the code? I'm sorry I really don't understand the link which Cyberpower has given. I only code in Mainframe lol, but let me know what seems to be the issue. —IB [ Poke ] 08:04, 22 May 2018 (UTC)
Like I said, I'm a shit coder. This looks to be in PHP so presumably anyone that knows PHP could take over the book reports. Headbomb {t · c · p · b} 13:09, 22 May 2018 (UTC)
Someone only need to file a pull request, and I will deploy it.—CYBERPOWER (Chat) 13:42, 22 May 2018 (UTC)
I can have a look - I'm not a PHP expert by any means (I prefer Python! ;) ) but I've used it extensively in a past life. Richard0612 19:31, 22 May 2018 (UTC)
Richard0612, that will be a real help if you can do it. A lot many books are lagging in their updates. —IB [ Poke ] 12:32, 23 May 2018 (UTC)

(→) Hey @Richard0612: was wondering did you get a chance to look into the code base? —IB [ Poke ] 09:17, 4 June 2018 (UTC)

Not getting any response, so pinging @Cyberpower678: what can be done? —IB [ Poke ] 06:32, 10 June 2018 (UTC)
@Richard0612: sorry can we have any update on this? —IB [ Poke ] 15:10, 26 June 2018 (UTC)

I just happened to stumble across this discussion while visiting for something else, but I can say that the book report updates have resumed as of 28 June. --RL0919 (talk) 18:50, 2 July 2018 (UTC)

Bot to deliver Template:Ds/alert

Headbomb {t · c · p · b} 21:13, 2 July 2018 (UTC)

Ancient Greece

Not exactly bot work, but you bot operators tend to be good with database dumps. Can someone give me a full list of categories that have the string Ancient Greece in their titles and don't begin with that string, and then separate the list by whether it's "ancient" or "Ancient"? I'm preparing to nominate one batch or another for renaming (I have a CFD going, trying to establish consensus on which is better), and if you could give me a full list I'd know which ones I need to nominate (and which false positives to remove) if we get consensus.

If I get to the point of nominating them, does someone have a bot that will tag a list of pages upon request? I'll nominate most of the categories on one half of the list you give me, and there are a good number, so manually tagging would take a good deal of work. Nyttend (talk) 00:01, 6 July 2018 (UTC)

quarry:query/28035
Category Caps Notes
Category:Ambassadors of ancient Greece Lower Redirect to Category:Ambassadors in Greek Antiquity
Category:Articles about multiple people in ancient Greece Lower
Category:Artists' models of ancient Greece Lower
Category:Arts in ancient Greece Lower
Category:Athletics in ancient Greece Lower CfR: Category:Sport in ancient Greece
Category:Battles involving ancient Greece Lower
Category:Cities in ancient Greece Lower
Category:Coins of ancient Greece Lower
Category:Economy of ancient Greece Lower
Category:Education in ancient Greece Lower
Category:Eros in ancient Greece Lower Redirect to Category:Sexuality in ancient Greece
Category:Films set in ancient Greece Lower
Category:Glassmaking in ancient Greece and Rome Lower Redirect to Category:Glassmaking in classical antiquity
Category:Gymnasiums (ancient Greece) Lower
Category:Historians of ancient Greece Lower
Category:History books about ancient Greece Lower CfR: Category:History books about Ancient Greece
Category:Military ranks of ancient Greece Lower
Category:Military units and formations of ancient Greece Lower
Category:Naval battles of ancient Greece Lower
Category:Naval history of ancient Greece Lower
Category:Novels set in ancient Greece Lower
Category:Operas set in ancient Greece Lower
Category:Pederasty in ancient Greece Lower
Category:Plays set in ancient Greece Lower
Category:Political philosophy in ancient Greece Lower
Category:Portraits of ancient Greece and Rome Lower
Category:Prostitution in ancient Greece Lower
Category:Set indices on ancient Greece Lower
Category:Sexuality in ancient Greece Lower
Category:Ships of ancient Greece Lower
Category:Slavery in ancient Greece Lower
Category:Social classes in ancient Greece Lower
Category:Television series set in ancient Greece Lower
Category:Wars involving ancient Greece Lower
Category:Wikipedians interested in ancient Greece Lower
Category:Comics set in Ancient Greece Upper
Category:Festivals in Ancient Greece Upper
Category:Military history of Ancient Greece Upper
Category:Military units and formations of Ancient Greece Upper Redirect to Category:Military units and formations of ancient Greece
Category:Museums of Ancient Greece Upper
Category:Populated places in Ancient Greece Upper
Category:Transport in Ancient Greece Upper
Category:Works about Ancient Greece Upper CfR: Category:Works about ancient Greece
@Nyttend: See list above. If you tag one, I can tag the rest. — JJMC89(T·C) 04:53, 6 July 2018 (UTC)
You might also want to search for "[Aa]ncient Greek", as is Category:Scholars of ancient Greek history and Category:Ancient Greek historians. – Jonesey95 (talk) 05:06, 6 July 2018 (UTC)

Fix station layout tables

In the short term, could someone code AWB/Pywikibot/something to replace variations of <span style="color:white">→</span> (including variations with font tag and &rarr;) with {{0|→}}? This is for station layout tables like the one at Dyckman Street (IND Eighth Avenue Line). Colouring the arrow white is not really ideal when the intention is to add padding the size of an arrow. I'm not sure why there are still so many of them around.

In the long term, it would be nice to clean up these station layout tables further, perhaps even by way of automatically converting them to a Lua-based {{Routemap}}-style template (which does not exist yet, unfortunately). Most Paris Metro stations' articles have malformed tables, for instance, and probably a majority of stations have deprecated or incorrect formatting somewhere. Jc86035 (talk) 04:03, 2 July 2018 (UTC)

@Jc86035: How do we find the articles - are they categorised? or is it a "brute force" search and hope we find them all? ---
insource:/span style.*color:white.*→/ Gives 1818 Articles
insource: &rarr; Gives 1039 Articles.
Ronhjones  (Talk) 00:30, 4 July 2018 (UTC)
In the long run, I'd be happy to start sandboxing/discussing a way to make a station layout template/suite, and a bot run should be possible if it's shown that {{0}} is a better way to code it. Happy to do the legwork on the AWB side of things.
I do have a question, though: just perusing through pages, I found Ramakrishna Ashram Marg metro station , which has the "hidden" → but I'm not really sure why it's there - is it supposed to be padded like that? Also, for later-searching, a search of the code Jc86035 posted (with and without ") gives 1396 pages. I know it misses out on text that isn't exactly the same, but it's probably a good estimate of the number of pages using → as described. Primefac (talk) 13:54, 4 July 2018 (UTC)
It may well be a simple run with AWB. I will play devil's advocate and ask is there any MoS for s station page? It seems to me that the pages involved will be all US based - there seems to be a big discrepancy to layouts of equivalent sized stations in my side of the pond - where most the stations seem to have plumped for using succession boxes. I'm not saying either is correct, just should there be a full discussion at, say, Wikipedia:WikiProject_Trains with regards to having a consistent theme for all? Maybe then it might be worth construction a template system, like the route maps.Ronhjones  (Talk) 15:43, 4 July 2018 (UTC)
The station layout tables (such as that top left of Dyckman Street (IND Eighth Avenue Line)#Station layout) should not be confused with routeboxes (which are always a form of succession box). --Redrose64 🌹 (talk) 20:26, 4 July 2018 (UTC)
I was asking if there should not be a general station MoS, and maybe a template system for the station layout. The US subway stations all seem to have a layout table and no routeboxes, whereas, say, the London ones all have a routebox and no layout table. Maybe they need both. Also just using a plain wikitable tends to result in non-consistent layoutsRonhjones  (Talk) 16:53, 5 July 2018 (UTC)
@Ronhjones: In general I think it's a mess. Some station articles use the style of layout table used in most articles for stations in the US, some use the Japanese style with {{ja-rail-line}}, some use standard BSicon diagrams, some use Unicode diagrams, some probably use other styles, and some (like Taipei station) use a combination. I found an old discussion about some layout templates for the American style in Epicgenius's user space, but no one's written a Scribunto module with them (or used them in articles) yet. Jc86035 (talk) 17:48, 5 July 2018 (UTC)
Regarding a general MoS for stations: there certainly is inconsistency between countries, but within a country there is often consistency.
For articles on UK stations, we agreed a few years ago that layout diagrams were not to be encouraged. Reasons for this include: (i) in most cases, there are are two platforms and the whole plan may be replaced with text like "The main entrance is on platform 1, which is for services to X; there is a footbridge to platform 2 which is for services to Y."; (ii) for all 2,500+ National Rail stations, their website provides a layout diagram which is more detailed than anything that we can do with templates (examples of a simple one-platform station; a major terminus); (iii) trying to draw a layout plan for London Underground stations is fraught with danger since a significant number have their platforms at different levels or angles, even crossing over one another in some cases. --Redrose64 🌹 (talk) 18:23, 5 July 2018 (UTC)
I know, I was born in Barking... District East on Pt.2 (train doors open both sides), Hammersmith & City line ends Pt.3 (down the end Pt.2), District West on Pt.6... :-) Ronhjones  (Talk) 22:24, 5 July 2018 (UTC)
Since I was pinged here in Jc86035's comment, I suppose I'll put my two cents. I experimented with a modular set of station layout templates a couple of years ago. (See all the pages in Special:PrefixIndex/User:Epicgenius/sandbox that start with "User:Epicgenius/sandbox/Station [...]".) This itself was based off of {{TransLink (BC) station layout}}, which is used for SkyTrain (Vancouver) stations. Template:TransLink (BC) station layout is itself a modular template, and several instances of this template can be used to construct a SkyTrain station layout. epicgenius (talk) 23:33, 5 July 2018 (UTC)
@Primefac: Sorry I didn't reply about this earlier. The use of the "hidden" arrow on Ramakrishna Ashram Marg metro station is actually completely wrong, since it is in the wrong table cell, and the visible right arrow also being in the wrong place makes it pointless. A more correct example is in Kwai Hing station. Jc86035 (talk) 09:14, 6 July 2018 (UTC)
Ah, I see. I guess that would be my main concern, though I suppose the GIGO argument could be made... but then again I like to try and minimize the number of false positives that later have to be re-edited by someone else. Primefac (talk) 17:41, 6 July 2018 (UTC)

Remove manual categorization from DYK archives

I've implemented automatic categorization in Template:DYK archive header. Now, in all pages in Category:Wikipedia Did you know archives, the manual categorization has to be removed, so that the pages are correctly sorted in the category. —⁠andrybak (talk) 10:17, 1 September 2018 (UTC)

 Done — JJMC89(T·C) 20:13, 1 September 2018 (UTC)

Category WikiProject Television tagging

Hey, could someone help me out with this and tag the pages in Category:Television articles with incorrect naming style and its sub-categories be with the WP:WikiProject Television banner - {{WikiProject Television}}? Thank you in advance. --Gonnym (talk) 16:40, 4 September 2018 (UTC)

Doing... Ronhjones  (Talk) 18:51, 4 September 2018 (UTC)
Y Done @Gonnym: 156 pages edited. If you want a list - then ping me. Ronhjones  (Talk) 19:14, 4 September 2018 (UTC)
Thank you! --Gonnym (talk) 20:17, 4 September 2018 (UTC)

Request for someone to run SineBot for me.

Hello Admins, I'd like to request someone to please run SineBot for me to auto-sign the signatures as I know nothing about software / website programming for me to use the bot myself. Thanks! --Mkikoen (talk) 16:30, 15 August 2018 (UTC)

Mkikoen, you sign your posts with ~~~~, and you shouldn't necessarily rely on SineBot to do it for you. However, SineBot does work automatically, so if you forget it will likely sign your posts for you. Primefac (talk) 16:33, 15 August 2018 (UTC)
@Primetac: Sorry but I don't use Wikipedia very often as I edit very infrequently. I know that you sign your messages by adding 4 tildes characters or if you're using the WikiEditor then you click on the Signature button but I sometimes forget to sign it manually as I usually assume the system does it for me automatically (at least from what I can somewhat recall from using other fandom wiki sites if I remember correctly). I just find it a little bit tedious as I'm still new to the editing process. I honestly hope one day there may be a preference setting to have logged in users like me automatically sign in only because like I said I don't much about all of this programming technical stuff. It's just for me to save time and an extra step. — Preceding unsigned comment added by Mkikoen (talkcontribs) 16:40, 15 August 2018 (UTC)
Well, as I said, SineBot already (usually) signs posts when the signature is left off, so there's nothing for you to do/request. Primefac (talk) 16:41, 15 August 2018 (UTC)
@Primetac: Oh it does automatically sign it if I forget? Okay, my mistake I didn't read that part of your previous message, Ritchie333 was the person who recommended this bot to me so I apologize if I wasted your time. I hope that maybe the SineBot feature will be implemented as a preference feature that logged in users can enable if they so wish to do so. Thanks. — Preceding unsigned comment added by Mkikoen (talkcontribs) 16:50, 15 August 2018 (UTC)
You should make every effort to sign your posts, as SineBot is not infallible. There's no "preference" to engage because it does it automatically for everyone (except when it doesn't...). Primefac (talk) 16:53, 15 August 2018 (UTC)
Just to clarify a few things, SineBot runs in the background and will (or is supposed to!) sign anyone's post within about 2-3 minutes if they've forgotten to do it themselves, provided that user has less than 800 edits. After that, the bot assumes they are used to signing posts. As you can see from this discussion, SineBot has signed one of Mkikoen's posts. Ritchie333 (talk) (cont) 16:56, 15 August 2018 (UTC)
Is the cutoff 800 edits? I was pretty sure I've seen it sign my posts before. And for what it's worth, I've been the one "signing" Mkikoen's posts, but only because I'm avoiding doing real work and getting the notifications right after they post. Primefac (talk) 16:58, 15 August 2018 (UTC)
@Primetac: Again, that's something I'm still not used to as I assume and would prefer the wikipedia system to auto-sign my messages as I have no experience with software / website programming. It's just to make it slightly easier for newcomer editors who edit very infrequently. I can't give any specific details on how the feature could maybe be implemented but I'm just trying to give out a trivial suggestion. (Finally remembered to click on the Signature and Timestamp button ... when it comes to editing articles or writing messages on talk pages, signatures is just not someone I prioritize as I just focus on accurately writing / editing the article or message carefully and just wait for the system to auto-sign it without it saying "preceding unsigned comment added by [Username here]." --Mkikoen (talk) 17:05, 15 August 2018 (UTC)
@Primetac: It's just a trivial suggestion for an update in the future as long as it's possible to add an auto-sign signature option in the logged in user's preferences page whether it functions as a bot or not. I guess I just need to remember to sign my messages manually for now until that feature possibly gets implemented if it causes slight additional work for someone else.— Preceding unsigned comment added by Mkikoen (talkcontribs)
@Mkikoen: sign your edits (see how). The software has no idea whether you're simply adding modifying an existing comment, modifying a template, adding a piece of text that should not be signed, or whatever. SineBot is a stopgap measure, not a solution. Headbomb {t · c · p · b} 17:14, 15 August 2018 (UTC)
@Headbomb: Very good point Headbomb. I'll keep in mind to sign my messages manually at least until a feature like that can be implemented in the future so everyone's messages can be auto-signed. --Mkikoen (talk) 17:27, 15 August 2018 (UTC)

SineBot was offline for the last two weeks and has only just returned, sometime in the past 24hrs or so. -- GreenC 17:42, 15 August 2018 (UTC)

Peer review - periodically contacting mailing list with unanswered reviews

Hi all, could a bot editor help out at peer review by creating a bot that periodically contacts editors on our volunteer list with a list of unanswered reviews? Some details

  • Discussion is here: WT:PR - the problem we are trying to answer is that of a large number of outstanding reviews that haven't been answerd
  • List of unanswered reviews is here: WP:PRWAITING
  • List of volunteers is here: WP:PRV
  • We will remove inactive volunteers, and I will reformat the list in a bot readable format similar to this: {{User:Tom (LT)/sandbox/PRV|Tom (LT)|anatomy and medicine|contact=never}}
    • Editors will opt in to the system - all will be set to default to never contact
    • Options for contact will be never, monthly, quarterly, halfyearly, and yearly (unless you can think of a more clever way to do this)

Looking forward to hearing from you soon, --Tom (LT) (talk) 23:12, 4 June 2018 (UTC)

Addit: ping also to Anomie who very kindly helped create the AnomieBot that now runs PR.--Tom (LT) (talk) 23:12, 4 June 2018 (UTC)
In the meantime, I'll mention that WP:AALERTS will report peer reviews requests to Wikiprojects, if the articles are tagged by banners. Headbomb {t · c · p · b} 11:29, 5 June 2018 (UTC)
Bump. --Tom (LT) (talk) 00:51, 25 June 2018 (UTC)
Bump. --Tom (LT) (talk) 03:24, 10 July 2018 (UTC)
Bump. --Tom (LT) (talk) 10:47, 30 July 2018 (UTC)
@Tom (LT): BRFA filed Kadane (talk) 00:27, 10 August 2018 (UTC)
Y Done Bot Approved Kadane (talk) 14:36, 11 September 2018 (UTC)

Simple alphabetization botreq

User:JL-Bot/Questionable.cfg and User:JL-Bot/Citations.cfg have a lot of entries. The botreq is simply to take every section and alphabetized them (case insensitive), strip linebreaks, and remove exact duplicates (case sensitive). Basically take something like

===T2===
{{columns-list|colwidth=30em|
{{JCW-selected|Tehnički vjesnik|Technical Gazette|TV-TG|source=BJL}}

{{JCW-selected|Tactful Management Research Journal|TMRJ|source=BJL}}
{{JCW-selected|Technical Journal of Engineering and Applied Sciences|TJEAS|source=BJL}}
{{JCW-selected|Technics Technologies Education Management|source=BJL}}

}}

And turn it into

===T2===
{{columns-list|colwidth=30em|
{{JCW-selected|Tactful Management Research Journal|TMRJ|source=BJL}}
{{JCW-selected|Technical Journal of Engineering and Applied Sciences|TJEAS|source=BJL}}
{{JCW-selected|Technics Technologies Education Management|source=BJL}}
{{JCW-selected|Tehnički vjesnik|Technical Gazette|TV-TG|source=BJL}}
}}

likewise for something like

===T===
{{columns-list|colwidth=35em|
{{JCW-exclude|Trends (journals)|Trends (journal)}}
{{MCW-exclude|Trains (magazine)|Tracés}}
{{JCW-exclude|Trains (magazine)|Trips magazine}}

{{JCW-exclude|Trains (magazine)|Trips Magazine}}

}}

to

===T===
{{columns-list|colwidth=35em|
{{JCW-exclude|Trains (magazine)|Trips Magazine}}
{{JCW-exclude|Trends (journals)|Trends (journal)}}
{{MCW-exclude|Trains (magazine)|Tracés}}
}}

Bot could be run 1 hour after the last edit. Or daily. Headbomb {t · c · p · b} 01:15, 17 September 2018 (UTC)

Coding... I like the title "simple" - that will surely doom it... :-) Ronhjones  (Talk) 22:32, 17 September 2018 (UTC)
@Headbomb: Please have a good look at special:diff/860057846. I think that's looking OK Ronhjones  (Talk) 01:57, 18 September 2018 (UTC)
@Ronhjones: Looks pretty good. The only thing that would make it better is ignoring leading articles like A An Der L' La Le The. Headbomb {t · c · p · b} 02:05, 18 September 2018 (UTC)
@Headbomb: I refer the honourable gentleman to his simple heading. Now it's the "moon on a stick", as we say. You know bots are rubbish at determining context - so what about A M Publishers - It's going to be at the bottom of the "A" list with such reasoning - there will always be odd cases where it will case errors. I have been here before on other projects, it's really difficult to get 100% accurate. Note that the blocks are sorted individually, so they won't change their heading - so "A Wonderful Journal" if put in "W" stays in "W". Anyway, I'll tidy up the bot code later, and do the BRFA. Ronhjones  (Talk) 15:10, 18 September 2018 (UTC)
BRFA filed - Special:Diff/860134406 is a better diff (I forgot the duplicate entries!) Ronhjones  (Talk) 16:28, 18 September 2018 (UTC)
Meh, skip the fancy sorting then. Not worth the coding effort. Headbomb {t · c · p · b} 17:07, 18 September 2018 (UTC)

New York Times archives moved

Diff

The new URL can be obtained by following Location: redirects in the headers (in this case two-deep). I bring it up because of the importance of NYT to Wikipedia, uncertainty how long redirects last and the new URL is more informative including the date. -- GreenC 21:46, 13 June 2018 (UTC)

Comment Looks like there are 29,707 links with "query.nytimes.com" Ronhjones  (Talk) 15:25, 9 July 2018 (UTC)

BRFA filed -- GreenC 15:44, 20 July 2018 (UTC)

Per WP:INTDABLINK, redirect pages should not be linked to directly but via a redirect ending in (disambiguation). However not all such redirects exist. This request is to create those redirects that do not exist:

If one or more the conditions in the first bullet are not met the bot should do nothing (e.g. M class is a disambiguation page in the category, but M class (disambiguation) already exists, so the bot doesn't need to do anything). A list of redirects created as part of this run would be useful. Thryduulf (talk) 14:17, 30 August 2018 (UTC)

For clarity, this request is initially for a one-time run. Following that there might be desire for either subsequent runs on a schedule or live monitoring of creations of dab pages, however this if so this will form a separate request and is not requested currently. Thryduulf (talk) 15:21, 30 August 2018 (UTC)
Yes, some have been bot created in the past, don't know the history. Many are missing currently is the status. Widefox; talk 15:25, 30 August 2018 (UTC)
@R'n'B: Pinging RussBot's owner Ronhjones  (Talk) 20:20, 30 August 2018 (UTC)
Yes, this task continues to run monthly. As described, it only adds a redirect to disambiguations that have incoming links (which used to be the majority, but is now a relatively small fraction of all disambig pages). --R'n'B (call me Russ) 20:27, 30 August 2018 (UTC)
@Thryduulf: Suggest you liaise with R'n'B and see if he will start a new BRFA for a wider task. No point in anyone else re-inventing the wheel. Ronhjones  (Talk) 22:13, 30 August 2018 (UTC)

Repair dead URLs from NYTimes

A while ago the NYTimes adjusted their website structure, effectively killing their 'select' and 'query' sub-domains. A few weeks back I updated the URLs for the 'query' sub-domain to point to their current URLs, and there were a few links that were broken but I simply repaired them manually. The 'select' sub-domain is proving to be more difficult. So far, around 12% of the links redirect to a "server error" or "not found" page - this becomes a significant amount when there are 19,000 links to the https://select.nytimes.com sub-domain from the mainspace, which would be approximately 2,000 dead links.

A quick solution I've found is to pop the URL into https://web.archive.org, and it'll have the correct redirect chain archived to the current, live URL. Example : https://select.nytimes.com/gst/abstract.html?res=F00611FC3A5A0C778EDDAD0894DB494D81 is a dead link, but https://web.archive.org/web/20160305140459/https://select.nytimes.com/gst/abstract.html?res=F00611FC3A5A0C778EDDAD0894DB494D81 eventually redirects to the live link. The desirable outcome would be to have all the old URLs updated to the current URL by using the redirects cached by WebArchive, but I do not think it is possible with IABot as it is currently configured. From my understanding, IABot would add the archive URL instead of updating the original URL to the live link. @Cyberpower678: would it be possible to tweak IABot for this batch to run through the list of URLs I provide and update the page with both the new URL under https://www.nytimes.com as well as an archive URL while we're at it? Jon Kolbert (talk) 04:20, 22 July 2018 (UTC)

I've been looking into it the past few days, and am quite confused by the URL taxonomy. For example https://query.nytimes.com/gst/fullpage.html?res=9E0CE2DD113BF93BA35751C1A964958260 still works. Notice the "fullpage" versus "abstract". There are also search queries that still work https://query.nytimes.com/search/query?ppds=des&v1=STRIKE%20THE%20GOLD%20(RACE%20HORSE) and "mem" (or deep archive) queries that work https://query.nytimes.com/mem/archive/pdf?res=9D0DE0DF163FEE3ABC4952DFB767838E659EDE. Though in the later case these now redirect to a new "timesmachine" sub-domain. Are there other URL types? When you say "I updated the URLs for the 'query' sub-domain" I don't understand as there are about 22,000 still in enwiki. That's an interesting idea to pull redirect URLs from Wayback, though not simple or guaranteed to always work (sometimes redirects point back on themselves in loops, or lead to soft404s). There is also a NYT API that might return the URL given an article title. -- GreenC 05:01, 22 July 2018 (UTC)
@GreenC: Ah, you're right. I only did it for the HTTP PDF links it seems (see Special:Diff/846028034 for example). There still remains work to be done on the query sub-domain as well. The majority of query and select URLs work just fine, it's only a select (heh) few that are problematic, the rest can be updated quite easily while we have the chance. Here's a random sample of 10 links from what I've tested so far : https://tools.wmflabs.org/paste/view/raw/fd073f61. Of them, only one doesn't work. Most are just fine and can be updated quite easily, and I plan on doing so shortly. It's the ones that end up in status=nf that are no easy fix. Jon Kolbert (talk) 05:25, 22 July 2018 (UTC)
@Jon Kolbert: That is not a tweak, that is far outside of the bot's original programming, and requires a separate bot. I would just advise a simple search and replace bot to replace the originals and let IABot get to them eventually.—CYBERPOWER (Chat) 14:11, 22 July 2018 (UTC)
@Cyberpower678: Okay, fair. The problem we face is finding a reliable method of finding the replace URLs for the 12% of dead URLs. Jon Kolbert (talk) 14:58, 22 July 2018 (UTC)

@Jon Kolbert: - There are about 12,000 articles containing one or more "select.nytimes.com". Are you testing all those, finding which ones are "nf" and in that set need help with discovering a working URL? How are you updating on wiki using programming tools or manually? -- GreenC 15:35, 22 July 2018 (UTC)

@GreenC: I have tested each of the 20K URLs and separated the results into two separate lists, one list contains the dead URLs and the other list contains the old URLs with live replacement URLs. I have been updating them semi-automatically using pywiki for efficiency reasons, it has much more customization than AWB but has the same effect (previewing the diff before submitting). As I've said, I'm still not entirely sure with what to do with the dead links but I have them all saved in a list. Jon Kolbert (talk) 15:48, 22 July 2018 (UTC)
Jon Kolbert - Ok great. Send me 5 or 10 of those dead URLs and I'll try sending them through my bot WP:WAYBACKMEDIC offline, which follows redirects at Wayback. I think it will log the final destination url. -- GreenC 16:11, 22 July 2018 (UTC)
@GreenC: Okay! Here is a sample. Let me know how it goes. Jon Kolbert (talk) 16:36, 22 July 2018 (UTC)
Jon Kolbert - It's not working because Wayback is not returning redirects rather working snapshots or no snapshot. However archive.is is working (sometimes). I put together this awk script which you can run. Download the file library.awk and create a list of URLs called "file.txt" like with the sample 10 above. It will print the original followed by the new.
awk -i ./library.awk 'BEGIN{for(i=1;i<=splitn("file.txt",a,i);i++) {sub(/^https/,"http",a[i]); match(sys2var("wget -q -O- \"http://archive.is/timemap/" strip(a[i]) "\""),/https?[:]\/\/archive.is\/[0-9]{14}/,dest);match(sys2var("wget -q -O- \"" strip(dest[0]) "/" strip(a[i]) "\""),/\|[ ]*url[ ]*[=][^|]*[^|]/,dest);sub(/\|[ ]*url[ ]*[=][ ]*/,"",dest[0]);print strip(a[i]) " | " dest[0]}}'
The script works by finding the archive.is URL via it's API (http://archive.is/timemap/) then web scrapes that page for the URL. -- GreenC 18:30, 22 July 2018 (UTC)
@GreenC: I have finished a good portion of the select.nytimes.com, I'm now analyzing all the query links so I can compile one large dead link list for both sub-domains after having fixed all the ones with redirects that work (for now). Jon Kolbert (talk) 21:10, 23 July 2018 (UTC)
Excellent. Seeing it in my watchlist. Another script for checking Wayback header Location: redirects, it might catch some more:
awk -i ./library.awk 'BEGIN{for(i=1;i<=splitn("file.txt",a,i);i++) {sub(/^https/,"http",a[i]); c = patsplit(sys2var("wget -q -SO- \"https://web.archive.org/web/19700101010101/" strip(a[i]) "\" 2>&1 >/dev/null"), field, /Location[:][ ]*[^\n]*\n/); sub(/Location[:][ ]*/, "", field[c]); print strip(a[i]) " | " strip(field[c])}}'
-- GreenC 23:35, 23 July 2018 (UTC)

Bot to fix old newsletters changing font-size

Many old versions of a WikiProject Military history newsletter said <span style="font-size: 85%;"><center>...</span></center> with wrong closing order. This doesn't currently close the font-size. Each newsletter makes it smaller so it becomes unreadable. See e.g. User talk:Lahiru k#The Bugle: Issue LVI, October 2010 (permanent link) and scroll down. A search currently finds 1953 user talk pages including 737 archives. Newer editions of the newsletter say <div style="font-size: 85%; margin:0 auto; text-align:center;">...</div>. A bot doing this would be nice. The tag order could also be fixed but <center>...</center> is obsolete per Help:Wikitext#Center text. PrimeHunter (talk) 10:35, 15 September 2018 (UTC)

@PrimeHunter: Am I correct in thinking this is due to the closing statement only - To stop receiving this newsletter...? If so, I would suggest removing the <center> and following </center> tags, and leave the statement as left justified. Easier to code, and less likely to go pear-shaped. Ronhjones  (Talk) 12:41, 15 September 2018 (UTC)
That is it. Removing <center> and </center> would be fine. PrimeHunter (talk) 13:18, 15 September 2018 (UTC)
I support this task. Expanding it to all namespaces would get an additional 35 or so pages. – Jonesey95 (talk) 13:38, 15 September 2018 (UTC)
Coding... Ronhjones  (Talk) 16:01, 15 September 2018 (UTC)
@PrimeHunter: Almost done, I see what user:Jonesey95 means - there are some added to user pages, rather than user talk pages. Setting up for namespaces 2 and 3 to do both. Your search didn't work in python (no idea why, I probably missed an apostrophe...) - so I changed it to insource: "<center>" insource: "The Bugle" insource: "</span></center>" - that picks up 2023 pages - they might not all need changing - the RegEx substitution is more specific. Code at User:RonBot/9/Source1. Test page at Special:Diff/859718967 - looks OK? Ronhjones  (Talk) 21:36, 15 September 2018 (UTC)
Thanks. I don't know Python. The diff looks good. PrimeHunter (talk) 21:49, 15 September 2018 (UTC)
BRFA filed Ronhjones  (Talk) 22:26, 15 September 2018 (UTC)
Y Done @PrimeHunter: Bot Approved and running - 2014 pages to do, doing about 5 per minute. Ronhjones  (Talk) 23:45, 21 September 2018 (UTC)

HTML errors on discussion pages

Is anyone going to be writing a bot to fix the errors on talk pages related to RemexHtml?

I can think of these things to do, although there are probably many more things and there might be issues with these ones.

  • replace non-nesting tags like <s><s> with <s></s> where there are no templates or HTML tags between those tags
  • replace <code> with <pre> where the content contains newlines and the opening tag does not have any text between it and the previous <br> or newline
  • replace <font color=#abcdef></font> with <span style="color:#abcdef"></span> where the content does not only contain a single link and nothing else

Jc86035 (talk) 13:50, 15 July 2018 (UTC)

Changing font tags to span tags is a bit of a loosing battle when people can still have font tags in their signatures. -- WOSlinker (talk) 13:48, 16 July 2018 (UTC)
I was under the impression all of the problematic users had been dealt with, and all that was left was cleaning up extant uses. Primefac (talk) 17:24, 16 July 2018 (UTC)
Unfortunately not. There are a couple unworked tasks on the point in phab, but I've seen a few lately who use obsolete HTML. (Exact queries somewhere on WT:Linter I think.) --Izno (talk) 05:47, 17 July 2018 (UTC)
What is the use of fixing the old messages? I suggest you leave them as they are. This is not useful to change them. Best regards -- Neozoon 16:16, 21 July 2018 (UTC)
@Neozoon: The thing is, when there are HTML tags which are unclosed or improperly closed, their effects remain in place, not to the end of the message but to the end of the page. See for example this thread which contains two <tt> tags instead of one <tt> and one </tt>, so every subsequent post is in a monospace font. One tiny change - the addition of the missing slash - fixes the whole page. --Redrose64 🌹 (talk) 11:01, 22 July 2018 (UTC)
If we're going to fix font tags, one issue I'd really like to see fixed is when a font tag is opened inside a link label and never closed, since I see that one frequently while browsing archives. Enterprisey (talk!) 20:16, 29 July 2018 (UTC)
BRFA filed Basically does what Jc86035 suggested in the first bullet point Galobtter (pingó mió) 07:44, 3 August 2018 (UTC)

Moving/removing ticker symbols in article lead

Hello all. Per WP:TICKER, ticker symbols should not appear in article leads if the infobox contains this information. My idea is to remove ticker symbols from article lead (more specifically the first sentence) if the article has {{Infobox company}}, and adding parameter traded_as along with the ticker symbol if it did not contain such parameter:

{{Infobox company
|name = CK Hutchison Holdings Limited
|type = Public
|traded_as = {{SEHK|1}}
...}}

If the article does not contain any infobox, it will be moved in the "External links" section of the page, the alternative position mentioned in the information page.

==External links==
* {{Official website}}
* {{SEHK|1}}

I was doing this action repeatedly in the past few days and realized that it would be a great idea if a bot can do this, especially when there is not an accurate list of these problematic articles for human editors to reference from. Would like to help creating a bot but not too familiar in this field, now asking for any experienced editor to help. Cheers. –Wefk423 (talk) 12:46, 2 August 2018 (UTC)

TICKER isn't policy or guideline which makes consensus difficult. Also tricky to automate since everything is free floating in the lead section it might break layout or context if removed. Perhaps AWB semi-automated would be better? -- GreenC 10:31, 3 August 2018 (UTC)

Removing BLP templates from non-BLP articles

Hello. As an example: according to PetScan, there are 1050 articles belonging to the 2 categories "All BLP articles lacking sources" and "21st-century deaths". We have this kind of templates for people who died 50 or even 100 years ago: William Colt MacDonald, Joseph Bloomfield Leake. Thanks and regards, Biwom (talk) 04:34, 6 August 2018 (UTC)

Are they verifiably dead? If not we presume alive, unless they were born more than 115 years ago, see WP:BDP. --Redrose64 🌹 (talk) 09:40, 6 August 2018 (UTC)
Hello. You are correct, but two of the first things I can read at WP:BLP are "we must get the article right" and "requires a high degree of sensitivity". Having a big banner saying that the person is alive before a lede that states a date of death is both wrong and insensitive. Let's keep in mind that these so-called maintenance templates are visible for the casual Wikipedia reader.
On a side note: PetScan is giving me 162 results for "All BLP articles lacking sources" and "19th-century births". Thanks and regards, Biwom (talk) 11:54, 6 August 2018 (UTC)

Archive disambiguation bot

Something that's been bugging me for a while, but which I've been reminded recently by ClueBot (talk · contribs) is that section links to archived sections break all the time. So I'm proposing that we have a fully dedicated bot for this

When you have a section link like

  • [[Talk:Foobar#Barfoo]]

where the section link is broken, search through all 'Talk:Foobar/Archives', 'Talk:Foobar/<time>' or 'Talk:Foobar/Old' subpages for a #Barfoo anchor. If you find a match, update the section link to, e.g.

  • [[Talk:Foobar/Archives 2009#Barfoo]]<!--Updated by Bot-->

If you find multiple matches, instead tag the link with

  • [[Talk:Foobar#Barfoo]]{{Old section?|Talk:Foobar/Old#Barfoo|Talk:Foobar/Old 3#Barfoo}}

to render

Lastly, if you're looking in a dated archive (e.g. Archives/2008) or sequential (e.g. /Archives 19), then only search in the archives older than that. Headbomb {t · c · p · b} 15:09, 9 August 2018 (UTC)

Bot to remove religion parameter in bio infoboxes

2016 Policy RFC re religion in bio infoboxes Don't know if this has been discussed here previously. I've noticed several editors manually removing the religion from biographical infoboxes. Surely, there are thousands of these infoboxes that contain the religion entry. Wouldn't it make sense to just run a bot? — Maile (talk) 19:49, 29 July 2018 (UTC)

How does a bot know whether the parameter is appropriate for a certain topic? --Izno (talk) 21:14, 29 July 2018 (UTC)
FYI, it looks like there are roughly 4,000 instances of {{Infobox person}} that use the |religion= parameter. I am sure that there are other person-related infoboxes using this parameter as well. They will probably require a human editor, per the RFC outcome, to ensure that those cases in which the religion is significant to the article subject is adequately covered either in the body text or in a custom parameter. – Jonesey95 (talk) 21:31, 29 July 2018 (UTC)
Misunderstanding here. It's not by topic, or by individual person. It involves all instances of Template:Infobox person, which seems to used on 290,000 (plus) pages. The notation says, Please note that in 2016, the religion and ethnicity parameters were removed from Infobox person as a result of the RfC: Religion in biographical infoboxes and the RfC: Ethnicity in infoboxes as clarified by this discussion. Prior to 2016, the religion parameter was allowed, and there's no way of knowing how many hundreds of thousands of Infobox person templates have the religion there. The immediate result, is that the existing religion stated in the infobox remains in place, but just doesn't show up on the page. What random editors are doing is going to articles, one by one, and removing the stated religion from the infobox. My question is whether or not a bot could be run to go through all Infobox persons in place, and remove that entry — Maile (talk) 22:58, 29 July 2018 (UTC)
There is a way of knowing, which is how I came up with the estimate above. Go to Category:Pages using infobox person with unknown parameters and click on "R" in the table of contents. Every page listed under "R" has at least one unsupported parameter in Infobox person that starts with the letter "R". (Edited to add: see also Category:Infobox person using religion).
As to whether this is a task feasible for a bot, see my response above, which explains why this request probably runs afoul of WP:CONTEXTBOT. – Jonesey95 (talk) 01:00, 30 July 2018 (UTC)
Jonesey95, I'm not really sure this is a CONTEXT issue. The template doesn't show anything when |religion= is given (which, based on the TemplateData, is actually used 8000+ times). Now, I wouldn't want to run this bot purely to remove one bad parameter, but if I could get a list of the top 20 "bad params" (or anything with 50+ uses) and/or include |ethnicity= and |denomination= (which also appear to be deprecated) then we might be getting somewhere. Primefac (talk) 01:38, 30 July 2018 (UTC)
Following on from the above, there are (according to TemplateData) about 156 "invalid" params with 10+ uses, 75 with 25+ uses, 38 with 50+ uses, 20 with 100+ uses and 3 (religion, ethnicity, and imdb_id) with 2000+. There are about 2k invalid params all told, but the majority look like they're either typos (i.e. they are context-dependent) but by removing the most common ones it'll cut down the work load. Primefac (talk) 01:43, 30 July 2018 (UTC)
How does a bot determine that in cases in which the religion is significant to the article subject[, the person's religion] is adequately covered either in the body text or in a custom parameter? (words in brackets added to RFC outcome excerpt to avoid quoting the whole thing). As for other parameters, my experience with removing/fixing unsupported infobox parameters is that a large number of them are typos that need to be fixed, rather than removed. Maybe an AWB user with a bot flag can make these changes, but I don't see how an unsupervised bot would work reliably. – Jonesey95 (talk) 04:18, 30 July 2018 (UTC)
Because it's not a valid parameter. If |religion= is used in {{infobox person}} it does nothing, and thus there is zero reason to have it. As for the second half of your question - yes, the majority of the invalid parameters are typos, but the 2108 uses of |imdb_id= are not typos and could be removed without any issue (see my bot's tasks 7, 8, 10, 18, 20, 23, and 26 for similar instances of parameters being changed/removed). Primefac (talk) 16:01, 30 July 2018 (UTC)
Maybe a specific example will help. If there is a person, Bob Smith, who is Roman Catholic, and his religion is significant, and the religion was placed into the infobox by an editor in good faith, and that religion is not adequately covered either in the body text or in a custom parameter, the religion parameter in the infobox should not simply be removed. That's what the RFC says. – Jonesey95 (talk) 16:32, 30 July 2018 (UTC)
I suppose that's a fair point. The issue is that at this exact moment the template doesn't currently accept the parameter. Was the removal so that the "invalid" uses could be removed and the "valid" ones kept, so that it could be (at some point in the future) reinstated? Or will it never be reinstated and the religion parameter be reincarnated as something else entirely? Primefac (talk) 20:04, 30 July 2018 (UTC)
Based on the close it sounds more like the religion parameter shouldn't be used in {{infobox person}} and thus the template call should be changed to something more appropriate. Pinging Iridescent as the closer, mostly to see if I've interpreted that correctly. Primefac (talk) 20:07, 30 July 2018 (UTC)
The consensus of the RFC was fairly clear that the religion parameter should be deprecated from the generic {{infobox person}},and that in those rare cases where the subject's religion is significant a more specific infobox such as {{infobox clergy}} should be used. That the field was left in place and disabled rather than bot-blanked was, as far as I'm aware, an artefact of the expectation that those people claiming "the religion field is necessary" would subsequently go through those boxes containing it and migrate the infoboxes in question to a more appropriate infobox template. ‑ Iridescent 06:51, 31 July 2018 (UTC)

Lengthy post-RfC discussion at Template_talk:Infobox_person/Archive_31#Ethnicity?_Religion? .. -- GreenC 01:19, 30 July 2018 (UTC)

  • Just an FYI from my personal experience. The bigger issue is the infinite perpetuation of the religion parameter through copy and paste off an existing article. If the religion is not supported, it merely doesn't show in a given article. Where the issue perpetuates is when a new article is created, and the editor uses the infobox from another article as the template, changing what needs to be changed. That's what I do, and I've been here more than a decade. Why bother figuring out usage from a new blank when I know an article that already has the basics I need? The only reason I know the religion parameter is not supported, is because of editors who are manually correcting templates, one by one. Might there be a lot of other editors, newbies and long-timers, who do the copy-from-existing-article-and-paste method? — Maile (talk) 11:54, 31 July 2018 (UTC)
  • BRFA filed. Primefac (talk) 01:09, 11 August 2018 (UTC)

Lint Error elimination of poorly formatted italics....

Would it be possible for their to be a bot to look for and repair specifc LintErrors based on regexp?

https://en.wikipedia.org/w/index.php?title=Special:LintErrors/missing-end-tag&offset=70252016&namespace=0 was where I'd reach manually, but it's takign a while..

In this instance, the reqeust would be to search for mismatched italics, bold and SPAN's in a page ShakespeareFan00 (talk) 13:13, 11 August 2018 (UTC)

For the italics at least, I think this would be too context-dependent. For example, Special:Diff/854452962 contains two mismatched sets and it took me reading through it twice for the first one before I figured out where the closing '' was supposed to go. Primefac (talk) 13:27, 11 August 2018 (UTC)

Can someone take over User:Bot0612?

Firefly has been inactive for two months and Bot0612 task 9 no longer updates many WP:VITAL article lists. I haven't looked at other tasks, but can anyone take over the functions currently done by Bot0612? feminist (talk) 09:01, 7 October 2018 (UTC)

@Feminist: Have you tried to contact him? He was around 7 days ago, and he has an e-mail link. User:Bot0612/shutoff/9 suggest the bot is still active. It may be some wiki change that has broken the bot, and he would be better placed to fix it. If he wants to give it up, then the bot code is published. Ronhjones  (Talk) 22:06, 10 October 2018 (UTC)
Never mind, the bot seems to work correctly now, at least for this task. Thanks. feminist (talk) 03:58, 11 October 2018 (UTC)

portal

hi please creat bot to adding portals to articles many articles not any portal example https://en.wikipedia.org/wiki/Billy_James_(footballer) — Preceding unsigned comment added by Amirh123 (talkcontribs) 11:50, 26 September 2018 (UTC)

Declined Not a good task for a bot. per WP:CONTEXTBOT. Headbomb {t · c · p · b} 14:45, 26 September 2018 (UTC)
@Amirh123: Please review the various requests that you have made here in the past, and the replies that we have left. --Redrose64 🌹 (talk) 15:23, 26 September 2018 (UTC)

ViewBot

for editing subjects that people would be extremely tedious about. name of bot:ViewBot. Huff slush7264 Chat With Me 22:49, 8 October 2018 (UTC)

@Huff slush7264: So.... What would the bot do? SQLQuery me! 22:52, 8 October 2018 (UTC)
Be specific. You shouldn't expect others to know what you're talking about just by saying "editing subjects that people would be extremely tedious about" or "what User:タチコマ robot does". Nardog (talk) 16:08, 9 October 2018 (UTC)

Please can someone remove all calls to Template:Incomplete in mainspace? Per the Tfd all those with specified reasons have been converted to {{missing information}}. The remainder are not suitable for this template and can simply be removed. There are 5129 transclusions but only those in article space or draft space are important. Thank you — Martin (MSGJ · talk) 19:01, 10 October 2018 (UTC)

 Done — JJMC89(T·C) 08:34, 12 October 2018 (UTC)

Mass category change for location userboxes lists

I created new category Category:Lists of location userboxes to better organize userboxes. I would like to change (and add where it's missing) all entries of

[[Category:Lists of userboxes|.*]] 

to

[[Category:Lists of location userboxes|{{subst:SUBPAGENAME}}]]

in all subpages of following pages:

  1. WP:Userboxes/Life/Citizenship
  2. WP:Userboxes/Life/Origin
  3. WP:Userboxes/Life/Residence
  4. WP:Userboxes/Location
  5. WP:Userboxes/Travel

There are some exceptions, like WP:Userboxes/Location/United States/Cities should be [[Category:Lists of location userboxes|United States]], and not, [[Category:Lists of location userboxes|Cities]]. It would be nice if bot could distinguish such subpages (with titles equal to Cities, Regions, States, Nations), but it would be OK if it didn't — there is only a handful of such subpages — they can be updated later manually.

—⁠andrybak (talk) 12:40, 20 August 2018 (UTC)

It appears that only /Location has subpages, although /Travel has /Travel-2 and /Travel-3, which are a sort of parallel page that might be suitable for this categorization. I might have missed other pages that are not technically subpages. – Jonesey95 (talk) 16:37, 20 August 2018 (UTC)
Thanks. I guess going by page prefix would be better:
  1. Special:PrefixIndex/Wikipedia:Userboxes/Life/Citizenship
  2. Special:PrefixIndex/Wikipedia:Userboxes/Life/Origin
  3. Special:PrefixIndex/Wikipedia:Userboxes/Life/Residence
  4. Special:PrefixIndex/Wikipedia:Userboxes/Location
  5. Special:PrefixIndex/Wikipedia:Userboxes/Travel
—⁠andrybak (talk) 17:03, 20 August 2018 (UTC)

[r] → [ɾ] in IPA for Spanish

(reviving) A consensus was reached at Help talk:IPA/Spanish#About R to change all instances of r that either occur at the end of a word or precede a consonant (i.e. any symbol except a, e, i, o, or u, or j or w) to ɾ inside the first parameter of {{IPA-es}}. There currently appear to be about 1,140 articles in need of this change. Could someone help with this task with a bot? Nardog (talk) 12:55, 20 August 2018 (UTC) – Fixed Nardog (talk) 15:28, 20 August 2018 (UTC)

Hi Nardog - I have a script, but could you confirm how these testcases should look, perhaps add other unusual cases that might come up:
*{{IPA-es|aˈðrβr|lang}}
*{{IPA-es|aˈðr βr|lang}}
*{{IPA-es|aˈðr-βr|lang}}
*{{IPA-es|aˈðr|lang}}
*{{IPA-es|aˈðrer|lang}}
*{{IPA-es|aˈerβ|lang}}
*{{IPA-es|ri|lang}}
*{{IPA-es|r|lang}}
*{{IPA-es|r r r|lang}}
*{{IPA-es|ir er or|lang}}
Thanks, -- GreenC 14:42, 20 August 2018 (UTC)
@GreenC: Thanks for taking a stab at this. In principle, any instance of r that is followed by anything (including a space, | or }) except a, e, i, j, o, u, or w in the first parameter of {{IPA-es}} must be replaced with ɾ, so the first seven would be
*{{IPA-es|aˈðɾβɾ|lang}}
*{{IPA-es|aˈðɾ βɾ|lang}}
*{{IPA-es|aˈðɾ-βɾ|lang}}
*{{IPA-es|aˈðɾ|lang}}
*{{IPA-es|aˈðreɾ|lang}} <!-- the first [r] should also be [ɾ] anyway but that is not relevant here -->
*{{IPA-es|aˈeɾβ|lang}}
*{{IPA-es|ri|lang}}
and the final one *{{IPA-es|iɾ eɾ oɾ|lang}}. *{{IPA-es|r|lang}} and *{{IPA-es|r r r|lang}} should probably be unmodified, but I found no such occurrence by searching hastemplate:IPA-es insource:/IPA-es[^\|]*?\|[^\|\}]*?[ \|]r[ \|\}]/. As a side note, a few transclusions include a comment inside the first parameter, but those have been fixed so you can likely exclude them. I've also fixed some other idiosyncratic cases like this and this. Nardog (talk) 15:28, 20 August 2018 (UTC)
Alright that works. Since it's only 1,140 we can probably do this right away fully supervised. Would you be willing to help verify the edits? I can do 50 or so on the first run to make sure it's working correctly, and then 500 after that. If there are no problems in the 500 the rest can be finished with spot checks (it will be obvious there is a problem by looking at the edit byte count which will be uniform size 0 byte). -- GreenC 16:35, 20 August 2018 (UTC)
Sounds great! (ɾ is 3 2 bytes long so it would be +2 +1 per change, by the way.) Nardog (talk) 16:39, 20 August 2018 (UTC)
Nardog - Alright just did 73, hard to control exact number - see diffs for User:GreenC bot (ignore first 5 a typo in the script caused catastrophic fail). -- GreenC 17:06, 20 August 2018 (UTC)
@GreenC: Sorry, please add j and w to the list of symbols that can follow [r] (see the correction to my OP of this thread, I should have made it clearer). Other than that, the corrections are spot on. (2 bytes it was...) Nardog (talk) 17:31, 20 August 2018 (UTC)

500 done, I'll wait till tomorrow to finish the rest. -- GreenC 19:03, 20 August 2018 (UTC)

@Nardog: No response by anyone from yesterday's run which is a good sign so I went ahead and finished the rest it's a clean sweep (nice search formulation btw). This script is basically a cut and paste "bot complete" for anyone with a bot flag and OAuth credentials; it identifies the article names, downloads the wikisource, makes the changes and uploads the new article with edit summary. I can do it or anyone else in the future as needed. -- GreenC 15:58, 21 August 2018 (UTC)
Awesome ;) Nardog (talk) 16:48, 21 August 2018 (UTC)
Script

Dependencies: GNU awk, wikiget.awk, library.awk. MIT License User:GreenC 2018
./wikiget -a ": hastemplate:IPA-es insource:/IPA-es[^\|\}]*?\|[^\|\}\<]*?r[^aeijouw]/" > ipa; awk -ilibrary '{fp=sys2var("./wikiget -w " shquote($0)); c=patsplit(fp,field,/{{IPA-es[^}]*}}/,sep);for(i=1;i<=c;i++){patsplit(field[i],aa,/[|]/,a); sub(/r$/,"ɾ",a[1]); gsub(/r[ ]/,"ɾ ",a[1]); d=patsplit(a[1],b,/r[^aeioujw]/,bb);for(j=1;j<=d;j++) sub(/^r/,"ɾ",b[j]);a[1]=unpatsplit(b,bb); field[i] = unpatsplit(aa,a) } if(unpatsplit(field,sep) != fp){ print shquote($0); sys2varPipe(unpatsplit(field,sep), "./wikiget -E " shquote($0) " -S " shquote("[r] → [ɾ] in IPA for Spanish [[Help_talk:IPA/Spanish#About_R|per discussion]] and [[Wikipedia:Bot_requests#%5Br%5D_%E2%86%92_%5B%C9%BE%5D_in_IPA_for_Spanish|botreq]]") " -P STDIN")}}' ipa

Per discussion at Wikipedia_talk:WikiProject_Women_in_Red#How_about_weekly_additions_to_drafts?, could somebody with an established bot take this task on? The code is in User:Ritchie333/afcbios.py - at the moment it prints the page to the standard output, which I redirect to a file and then copy and paste manually into the editing window. It would be much less hassle to bolt this task into an existing bot, rather than me creating one from the ground up. Ritchie333 (talk) (cont) 11:35, 9 October 2018 (UTC)

Coding... Ronhjones  (Talk) 23:50, 10 October 2018 (UTC)
@Ritchie333: First userspace test at User:Ronhjones/Sandbox2. Ignoring the fact that the numbering went a bit off (It hit the max of 5000 pagenames, and the counter got zeroed when I did the second lot), it don't look too bad. I had to re-write it, as I've yet to get pywikibot to install on my PC - last time I tried, it corrupted the whole python system, and lucky I had a backup... Source is at User:RonBot/11/Source1 Ronhjones  (Talk) 13:41, 11 October 2018 (UTC)
I've installed Pywikibot on Linux, OS X and Windows using Mingw64, and it seems to work okay for all of them. Still, if you want to rewrite with raw APIs (which you have done - crikey, whoever wrote the module mwparserfromhell must have named it from experience), that's not an issue. Also, most of my scripts tend to just write to the standard output rather than chain things together, as other developers can test them quickly and easily. You can then replace print with a lambda argument and encapsulate bits of functionality. Anyway, that's just the way I work.
Another useful feature would be to have a list of false positives that the script generates. For example, it thinks Draft:Harry Edward Vickers (Flannel Foot) is in scope for the project because of the sentence starting "In May 1934 newspapers report that Scotland Yard have circulated a girl’s picture...." - the draft contains information about a woman, but the main subject isn't one. I was going to code that as a straight list, possibly reworking as an internal file "blokes.txt", and any draft whose title is in the list is skipped over. Ritchie333 (talk) (cont) 14:18, 11 October 2018 (UTC)
@Rosiestep: Does the output at user:Ronhjones/Sandbox2 look about right? Ritchie333 (talk) (cont) 14:21, 11 October 2018 (UTC)
@Ritchie333: We can have a subpage of the bot for false positives, then anyone can update them. Never mind the mwparserfromhell - the wikitools page.page was confusing the first time I encountered it! As an aside - as these are Drafts, would adding the date of the last edit date be useful? I know how to get the history dates. Ronhjones  (Talk) 17:06, 11 October 2018 (UTC)
The date might be useful - it would let us know which are in danger of deleted via WP:G13. Also the regex (defined as reText in my script) to work out what might be in scope would be useful as a configurable sub-page (though I'd probably want that protected in some form to stop vandalism). Ritchie333 (talk) (cont) 17:08, 11 October 2018 (UTC)
I'll have a play and see what works. Ronhjones  (Talk) 17:48, 11 October 2018 (UTC)
@Ritchie333 and Rosiestep:Shortened test at User:Ronhjones/Sandbox4. I need to tidy the date field (don't really need the time part). Note the lack of "Draft:Adam Zango" as he has been put in User:RonBot/11/FalsePositives. If that looks OK, I'll file the BRFA. Ronhjones  (Talk) 18:43, 11 October 2018 (UTC)
Yes, Ronhjones, this is an improvement. Question: will the page that Ritchie333 originally created be automatically updated or will there be a constant stream of new subpages with new entries? The latter would be impractical, I think. --Rosiestep (talk) 14:49, 12 October 2018 (UTC)
@Rosiestep: I was planning a weekly update of the one page. Ronhjones  (Talk) 17:04, 12 October 2018 (UTC)
But it can be any interval you like, I use Windows Schedule Tasks to run all my bots. Ronhjones  (Talk) 17:07, 12 October 2018 (UTC)
Ronhjones Perfect; thanks. When you have a moment, would you also please add explain all this on the Women in Red talkpage so that those pagestalkers are in the loop? --Rosiestep (talk) 17:31, 12 October 2018 (UTC)
@Rosiestep: OK, I am running a full length trial (now) to my user sandbox. When complete (and if OK), I'll add a bit to the page and start the WP:BRFA process. Ronhjones  (Talk) 17:54, 12 October 2018 (UTC)
Ronhjones, for transparency... Yesterday, I gave an elevator pitch to TNegrin (WMF) about what I really want (a spreadsheet which looks like other Women in Red Wikidata-SPARQL-QUERY lists, with columns including the name, year of birth, year of death, place of birth, place of death, photo, occupation, nationality, Q number), so he may be following up, too. In the meantime, I think what you and Ritchie have developed will be a good start. --Rosiestep (talk) 18:16, 12 October 2018 (UTC)

BRFA filedRonhjones  (Talk) 15:32, 15 October 2018 (UTC)

Y DoneRonhjones  (Talk) 21:46, 19 October 2018 (UTC)

Populate Selected Anniversaries with Jewish (and possibly Muslim) Holidays

Right now there is a manual process to go year by year to the last year's date and remove the Jewish holiday and then find the correct Gregorian date for this year and put it in. This is because Jewish and Muslim holidays are not based on the Gregorian calendar. There is a website, http://www.hebcal.org for Jewish holidays that lists all the Gregorian dates for the respective Jewish holidays. I suggest a bot take that list and update the appropriate dates for the current/next year. Sir Joseph (talk) 19:29, 20 July 2018 (UTC)

Which articles are we talking about? Enterprisey (talk!) 20:15, 29 July 2018 (UTC)
Not articles, the Selected Anniversaries pages which then lands up on the main page. For example, Wikipedia:Selected_anniversaries/July_22 has a Jewish holy day for 2018 but in 2017 and 2019.... that day falls on a different date. So the idea is to get a list of days that are worthy of being listed and then go to the page for the prior year and delete the entry and then add the entry for the current year. Sir Joseph (talk) 23:38, 29 July 2018 (UTC)

Hello @Sir Joseph and Howcheng: - I'm thinking about and investigating this as a possible project, if you are still interested. It would help which Jewish/Israeli holidays are tracked on Selected Anniversary pages? -- GreenC 15:25, 22 August 2018 (UTC)

@GreenC:, yes and thank you for your interest. I know that hebcal is open source and has a data feed, but not sure how useful it is. My thinking is to create a subpage listing "Holidays to be posted on the Front Page" and then the bot goes to this year or last year and updates it. I'm sure you might have a better way to do it. Sir Joseph (talk) 15:44, 22 August 2018 (UTC)
http://www.hebcal.org appears to be expired now a click-farm. What I would need is a list of holidays, then I can see what resources are available. -- GreenC 16:22, 22 August 2018 (UTC)
my huge mistake, I meant hebcal.com Sir Joseph (talk) 16:25, 22 August 2018 (UTC)
Great, I'll take a look. There is also Enrico Service 2.0 which is pretty good for an API but it's missing some holidays thus I would need to know which ones are needed before deciding on a data source. -- GreenC 16:30, 22 August 2018 (UTC)
I'd go with the hebcal integration. Many of the holidays posted to the front page are not national holidays so wouldn't be in enrico.Sir Joseph (talk) 16:36, 22 August 2018 (UTC)
You are right it is better. Would need to know which holidays. What happens with multi-day holidays like Chanukah, is an entry made in every Selected Anniversaries page (for 8 days I think?) or just the first day? -- GreenC 16:57, 22 August 2018 (UTC)
@Howcheng: would be able to confirm, but I think only one day is listed, except perhaps for Rosh Hashana. I will see if I can pull together a list of holidays unless Howcheng has one that he uses, this would be my own list and not what has happened in the past. Sir Joseph (talk) 17:11, 22 August 2018 (UTC)
Actually, I was thinking NOT of SA/OTD, but to update the holiday articles themselves. If we can make sure all the articles have holiday infoboxes in them, then the bot can add the date20XX= parameters. Including them in OTD is a manual process because the articles have to be checked to make sure the quality is still good enough for inclusion. howcheng {chat} 17:19, 22 August 2018 (UTC)
I'm OK either way, but wouldn't it be easier for you to not have to search for a holiday, but know a few days in advance as you do to check the SA and then see if that article is good enough? Sir Joseph (talk) 17:58, 22 August 2018 (UTC)
It's a bot, so there's no reason it can't do both, I suppose. howcheng {chat} 07:20, 23 August 2018 (UTC)

After some reflection, I would suggest the way to display Jewish and other moveable holiday dates is with a Lua template. The data could be in an enwiki hosted JSON file during development, later imported into Wikidata once established to share with other projects. The data can be downloaded from hebcal.com, generated with a script using mathematical algos or otherwise manually entered. It only needs to be done one time up to 2050 or 2100 whatever makes sense. It can be for any holiday that uses a non-Gregorian calendar, for translating to Gregorian. For example {{holigreg |holiday=Hanukkah |date=2018 |df=mdy}} would produce December 2, 2018 - December 10, 2018. There can be other options to control display output. It might also have |date=CURRENTYEAR so it's always up to date for use in infoboxes. And support |date=CURRENTYEAR+1 to display dates previous and after. Another advantage is other tools and bots can use the database; so for example if a bot was written to update the SA, it could key off the JSON or Wikidata database in a consistent manner for Jewish, Muslim and I assume there are other non-Gregorian holidays. -- GreenC 13:13, 23 August 2018 (UTC)

Thanks, that sounds like a plan, or I think I understand that it sounds like a plan. I know hebcal has a JSON here: [2] Let me know if you need anything from my end. I am sure we're going to need spelling equalization, etc. Sir Joseph (talk) 13:28, 23 August 2018 (UTC)

This is a really good idea. Just make sure that someone checks which "day" we tend to mark. Jewish festivals begin in the evening. I think we usually note the occurrence on the 'main day', not the day the festival begins, but there may be exceptions. --Dweller (talk) Become old fashioned! 13:51, 23 August 2018 (UTC)

This would also be a good approach to take for other moveable holidays, like Easter and its related holidays (Fat Tuesday, Ash Wednesday, etc), and ones based on the Chinese calendar as well. howcheng {chat} 16:02, 23 August 2018 (UTC)

We should create a page with a table similar to this: User:Sir_Joseph/sandbox and that can be populated by the bot. Sir Joseph (talk) 16:11, 23 August 2018 (UTC)

New template is created. Follow-up at Template_talk:Calendar_date#Template. -- GreenC 19:02, 24 August 2018 (UTC)

Cleanup needed in Category:Featured articles needing translation from <language>

Hello! The other day I was looking through Category:Featured articles needing translation from Chinese Wikipedia. This is a hidden category containing articles that have featured status on the Chinese Wikipedia. This is nifty because now we know we have a high quality article in Chinese that can be translated into English.

However, I noticed that a lot of these articles have had their featured status removed long ago (e.g. glove puppetry). Then I looked into other subcategories of Category:Featured articles needing translation from foreign-language Wikipedias and found similar problems. For example, Nationalist Party of Castile and León is in the category Category:Featured articles needing translation from Spanish Wikipedia, even though the corresponding article in Spanish appears to have never been featured.

This looks like a problem a bot could fix without much trouble. Just go each subcategory of Category:Featured articles needing translation from foreign-language Wikipedias, then for each article in that subcategory verify that it is indeed featured on the corresponding language's edition of Wikipedia, otherwise remove it from the category.

While we're at it, we could populate these categories by going through the list of featured articles in each language and adding each one to the appropriate category. This might be desirable because these categories are a bit sparse and many don't seem to be actively maintained. However this might be too big of a change to automate without consensus from the Wikipedia:WikiProject Intertranswiki or elsewhere.

I am relatively new to Wikipedia so sorry if this request is inappropriate! Flurmbo (talk) 21:45, 23 August 2018 (UTC)

Doing... - @Flurmbo: - I think this is a great idea. I have started a discussion at the following projects Cleanup, Languages, and Categories. If this gains consensus I will be willing to complete the request. Kadane (talk) 18:32, 8 September 2018 (UTC)
Also asked for comments at Wikiproject Intertranswiki. Kadane (talk) 18:40, 8 September 2018 (UTC)

Bot to assist in uploading/linking spoken word audio files

I have an idea for a bot (well a piece of software actually) to assist in uploading/linking spoken word audio files.

In particular it would help with article title uploads. Eg short recordings, like "Scott Morrison" to be used in the Wikipedia article.

Workflow:

  1. Display to the user an article title. Eg a random page or member of a category.
  2. Record the user saying the title.
  3. Name the recording (with the article title, language, accent)
  4. Upload the recording to Commons (in ogg format).
  5. If no other recording exists for the article, link the recording from a article page (perhaps a subpage like Scott Morrison/spoken - we can work out the details later).
  6. Repeat step one.

If you need clarification let me know.

Does this software already exist, perhaps in other Wikimedia projects?

Potentially I could make the tool myself, but it would take a loooong time. If no one can assist could you point me in the right direction - I was thinking of using Python as the pywikibot would help with some of the tasks.

Thanks for your time, Commander Keane (talk) 08:22, 31 August 2018 (UTC)

  • @Commander Keane: For clarification: you intend that such recordings only contain the title of the article, not a readout of the whole article, right?
That sounds feasible, but I am not sure it is easy (technically) to get the recording via browser. It is probably easier to have one user/client-side tool that (1) picks a list of article titles, (2) records the user saying them, (3) stores the whole thing with relevant metadata on the user's disk; and then to have another tool, either web/Wikipedia-based or client-side again, that (a) reads a directory on the user's disk, (b) uploads ogg files to Commons according to metadata or file naming conventions, (c) put relevant tags and links on article talk pages on en-wp. The first tool could be configured once to save user metadata (accent, preferred saving format, etc.) in every file. The second tool can also be made compatible with existing recordings, e.g. if someone names their files [article title]-title.ogg simple string matching can catch it, and made to upload more than title readouts (think [article title]-title.ogg, [article title]-lead.ogg and [article title]-fullarticle.ogg).
I have zero experience with audio files programming, but I can give a go at the first tool in Python, if nothing similar pops up. For the second tool, you will probably need some consensus first; since it uploads files and potentially writes to talk pages etc. it must go through approval processes (WP:BRFA on en-wp, I dunno what on Commons). There are also security issues, I think (we do not want a vandal to use the tool to upload a.ogg, b.ogg, ... z.ogg, aa.ogg, ... zzzzzz.ogg to Commons). TigraanClick here to contact me 09:15, 31 August 2018 (UTC)
No surprise - it's c:COM:BRFA on commons Ronhjones  (Talk) 19:57, 31 August 2018 (UTC)

Update GA section of Template:WPVG announcements

See my post at WT:Lua#Can Lua be used to parse a section of one page, change the contents and be transcluded on another page? for more details. Basically, the bot would take information from WP:GAN#VG, reformat it and add it to a new page that can be transcluded to {{WPVG announcements}} which is currently updated manually. @GamerPro64, TarkusAB, TheJoebro64, ProtoDrake, and Lee Vilenski: You are currently the main users doing those updates, do you think there is a problem to have a bot handle this in future? Regards SoWhy 10:02, 11 September 2018 (UTC)

@SoWhy: Are you requesting a bot to update the entire template, or just the WP:GAN section? If just one section, why not have a bot update the entire thing? Kadane (talk) 14:45, 11 September 2018 (UTC)
The GAN part is the one that sees the most changes. It would probably make sense to update the other parts of the template via bot as well but I don't really know how they are updated at the moment. I'll cross-post this to WT:VG for more input. Regards SoWhy 15:12, 11 September 2018 (UTC)
I think I suggested this a while back but usually the procedure is to get local consensus first before requesting that someone write a bot. Could pull from WP:VG/AA too. Also, a bunch of the WPVG regulars like updating lists manually, fwiw. (not watching, please {{ping}}) czar 01:51, 15 September 2018 (UTC)

Bot to fix capitalization of "Senator" in specific contexts

I'd like to request bot help to downcase "Senator" to "senator" (and the plural) in specific titles and links to those titles, as follows:

This came up at Talk:Dan_Sullivan_(American_senator)#Requested_move_8_September_2018 and I've started an RFC to see if there's any objection, at Wikipedia:Village_pump_(policy)#RFC:_Capitalization_of_Senator. So far, nobody is claiming that senator or senators in these contexts is part of a proper name. Dicklyon (talk) 03:45, 15 September 2018 (UTC)

The problem is that the capitalization of “senator” depends on context, and bots are notoriously bad at determining context. This is something that should be done manually. Blueboar (talk) 11:40, 15 September 2018 (UTC)
If you look at the bot proposal above, you can see I'm only proposing very narrow specific contexts where the bot can easily get it right; 250 specific moves, and the links to them (not messing with any piped text of course). The rest would need to be done by hand, as you note. Dicklyon (talk) 16:13, 15 September 2018 (UTC)
There should be no need to "fix all links to" any redirects that may be created by a page move; this is WP:NOTBROKEN. --Redrose64 🌹 (talk) 21:01, 15 September 2018 (UTC)
Presumably it would make sense to convert United States Senator from Alaska -> United States senator from Alaska because of how those words display on the page. If it's a piped link it wouldn't make sense to convert as you say it's not broken. -- GreenC 21:14, 15 September 2018 (UTC)
Right, if it's a piped link, someone made an explicit choice about how it should appear, and we shouldn't have a bot mess with that. It's optional, unnecessary but harmless, to fix the part before the pipe, however. Dicklyon (talk) 03:37, 23 September 2018 (UTC)
  • Oppose for multiple different reasons. First, I'm not convinced that "United States Senator" is incorrect; though I agree that "American Senator" is incorrect. Pages such as List of United States Representatives from Nebraska would be equally wrong. For categories, any change should encompass all the subcategories of Category:United States Senators. Perhaps a "Members of the United States Senate" formulation would be better. There's also no point to use a bot to update links to redirects; several of these formulations are already redirects and all of them should remain as {{R from other capitalisation}} forever. Finally, a proposal to move articles should be a WP:RM; you can probably bundle all 50 states into a single proposal if the page titles are otherwise the same. power~enwiki (π, ν) 03:45, 16 September 2018 (UTC)
    Can you explain your objections better? How can "United States Senator" be correct in the contexts mentioned? And are you objecting because I didn't go further correct other over-capitalization at the same time, like Representative? I'd be happy to add that on, but no need to do everything at once. As for links to redirects, there is absolutely a point. The redirect links usually appear in article text, over-capitalized; downcasing them corrects this very common style error in articles. Perhaps you didn't understand what corrections I meant; sorry if I was unclear. And yes I can easily generate the multiple-RM requests, but that seems like the wrong approach for such obviously uncontroversial corrections that have been discussed elsewhere; and even if the moves got done it would leave a ton of cleanup work for someone, where a bot would be a huge help. Dicklyon (talk) 03:23, 17 September 2018 (UTC)
    My concern is that in "Alabama senator", the word "Alabama" describes the person. In "United States Senator", "United States" describes the legislative body, not the person. Also, (U.S. senator) simply looks wrong to me. I doubt that consensus will agree with me on this point, so I'm not going to argue it in detail here. power~enwiki (π, ν) 04:05, 17 September 2018 (UTC)
    I don't see the difference. There's an Alabama Senate (I presume) and a United States Senate. But senator is a title whether it's an Alabama senator or U.S. senator, and doesn't need a cap except when attached to an individual's name as MOS:JOBTITLES explains. Dicklyon (talk) 02:50, 18 September 2018 (UTC)
    @Power~enwiki: Let us know if you're still concerned. Let's not worry about your "simply looks wrong to me", but rather focus on Wikipedia guidelines such as MOS:CAPS and MOS:JOBTITLES. Dicklyon (talk) 03:37, 23 September 2018 (UTC)
    Oh, you're only referring to updating un-piped links in articles? I was thrown off by the use of the word "move". I guess that's fine, though I don't see the point of changing template names (and categories should go to CfD regardless; they already have bots for that). It's probably possible to do that with AWB fairly easily; I've never used that so you'll have to ask someone else. power~enwiki (π, ν) 04:05, 17 September 2018 (UTC)
    OK, we can separately do the Categories at CfD when this is all settled. Dicklyon (talk) 02:50, 18 September 2018 (UTC)

Revised plan: downcase Senator, Senators, Representative, Representative, in these contexts, when they exist (most of the added ones don't, but we can try):

And templates:

The objection above by power~enwiki seems to have gone away, as he has not responded to pings about whether his concerned have been answered. Dicklyon (talk) 14:26, 26 September 2018 (UTC)

I'm filing a request for a bot to empty the category Category:Episode lists with unformatted air dates. This category tracks transclusions of {{Episode list}} that do not use {{Start date}} when listing airdates under |OriginalAirDate=, rather listing plain text dates, which is required as {{Start date}} produces microformats as it produces the dates.

If this request is approved, the following find-replaces (find first line, replace with second line) are required. After the first two regular expression replacements (one for MDY and one for DMY, replacing |OriginalAirDate=November 7, 2018 with |OriginalAirDate={{Start date|2018|November|7}}), the month names needs to be replaced with their respective month numbers (i.e. |November| with |11|), which can just be done through standard find-replaces as shown in the collapsed section. (There is another way that's more detailed/complicated but with less steps (only three instead of twelve), but I thought it'd be best to go with the more straightforward/simpler way to prevent any confusion. Or if the bot owner has another way, even better.)

(\|\s*OriginalAirDate\s*=\s*)([A-Za-z]+) (\d{1,2}), (\d{4})
$1{{Start date|$4|$2|$3}}
(\|\s*OriginalAirDate\s*=\s*)(\d{1,2}) ([A-Za-z]+) (\d{4})
$1{{Start date|$4|$3|$2|df=y}}
Month-number replacements
|January|
|1|
|February|
|2|
|March|
|3|
|April|
|4|
|May|
|5|
|June|
|6|
|July|
|7|
|August|
|8|
|September|
|9|
|October|
|10|
|November|
|11|
|December|
|12|

Thanks. -- AlexTW 14:47, 6 November 2018 (UTC)

Could someone confirm if I'd be able to just do this through AWB? -- AlexTW 23:22, 12 November 2018 (UTC)
 Done Through AWB due to lack of response. -- AlexTW 08:18, 19 November 2018 (UTC)

Joe's Null Bot task 5 clone

Further to Template talk:Db-meta#Template:Db-c1 and Wikipedia:Bots/Noticeboard#Checking for bot activity, is anybody able to assume the duties of Joe's Null Bot (talk · contribs)? I don't know if all tasks are stopped, or just Task 5. --Redrose64 🌹 (talk) 23:20, 9 November 2018 (UTC)

On the Joe's Null Bot page it says "Migrated 27 December 2016 to run on Tool Labs rather than my own server". Logging in there and searching for joe or null I found a project called "nullbot" and it is indeed Joe's Null Bot. It contains:
Extended content
tools.farotbot@tools-bastion-03:/data/project/nullbot$ ls -ltr
total 91404
drwxrwsr-x 2 root          tools.nullbot     4096 Dec 27  2016 logs
-r-------- 1 tools.nullbot tools.nullbot       52 Dec 27  2016 replica.my.cnf
drwxr-sr-x 2 tools.nullbot tools.nullbot     4096 Dec 29  2016 old
-rw-r--r-- 1 tools.nullbot tools.nullbot     2244 May  4  2017 savedcopy
drwxr-sr-x 2 tools.nullbot tools.nullbot     4096 Oct  8  2017 oldlogs
-rw-rw---- 1 tools.nullbot tools.nullbot      110 Oct  9  2017 cron-tools.nullbot-2.err
drwxr-sr-x 3 tools.nullbot tools.nullbot     4096 Oct 18  2017 pynull
-rw-rw---- 1 tools.nullbot tools.nullbot     1508 Dec 20  2017 cron-tools.nullbot.20.out
drwxr-sr-x 2 tools.nullbot tools.nullbot     4096 Dec 21  2017 nullbot
-rw-rw---- 1 tools.nullbot tools.nullbot      385 Mar 10  2018 cron-tools.nullbot-12.err
-rw-rw---- 1 tools.nullbot tools.nullbot      420 Jun 30 00:22 cron-tools.nullbot-11.err
-rw-rw---- 1 tools.nullbot tools.nullbot      912 Jul 25 18:13 cron-tools.nullbot-13.err
-rw-rw---- 1 tools.nullbot tools.nullbot     1018 Oct  4 17:10 cron-tools.nullbot-10.err
-rw-rw---- 1 tools.nullbot tools.nullbot     1852 Oct 12 05:42 cron-tools.nullbot-5.err
-rw-rw---- 1 tools.nullbot tools.nullbot      885 Oct 21 09:13 cron-tools.nullbot-9.err
-rw-rw---- 1 tools.nullbot tools.nullbot     3852 Oct 25 04:08 cron-tools.nullbot-4.err
-rw-rw---- 1 tools.nullbot tools.nullbot     1470 Oct 30 06:15 cron-tools.nullbot-30.err
-rw-rw---- 1 tools.nullbot tools.nullbot     5120 Nov  3 09:01 cron-tools.nullbot-8.err
-rw-rw---- 1 tools.nullbot tools.nullbot 24586391 Nov  9 05:03 cron-tools.nullbot-4.out
-rw-rw---- 1 tools.nullbot tools.nullbot  2950972 Nov  9 05:38 cron-tools.nullbot-5.out
-rw-rw---- 1 tools.nullbot tools.nullbot    36892 Nov  9 07:21 cron-tools.nullbot-6.out
-rw-rw---- 1 tools.nullbot tools.nullbot      166 Nov  9 07:21 cron-tools.nullbot-6.err
-rw-rw---- 1 tools.nullbot tools.nullbot 26290814 Nov  9 08:39 cron-tools.nullbot-8.out
-rw-rw---- 1 tools.nullbot tools.nullbot  1944205 Nov  9 09:24 cron-tools.nullbot-9.out
-rw-rw---- 1 tools.nullbot tools.nullbot  1191809 Nov  9 12:14 cron-tools.nullbot-12.out
-rw-rw---- 1 tools.nullbot tools.nullbot 36224638 Nov  9 21:18 cron-tools.nullbot.20.err
-rw-rw---- 1 tools.nullbot tools.nullbot     6044 Nov 10 00:15 cron-tools.nullbot-11.out
-rw-rw---- 1 tools.nullbot tools.nullbot     8109 Nov 10 02:02 cron-tools.nullbot-2.out
-rw-rw---- 1 tools.nullbot tools.nullbot    66422 Nov 10 03:02 cron-tools.nullbot-30.out
-rw-rw---- 1 tools.nullbot tools.nullbot    70120 Nov 10 03:10 cron-tools.nullbot-10.out
-rw-rw---- 1 tools.nullbot tools.nullbot    74470 Nov 10 03:13 cron-tools.nullbot-13.out
The cron log files are protected so can't be opened. It shows the bot is running (Nov 10) with a large err file (cron-tools.nullbot.20.err) but also crons that are running error free. A 'last nullbot' shows Joe has not logged in since Nov 1 when logs being. The Perl files were last updated in 2017. Joe's user page says "My job occasionally leaves me out of communication for as much as a couple weeks as a time." -- GreenC 04:08, 10 November 2018 (UTC)
What do you get for a ps -lu tools.nullbot shell command? How about tail -n 10 cron-tools.nullbot-5.out --Redrose64 🌹 (talk) 08:53, 10 November 2018 (UTC)
As GreenC says the files are read protected, so one gets "tail: cannot open ‘cron-tools.nullbot-5.out’ for reading: Permission denied". Processes are run in the job grid, so ps -lu can't be used; one can access information about jobs as described in here, but since the jobs are hourly no job was being run when I checked (assuming the job name is "nullbot"). Galobtter (pingó mió) 09:26, 10 November 2018 (UTC)
The job is running according to toolforge:grid-jobs/tool/nullbot. The timestamp on this task's error file is newer, so there could be a problem with it. — JJMC89(T·C) 07:12, 11 November 2018 (UTC)
Sorry I've been off-line, a combination of health (temporary, I assure you), life and moving-related changes. Due to an unrelated issue that JJMC89 noticed, the bot will be offline for a day or two, but the problem with task 5 is simple, there's a sanity check for the size of the category involved, and it has been exceeded. I have already made a fix for it, when the bot comes back up, that task should be back up. --joe deckertalk 07:53, 11 November 2018 (UTC)

These 35 wikilinks should link directly to Regina (Bosnia and Herzegovina band), so that the redirect Regina (band) can be redirected to the disambiguation page Regina, because there's also another band with the same name: Regina (Finnish band). 91.158.232.8 (talk) 01:03, 13 November 2018 (UTC)

Y Done @91.158.232.8: Please randomly verify(if you wish)the edits as directed/requested by you through this history section. You could check for pages that contains the edit summary as "Changed link for Regina band". Also, there is one redirect that remains which is to the disambiguation page. Please remove that.Adithyak1997 (talk) 14:01, 14 November 2018 (UTC)
Thank you! 91.158.232.8 (talk) 21:17, 15 November 2018 (UTC)

Bot to update 'Needs infobox'

So I was browsing the WP Backlog and came across a number of categories regarding articles needing Infoboxes. For example, Category:Baseball articles needing infoboxes and Category:BBC articles without infoboxes. Did a little clicking around and found that a number of these actually already have had Infoboxes added but the parameter on the talk page was never updated to reflect this. So, my thought was to create a bot that would do the following:

  1. Take a list of categories that are of the basic format <article type> articles needing infobox
  2. Routinely (weekly?) checks those pages for an infobox.
  3. If the page contains an infobox, updates the |needs-infobox=yes on the talk page.

I'm happy to tackle creating this bot myself but wanted to discuss it first and see what others thought? --Zackmann08 (Talk to me/What I been doing) 23:59, 22 September 2018 (UTC)

Sounds like a great idea. Category:Wikipedia articles with an infobox request definitely looks in need of bot help. -- GreenC 00:39, 23 September 2018 (UTC)
Sounds OK to me, Suggest start at Category:Wikipedia backlog, get the list and pick out the "needing infoboxes", "without infoboxes" "needing an infobox" (nothing like consistency...). Come back if you get stuck writing it. Ronhjones  (Talk) 19:47, 23 September 2018 (UTC)
@Ronhjones and GreenC: Wikipedia:Bots/Requests for approval/ZackBot 10 --Zackmann (Talk to me/What I been doing) 20:45, 28 September 2018 (UTC)

[r] → [ɾ] in IPA for Italian

In Italian the letter ⟨r⟩ – usually trilled as [r] – is systematically realised as a flap [ɾ] in certain positions, which I don't think is currently in our IPA entries. Specifically it occurs in any unstressed syllable with a vowel on either side,[1] and in the onset of a mid-word syllable with secondary stress.[2]

Formally: [r] → [ɾ] / [ˈVːɾV, (V/C)(ˌ) ɾV-, Vɾ-, -ɾ(ˈ)C-]

References

  1. ^ Which for this case should just any unstressed syllable I think.
  2. ^ Romano, Antonio. "A preliminary contribution to the study of phonetic variation of /r/ in Italian and Italo-Romance." Rhotics. New data and perspectives (Proc. of’r-atics-3, Libera Università di Bolzano (2011): 213-214.

ReconditeRodent « talk · contribs » 12:50, 7 October 2018 (UTC)

This looks similar to the script made for Wikipedia:Bot_requests#[r]_→_[ɾ]_in_IPA_for_Spanish which might be applicable with some tweaks. But it would need a discussion somewhere, and an explanation what to do as I am not familiar with lexicographic/linguistic terminology (trilled, flap, unstressed syllable, onset of a mid-word syllable with secondary stress). Would also need to develop a search formula like this. @Nardog: -- GreenC 13:10, 7 October 2018 (UTC)

Whoops, I summarised Canepari's analysis which is slightly at odds with the actual conclusion of the review. What I said earlier maybe would be how an Italian speaker hears it, but for our purposes I suppose that means the only confirmed standard realisation of [ɾ] is in the unstressed intervocalic position. So whenever there's an 'r' between any two vowels ('iueoɛɔa'), it should be replaced with an 'ɾ'. There may also be a syllable break '.' and/or vowel lengthening mark (':' or 'ː') between the 'r' and the previous vowel. This should narrow the search down to unstressed syllables automatically as the stress mark (' or ˈ ) would get between 'r' and the previous vowel.
I tried to make a search expression which seems to work but I wouldn't mind someone else checking my logic. ─ ReconditeRodent « talk · contribs » 14:39, 7 October 2018 (UTC)
44 results. Normally that is too small for a bot, but since the code is already done with a few tweaks, and it's so few cases I can run it if you want. -- GreenC 14:53, 7 October 2018 (UTC)
Oh a lot more than 44. Something wrong with my offline script gave a wrong number. -- GreenC 15:32, 7 October 2018 (UTC)
Yeah. I just realised this isn't Wiktionary. Anyway, I messed up the last search expression by putting in the stress marks so use this instead. Thanks. ─ ReconditeRodent « talk · contribs » 15:37, 7 October 2018 (UTC)
502. That's botable but not so many to require botrequest. Can we apply the same consensus from the Spanish to the Italian, do you foresee anyone would object? -- GreenC 16:11, 7 October 2018 (UTC)
Well there was a bit of debate about it back in 2009 at Talk:Italian_phonology#flap_vs_trill but the source I gave is solid, and I doubt anyone would object. For the record, this independent blog post was what put me on to this in first place. ─ ReconditeRodent « talk · contribs » 16:37, 7 October 2018 (UTC)

You couldn't give me a hint on how to write a search expression for Wiktionary, could you? The only difference is that instead of an IPA-it template you have an IPA template with "lang=it" somewhere inside it. ─ ReconditeRodent « talk · contribs » 17:03, 7 October 2018 (UTC)

This maybe, it's a technique I've never used before. Will wait a day or two for comment on the bot. -- GreenC 21:02, 7 October 2018 (UTC)

There is no consensus for this. Please discuss it first at Help talk:IPA/Italian before changing existing transcriptions. Nardog (talk) 02:12, 8 October 2018 (UTC)

We are currently discussing this at Talk:Italian phonology#flap vs trill. イヴァンスクルージ九十八(会話) 06:38, 8 October 2018 (UTC)

Make List Request

Would someone be able to make a list of articles for me? The list would be derived from the following search:

The purpose of this task is I am looking into the development of a possible workaround for the currently malfunctioning User:WP 1.0 Bot, see here for an overview of the issue: Wikipedia talk:Version 1.0 Editorial Team/Index. Just an idea I want to explore at this point and need the list to get an idea of the affected pages. Cheers, « Gonzo fan2007 (talk) @ 16:45, 4 December 2018 (UTC)

@Gonzo fan2007:  Done with AWB - took user contribs for User:WP 1.0 Bot, sorted and removed duplicates. Then saved list as text file, and a quick manual edit of the non Wikipedia:Version 1.0 Editorial Team/ stuff. Ronhjones  (Talk) 04:24, 6 December 2018 (UTC)
Scrub that - the plan was good - must have hit a download limit... Watch this space... Ronhjones  (Talk) 04:30, 6 December 2018 (UTC)
@Gonzo fan2007: Now really done :-) Alternative plan use intitle:"Version 1.0 Editorial Team/" intitle:"by quality log" in wikisearch. 2437 pages found. Ronhjones  (Talk) 04:52, 6 December 2018 (UTC)
Thank you Ronhjones! Appreciate the assistance. « Gonzo fan2007 (talk) @ 13:20, 6 December 2018 (UTC)

Request for a list of article-space pages with the most Linter errors

[I was sent here from Wikipedia talk:Linter.]

While working on Linter table tag errors, I stumbled across Greek football clubs in European competitions, which had over 1,000 Linter errors, mostly missing end tags (crazy diff here!). I was able to fix them with a series of find-and-replace operations, so it wasn't too bad. Is there a way to find pages with many Linter errors? We could reduce our total count more quickly if we could knock out some of the worst offenders.

This is a request for a list of 1,000 pages with the most errors in article space, or a list/table of all article-space pages with 20 or more errors (and how many of each type of error exist on each page). Does anyone know how to create such a list? The list will need to be recreated periodically as gnomes work through it, fixing errors. Thanks. – Jonesey95 (talk) 11:57, 4 October 2018 (UTC)

I don't know what's involved, but wonder if User:Firefly would be interested in adding this to Firefly Tools? -- GreenC 13:06, 4 October 2018 (UTC)
Jonesey95 Done! I realised this could be done through quarry, and after some fiddling around got Quarry:query/30386 working. I made into a table at User:Galobtter/Articles_by_Lint_Errors Galobtter (pingó mió) 19:09, 14 October 2018 (UTC)
Excellent! Thanks Galobtter. Is this something that could be updated periodically, like once a week? I put it on my watchlist and will be working on the articles. – Jonesey95 (talk) 04:42, 15 October 2018 (UTC)
Jonesey95, Yes; I can manually run the query every week and that'll only take a few minutes work; but I'll also see if I can get User:Galobot on it :) Galobtter (pingó mió) 04:49, 15 October 2018 (UTC)

Mass move of election articles

The naming guideline for election articles was recently amended as a result of an RfC to move the year from the end to the start of the article title. As part of the proposal, I stated that if successful, I would request a bot run to move the thousands of articles affected.

As the RfC was closed in favour of the change, we now need a bot run to move the articles. I have prepared this offline in an Excel file and can also provide it in a txt file to the bot owner by email. If it needs to be on-wiki, I can create a few pages in my sandbox with a full list of the proposed moves. Cheers, Number 57 21:52, 15 October 2018 (UTC)

@Number 57: It's been a while (couple months since I've done bot work), but I'm interested. Just need a bit for some API research and some time to wrap my head around this. Also need to figure out how many articles would be affected and how to find them. --TheSandDoctor Talk 23:03, 15 October 2018 (UTC)
Found them, just need to figure out the "rules" for the bot to rename them by. --TheSandDoctor Talk 23:05, 15 October 2018 (UTC)
Prototyping looks good so far. Email sent as well Number 57. --TheSandDoctor Talk 03:41, 16 October 2018 (UTC)
@TheSandDoctor: No need to find the articles – I've prepared a list of articles that should be changed and the new names (sorry, perhaps should have been clearer above about what I'd prepared offline). Sent by email. Cheers, Number 57 07:25, 16 October 2018 (UTC)
BRFA filed --TheSandDoctor Talk 18:13, 17 October 2018 (UTC)

I created new category Category:Lists of location userboxes to better organize userboxes. I would like to change (and add where it's missing) all entries of

[[Category:Lists of userboxes|.*]] 

to

[[Category:Lists of location userboxes|{{subst:SUBPAGENAME}}]]

in all pages of following PrefixIndex searches:

  1. Special:PrefixIndex/Wikipedia:Userboxes/Location
  2. Special:PrefixIndex/Wikipedia:Userboxes/Travel
  3. Special:PrefixIndex/Wikipedia:Userboxes/Life/Citizenship
  4. Special:PrefixIndex/Wikipedia:Userboxes/Life/Origin
  5. Special:PrefixIndex/Wikipedia:Userboxes/Life/Residence

—⁠andrybak (talk) 08:53, 21 October 2018 (UTC)

Andrybak - Done- Since there were not many pages in your request (~400), I went ahead and ran the pages through AWB using my main account. You can review all the changes using [this link]. Regards. — fr+ 17:48, 22 October 2018 (UTC)
Thanks, f. Marking it as Y Done. —⁠andrybak (talk) 17:40, 24 October 2018 (UTC)

Number Bot

I have an idea to create a bot for correctly writing numbers as the correct English form. For example, in the statement, “There were 8 people on the beach” the bot would change the sentence to the correct form, “There were eight people on the beach.” Or the bot would also change sometihing such as, “The cost was twenty cents,” to, “The cost was 20c.” The bot could follow the common rule where numbers less then 13 would be written out. The bot would also have a feature to detect dates and it wouldn’t change numbers. I feel this should be added for being a, “tedious task.” — Preceding unsigned comment added by DonutDerped (talkcontribs) 03:24, 22 November 2018 (UTC)

This really wouldn't be a good task for a bot, given how context-heavy the edits would have to be. Additionally, if the MOS doesn't explicitly state that "numbers less than 13" must be written out in words, then it's very unlikely that a bot would be approved. Primefac (talk) 03:26, 22 November 2018 (UTC)
@DonutDerped: Absolutely not. See WP:CONTEXTBOT. --Redrose64 🌹 (talk) 20:23, 22 November 2018 (UTC)
@DonutDerped: while it seems like a good idea nad would be helpful, as Redrose points out, WP:CONTEXTBOT prevents this. --Zackmann (Talk to me/What I been doing) 22:08, 25 November 2018 (UTC)

linkfixes arrs.net -> arrs.run

hi!
(fyi user:GreenC)
arrs.net is now a porn site. formerly (according to archive.org) it was a website of the "Association of Road Racing Statisticians" and has moved to https://arrs.run/.
So I guess, all linkes should be converted. The subpage structure seems to be unchanged. -- seth (talk) 11:46, 2 December 2018 (UTC)

Doing... -- GreenC 16:00, 2 December 2018 (UTC)

Y Done Example. -- GreenC 18:40, 2 December 2018 (UTC)

Hi!
amnesty.org changed their url layout several months ago. All external links that look like

http://....amnesty.org/library/...

should get fixed. And there are many of them: special:linksearch/*.amnesty.org, special:linksearch/https://*.amnesty.org.

I used the following regexps (perl syntax) in dewiki for replacements:

  # first search pattern
  qr~https?://(?:[a-z.]+\.)?amnesty\.org/library/(?:index|print)/[a-z]{3}(?<path1>[a-z]{3}[0-9]{2})(?<num1>[0-9]{3})(?<year>[12][0-9]{3})(?:\?open&of=[a-z]{3}-[a-z]{3}|)~i;
  # first replacement
  "https://www.amnesty.org/documents/\L$+{path1}/$+{num1}/$+{year}/en";

  # second search pattern
  qr~https?://(?:[a-z.]+\.)?amnesty\.org/(?<lang>[a-z]{2})/library/(?:asset|info)/(?<path1>[A-Za-z]+[0-9]+/[0-9]{3}/[12][0-9]{3})(?:(?:/[a-z]{2}/)?(?:[0-9a-f-]+/|dom-[A-Z]+)[a-z0-9]+\.(?:html|pdf))?~;
  # second replacement
  "https://www.amnesty.org/documents/\L$+{path1}/$+{lang}";

I could use my bot (de:user:CamelBot), but it's trained for dewiki and their templates and so would need some adaptions to enwiki. I guess, there are bots here already that can do this out of the box, right? -- seth (talk) 14:33, 2 November 2018 (UTC)

Be aware of archive URLs eg. https://web.archive.org/web/19991231010101/http://amnesty.org/library/.. shouldn't get modified or it will break the archive URL. -- GreenC 15:04, 2 November 2018 (UTC)
Hi!
Just in case, I wasn't clear enough: Is there any bot that is specialized on (regexp-based) link replacements here at enwiki or even globally?
Of course, that bot should cope with archived urls and templates such as template:webarchive. -- seth (talk) 10:44, 10 November 2018 (UTC)

Not aware of a bot specialized for link replacement, there should be. I do it frequently with awk regex, but it doesn't use PCRE so I'm having trouble understanding the statements what needs to be done. For example given this URL:

https://amnesty.org/en/library/asset/AMR25/002/1999/en/aa762f2f-e34a-11dd-a06d-790733721318/amr250021999en.html

It uses the second search pattern and would produce https://www.amnesty.org/documents/AMR25/002/1999/en is this correct? @Lustiger seth:-- GreenC 05:58, 14 November 2018 (UTC)

I have the regex figured out, now working to update WP:WAYBACKMEDIC to undo the archives as most of them have been archived. Seems like a logical extension of its function and it has its own custom template and ref parsing libraries which can be adapted. These will work in the search box, looks like around 1500 total:

  • First search : insource:/https?:\/\/([a-z.]+\.)?amnesty\.org\/library\/(index|print)\/[a-z]{3}([a-z]{3}[0-9]{2})([0-9]{3})([12][0-9]{3})/
  • Second search: insource:/https?:\/\/([a-z.]+\.)?amnesty\.org\/([a-z]{2})\/library\/(asset|info)\/([A-Za-z]+[0-9]+\/[0-9]{3}\/[12][0-9]{3})((\/[a-z]{2}\/)?([0-9a-f-]+\/|dom-[A-Z]+)[a-z0-9]+\.(html|pdf))/

-- GreenC 18:20, 14 November 2018 (UTC)

Hi!
@GreenC: yes, this is correct. There are two different types of old urls to be transformed into one new url scheme. You mentioned an example for the second type. An example for the first type would be
http://asiapacific.amnesty.org/library/Index/ENGAFR440131994?open&of=ENG-360
-> https://www.amnesty.org/documents/AFR44/013/1994/en/ (or https://www.amnesty.org/en/documents/AFR44/013/1994/en/)
The only thing is: I'm not sure, whether
<https://www.amnesty.org/documents/AMR25/002/1999/en/ or
https://www.amnesty.org/en/documents/AMR25/002/1999/en/
is the better new url. The first url is just a forwarder to the second. So maybe the second is better. But I don't know for sure.
-- seth (talk) 22:05, 14 November 2018 (UTC)
@Lustiger seth:. Thanks. I've been finishing up another project and now looking at this. The URLs go to a landing page where users select a language which then goes to a PDF under a different URL. The problem is that the PDF URL is invisible to Wikipedia archive bots (IABot) so it will never be archived at Wayback Machine and could be lost to link rot. There are two solutions: set the URL on Wikipedia to the underlying PDF URL. Or have the update bot trigger a "save page now" at Wayback so the PDF URL is archived. The second solution works but only for Wayback, other archive providers scanning Wikipedia for links to save will still never see it. Nevertheless I'm leaning towards the "save page now" at Wayback option and run it on all amnesty.org links even those that don't need to update URLs, to ensure the underlying PDF links are archived. -- GreenC 16:05, 18 November 2018 (UTC)
@GreenC: I agree. The second solution is better for the users, because the landing page is more comfortable for them.
Another possibility (a variation of the first solution) would be linking to all pages. Something like "AI report [landing_page], in languages [pdf en], [pdf es], [pdf fr]". But this might be too much info for the users. -- seth (talk) 17:41, 18 November 2018 (UTC)

@Lustiger seth: Bot has successfully cleared search 1, about 64 articles, difs in User:GreenC bot. I'll wait a day or so before proceeding to search 2, which is around 1000. -- GreenC 21:27, 19 November 2018 (UTC)

@GreenC: looks good. Just a small thing: It would be great if your bot could write in the summary something more meaningful, such that its jobs can be distinguished in a better way. :-) -- seth (talk) 21:45, 19 November 2018 (UTC)
Well it's a massive bot that does many things it's scanning for any changes not just Amnesty (though that is mainly what it finds in this case). The edit summary links to User:GreenC/WaybackMedic 2.1 which details most of them and I will add URL moves as #31. I also left a mini "FAQ" on the bot talk page. -- GreenC 21:53, 19 November 2018 (UTC)

@Lustiger seth: this is  Done. I also did a Save Page Now (SPN) on each of the new links and their underlying PDF links (up to 5 for each). This later part for en, de, fr, ru and zh wikis. BTW if you come across other URL moves, please ping me as my bot is ready. I'll do the same if you are interested. -- GreenC 17:31, 22 November 2018 (UTC)

Remove infobox image requests from WP templates when the article has an infobox image

This should not be too hard. Many WikiProjects like WP:VG use a cover=yes (or cover=y) switch in their WikiProject banner on an article's talk page to populate a request category (e.g. Category:Video game articles requesting identifying art). However, sometimes editors add a cover to the article and forget / don't know about the WP banner request, leaving the article in the category despite not needing an image anymore. I'd like to request a bot to check all articles in Category:Video game articles requesting identifying art and see if the infobox has an (existing) image defined. If so, the bot should remove the cover=yes (and of course ideally log the removal somewhere so we can check whether it made mistakes). That way, when trying to eliminate the backlog, editors won't have to load articles that were already fixed. This would be a manual bot, run every once in a while. Anyone feeling like coding something like that? Regards SoWhy 07:17, 9 October 2018 (UTC)

There are articles that have screenshots in the infobox - a valid existing image that will usually need to be retained and moved to another location in the prose - but the request for a cover is a genuine one. So this request is not as straight forward as if image present, remove cover required flag. A better solution would be to have a bot run that creates a list of articles that have an image and a request for a cover, users can then manually clear that list first. This means that the category will then be free of articles with images and requests, eliminating the need for a regular bot run. - X201 (talk) 09:27, 9 October 2018 (UTC)
Articles with screenshots in the infobox are indeed a problem but how often will this actually be the case? Of course, the bot could instead compile a list of articles which contain an image but have cover=yes and someone can check them manually and then feed the list sans those false positives back into a bot/tool/AWB. I'd be happy to help check such a list if generated. Regards SoWhy 10:26, 9 October 2018 (UTC)
I'm generating a list now. - X201 (talk) 13:26, 9 October 2018 (UTC)
@SoWhy: Here you go. User:X201/Cover required but image present - X201 (talk) 14:28, 9 October 2018 (UTC)
@X201: Thanks! I'll go through it later and notify you when I'm done. Regards SoWhy 14:38, 9 October 2018 (UTC)

BBLd - Linkfix

https://en.wikipedia.org/w/index.php?title=Special:LinkSearch&limit=500&offset=0&target=http%3A%2F%2F%2A.bbl-digital.de

instead of

http://bbl-digital.de/eintrag/$1/
http://www.bbl-digital.de/eintrag/$1/

it should be

https://bbld.de/$1

i.e. new domain, httpS, no "eintrag/" and no final "/". 78.55.121.98 (talk) 02:33, 23 September 2018 (UTC)

It doesn't seem to work with this test:
http://www.bbl-digital.de/eintrag/Adlerberg-Woldemar-Eduard-Ferdinand-v.-1791-1884/
https://bbld.de/Adlerberg-Woldemar-Eduard-Ferdinand-v.-1791-1884
-- GreenC 12:49, 23 September 2018 (UTC)

I was curious if this is a problem in huwiki. So I checked both your link and the huwiki list, tried several links and they all worked. There is an automatic redirect on that site to the correct address without eintrag and trailing /, so IMO it is not worth to change them unless somebody is bored. Not a big deal although. If some of them does not work, another problem may cause the trouble, that's why GreenC's test didn't work. Bináris (talk) 14:47, 27 September 2018 (UTC)

I don't understand how an external link works on huwiki but doesn't work on enwiki. -- GreenC 16:11, 27 September 2018 (UTC)
I clicked on the uppermost link in this section and tried several links randonly, all worked. You may have found one problematic. Bináris (talk) 16:18, 27 September 2018 (UTC)
Do these work? [3][4][5][6][7][8][9][10] .. for me only two of them work (near the end). They were not chosen randomly from the list, they are the first 8 (starting at line #14). -- GreenC 16:25, 27 September 2018 (UTC)
Same for me. But the problem is not with the form of the link or the redirect. These are simply bad or broken links. Bináris (talk) 16:54, 27 September 2018 (UTC)
Got it. Someone would need to manually find the correct links and update the pages. Only about 40 to check. -- GreenC 17:03, 27 September 2018 (UTC)
Green and Bináris, I have recently resolved the issue mentioned above. Please do verify through random checking. Also, I am having a doubt whether the archive link present in some of the links that were added by InternetArchiveBot needs to be removed or not. For example: [August Volz]. I actually don't know whether it needs to be removed or not.Adithyak1997 (talk) 11:39, 8 November 2018 (UTC)
Adithyak1997, thanks for taking this on. I would recommend deleting |archive-url=, |archive-date= and |dead-url= if the source |url= is different and working (verified), and update |access-date=. -- GreenC 14:05, 8 November 2018 (UTC)
Green, with what value do I need to update |access-date= with? Is it today's date?Adithyak1997 (talk) 15:07, 8 November 2018 (UTC)
Yep! -- GreenC 15:37, 8 November 2018 (UTC)
@Green, The archive link has been removed and |access-date has been added. Please verify whether any changes needs to be made in the bot request table which is present at the top of this page.Adithyak1997 (talk) 18:17, 8 November 2018 (UTC)
@GreenC: please see the above. Is everything in order? --TheSandDoctor Talk 17:35, 17 December 2018 (UTC)
@TheSandDoctor: No idea lost track of what the problem is. Assuming good faith it was fixed. Only about 24 links. -- GreenC 20:12, 17 December 2018 (UTC)
@GreenC: Thanks. In that case Y Done. --TheSandDoctor Talk 20:14, 17 December 2018 (UTC)

Hello! There are many articles about albums that have links to album reviews on the AllMusic website. It seems that some time ago, I'm not sure when, AllMusic changed their URLs. So now, a lot of the links to the AllMusic reviews are to obsolete URLs. It would be great if a bot could find these and change them to the current URLs. But, I'm not sure how hard or easy it would be for a bot to figure out the current URLs. (I've updated some of these manually, and I can generally find the current AllMusic page for an album review by doing a search for the album on the AllMusic site itself.) As a further complication, the references in some of the articles contain the URL for the AllMusic review itself, while others (for example this one) use the {{AllMusic}} template to generate the URL. I don't know how many articles include the outdated links, but I would think there have to be thousands of them. So, how does this sound so far? Mudwater (Talk) 18:29, 11 October 2018 (UTC)

Given this old and new link how would a bot determine r1701846 -> mw0000649874? Link to discussion Wikipedia_talk:WikiProject_Albums#AllMusic_links. -- GreenC 18:50, 11 October 2018 (UTC)
I haven't detected a pattern that can be used to convert programmatically from the old links to the new ones. If there isn't one, then the bot would have to use some kind of search to find the new links. And as I said, I'm not sure how hard or easy that would be to implement. Mudwater (Talk) 19:40, 11 October 2018 (UTC)
This is difficult but one possible solution: checking the old URL at Wayback returns the redirected URL, thus resolving r1701846 -> mw0000649874 [11]. If they all have this I don't know. One could extract the new URL from the redirected Wayback URL. Probably need to go through 10 of 20 samples to see how many work and if the rule holds. If so it might be automated. -- GreenC 20:53, 11 October 2018 (UTC)
Sounds like a promising line of inquiry! Mudwater (Talk) 22:10, 11 October 2018 (UTC)
I'll take this on. Will be busy the next 2-3 weeks might make some progress, if not after that. Number of complications including archive URLs (can't modify a source URL if there is corresponding archive URL already in place), the AllMusic template and probably more than the "/album/" links. A bot probably won't get them all but will narrow the field. -- GreenC 15:00, 12 October 2018 (UTC)
Sounds good to me. Thanks! Mudwater (Talk) 00:15, 13 October 2018 (UTC)

@GreenC: and everybody: Hey, guess what. It looks like the old links are now redirecting to the new links, within AllMusic itself. I think that was happening before and then stopped working. But it looks like it's working again now. I guess they fixed it at their end. (Here's a random example, but there are thousands of these puppies on Wikipedia.) So, I guess we're good, for now anyway. Mudwater (Talk) 00:40, 26 October 2018 (UTC)

Renaming Template:Category elections by year to Template:Category U.S. State elections by year

Accoring to the discussion at Wikipedia:Redirects for discussion/Log/2018 November 18#Template:Category elections by year:

Highly misleading redirect to Template:Category U.S. State elections by year. The template was created at this name on 5 March 2013, but was moved the following day to its present stable title. There are about 6,000 uses of the old title, which will need to be changed by a bot. But this trivial bot job will stop the ambiguous title being mistakenly used on categories for elections other than those in US states. If there is consensus to do this, a request at WP:BOTREQ will have it done easily. BrownHairedGirl (talk) • (contribs) 22:50, 18 November 2018 (UTC)

I was the editor who redirected it in 2013. It looks like all uses of the template's old name are used for US states. I suspect there would be little or no objection to the request. Thank you. —GoldRingChip 02:41, 19 November 2018 (UTC)

Since Template:Category elections by year is deleted and nothing links to it, I assume this is done. Ronhjones  (Talk) 04:04, 6 December 2018 (UTC)

Bot to archive a website

Hello, I would like to request a bot to add website archives to these webpages, for Wikipedia pages in article-space and template-space only (maybe draft space as well). The person who maintained the website died this week, unfortunately. I was thinking InternetArchiveBot or something similar could do this. epicgenius (talk) 18:15, 23 November 2018 (UTC)

@Epicgenius: Although I suspect it's already been done with the automated processes, I issued a Save Page Now (SPN) for these links on Wayback. They are also likely saved at archive.is , if and when the links die IABot will replace with the archive version. Thanks for the notice. -- GreenC 16:15, 26 November 2018 (UTC)

WP:Today's featured article/recent TFAs has monthly subpages such as WP:Today's featured article/recent TFAs/February 2018. Currently those pages are created manually; we seem to have fallen behind (February or possibly by now March 2018 are the newest). A how-to guide to creating these pages was given here. Is it possible to have a bot create those subpages and fill them, at least to some degree? Apparently the page views are a particularly cumbersome task for humans, while I expect a bot would have trouble assigning FA category and country. I envision one run to get rid of the current backlog, and then monthly runs to prevent a new backlog from creeping up. If people frequently working on the behind-the-scenes work surrounding TFAs agree, it might be useful for the bot to leave a note at some talk page, e.g. WT:Today's featured article, when a new page has been created to prompt humans to complete it. Huon (talk) 11:25, 4 December 2018 (UTC)

Why do we have two sets of monthly pages - e.g. WP:Today's featured article/February 2018 and WP:Today's featured article/recent TFAs/February 2018? --Redrose64 🌹 (talk) 11:28, 4 December 2018 (UTC)
My understanding is that the former archives the blurbs, pictures and so on as they appeared on the main page, while the latter is meant to be more of a set of behind-the-scenes statistics. If you think a general discussion about the usefulness of the pages is helpful before a bot is created or modified, WP:Today's featured article seems a good place. Huon (talk) 14:07, 4 December 2018 (UTC)
I'm willing to enhance the FACBot to handle this. I don't foresee too many problems. You used to be able to easily retrieve the page view stats programmatically; I think it is far more difficult nowadays, but I'm willing to give it a go. Hawkeye7 (discuss) 22:58, 7 December 2018 (UTC)
Coding... Well that was simple. Looks like I already coded most of it in September 2017, and the page views routine is already written and tested. Hawkeye7 (discuss) 04:36, 10 December 2018 (UTC)
BRFA filed Sample page created at Wikipedia:Today's featured article/recent TFAs/August 2018. Manual runs can generate any desired month. The Bot run will generate a report for the previous month. Hawkeye7 (discuss) 00:56, 12 December 2018 (UTC)

Generating a list of a sub-set of articles using a specific template

Not sure if this is the right place, so sorry in advance if it isn't.

Would it be possible to get a list of all main-space articles (so no talk-, user-, template-pages, etc.) using Template:Infobox television episode or Template:Infobox Television episode that also have parenthesis in their title (basically I want all disambiguated articles using these templates)? --Gonnym (talk) 19:43, 10 October 2018 (UTC)

6k or so. May include some redirects--handling for redirects in search is optimized for readers rather than editors, so sometimes results aren't great. --Izno (talk) 20:41, 10 October 2018 (UTC)
Thanks a lot Izno! --Gonnym (talk) 21:16, 10 October 2018 (UTC)
Here's a Quarry query to get the information. This includes all the redirects to Template:Infobox television episode], not just Template:Infobox Television episode, and specifically selects only titles that end with something looking like a disambiguator. Anomie 12:38, 14 October 2018 (UTC)
Did not see this answer, thanks! --Gonnym (talk) 11:31, 24 October 2018 (UTC)

Hi! In regards to CAT:MISSFILE, we have a significant issue which could be helped greatly by the creation of a few bots. For the past few weeks now, Sam Sailor and I have been patrolling the backlogged category (you can check our documentation of the backlog in Sam Sailor’s documentation log and Katniss’s documentation log.) In helping to lessen the backlog, we noticed a few patterns which could greatly reduce the back of the category if a bot were to preform the tedious manual tasks.

  1. Often, users will attempt to change the name of a file in good faith to correct a perceived typo. There have also been several cases in which even experienced users will change certain incorrect punctuation in a file name. Of course, this causes the image names to link to files are not technically linked or uploaded on the project servers.
  2. Another large issue is users adding file names to the articles before they are uploaded, and then often times, forgetting to upload them completely.
  3. There are also a few issues with the current bots created for this category, User:CommonsDelinker and User:Filedelinkerbot. Even after they are removed from Commons, the bots are sometimes not performing their duties. There were several instances of files (and audio files) deleted from Commons back in July which were still present in the articles when we manually removed them in September.

As an experiment, neither of us patrolled CAT:MISSFILE for a period of 10 days, and the backlog already grew again to 681 articles in the short time that the category was not patrolled (for more information on this, you can read our talk page conversation). Though neither of us are particularly tech savvy (and thus wouldn't know the technical way to describe the commands that would be most efficient for the bots to perform), we believe that the creation of bots to perform these tasks would help to greatly reduce the backlog in that category. If anyone has any suggestions, thoughts, or ideas in regards to creating bots to efficiently completely these tasks, that would be great! Thanks! (Courtesy ping for KylieTastic, who may potentially be interested in this conversation even though she is currently on WikiBreak) Katniss May the odds be ever in your favor 14:05, 27 September 2018 (UTC)

In terms of existing bots @Magnus Manske and Krd:.. Filedelinkerbot is running and CommonsDelinker is running. If they are not removing certain links report the errors as you find them. Also, maybe ask them to investigate new features like above; since they wrote bots designed for this sort of thing a good place to start. -- GreenC 15:57, 27 September 2018 (UTC)
Thanks for your advice, GreenC. I have invited them both to contribute to this discussion. To clarify, the issue goes beyond a few errors. In patrolling CAT:MISSFILE the past five months, I can safely say there have been hundreds of files over that period that are deleted but the bots are not catching them. The reason for creating this discussion (in addition to exploring the other two suggested features) was to hopefully come up with a solution less tedious than reporting every single file that the bots aren't catching, as it's clearly a lot. Thanks, Katniss May the odds be ever in your favor 18:32, 27 September 2018 (UTC)
I appreciate the initiative, but I sadly don't have enough resources to check hundreds of files. Please advise if there is some pattern visible, or please report few examples for further investigation. Thank you. --Krd 18:59, 27 September 2018 (UTC)
Hi Krd, the only pattern I've noticed is that probably 99.9% of the files that aren't being deleted are ones that were deleted from Commons by Explicit, though I have no idea if that means anything or not. Regarding the other feature requests as well, would either of the issues above warrant creating new features on the existing bots (or new bots altogether)? In example, maybe a bot could detect users changing a filename in the mainspace without uploading it/changing the name of the file itself first, and revert those edits? Could a bot (either existing or new) also detect if users are adding images to the article mainspace that aren't uploaded (e.g. a file has never been uploaded to Wikipedia under that name before)? Hopefully I'm explaining myself okay, as I said this would be my first time making a bot request and thus I'm not sure if I'm explaining what's in my head right. Let me know if any clarification is needed, and I would be interested to get people's opinions on this! Best, Katniss May the odds be ever in your favor 13:01, 28 September 2018 (UTC)
Regarding your feature request, I'd say this is definitely more that a simple task, more than a few additional lines of code, but a small new project which I definitely have no resources for, neither for the actual coding nor the later maintenance.
Regarding the actual problem, please choose one example (and if possible please keep it uncorrected until I was able to look). --Krd 14:03, 28 September 2018 (UTC)
User:KatnissEverdeen. Both ideas (#1 and #2) are technically feasible (IMO), and justifiable, but not simple, maybe even kind of hard. Assuming the two features were implemented, estimating in your experience, how many links do you think such a bot would be fixing on a weekly basis? Dozens, hundreds, thousands? -- GreenC 14:54, 28 September 2018 (UTC)
Hi Krd, this one wasn't deleted by Explicit, but here is one example. File:Fyodor_Petrovich_Tolstoy_2.jpg was deleted from Commons on September 24, 2018, but is still showing up as a missing (red-linked) file on List of Russian artists. This was found just by checking two articles at random, the second one being the example I outlined above, so I'm sure there are plenty more given I was able to find one so quickly. From patrolling that category, this is a pretty regular occurrence. Katniss May the odds be ever in your favor 19:08, 28 September 2018 (UTC)
@KatnissEverdeen: As far as I see, Filedelinkerbot currently works for files only that are deleted at Commons, but File:Fyodor Petrovich Tolstoy 2.jpg was deleted locally at enwiki. I don't know why this isn't active, but think I can easily add that, although I'm not sure if additional discussion or approval is required. What do you think? --Krd 07:33, 1 October 2018 (UTC)
Krd, I would say there has been enough discussion. I can't imagine why someone wouldn't agree with you doing it. It's probably worth saying though that there are some files that were actually deleted from Commons (don't have any examples at the moment, but I could probably find one if you need it) that are slipping under the radar of the bots. Most of the issue comes from enwiki files however. Katniss May the odds be ever in your favor 13:23, 1 October 2018 (UTC)
I'm going to active it now, please block the bot if something goes wrong. And please report further examples, if there appear any in the future. --Krd 17:07, 1 October 2018 (UTC)
@Krd: You should probably file a BRFA for the "deleted at en.wp" part. I guess it will be approved quickly. --Izno (talk) 18:01, 1 October 2018 (UTC)
Done. --Krd 06:08, 2 October 2018 (UTC)

Thank you, Krd! The bot seems to be working beautifully and has already reduced the backlog by several hundred files! I would say #3 of my list above is done! Katniss May the odds be ever in your favor 13:34, 2 October 2018 (UTC)

Hi GreenC, let's see, so in the two weeks (give or take a day) that the category hasn't been patrolled, the category is up to 706 not counting the 10 templates that are also in that category. So let's say roughly 700 in the category at the time of my reply. I would say from experience that about 75% of the pages in that category are there because of the three things I outlined above. So 75% of 700 would be 525 files in two weeks, divided by two for a single week would make ~263 a week. Katniss May the odds be ever in your favor 19:12, 28 September 2018 (UTC)
Bots are terrible at evaluating context, and #1 and #2 are quite different. I think you will struggle with #1 unless you have a bot that is constantly running and uses "recent changes" to check if the file name has been altered and if the resulting file name is bad - otherwise you can end up with a page that could have several edits between the file change and the current version, and having a bot revert a much older edit won't be easy. I wonder if such a system could be added to ClueBot - ping @Cobi:. #2 is not a problem, however if you don't fix #1 then the bot will end up removing links when someone has changed the file name - one could wait a fixed time before doing that, so it it's not fixed in X days, then the link gets removed. Ronhjones  (Talk) 20:14, 30 September 2018 (UTC)

Hi Ronhjones, I really like your suggestion about adding a feature to ClueBot. Often, some of the file name changes are sneaky vandalism anyways, so it would make sense for ClueBot to have a feature which catches that. What would you (or anyone who wants to jump in) say is an appropriate amount of time to wait? A week, a few days, a few hours (as ClueBot seems to always catch vandalism right away)? Another suggestion, would it be possible to adding a warning when someone alters a file name that would pop up as they hit save? For example, "You are about to alter a file name. Changing a file name, even due to typos/grammatical errors, may cause the file name to break. Are you sure you want to do this?" It certainly wouldn't prevent all cases, but it would definitely cut down on the accidental file name changing issue. Katniss May the odds be ever in your favor 23:36, 30 September 2018 (UTC)

@KatnissEverdeen: Hopefully the cluebot operator will comment. Another option, which just occurred to me, is an edit filter - thus you could trap all the #1 with an edit filter (maybe warn and only allow respected editors to edit), then that makes it easy to attack the rest as just needing the image link removed. I'm no edit filter expert, why not suggest at Wikipedia:Edit filter/Requested and see what happens? If you can get a quick stop/revert system to fix #1, then I would think a 1 day wait would do for #2, with say, the bot running once a day. Ronhjones  (Talk) 00:01, 1 October 2018 (UTC)

Good suggestion, Ronhjones! I'll suggest it, but since I'm definitely not an edit filter (or bot request!) expert, what would you say qualifies as a "respected user" in this case? Autoconfirmed? Someone with 100+ edits? Just asking to make sure I'm understanding correctly, as well as the fact that it's not just non-autoconfirmed users that are making this mistake. Katniss May the odds be ever in your favor 01:01, 1 October 2018 (UTC)

@KatnissEverdeen: I "think" the edit filter can accept/deny based on the user edit count. I would think autoconfirmed too low, would let in the determined vandals, 50-100 would be nice. Also they might be able to stop AWB edits - I saw a page you fixed where there was an AWB run fixing "fancy" quotes to normal quotes. Ronhjones  (Talk) 01:06, 1 October 2018 (UTC)

Ronhjones That "fancy" quote issue actually comes up more often than you'd think, in fact I've even seen many experienced users make the mistake. I've gone ahead and made the edit filter request here. Katniss May the odds be ever in your favor 01:32, 1 October 2018 (UTC)

Kat, WP:Edit filter can't revert but they can log, warn or block edits. I see they are also only meant to be used for abusive edits, this may or may not be considered vandalism. Will see what they say. Ronhjones is right that #2 is more feasible than #1, though I think #1 is (theoretically) possible by monitoring EventStreams and comparing diffs, it's just would be a lot of work and resources to setup and run. --GreenC 12:46, 1 October 2018 (UTC)

GreenC I would say that the file name changes fall into three categories. 1) deliberate/blatant vandalism 2) sneaky vandalism, often changing one letter of a filename (i.e. if the file name was "Katniss," a user might vandalize it and change it the K to a C, "Catniss", etc.) 3) good-faith attempts, such as correcting punctuation or spelling, to 'correct' a file name (which breaks the file). I would argue that the first two categories would warrant an edit filter, and there's not really any harm in having the edit filter warn or stop edits in the third category seeing as it's usually a mistake anyways. Katniss May the odds be ever in your favor 14:37, 1 October 2018 (UTC)

I agree with GreenC, in that a bot for #1 can only be done by a continuously running one, monitoring the changes - not easy (and not for me - I only run RonBot from the PC). I commented on the edit filter page - deliberate/blatant vandalism does occur and I have often see many a nice portrait replaced with a large image of some sexual organ - admittedly they do get reverted fairly quickly, but not before half a dozen annoyed readers have posted an e-mail to OTRS! Ronhjones  (Talk) 15:29, 1 October 2018 (UTC)

Regarding "There have also been several cases in which even experienced users will change certain incorrect punctuation in a file name.", what about a bot that does something like what User:DPL bot does, which is to notify users who introduce red linked file names.. Galobtter (pingó mió) 17:19, 1 October 2018 (UTC)

Not sure how DPL bot 2 works and if the strategy would work here - ping the owner for comment @JaGa: Ronhjones  (Talk) 19:37, 1 October 2018 (UTC)

Just to give people a heads up that may not be following this page, the edit filter request I made was denied due to technical limitations. Seeing as an edit filter isn't feasible, is there maybe another way we could have a warning pop up when someone hits save (before the page is offically updated)? Such as, "You are about to alter a file name, which if not properly linked will break the image. Are you sure you want to do this?" or something like that. Katniss May the odds be ever in your favor 18:58, 11 October 2018 (UTC)

@KatnissEverdeen: The only thing to stop a live change in it's tracks is an edit filter. MusikAnimal said We can to some extent detect changes to image link syntax, but unfortunately we can't detect if the image exists - which does not really help you too much - especially the "to some extent" part - if they could have detected all image link changes, at least that might have been useful to pop up a "you are changing an image link" message. I don't see any easy answer now. If we were to check if the link points to a deleted image, and if so remove it, then we are at risk of there being a deleted image that someone has coincidentally changed the file name to - the bot would not be able to work out that it was a bad name change, and not a unlinked deletion. One option is to just remove all image links that do not point to a current image, that I suspect would be rather controversial (but much easy to code!) and would certainly need some discussion and support before any approval. We could have a bot that just sends a message to the last editor to say that there is now a broken image link, and ask if they broke it - but if it's an IP or it's a vandal then it's probably a waste of time. Ronhjones  (Talk) 01:17, 15 October 2018 (UTC)
Thanks for replying, Ronhjones. From my experience, the majority of editors doing this are IPs who haven't necessarily familiarized themselves with the rules, so the bot sending people messages probably wouldn't help unless it is an experienced editor who has accidentally altered an image name. "One option is to just remove all image links that do not point to a current image" - I'm not sure I agree with this idea. I think, like you said, we would be at the risk of removing valid image links which vandals have altered. Is there a way for a bot to sense if the file name has been changed recently? Katniss May the odds be ever in your favor 13:04, 15 October 2018 (UTC)
I think you mentioned that some experienced editors also break file links, so there would be some use to notifying experienced editors that they have broken the image right? Galobtter (pingó mió) 13:40, 15 October 2018 (UTC)
@KatnissEverdeen and Galobtter: I think Galobtter has a valid point, if we are only posting a simple standard template (to be written) to a user talk page, then there is no harm done. Even if said editor was not the culprit, since they have edited the page, they conceivably might have an interest in the subject and might fix it anyway. One could also duplicate the message on the article talk page - editors might have the page in their watch lists - OR, better still I think, a small template to the top of the article page like the cleanup tags - I could write a task for my bot to do all or some of that (and as it's an adminbot, page protection does not get in the way).
As for working out the change of link, that does really need a continuously running system like ClueBot or DPL bot 2 (owners were pinged but never answered), which can monitor the recent changes. To try to do that, say once a day, I think would be a nightmare of trying to compare revisions - easy for a human, difficult for a bot (and not for me, as my bot talks are PC based).Ronhjones  (Talk) 15:31, 15 October 2018 (UTC)
DPL Bot 2 doesn't check continuously but twice a day, seeing what new dab links are there and notifying users (See BRFA). I can see a similar system working for this category too. Galobtter (pingó mió) 15:36, 15 October 2018 (UTC)

Given what we have tried/suggested so far, I would say creating a standard template to post to the user talk page and article talk page would definitely help with the issue, at least a little bit. Katniss May the odds be ever in your favor 15:40, 15 October 2018 (UTC)

Coding... @KatnissEverdeen:I'll make something up. Of course, it does not stop their being another bot added later, if someone can work out a useful method. Ronhjones  (Talk) 16:36, 15 October 2018 (UTC)
@Ronhjones: Thanks in advance for creating a talk page template for this. I also patrol this category and a template would be very helpful! (Although an update to ClueBot also seems like a great idea for some of the image-related vandalism reversions.) - tucoxn\talk 16:49, 15 October 2018 (UTC)

BRFA filed Ronhjones  (Talk) 23:37, 15 October 2018 (UTC)

Ronhjones, I know you've withdrawn the BRFA, but I wonder if it would still be useful to tag the articles as having a broken image even with notifications of the editor who broke the file link, since in most cases a notification isn't sent (as the editor is an IP/non-autoconfirmed user)? Galobtter (pingó mió) 18:54, 17 October 2018 (UTC)
Galobtter Ok, I'll trim down the code, undelete the template (handy being an admin :-) ), and un-withdraw. Ronhjones  (Talk) 19:09, 17 October 2018 (UTC)

BRFA filed This is in regard to messaging users a la DPL Bot 2. Ping KatnissEverdeen Galobtter (pingó mió) 10:09, 16 October 2018 (UTC)

fix ping @Tucoxn and Sam Sailor: Galobtter (pingó mió) 10:10, 16 October 2018 (UTC)
Wonderful, thank you both, Ron and Galobtter. Sam Sailor 10:16, 16 October 2018 (UTC)
Thanks everyone! Katniss May the odds be ever in your favor 14:14, 16 October 2018 (UTC)

Related to this conversation, please see this diff in the discussion for Wikipedia:Bots/Requests for approval/Filedelinkerbot 3. It's very interesting to note that ImageRemovalBot "went AWOL last month." It seems like that bot's operator has not been editing much recently. I bet all of this is linked to the recent increase in red-linked files we're seeing at CAT:MISSFILE. - tucoxn\talk 15:21, 18 October 2018 (UTC)

Hmm, that's very interesting and that's for letting us know Tucoxn. Is there anything we can do to get the bot back online, if the operator has left permanently? Katniss May the odds be ever in your favor 16:08, 18 October 2018 (UTC)
On the subject of ImageRemovalBot, I'm not sure why it's not running. I'll give it a kick and see if that fixes it. --Carnildo (talk) 02:38, 25 October 2018 (UTC)
In terms of Krd's FileDelinkerBot and ImageRemoval Bot, I'm still seeing several examples of files deleted from Commons that are not being removed from pages or templates. Jungle (Tash Sultana song), Template:Portal/doc/all and Template:POTD protected/2016-04-15 are just a few examples. Katniss May the odds be ever in your favor 16:02, 27 October 2018 (UTC)
Interesting. While the first example may have been before the bot was fully active, and I cannot find the relevant file in the second example, for the third case the file is definitely in the Filedelinkerbot log, saying it has been tried to unlink that from the mentioned page. Sadly there is no reason why it didn't work. --Krd 16:23, 27 October 2018 (UTC)
Hmm, that's super strange, Krd. I can't find the relevant file in the second example either now, but there was definitely a broken file showing up yesterday. Oddly, it's also still in CAT:MISSFILE, so there must be something still broken there. That's too bad about the third example. Katniss May the odds be ever in your favor 19:02, 28 October 2018 (UTC)

File name vandalism

So I've been spending sometime manually cleaning out Category:Articles with missing files. One of the things that I've found is that 99% of the time, the reason the file is broken is vandalism. Someone has come in and vandalized the file name on a page. Would be pretty cool to have a bot that would look at recent edits (particularly by IP address users) that have resulted in pages being placed in that category and reverting them. --Zackmann (Talk to me/What I been doing) 20:03, 1 November 2018 (UTC)

That would not be a good task for a bot as per WP:CONTEXTBOT -- the bot has no way of knowing if the edit is correct. What checks would it perform to not revert good edits? —  HELLKNOWZ   ▎TALK 20:24, 1 November 2018 (UTC)
@Hellknowz: I think there would be a way to figure it out. Having written a few bots myself... If a page was not in the category, then I make an edit to the page, and now it is in the category... My edit broke the file. --Zackmann (Talk to me/What I been doing) 20:28, 1 November 2018 (UTC)
What if I simply made a typo and didn't get a chance to fix? Or added a different file with wrong extension? Or reverted someone's edits to version with broken file? Or the file was deleted and I edited the page? —  HELLKNOWZ   ▎TALK 20:49, 1 November 2018 (UTC)
@Hellknowz: I didn't say it would be easy... I just thought it was worth discussing. Geeze. --Zackmann (Talk to me/What I been doing) 21:35, 1 November 2018 (UTC)

There are two BRFA ongoing related to broken file links. See Wikipedia:Bot_requests#CAT:MISSFILE_bot above. -- GreenC 20:54, 1 November 2018 (UTC)

Add talkref to talk page sections with ref tags

Talk pages can get pretty messy when there are a bunch of sections with <ref>...</ref> tags, but without {{talkref}} placed. I thought perhaps a bot could patrol talk pages and add these automatically where appropriate. There are probably some issues I haven't considered, but I just wanted to toss out the idea. –Deacon Vorbis (carbon • videos) 00:30, 1 November 2018 (UTC)

By the way, I wouldn't mind doing the bulk of this myself. I've never looked much into how bots work, but wouldn't mind using this as an excuse to learn more in this area. –Deacon Vorbis (carbon • videos) 15:48, 2 November 2018 (UTC)

Bot to search and calculate coordinates

Please look at this table: Lands_administrative_divisions_of_New_South_Wales#Table_of_counties

My goal is to add a column to this table that shows the approximate geographical coordinates of each county. Those county coordinates can be derived form the parish coordinates that are found in each county article, by taking the middle of each northernmost and southernmost / easternmost and westernmost parish coordinates. Is it possible to write a script or a bot to achieve this?

For illustration, I did the work for the first county in the list, Argyle County, manually. The table of parishes in this article shows that they range from 34°27'54" and 35°10'54" latitude south and 149°25'04" and 150°03'04" longitude east. The respective middle is 34°49'24" and 149°44'04", which I put in the first table entry of Lands administrative divisions of New South Wales and the info-box of Argyle County. --Ratzer (talk) 17:47, 20 December 2018 (UTC)

How do you determine middle, averaging? ie. adding up the 34s and 35s and dividing the total by the number of rows? -- GreenC 23:59, 20 December 2018 (UTC)
@Ratzer:The result in Lands_administrative_divisions_of_New_South_Wales#Table_of_counties is not correct. The range is not 34°27'54" and 35°10'54", it's 34°21′54″ and 35°10'54" (you seem to have missed "Bourke"). My calculations (good old Excel!) make it (Min(latitude)+Max(latitude))/2 and (Min(longitude)+Max(longitude))/2 with a result of 34°46'24" and 149°44'04". An average of the 50 coordinates would give 34°46'37.92" and 149°45'22.28". I would also ask - is there some strange rounding errors going on - 50 readings of which 48 have 54 seconds at the end is very dubious, the other figure has the same amount of 4 seconds! One of the trickier parts is going to be writing the result into the table page (for just a single run) - there are only 141 lines, I would be inclined to suggest a WP:BOTUSERSPACE bot, just writing the formatted math results to a bot user page for someone to copy and paste to Lands_administrative_divisions_of_New_South_Wales#Table_of_counties. I would be happy to do that. Ronhjones  (Talk) 22:36, 22 December 2018 (UTC)
Coding... I'll code something up based on the above Ronhjones  (Talk) 17:25, 25 December 2018 (UTC)
@Ratzer:See User:RonBot/NSWcoords, you have mid points (first table) or average points (second table). Ready formatted for copy and paste. Run time was 130 seconds. Ronhjones  (Talk) 21:12, 25 December 2018 (UTC)
@Ronhjones: Thanks a million, I'm impressed. I used the average coordinates in the table, although I think average or middle coordinates makes no difference, especially when the arcsecond precision appears to exhibit gross rounding areas. The nearest arcminute (about 2 kilometers) is more than enough precision in this context, where counties are some 75 kilometers across, on the average. Greetings from Bavaria,--Ratzer (talk) 14:21, 26 December 2018 (UTC)
Y Done Glad to help Ronhjones  (Talk) 15:17, 26 December 2018 (UTC)

Move highbeam.com -> questia.com

Per discussion with User:Samwalton9 (WMF), bot for a URL move. This spreadsheet contains the moves. I believe the new URL can be algorithmic determined. Unclear if every highbeam.com URL can be moved, or only these ones. Spreadsheet sortable by country code. Pinging User:Lustiger_seth for Dewiki. I will work on Enwiki. The rest, might need to post on the Botreq forums but the numbers are not large. Sam mentioned consider in addition to change URL, changing the citation text string "HighBeam" to "Questeria". -- GreenC 16:27, 4 January 2019 (UTC)

@GreenC: Questia themselves said that these are the only URLs that should be able to be moved directly from the full list of URLs on the top ~10 wikis, so I think the bot should be limited to those. As for the citation text, you'll need to consider bare URLs that include the HighBeam text, e.g. [https://highbeam.com/url Citation information, 2018, via Highbeam] in addition to within cite templates, like the Highbeam cite at James Millar (loyalist) which includes |publisher=Belfast Telegraph via HighBeam Research. Might be as simple as just changing HighBeam or HighBeam Research to Questia, but could not be. Samwalton9 (WMF) (talk) 16:45, 4 January 2019 (UTC)
Hi!
Thanks for pinging, GreenC.
Concerening the algorithmic determination: the replacements
s/https?:\/\/www\.highbeam\.com\/doc\/1E1-([a-z-]+)\.htm.*/https:\/\/www\.questia\.com\/read\/10-$1/i;
s/https?:\/\/www\.highbeam\.com\/doc\/([A-Z0-9-]+)\.htm.*/https:\/\/www\.questia\.com\/read\/$1/i;
s/https?:\/\/www\.highbeam\.com\/doc\/([A-Z0-9-]{3}):([0-9-]++).*/https:\/\/www\.questia\.com\/read\/$1-$2/i;
should transform all urls of the given list (containing >6k urls) correctly (according to the list), except from one url
https://www.highbeam.com/doc/1G1-111320408 -> https://www.questia.com/read/1G1-111320408
(which could be handled by a forth replacement). -- seth (talk) 18:53, 4 January 2019 (UTC)
I tried to use those (algorithmic) replacements on some of the links in dewiki (that were not part of the spreadsheet), but none of them work. So maybe(!) the spreadsheet contains the only urls that can be fixed. -- seth (talk) 11:58, 5 January 2019 (UTC)
Hi Seth yes that is the case, Samwalton9 said above "the bot should be limited to those" (in the spreadsheet) sorry that was not more clear. -- GreenC 17:47, 5 January 2019 (UTC)
Oops, thanks. :-) -- seth (talk) 18:13, 6 January 2019 (UTC)

@Samwalton9 (WMF): Y Done enwiki. -- GreenC 01:56, 8 January 2019 (UTC)

I moved some pages, and there are many links to fix.

JSH-alive/talk/cont/mail 19:10, 9 December 2018 (UTC)

@JSH-alive: I am not sure what needs fixing exactly? Those appear to redirect as you have indicated. --TheSandDoctor Talk 07:49, 11 December 2018 (UTC)
@TheSandDoctor: I need to fix the links at once (especially those in the navigational templates). JSH-alive/talk/cont/mail 15:40, 11 December 2018 (UTC)
See WP:NOTBROKEN -- GreenC 16:19, 11 December 2018 (UTC)
Guess I'll have to manually fix them. (I was going to redirect ABC News (TV channel) and ABC News (radio) to ABC News (disambiguation), though.) JSH-alive/talk/cont/mail 16:54, 11 December 2018 (UTC)

Convert usages of Template:Infobox Hollywood cartoon to Template:Infobox film

Template:Infobox Hollywood cartoon has been merged into Template:Infobox film per the result of Wikipedia:Templates for discussion/Log/2018 November 8#Template:Infobox Hollywood cartoon and no additional comments at Template talk:Infobox film#Merge process with Infobox Hollywood cartoon. Could someone help me out with a bot operation that will convert usages of {{Infobox Hollywood cartoon}} into {{Infobox film}}?

Mapping should be as followed:

  • Not changed:
    • |italic title=, |name=, |image=, |image_size=, |alt=, |caption=, |director=, |producer=, |narrator=, |animator=, |layout_artist=, |background_artist=, |color_process=, |studio=, |distributor=, |runtime=, |country=, |language=. Also do not modify any other parameters supported by {{Infobox film}}.
  • Changed:
    • |cartoon_name= converted to |name=
    • |image size= converted to |image_size=
    • |story_artist= and |story artist= converted to |story=
    • |voice_actor= and |voice actor= converted to |starring=
    • |musician= converted to |music=
    • |layout artist= converted to |layout_artist=
    • |background artist= converted to |background_artist=
    • |release_date= and |release date= converted to |released=
    • |color process= converted to |color_process=
    • |movie_language= converted to |language=
  • Not merged Delete:
    • |series=
    • |preceded_by=
    • |followed_by=

Thanks. --Gonnym (talk) 09:28, 9 December 2018 (UTC)

@Primefac: this might be a good job for your bot, if it is willing. – Jonesey95 (talk) 12:19, 9 December 2018 (UTC)
Yeah, I can do it. Might not be tonight or anything, but it saves the hassle of putting through a new BRFA for someone. Primefac (talk) 14:56, 11 December 2018 (UTC)
I have the time now where I could do/file this. Are you still good Primefac? (No rush, just double checking) --TheSandDoctor Talk 05:55, 15 December 2018 (UTC)
Forgot, thanks. Will get to this shortly. Primefac (talk) 17:10, 16 December 2018 (UTC)

I didn't say I only enabled it if there were changes, I said I never enable it for if there are no changes. There is no harm in using a template redirect. Primefac (talk) 17:07, 23 December 2018 (UTC)

Gonnym, as an update, I found a typo in my module which caused a few of the parameters to not be dealt with correctly. I'm doing a second run now and that should take care of those params. Primefac (talk) 01:09, 27 December 2018 (UTC)
Thank you for all your work, much appreciated! --Gonnym (talk) 11:43, 27 December 2018 (UTC)

school project for desktop Encyclopedia

Get pages for school project and save them to local database, in order to create an offline desktop encyclopedia. — Preceding unsigned comment added by Smaragda2 (talkcontribs) 15:51, 10 November 2018 (UTC)

I take it this is something to do with Wikipedia:Bots/Requests for approval/school project for desktop encyclopedia? The specification is far too vague. --Redrose64 🌹 (talk) 23:35, 11 November 2018 (UTC)
Offline versions of Wikipedia already exist. See WP:DUMP. Headbomb {t · c · p · b} 13:17, 14 November 2018 (UTC)

WikiProject tagging

If it's possible, I'd like to request a bot to tag with the WikiProject Television banner, categories and pages relevant to the project. My question is, what would be the best option to supply such a list without also including incorrect articles in the mix. Is this allowed and possible? If so, any opinions on how best to tackle this? --Gonnym (talk) 23:29, 11 November 2018 (UTC)

There are several bots approved for a WikiProject tagging run. In most cases you should supply a list of article categories, and should not specify something like "and subcategories of those", since that has led to problems in the past with mistagging. Instead, each subcategory that is to be processed should be explicitly listed. --Redrose64 🌹 (talk) 23:40, 11 November 2018 (UTC)
Follow up question. I'm currently making the list and while checking the Category:Years in television by country category tree, all the sub and sub categories of this are valid entries. Should I still list all the specific categories? --Gonnym (talk) 08:49, 12 November 2018 (UTC)
It will depend upon who picks this up... some botops are more lenient than others when it comes to "and subcategories of those". --Redrose64 🌹 (talk) 23:40, 12 November 2018 (UTC)

Add a wikiproject template to New York City parks articles

Could someone make a bot script to add {{WikiProject Protected areas}} to all the talk pages of articles in the following categories:

given that the template, or any of the templates that redirect to it, isn't already on the page.

If the script can automatically add a |class= parameter based on existing wikiproject banners, it would be appreciated. Thanks. epicgenius (talk) 14:10, 30 October 2018 (UTC)

Epicgenius, no one else seems to have chimed in so I'll give this one a go ProgrammingGeek talktome 18:26, 13 November 2018 (UTC)
Coding... ProgrammingGeek talktome 23:56, 13 November 2018 (UTC)
BRFA filed ProgrammingGeek talktome 01:53, 14 November 2018 (UTC)

Move naldc.nal.usda.gov -> naldc-legacy.nal.usda.gov

Per discussion User_talk:Citation_bot#Update_naldc.nal.usda.gov_URLs

I will fix on enwiki. @Lustiger seth: at dewiki -- GreenC 16:41, 29 December 2018 (UTC)

Only in 1 dewiki article de:Lothar Weinmiller it is fixed. -- GreenC 16:54, 29 December 2018 (UTC)
Hi GreenC!
Thanks for pinging. And yes, nothing left anymore. :-)
However, I guess, the links can be fixed in another way:
  • old: https://naldc.nal.usda.gov/naldc/download.xhtml?id=42375&content=PDF
  • suggested: https://naldc-legacy.nal.usda.gov/naldc/download.xhtml?id=42375&content=PDF
  • alternative: https://naldc.nal.usda.gov/download/42375/PDF
Im my opinion the last one looks best, but I don't know what url will be more stable. -- seth (talk) 17:10, 29 December 2018 (UTC)
Oh didn't know of the third form. Just finished updating 250 articles to the '-legacy' form. -- GreenC 18:07, 29 December 2018 (UTC)

Removing the venue parameter from Template:Infobox album when it doesn't apply

I originally raised this at Template talk:Infobox album#Including the venue parameter for studio albums when substituting last month. There have been several users (one of whom, most notably, has been Zackmann08) transcluding thousands of uses of Template:Infobox album on albums, and for studio albums, inserting the unnecessary parameter |venue=. This parameter is not needed for the vast majority of studio albums as they were recorded in studios, not live venues. The template explicitly states in bold to use this parameter for live albums—so then it has no use being included for other types of albums. I, and I have noticed other users doing so as well, often remove this parameter upon discovering it has been added to articles because it has does not apply to them. So I'm requesting if a bot can remove the venue parameter from uses of Template:Infobox album on articles where the infobox already has its |type= defined as "studio" (or "album", as this is often used by users who don't know to write "studio"). It's not a big deal if it is removed anyway—if it's needed for a type of album, it can be restored as necessary. But these cases are few and far between, not for the vast majority where |venue= has been added by users just because they're automatically transcluding a template without much consideration for what those albums actually are. Thanks. Ss112 02:39, 6 November 2018 (UTC)

So your solution to "thousands of unneeded edits" is a bot that will perform thousands more edits that ABSOLUTELY are not needed?? The parameter is blank and therefore not being used so there isn't any problem with it... You are looking for a solution where there is no problem. Most infoboxes have parameters that are only to be used in certain situations. As long as those parameters are left blank, there isn't a problem. As I said when you first brought this up with me (and I note that you mentioned my username in this post but didn't link to me so I wouldn't be notified), this isn't a problem at all. If you are so worried about venues being added for other types of albums, then add a tracking category. If the type param is not live and a venue is provided, place the page in the tracking category. But having a bot remove an unused parameter is just a waste of everyone's time.
 Denied per WP:COSMETICBOT --Zackmann (Talk to me/What I been doing) 05:26, 6 November 2018 (UTC)
BAG note: @Zackmann08: You are not a BAG member, and have no authority to approve or deny bots. Do not claim otherwise. Headbomb {t · c · p · b} 13:20, 14 November 2018 (UTC)
@Headbomb: I didn't realize I needed to be a BAG member to approve or deny. Had this been a formal BRFA I wouldn't have commented that. I felt that given that it was a bot request and clearly violated WP:COSMETICBOT it was safe for me to comment that. I have learned something and appreciate your comment. Note that I have struck my comment above. --Zackmann (Talk to me/What I been doing) 17:13, 14 November 2018 (UTC)
@Zackmann08: Thanks. Approved/denied is very specific to the BRFA process, much like you wouldn't comment say "Accepted" when it comes to an ARBCOM case, when the only people who can do that are ARBCOM members (and only by a majority vote). Your general objection is noted though. Headbomb {t · c · p · b} 17:18, 14 November 2018 (UTC)
@Headbomb: Learn something new every day! Out of curiosity, how does one become a member of WP:BAG? Perhaps we can discuss on my talk page? --Zackmann (Talk to me/What I been doing) 17:23, 14 November 2018 (UTC)
The process is outlined in the bot policy at at Wikipedia:Bot policy#Bot Approvals Group. Headbomb {t · c · p · b} 17:26, 14 November 2018 (UTC)
@Zackmann08: Did you think I thought you wouldn't see this? I already knew you were a regular here when Jonesey95 suggested I put in a request here. I don't feel the need to tag users upon every mention of their username, so I didn't care whether you saw it or not. Otherwise it seems like you're implying I had bad intentions by "noting" I didn't notify you. So then I must say it seems a little telling that you would deny this because you think I've attempted to rag on you without notifying you. Maybe others have a different view. Why don't you let them comment and deny the request or offer their opinions, since I'm so obviously complaining about you just racking up your edit count without consideration for the unnecessary parameters you're adding all over the place? Not that it really needs to be said, but you are one user. Your view that it isn't a problem doesn't mean nobody else thinks it isn't a problem. The tracking category is an absolutely pointless venture, because evidently I want the pointless parameters to be removed, not to track instances of it for...what reason exactly? Maybe I can get somebody to knock up a script to do it, since I don't think this request page is the be-all and end-all and that all semi-automated tasks must go through here. Ss112 13:16, 6 November 2018 (UTC)
Also I don't know if you're attempting to direct quote me or paraphrase what you thought I was saying, but I never said they were "thousands of unneeded edits". I never said substituting the template to update its parameters was "unneeded". It is needed (although I thought we got bots to do this and get it done quicker, instead of users). But along with that has come thousands of insertions of |venue= in instances where it doesn't apply, and even where it has previously already been removed. Ss112 13:26, 6 November 2018 (UTC)

@Ss112: I'm not sure a bot is needed for this exactly. Or at least for what you requested exactly. The template could easily be updated to throw an error / put problem articles in a category if |venue= is set when |type=Studio/whatever. A bot that pre-emptively removes an empty |venue= likely wouldn't be approved without consensus to show this task was desired, although removal of an empty parameter under certain condition (e.g. substantive edits are made) likely would be. Headbomb {t · c · p · b} 13:27, 14 November 2018 (UTC)

If the rendered page output is not affected then it is a cosmetic change. There are thousands of infoboxes with blank parameters, I see no reason for this task. Ronhjones  (Talk) 21:18, 26 November 2018 (UTC)

Short descriptions: find & replace

From WP:WikiProject Short descriptions#Which articles have a short description on Wikipedia?:

... about 400 are using the SHORTDESC magic word. These should be converted to the standard {{Short description}} template for ease of maintenance.

The task in question consists of finding each article containing

{{SHORTDESC:<xyz>}}

and replacing this code with

{{Short description|<xyz>}}

There are in fact 327 of these at the moment. Would any bot operator like to undertake this? With thanks: Bhunacat10 (talk), 00:03, 12 January 2019 (UTC)

This is an easy search-replace task, but it should be automated IMO, once a month or something. I don't mind adding a cron job on Toolforge unless there is a better idea, or more logical place to do so with an existing tool. -- GreenC 00:20, 12 January 2019 (UTC)
@GreenC and Bhunacat10: I'd like to take a crack at it with awb once my current bot request is processed. Would that be okay? --DannyS712 (talk) 02:15, 12 January 2019 (UTC)
Why not use your Python skills and setup a cronjob on Toolforge so it runs forever. AWB will fix them today but in a year there will be more again. -- GreenC 03:04, 12 January 2019 (UTC)
@GreenC: I don't know how to use toolforge or what a cronjob is. For now I would use AWB on a ~weekly basis (if approved), and would then devote the time to learning toolforge? --DannyS712 (talk) 04:47, 12 January 2019 (UTC)
@DannyS712: A cron job is a computer task executed automatically at a set time, usually at regular intervals (which may range from once per minute up to once per year). This is useful for running periodic maintenance tasks or generating reports, particularly if each edition of the report needs to cover exactly the same time period as the previous ones (a business might use a cron job to start off a daily sales report each night at 00:01, or a weekly report every Sunday at 18:00, etc.). In contrast to tasks initiated by a logged-in user (who would need to log in, set the task off, wait for it to complete, and log off again), they're instead run by something behind the scenes, known as "cron", so that the user who wants the job done can go home, and arrive the next morning knowing that it will have been done for them.
WP:Toolforge is the name given to some of the Wikimedia servers that are dedicated to running maintenance tasks and the like. --Redrose64 🌹 (talk) 16:18, 12 January 2019 (UTC)
@Redrose64: Could I do it as a user initiated task while separately figuring out how to do it with toolforge? --DannyS712 (talk) 17:06, 12 January 2019 (UTC)
You might want to use the 327 as test data at BRFA. The nature of wikipedia is the data holds unexpected surprises and the more test data you have to work with the better, when developing a bot. For example, a bot would ignore cases involving nowiki, <!-- comments -->, <pre>pre </pre>. (like in this post). That's just off the top of my head. -- GreenC 20:26, 12 January 2019 (UTC)
@GreenC: At some point ill learn github and try to figure out how to use toolforge, but for now I don't have the time. I'd like to do a bot run with awb for this task, but if someone else wants to make a tool that does this automatically then fine. --DannyS712 (talk) 23:22, 12 January 2019 (UTC)
OK I will do it then. It would help me in creating the bot to have the dataset available to learn and test from, not previously fixed by an AWB regex search-replace. If AWB is the kind of work you seek, try Wikipedia:AutoWikiBrowser/Tasks - it is the AWB equiv of BOTREQ. There are unresolved AWB requests in the archives of that board. Definitely try Toolforge, a unix shell account. Github not required though they recommend it eventually. -- GreenC 00:32, 13 January 2019 (UTC)

Y Done -- GreenC 16:32, 20 January 2019 (UTC)

The USA isn't in Asia

WP:NRHP maintains lists of historic sites throughout the USA, using a template that (among other things) displays each site's geocoordinates. Problem is, occasionally someone omits the minus sign, leaving a site in the wrong part of the world; in this old revision of National Register of Historic Places listings in Maury County, Tennessee, the coords for Zion Presbyterian Church (|lon=87.145) placed it in western China.

Could someone run through all pages whose title begins with "National Register of Historic Places listings in" and log all of the entries with coordinates placing them in the Eastern or Southern Hemispheres? Please do not fix them at this point, since there are a few sites that really are in the Eastern Hemisphere (you'll find a couple at National Register of Historic Places listings in Aleutians West Census Area, Alaska, for example), and at least National Register of Historic Places listings in American Samoa has some Southern Hemisphere locations. Presumably the bot could create a page in its userspace noting each list with potential problems and mentioning the names of the sites on each list with the offending coords; a human could easily run through this list and remove false positives, like the Aleutians and American Samoa.

Thank you. Nyttend (talk) 21:07, 13 January 2019 (UTC)

You could also try some regex searches like hastemplate:"NRHP row" insource:/lon *= *[0-9]/ and hastemplate:"NRHP row" insource:/lat *= *-/. PrimeHunter (talk) 23:06, 14 January 2019 (UTC)

Recently, consensus was reached to move all the articles on elections and referendums to have the year at the front (e.g.: "United States Senate elections, 2018" was moved to "2018 United States Senate elections"; see Wikipedia talk:Naming conventions (government and legislation)/Archive 2#Proposed change to election/referendum naming format, issue resolved on 20 November 2018). This left us with a huge number of redirects, sometimes double redirects. I was wondering if there is a chance that a bot fixes all those links. --Checco (talk) 09:38, 28 December 2018 (UTC)

Resolved

Category:Pages using infobox bridge with unknown parameters has at 'L' probably 2500 articles which have obsolete parameters. Could they be removed? They are |lat= |long= |map_cue= and |map_text=. I can then deal with the proper errors. Twiceuponatime (talk) 11:11, 24 December 2018 (UTC)

Picking an article at random from the "L" section, I found Folly Bridge, which has four unsupported parameters, all of which are empty. Removing them would be a cosmetic edit except for the removal of the hidden category, which could more easily be accomplished by setting "ignoreblank = y" in the unknown parameter check. Blank unsupported parameters do no harm and are usually ignored in the error check. I don't see a discussion on the template's talk page that resulted in the non-standard removal of "ignoreblank = y"; I recommend that it be reinstated so that the tracking category shows only actual errors. – Jonesey95 (talk) 21:34, 24 December 2018 (UTC)
For the record it's |ignoreblank=1, but I've made that change since it's likely uncontroversial. For whatever reason the TemplateData tracking actually shows no invalid params in use, which I don't think I've ever seen in an infobox. My bot does have clearance to remove invalid params from template usage, but only after a discussion determines there are simply too many to remove manually. Primefac (talk) 22:24, 24 December 2018 (UTC)
The category is empty now. It looks like the TemplateData report was correct. – Jonesey95 (talk) 00:40, 26 December 2018 (UTC)

Getting a list of data from "lblN" parameters of Template:Infobox character

I'm wondering if someone can help me out with a bot that would go over the articles listed in Category:Articles using Infobox character with multiple unlabeled fields, get the text of the "lblN" parameters (|lbl1=, |lbl2=, etc.) and output it to a list/table so that I can see what text is being used and how many times? If this can be combined with the unknown fields used at articles listed at Category:Pages using infobox character with unknown parameters that would be even better. Is this possible? Thanks. --Gonnym (talk) 10:56, 20 December 2018 (UTC)

If you can persuade someone to add TemplateData to the template's documentation page, the next monthly report will list all of the parameters in use and their values. – Jonesey95 (talk) 11:58, 20 December 2018 (UTC)
@Jonesey95: I've added it now, but looking at the report for Infobox television episode it seems that when there are more than 50 unique values, it doesn't list them. Am I not looking in the right place? --Gonnym (talk) 14:46, 20 December 2018 (UTC)
I don't know of an easy way to get those values, but if you return here in early January after the report is generated, you'll be able to supply a list of articles for someone to analyze for this parameter. – Jonesey95 (talk) 23:46, 20 December 2018 (UTC)
But I already supplied a list of articles - those in Category:Articles using Infobox character with multiple unlabeled fields. --Gonnym (talk) 13:43, 21 December 2018 (UTC)
My mistake. Sorry about that. – Jonesey95 (talk) 14:06, 21 December 2018 (UTC)
Y Done Hi @Gonnym: I wrote a custom module to pull the data for the first part of your request using AWB. The data and the custom module script are here. I picked up 35 labels supported by the template. Please click Edit source and copy the data to a text file. Ganeshk (talk) 03:07, 8 January 2019 (UTC)
Thank you very much Ganeshk! --Gonnym (talk) 19:39, 8 January 2019 (UTC)

| pushpin_map = Czechia

Please make this COMMONNAME change:

before

after

| pushpin_map = Czechia Prague Central

| pushpin_map = Czech Republic Prague Central

| pushpin_map = Czechia Prague Charles Bridge

| pushpin_map = Czech Republic Prague Charles Bridge

| pushpin_map = Czechia

| pushpin_map = Czech Republic

Number of spaces (or Tabs) may vary

Thanks Chrzwzcz (talk) 12:54, 8 December 2018 (UTC)

The location map modules in question, e.g. Module:Location map/data/Czechia Prague Central, appear to be the module equivalent of redirects, so as far as I can tell, these edits would be cosmetic edits (no effect on the rendered page). Chrzwzcz, is there a consensus to delete these redirects? Also, is there a reason that you are requesting only these three and not the other eight or so maps that start with "Module:Location map/data/Czechia"? – Jonesey95 (talk) 13:29, 8 December 2018 (UTC)
Czechia is not a commonname (CZech Republic talk page), even such cosmetic "invisible" occurrences are not welcome. Other 8 are not used (no more, I made some single changes and some renaming myself). All would be unified as "Module:Location map/data/Czech Republic"*. I am not asking for deleting that 11 Czechia redirect pages, but for change of the links to them. Chrzwzcz (talk) 14:07, 8 December 2018 (UTC)
Your request appears to be in conflict with this RFC. – Jonesey95 (talk) 14:33, 8 December 2018 (UTC)
Yeah, but another discussion clearly stated what COMMONNAME still is (not Czechia), and Czechia mentions are still being deleted (rewritten to Czech Republic). In other words Czechia is not allowed in totally random articles, basically it is OK only when citing the source word by word (like ISO standards, UN list or EU document). Chrzwzcz (talk) 15:04, 8 December 2018 (UTC)
It's clear to me that WP:COSMETICBOT and WP:NOTBROKEN both apply here. --Redrose64 🌹 (talk) 17:22, 8 December 2018 (UTC)

Would like to see a bot with a task of changing Template:convert to Template:cvt if the parameter abbr=on is present. It can also do similar changes to an abbreviated template from the actual template. ⊂Emoteplump (Contributions) (Talk) 14:34, 1 February 2019 (UTC)

Reason for this request is that Template:Convert is a highly used template with almost a million links while Template:cvt only has about 12 thousand links. ⊂Emoteplump (Contributions) (Talk) 14:36, 1 February 2019 (UTC)
Declined Not a good task for a bot.. This falls afoul of COSMETICBOT. Primefac (talk) 15:20, 1 February 2019 (UTC)
Requester has been blocked for socking. --Emir of Wikipedia (talk) 21:06, 3 February 2019 (UTC)

Bot to convert Template:Fb cl2 team transclusions to use Module:Sports table

I have tried my hand at creating a bot for this but it is super complicated. I've got a script that I'm running but have yet to get the results reliable enough to be able to use it as a bot. Right now I basically use it to just expedite the process. Basically I copy the table into my code, run it and then copy the results back into the browser. The issue is that I have to manually adjust each result before saving. The need for this/decision to make this change is all covered in this TFD. If anyone is willing to take this on, please let me know? I'd be very eager to work with you and help in any way I can. --Zackmann (Talk to me/What I been doing) 19:42, 20 December 2018 (UTC)

I am more than happy to help, but am going to need some specifics on what exactly needs doing? How are you converting them? --TheSandDoctor Talk 10:25, 24 December 2018 (UTC)
@Zackmann08: Oops, forgot ping. --TheSandDoctor Talk 10:27, 24 December 2018 (UTC)
TheSandDoctor, Bot may use User:Frietjes/fb.js to convert Template:Fb cl2 team transclusions to use Module:Sports table Hhkohh (talk) 13:37, 3 January 2019 (UTC)
TheSandDoctor, your bot can convert them if you are willing to do the following converting because Frietjes did not develop script which can convert them into Module:Sports results
Plastikspork said they were doing something with this. Galobtter (pingó mió) 10:56, 24 December 2018 (UTC)
Frietjes has a script that does it, but it requires human input in the process. I would rather have the tables converted consistently, with meaningful team abbreviations, than some generic AAA, BBB, CCC, etc. The script requires human input to help determine the abbreviations. Thanks! Plastikspork ―Œ(talk) 13:04, 25 December 2018 (UTC)
Plastikspork, maybe team abbreviations input use T1, T2, T3 and so on and competition abbreviations input use C1, C2, C3 and so on in order to support bot task Hhkohh (talk) 13:22, 3 January 2019 (UTC)
@Plastikspork, Hhkohh, and Galobtter: Is there a table of abbreviations? I am not very familiar with the sport. --TheSandDoctor Talk 18:17, 3 January 2019 (UTC)
TheSandDoctor, I do not find it. So I have asked it in WT:FOOTY and fb script can provide default abbreviations but need adjust if necessary Hhkohh (talk) 18:49, 3 January 2019 (UTC)
@Hhkohh: I have tried using the userscript, but it produces no difference? --TheSandDoctor Talk 18:51, 3 January 2019 (UTC)
Why not write code to check the first three letters of the team name, capitalise, check to see if that abbreviation already exists in the list, if it does look at the first 3+n letters until you get a distinct hit? Do the abbreviations have to be only three letters? There's no master abbreviation table, and I've never used one when using the new template. SportingFlyer talk 18:51, 3 January 2019 (UTC)
TheSandDoctor, which article? Hhkohh (talk) 18:53, 3 January 2019 (UTC)
@Hhkohh: 2003–04 Rangers F.C. season, 1901–02 East Stirlingshire F.C. season, 1903–04 East Stirlingshire F.C. season....literally every article I have tried it on (there are a couple more I forget) --TheSandDoctor Talk 18:56, 3 January 2019 (UTC)
TheSandDoctor, but why I can? [12] You need click convert fb button (under page information button) in edit page. Then the browser will show input box for you Hhkohh (talk) 19:05, 3 January 2019 (UTC)
@Hhkohh: I'm not sure. For me I click through all of the boxes and then it just refreshes to "No difference" and the edit was not logged. --TheSandDoctor Talk 19:14, 3 January 2019 (UTC)
Pinging Frietjes Hhkohh (talk) 19:24, 3 January 2019 (UTC)
TheSandDoctor, try this one. I have converted it successfully but I did not save in order to let you practice Hhkohh (talk) 02:31, 4 January 2019 (UTC)
@Hhkohh: Tried in Chrome: nothing. Firefox? Nothing. Not sure why it doesnt work for me. --TheSandDoctor Talk 05:36, 4 January 2019 (UTC)
TheSandDoctor, I run fb script on my mobile phone on Safari browser Hhkohh (talk) 14:37, 4 January 2019 (UTC)
for team season articles, where possible, we should transclude the tables from main season article. I have been working on the {{fb cl team 2pts}} tables, and once that is done, I will go back to the 3pts tables. Frietjes (talk) 19:26, 7 January 2019 (UTC)
@Zackmann08:, please stop converting as Frietjes is taking care of it, thanks Hhkohh (talk) 08:47, 9 January 2019 (UTC)