Wikipedia:Link rot/URL change requests/Archives/2024/January
This is an archive of past discussions about Wikipedia:Link rot. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current main page. |
Gospel Music Hall of Fame
Hello. The old url for the Gospel Music Hall of Fame looks to be usurped. The new URL was working at least until September 2023. Not sure which solution is better: 1) convert the old link to new links and use archive URLs 2) use archived URLs for both old and new links. Luckily, with the two URLS there's less than 100 links to work through. Thanks! MrLinkinPark333 (talk) 19:57, 19 December 2023 (UTC)
- Update: the new url is working today. Taking a look at the URLs, some of them are easier to change over than others:
- /site/ with name: this to that
- /speaker-lineup/ with name can be converted like /site/:
- Any with ID numbers would need manual converting. I.e. this to that
- Any with years could be manually converted to individual bios. E.g. this to that. However, 2000 is used in 2 articles.
- There's other exceptions where either the new URL is blank or the URL is in a slightly different order. I think an archived copy of Bartlett's old URL would be more useful as his article at Eugene Monroe Bartlett is referencing more than his year of induction.
Would this work or is there a more simpler solution? Thanks! --MrLinkinPark333 (talk) 02:50, 29 December 2023 (UTC)
- User:MrLinkinPark333 - gmahalloffame.org is in 30 mainspace articles. I can convert where possible using the two rules you found, and for the manual ones, I'll change to archive URLs. If you want to manually repair them, I'll provide the list of the articles/URLs which were converted to archive URLs. It will also check for the string "Biography coming soon" and treat those pages as dead. And I'll check what else might come up in the logs like soft404 redirects to the home page. -- GreenC 17:12, 31 December 2023 (UTC)
- Since it's a small list, I could fix whatever didn't get converted over. Thanks! MrLinkinPark333 (talk) 18:38, 31 December 2023 (UTC)
- The bot only edited 15 pages..You can check two places: Special:Contributions/GreenC_bot (ending at Dolly Parton). And Search of gmahalloffame.org. Of the edits most were adding archive URLs. The pages it didn't edit, most already had archive URLs, and with no available replacement page there was nothing it could do. -- GreenC 19:58, 31 December 2023 (UTC)
- Thank you for the quick reply! MrLinkinPark333 (talk) 21:11, 31 December 2023 (UTC)
- The bot only edited 15 pages..You can check two places: Special:Contributions/GreenC_bot (ending at Dolly Parton). And Search of gmahalloffame.org. Of the edits most were adding archive URLs. The pages it didn't edit, most already had archive URLs, and with no available replacement page there was nothing it could do. -- GreenC 19:58, 31 December 2023 (UTC)
- Since it's a small list, I could fix whatever didn't get converted over. Thanks! MrLinkinPark333 (talk) 18:38, 31 December 2023 (UTC)
Ilta-Sanomat
Around 346 articles (full list including the ones that already use archived URLs) have URLs to the Finnish newspaper Ilta-Sanomat's website http://www.iltasanomat.fi/ that now redirects to the main page https://www.is.fi/
It seems that URLs that have an ID starting with the numbers 200000 can be fixed by simply changing "iltasanomat" to "is", e.g.:
- http://www.iltasanomat.fi/kotimaa/art-2000001140341.html -> https://www.is.fi/kotimaa/art-2000001140341.html
- http://www.iltasanomat.fi/viihde/art-2000000039757.html -> https://www.is.fi/viihde/art-2000000039757.html
- http://www.iltasanomat.fi/kotimaa/art-2000001253854.html -> https://www.is.fi/kotimaa/art-2000001253854.html
(I also changed to HTTPS in those examples)
But the URLs with IDs starting with the number 1 or URLs with completely different patterns can't be fixed by changing "iltasanomat" to "is", e.g.: ("Sivua ei löydy" is Finnish for "Page not found")
- http://www.iltasanomat.fi/musiikki/art-1288486980607.html -> https://www.is.fi/musiikki/art-1288486980607.html
- but here's a working URL starting with "200000": https://www.is.fi/musiikki/art-2000000525543.html
-
- http://www.iltasanomat.fi/kotimaa/art-1444805886271.html -> https://www.is.fi/kotimaa/art-1444805886271.html
- working: https://www.is.fi/kotimaa/art-2000001019059.html
-
- http://www.iltasanomat.fi/kotimaa/art-1288651998157.html -> https://www.is.fi/kotimaa/art-1288651998157.html
- working: https://www.is.fi/kotimaa/art-2000000714120.html
-
- http://www.iltasanomat.fi/uutiset/kotimaa/uutinen.asp?id=1740927 -> https://www.is.fi/uutiset/kotimaa/uutinen.asp?id=1740927
- working: https://www.is.fi/kotimaa/art-2000000326929.html
So, what would be the optimal way of fixing these?
- A) Setting an archive link to all of them?
- B) Changing the ones starting with "200000" from "iltasanomat" to "is" and setting an archive link to others?
- C) ...?
Also, there are thousands of articles with the same issue on fi.wikipedia, so helping that project too would be much appreciated. 85.76.13.79 (talk) 15:12, 20 December 2023 (UTC)
- I checked around for redirect information such as in the Wayback Machine or in headers and can't find anything, so there is no map how to move the non-20000 links. The 20000 links can be moved. Thus, solution "B" for enwiki. For fiwiki, unfortunately my bot is not configured to work with Finnish citation templates. However I can change the entire domain to "permadead" in the IABot settings, this will inform IABot to convert every iltasanomat.fi link on 300+ wikis to an archive URL. -- GreenC 20:54, 1 January 2024 (UTC)
- Ok, plan B for en-wiki and changing iltasanomat.fi to permadead for other wikis sounds good. Thank you in advance. (Original poster). 2001:14BA:9C98:7100:C993:D281:D619:D802 (talk) 15:48, 2 January 2024 (UTC)
- Results: 487 pages contain the domain. Checked each and made changes in 378 pages (some already had archive URLs). Converted 163 URLs of the -20000 type, added 320 new archive URLs, added 12
{{dead link}}
, changed 12|url-status=live
to dead. Uploaded results (archive URLs) to IABot, and changed the domain to "permadead" so it will propagate on other wikis. IABot has recorded over 6,000 unique URLs. -- GreenC 20:20, 2 January 2024 (UTC)
- Results: 487 pages contain the domain. Checked each and made changes in 378 pages (some already had archive URLs). Converted 163 URLs of the -20000 type, added 320 new archive URLs, added 12
- Ok, plan B for en-wiki and changing iltasanomat.fi to permadead for other wikis sounds good. Thank you in advance. (Original poster). 2001:14BA:9C98:7100:C993:D281:D619:D802 (talk) 15:48, 2 January 2024 (UTC)
bird-stamps.org
Domain bird-stamps.org hsa been usurped and redirect to the home page. Link search shows about 275 articles with such links, a relative handful of these have been updated with archive links. Fabrickator (talk) 08:51, 31 December 2023 (UTC)
- A WP:JUDI gambling site. Added to queue: Special:Diff/1193111754/1193243552 -- GreenC 20:26, 2 January 2024 (UTC)
Memória Globo
Most Memória Globo links are dead (like https://memoriaglobo.globo.com/programas/entretenimento/novelas/zaza.htm), there are more on Portuguese Wikipedia. Notrealname1234 (talk) 18:06, 31 December 2023 (UTC)
- User:Notrealname1234: There are some working URLs, eg. [1]. I'll check each, can't set all dead. Portuguese Wikipedia has it's own archive bots and archive provider, it's one of a few sites where IABot is unable to run, and my bot can't run anywhere but Enwiki. -- GreenC 20:34, 2 January 2024 (UTC)
- It's done. Edited 144 pages, added 243 archive URLs, 7
{{dead link}}
, moved 114 URLs to a new URL (redirects), updated IABot. -- GreenC 01:33, 3 January 2024 (UTC)
- It's done. Edited 144 pages, added 243 archive URLs, 7
www.amjbot.org
We have hundreds of links to URLs like http://www.amjbot.org/content/96/3/668.full, which just serve an HTTP 404 in response. They can simply be removed, if they're in the URL parameter of a citation template with a DOI (which leads to the real current location of the current publisher's version). Nemo 16:36, 1 January 2024 (UTC)
- User:Nemo_bis:
- In 497 pages, I removed for
{{cite journal}}
and{{citation}}
where there is a|doi=
eg. Special:Diff/1189605841/1193313192 - In 175 pages (with other templates or no doi), I added 169 archive URLs, 22
{{dead link}}
eg. Special:Diff/1100760787/1193405038
- In 497 pages, I removed for
- -- GreenC 17:48, 3 January 2024 (UTC)
- Nice! Thanks, Nemo 16:27, 4 January 2024 (UTC)
ebooks.adelaide.edu.au (404)
460 pages. "eBooks@Adelaide has now officially closed", January 7, 2020. There is no copy or replacement site. Prior to 2014 it was http://etext.library.adelaide.edu.au (same paths).
- If path contains ".html" then convert to an archive URL
- If path contains 4 elements and ends in "/" eg. http://ebooks.adelaide.edu.au/k/kant/immanuel/k16p/ then add "complete.html" and convert to archive URL ie. http://ebooks.adelaide.edu.au/k/kant/immanuel/k16p/complete.html -> https://web.archive.org/web/20110309070433/http://ebooks.adelaide.edu.au/k/kant/immanuel/k16p/complete.html
- If path contains 3 elements and ends in "/" eg. http://ebooks.adelaide.edu.au/m/mill/john_stuart/ convert to archive URL
- Exceptions to rule 2 & 3 are Plutarch, Voltaire, etc.. eg. https://ebooks.adelaide.edu.au/p/plutarch/symposiacs/ .. check logs for other exceptions
- Optionally where no archive exists, either remove URL from citation or nuke citation if an external link section.
-- GreenC 18:10, 6 January 2024 (UTC)
- Done, saved all but a handful. The existing links were often not to the full text, the archive version didn't follow the chapter tree so the texts were incomplete. I moved many to the "complete.html" version, which is the entire text on a single page, then converted to the archive.org version of that page. Special:Diff/1061289409/1195386287 .. Also, most are 19th century texts, they could be replaced by Gutenberg etc -- GreenC 04:57, 14 January 2024 (UTC)
oxfordislamicstudies.com
The domain "oxfordislamicstudies.com", referenced in about 400 articles, is returning the "NET::ERR_CERT_COMMON_NAME_INVALID" error.
It seems that in at least some cases, the current content is available at oxfordreference.com. Other possible places to look would be oxcis.ac.eu or perhaps ox.ac.uk. I really have no idea to what extent archive copies of oxfordislamicstudies.com provide any useful content. Fabrickator (talk) 19:05, 7 January 2024 (UTC)
- In the case of http://www.oxfordislamicstudies.com/article/opr/t125/e2280?_hi=2&_pos=2 (non-working link), the archive copy returns useful content, while the oxfordreference.com link provides too little content to likely be of any use. Fabrickator (talk) 19:27, 7 January 2024 (UTC)
- No archived version available at https://fatcat.wiki/release/lookup?doi=10.1093/acref/9780195165203.001.0001 yet either. Were they all HTML pages only or was there a PDF somewhere? Nemo 20:42, 7 January 2024 (UTC)
- The book itself is archived (example). Nemo 20:43, 7 January 2024 (UTC)
- No archived version available at https://fatcat.wiki/release/lookup?doi=10.1093/acref/9780195165203.001.0001 yet either. Were they all HTML pages only or was there a PDF somewhere? Nemo 20:42, 7 January 2024 (UTC)
- According to [2]: "Oxford Islamic Studies Online product site has been retired. Content you previously purchased on Oxford Islamic Studies Online has now moved to Oxford Reference, Oxford Handbooks Online, or What Everyone Needs to Know. They are paywall sites and no redirect map. The Wayback links will probably be better, worth a try. -- GreenC 05:23, 14 January 2024 (UTC)
Fabrickator: In 317 articles, I added 413 new archive URLs, 19 {{dead link}}
, and changed 106 |url-status=live
to dead
. -- GreenC 22:30, 15 January 2024 (UTC)
now Malware: myetymology.com
There are at least fifty uses of "www.myetymology (dot) com" on en.wiki [3], both bare URLs and in Cite templates. This domain seems to have some tricky malware scheme on it: visited via a Chrome browser it shows a page with the Chrome logo and text with something about having to verify that you're human and you should click "Allow". Via a Firefox browser, it puts up a grayed-out dummy page with a white dialog-box-like splash area saying "Before you continue to myetymology.oom" and blather about security and download Firefox add-on", with a single button labeled "continue". It does tricky stuff too: when I switched away from the Chrome window to invoke the snip-it utility to capture it, it changed the display so that it showed a duckduckgo search for "!ducky" (a search engine I don't use). The domain has definitely been usurped, is very likely dangerous, and needs to be eradicated from wikipedia. -- R. S. Shaw (talk) 04:13, 10 January 2024 (UTC)
- User:R._S._Shaw: Added to the WP:JUDI queue for usurpation Special:Diff/1193243552/1195955910 -- GreenC 22:35, 15 January 2024 (UTC)
Change of URL for Lawfare
The website has undergone a total revamp, including a change of URL from lawfareblog.com to lawfaremedia.org.
Valjean (talk) (PING me) 16:31, 10 January 2024 (UTC)
- Valjean: Done. I changed the domain, and also checked for redirects, and the live status of each URL. It was more difficult due to CloudFlare DDoS mitigation blocking the bot, but resolved. About 408 URLs changed Special:Diff/1187555382/1196040749, another 18 moved the archive URLs and modified
|url-status=
Special:Diff/1177313295/1196042453. regards -- GreenC 04:34, 16 January 2024 (UTC)- Thanks! -- Valjean (talk) (PING me) 05:23, 16 January 2024 (UTC)
2002 Winter Olympics torch relay broken archive links
Hello. Both 2002 Winter Olympics and 2002 Winter Olympics torch relay use this archive URL but it does not work. Instead it redirects to the Wayback Machine and has a question mark in the URL. Looking at old archived copies of this link, none of the 2001 and 2002 versions work despite being highlighted in blue. Some of the 2002 archived copies redirect to a blank page. I was wondering why this was the case. Thanks! MrLinkinPark333 (talk) 20:50, 16 January 2024 (UTC)
- I reported it, but can not guarantee it will get resolved. I looked in various places and ways and can not find a working replacement for this archive. It's an old site (by Internet standards) and went dead with a few years of creation. Thanks for the report. -- GreenC 22:01, 16 January 2024 (UTC)
- No worries! It does make me wonder if any other archived URLs used on Wikipedia instead redirects to the Wayback Machine and puts a question mark into the URL. This is the first time that has happened to me. MrLinkinPark333 (talk) 00:49, 17 January 2024 (UTC)
- There is link rot within the Wayback Machine itself. My bot WaybackMedic was made (and named) for that purpose, but it takes so long now to check every archive URL, due to the volume, it's not feasible to run it that way anymore. When we started in 2015 there were around 600k archive URLs on enwiki, now there are nearly 12 million and adding about 200k a month. -- GreenC 01:35, 17 January 2024 (UTC)
- Ah. I wasn't aware of the issues with the Wayback Machine. Hopefully this is a limited issue. MrLinkinPark333 (talk) 02:25, 17 January 2024 (UTC)
- Yes I believe it's a very small fraction. Of course we don't know what we don't know, cases like this are only knowable by manual discovery. If it was a lot we'd be hearing more complaints. The cases I can detect, it's like 0.0005% error rate. -- GreenC 03:47, 17 January 2024 (UTC)
- Ah. I wasn't aware of the issues with the Wayback Machine. Hopefully this is a limited issue. MrLinkinPark333 (talk) 02:25, 17 January 2024 (UTC)
- There is link rot within the Wayback Machine itself. My bot WaybackMedic was made (and named) for that purpose, but it takes so long now to check every archive URL, due to the volume, it's not feasible to run it that way anymore. When we started in 2015 there were around 600k archive URLs on enwiki, now there are nearly 12 million and adding about 200k a month. -- GreenC 01:35, 17 January 2024 (UTC)
- No worries! It does make me wonder if any other archived URLs used on Wikipedia instead redirects to the Wayback Machine and puts a question mark into the URL. This is the first time that has happened to me. MrLinkinPark333 (talk) 00:49, 17 January 2024 (UTC)
ir.uiowa.edu
This repository was retired and its contents went in various directions, including pubs.lib.uiowa.edu and scholarworks.wmich.edu. The domain currently serves TLS errors, while at some point it seemed to redirect all requests to an unrelated frontpage. URLs can be replaced where an OA copy is available, but as a first step it's ok to just remove all links in cite journal templates where a DOI is present. Nemo 13:34, 6 January 2024 (UTC)
- User:Nemo_bis is "OA" -> "IA"? Otherwise I don't know what OA means. If it is IA, the example diff [4] shows the migration of ir.uiowa.edu -> pubs.lib.uiowa.edu .. are you suggesting using IA snapshots to find the redirect? Unfortunately it doesn't look like IA saved the correct redirect information. [5] Is there some place else to obtain the new URL? -- GreenC 17:32, 6 January 2024 (UTC)
- No, OA as in open access. Citation bot will add the OA links later if the broken links are removed. I was only asking about the removal, sorry. Nemo 15:51, 7 January 2024 (UTC)
User:Nemo_bis, there are 418 pages with the domain. For all cite journal with a doi: A) In 132 citations removed the URL Special:Diff/1137009702/1194866745. B) In another 84 there was a working redirect migrated Special:Diff/1184196199/1194866750. For everything else not a cite journal with a doi: C) Added 198 archive URLs Special:Diff/1186059609/1194876255. Migrated 54 redirects same as B). And D) added 8 {{dead link}}
Special:Diff/1173723334/1194876457. -- GreenC 05:39, 11 January 2024 (UTC)
- Outstanding! I thought figuring out the redirects would be too much work (some go to a Primo frontpage). Nemo 21:21, 12 January 2024 (UTC)
- I can usually catch those that redirect to the same place, by the nature of the same destination URL showing up multiple times in the logs, during a trial-run. I add a trap for them in the code to treat those redirects as dead links, and rerun it again. Almost every domain has this problem, to some degree. It's hard to fully automate but I have as much as possible. -- GreenC 22:27, 12 January 2024 (UTC)
- Cool! Makes sense. Nemo 07:25, 14 January 2024 (UTC)
- I can usually catch those that redirect to the same place, by the nature of the same destination URL showing up multiple times in the logs, during a trial-run. I add a trap for them in the code to treat those redirects as dead links, and rerun it again. Almost every domain has this problem, to some degree. It's hard to fully automate but I have as much as possible. -- GreenC 22:27, 12 January 2024 (UTC)
- Is there any way to get a list of where these changes were made? I have been correcting all the links as I have time. None of them should be dead and all have live content somewhere, most should be using a DOI (which I have been adding) 1920wr (talk) 16:38, 17 January 2024 (UTC)
- 1920wr, Yes. I could provide a list of the article names for set C), but it would miss pre-existing archive URLs. It's probably better to find them with this search: 196 articles. For set D) that's hard to search for, rather, here are the 8 the bot added a
{{dead link}}
: Victor L. Littig,Jonathan Blum (writer, born 1967),John Herriott,R. Douglas Hurt,Second plague pandemic,Mayors of Sioux City, Iowa,List of school districts in Iowa,Christopher B. Krebs .. good luck with this project it would be great to see them converted to cite journal with DOI, a major improvement for this domain. If you think there is something I can help with bot let me know. -- GreenC 21:15, 17 January 2024 (UTC)
- 1920wr, Yes. I could provide a list of the article names for set C), but it would miss pre-existing archive URLs. It's probably better to find them with this search: 196 articles. For set D) that's hard to search for, rather, here are the 8 the bot added a
Big Cartoon DataBase
Per Wikipedia:Templates for discussion/Log/2024 January 16#Big Cartoon DataBase Template:Bcdb and Template:BCDB title are being deleted, however there are many other non-templated links to that website that aren't working (see for example the second reference at Tod Carter or the external link at Knight-mare Hare). Reporting here as I don't think anything is currently done with these (archived, marked as dead, or removed) Gonnym (talk) 14:04, 23 January 2024 (UTC)
- Gonnym, I see about 1,000 instances of the templates, and another 1,400 links. The site has been "excluded from the Wayback Machine". But, the first one I checked is available at archive.today. There are a number of options:
- Convert the 1,000 templates to normal square links, then convert those plus the 1,400 to archive.today, where available, or add a
{{dead link}}
if not. That way if the site is ever un-excluded from the Wayback in future those archives could get added. - Nuclear option: completely eliminate all citations and links to this site.
- Some other combo, like nuking the 1,000 but trying to save the 1,400 and if any those don't archive then nuke those etc..
- Convert the 1,000 templates to normal square links, then convert those plus the 1,400 to archive.today, where available, or add a
- Both options are a bit of work, nuking is not clean it's semi-automated each one has to be visually verified it didn't mangle things, but I have done it before and the quantity isn't too high. The conversion and archiving is more automated. My suggestion, if you think the site is completely unreliable and should be eliminated even when it has archives, the nuclear option, otherwise the first option. -- GreenC 14:40, 23 January 2024 (UTC)
- I have no real opinion here as I hadn't participated in that discussion but I'll ping here others that did. @Snowmanonahoe @TechnoSquirrel69 @WikiPediaAid. Gonnym (talk) 14:47, 23 January 2024 (UTC)
- The site is a wiki... I'm impressed it managed to amass 1400 citations. I say nuke it, because again, it's a wiki. Snowmanonahoe (talk · contribs · typos) 15:57, 23 January 2024 (UTC)
- Thanks for the ping, Gonnym! The links being generated by the template are already being removed by a bot since the TfD closed as delete, so we don't need to worry about those. I would rather not indiscriminately delete the other links in citations, just add the archive URL along with a
|url-status=dead
if applicable. —TechnoSquirrel69 (sigh) 15:00, 23 January 2024 (UTC)- Sounds like that bot is not only eliminating the template, but also the entire citation to BCD. Sounds like a limitation of the bot, it can only delete templates without the option to convert to square links. That's unfortunate because TfD should concern removing templates, not removing citations, which is more the domain of WP:RSN. This is a common scenario with a mix of templates and links and we end up with this inconsistency. Some cites are completely deleted because of the template, others are kept because they are square links, it's random. Anyway this is not directly related to BCD just observing. I can try to archive what is left no problem. -- GreenC 15:25, 23 January 2024 (UTC)
- I don't think the bot is removing citations, just the links generated by the {{bcdb}} template. All of the cite links should still be around. —TechnoSquirrel69 (sigh) 15:45, 23 January 2024 (UTC)
- For now, I'll retain the citations and treat the links as dead. There is no clear consensus to nuke cites entirely. -- GreenC 01:47, 24 January 2024 (UTC)
- Thanks, GreenC! —TechnoSquirrel69 (sigh) 23:06, 24 January 2024 (UTC)
- For now, I'll retain the citations and treat the links as dead. There is no clear consensus to nuke cites entirely. -- GreenC 01:47, 24 January 2024 (UTC)
- I don't think the bot is removing citations, just the links generated by the {{bcdb}} template. All of the cite links should still be around. —TechnoSquirrel69 (sigh) 15:45, 23 January 2024 (UTC)
- Sounds like that bot is not only eliminating the template, but also the entire citation to BCD. Sounds like a limitation of the bot, it can only delete templates without the option to convert to square links. That's unfortunate because TfD should concern removing templates, not removing citations, which is more the domain of WP:RSN. This is a common scenario with a mix of templates and links and we end up with this inconsistency. Some cites are completely deleted because of the template, others are kept because they are square links, it's random. Anyway this is not directly related to BCD just observing. I can try to archive what is left no problem. -- GreenC 15:25, 23 January 2024 (UTC)
- I made the following edits
- Remove pre-existing Wayback links since they don't work
- Add archive.today links when available (1,025)
- Add
{{dead link}}
for the rest (697) - Update iabot.org so changes can propagate to 300+ other language wikis
- If in the future the restriction on Wayback is lifted the bots should be able to convert the dead links. -- GreenC 02:59, 25 January 2024 (UTC)
Gemini, Apollo, Shuttle Mission "Chronology of Wake-up Calls"
This weblink PDF (https://history.nasa.gov/wakeup%20calls.pdf) is used as a secondary source across a large number of articles for the Gemini, Apollo, and especially Space Shuttle missions. It recently got 404'd, but a very recent archived link is available here (https://web.archive.org/web/20231220093919/https://history.nasa.gov/wakeup%20calls.pdf). It would be great if y'all can add this archive link to the queue. SpacePod9 (talk) 00:54, 24 January 2024 (UTC)
- I submitted an IABot job to process the 56 pages where it's located. -- GreenC 01:51, 24 January 2024 (UTC)
- Thanks for the help! SpacePod9 (talk) 03:43, 24 January 2024 (UTC)