User talk:GreenC/2016
Template editing
[edit]Your account has been granted the "template editor" user permission, allowing you to edit templates and modules that have been protected with template protection. It also allows you to bypass the title blacklist, giving you the ability to create and edit edit notices.
You can use this user right to perform maintenance, answer edit requests, and make any other simple and generally uncontroversial edits to templates, modules, and edit notices. You can also use it to enact more complex or controversial edits, after those edits are first made to a test sandbox, and their technical reliability as well as their consensus among other informed editors has been established.
Before you use this user right, please read Wikipedia:Template editor and make sure you understand its contents. In particular, you should read the section on wise template editing and the criteria for revocation. This user right gives you access to some of Wikipedia's most important templates and modules; it is critical that you edit them wisely and that you only make edits that are backed up by consensus. It is also very important that no one else be allowed to access your account, so you should consider taking a few moments to secure your password.
If you do not want this user right, you may ask any administrator to remove it for you at any time.
Useful links:
- All template-protected pages
- Request fully-protected templates or modules be downgraded to template protection
Happy template editing! — Martin (MSGJ · talk) 14:40, 14 December 2015 (UTC)
- Just wondering ...
- Could you archive your talk page?
- Have you considered adminship? — Martin (MSGJ · talk) 12:32, 15 December 2015 (UTC)
- @MSGJ:. Done
- I'll look into it. I know nothing about requirements. Thanks. -- GreenC 14:21, 16 December 2015 (UTC)
- Wikipedia:Guide to requests for adminship is perhaps a good place to start. Regards — Martin (MSGJ · talk) 22:43, 16 December 2015 (UTC)
Possible AWB Bug
[edit]Hey just a notice that you somehow duplicated the contents of Champ Clark twice. In the process of reverting to the last good version, I removed <ref name="Allan, Chantal page 17">, but I don't think that caused any harm and it seemed to also fix a ref error. Opencooper (talk) 05:20, 17 December 2015 (UTC)
- Hi thanks for the notice. Never seen that before. I don't know if the problem is with my script or AWB itself. Since I don't know what caused it, what I'll do is add a sanity check if the edit size is > 50% of the original article it halts with a warning. I might also retry AWB on Champ Clark and see if it can replicate the problem. -- GreenC 14:19, 17 December 2015 (UTC)
Season's Greetings!
[edit]Hello Green Cardamom: Enjoy the holiday season and upcoming winter solstice, and thanks for your work to maintain, improve and expand Wikipedia. Cheers, North America1000 18:34, 20 December 2015 (UTC)
- Use {{subst:Season's Greetings}} to send this message
Merry Christmas and Happy New Year!
[edit]Warmest Wishes for Health, Wealth and Wisdom through the Holidays and the Coming Year! Lingzhi ♦ (talk) 22:43, 25 December 2015 (UTC)
Administrators' noticeboard: red herring
[edit]There is a response discussion at Wikipedia:Administrators' noticeboard/Incidents regarding an issue you created. Clepsydrae (talk) 19:27, 27 December 2015 (UTC)
As for your comment, "See WP:3RR before you revert again at red herring. You may be blocked from editing Wikipedia," I only reverted the article twice. I edited it multiple times this morning, possibly immediately after a revert, which undoubtedly appeared to you as a "revert." I have no intention, desire, or time to sustain a revert war. I had hoped that you would follow the references, have an "ah-ha..." moment and leave a good, scholarly edit be. Clepsydrae (talk) 19:27, 27 December 2015 (UTC)
2015 Thalys train attack
[edit]Hi, through no fault of your own, a recent revert I made at 2015 Thalys train attack has undone a change of yours made on 14 November. This is due to an improper implementation of an agreed-upon Rfc by Tough sailor ouch on 12 November which I subsequently reverted, leaving your edit nowhere to go for the time being. Once the Rfc change is properly implemented (by him or another editor) your edit can be reapplied. I will be happy to apply that edit for you, upon request. Please wait for the re-implementation of the Rfc first.
For details, please see Talk:2015_Thalys_train_attack#Implementation_of_Rfc. If there's anything you can think of to explain it to him better than I was able to, I'd sure appreciate it, cuz I'm not sure I'm getting the point across clearly enough. Cordially, Mathglot (talk) 10:16, 29 December 2015 (UTC)
2016
[edit]Thank you for your contributions to this encyclopedia using 21st century technology. I hope you don't get any unneccessary blisters. |
- @Cullen328:, hey thanks! Happy New Year. Cool banner. -- GreenC 01:39, 31 December 2015 (UTC)
Happy New Year, Green Cardamom!
[edit]Green Cardamom,
Have a prosperous, productive and enjoyable New Year, and thanks for your contributions to Wikipedia. North America1000 00:55, 2 January 2016 (UTC)
- Send New Year cheer by adding {{subst:Happy New Year fireworks}} to user talk pages.
- @Northamerica1000:, thank you, North America, Happy New Year! -- GreenC 16:08, 2 January 2016 (UTC)
Public domain movies
[edit]Hello,
Thank you for removing The Klansman on the list of films in the public domain in the United States. I have added another film on the list that I believe to be in the public domain by the name of Born to Win, a 1971 movie released thru United Artists that features Robert De Niro in an early appearance. If the source I used isn't reliable, then by all means please remove it from the list. Thank you. Hitcher vs. Candyman (talk) 22:11, 17 January 2016 (UTC)
Lemelson
[edit]Sorry about the edit conflict on Emmanuel Lemelson. Coincidentally I was in the middle of a more drastic rewrite of that para., which I've added. Hope you think it's better overall - less promotional. If you want to make other changes to the article, let me know, and I'll work on Lemelson Capital Management for a while instead. —SMALLJIM 14:33, 23 January 2016 (UTC)
- No problem, your edit is a big improvement. I probably won't edit more for the moment. -- GreenC 14:49, 23 January 2016 (UTC)
- OK. Thanks for your other edit there. —SMALLJIM 15:12, 23 January 2016 (UTC)
- No problem, your edit is a big improvement. I probably won't edit more for the moment. -- GreenC 14:49, 23 January 2016 (UTC)
hold
[edit]Dickens worked on David Copperfield for two years between 1848 and 1850. Seven novels proceed it, and seven novels would come after it, Copperfield being the mid-point novel.
Tolstoy regarded Dickens as the best of all English novelists, and considered Copperfield to be his finest work, ranking the "Tempest" chapter (chapter 55,LV - the story of Ham and the storm and the shipwreck) the standard by which the worlds great fiction should be judged. Henry James remembered hiding under a small table as a boy to hear installments read by his mother. Dostoevsky read it enthralled in a Siberian prison camp. Franz Kafka called his last book Amerika a "sheer imitation". James Joyce paid it relevance through parody in Ulysses. Virginia Woolf, who normally had little regard for Dickens, confessed the durability of this one novel, belonging to "the memories and myths of life".
The story is told almost entirely from the position of the first person narrative, through the voice of David Copperfield himself, and was the first Dickens novel to do so. It is considered a Bildungsroman and would be influential in the genre such as Dickens own Great Expectations (1861), Thomas Hardy's Jude the Obscure, Samuel Butler's The Way of All Flesh, H. G. Wells's Tono-Bungay, D. H. Lawrence's Sons and Lovers, James Joyce's Portrait of the Artist as a Young Man.
As a bildungsroman, it has one major theme throughout, the disciplining of the hero's emotional and moral life. We learn to go against "the first mistaken impulse of [the]] undisciplined heart", a theme which is repeated throughout all the relationships and characters in the book.
Talkback
[edit]Message added 04:58, 17 February 2016 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.
—cyberpowerChat:Offline 04:58, 17 February 2016 (UTC)
Works not by
[edit]Just so you know, it seems that the "Works By" project is linking to irrelevant works. For example, in the link to the Internet Archive you added to Brayton C. Ives labeled "Works by or about Brayton C. Ives", we find one relevant work (repeated 6 times), which already appeared among the external links in the article, and then 19 links to various volumes of the novels and tales of Robert Louis Stevenson, followed by more relevant links. I don't know why the RLS works are showing up, other than having a table of contents that includes "St. Ives". So maybe you need to evaluate your search strings or something similar. I suspect that searching with initials instead of the name actually used by the author (in this case, Brayton Ives or General Brayton Ives) is causing this. - Nunh-huh 05:56, 22 February 2016 (UTC)
- Actually, I've moved the page because there's no source referring to him with a middle initial.... - Nunh-huh 06:08, 22 February 2016 (UTC)
- It's because "Stevenson, Fanny Van de Grift, 1840-1914" (same birth-death) and "St. Ives" are being matched by the IA search engine to the search string
("1840-1914" AND Ives)
. This is intentional as other authors require this to avoid false negatives. Most of the time it works OK but some cases like this produce false positives. It's all a trade off between avoiding false negatives while producing as few false positives as possible. I check each one before adding and felt in this case it wasn't excessive though it did appear to be about 50% I usually aim for 40% or less before adding a custom search (which has its own trade offs thus not done often) but if most cases it's much less and often 0%. -- GreenC 15:02, 22 February 2016 (UTC)- Well, my feeling is that a link to works that may or may not be by or about the subject of the article doesn't really enhance the article. If the current method produces such links, it ought to be changed. It will probably require personal curation—you know, an actual personal assessment of the works linked to—if it's going to be done right. Thanks for fixing this one. - Nunh-huh 17:55, 22 February 2016 (UTC)
- Thank you for bringing it to my attention. I do manually check each search (~93% are rejected on average) and sometimes make a mistake as the data can be hard to interpret why the IA search engine included a certain book. The template documentation {{Internet Archive author}} has more info on how anyone can fine tune the results, such as done here with the custom search. Unfortunately the goal of 0% false positives is unworkable for a number of reasons. The solution just implemented for Brayton Ives is called a custom search (see the docs) however it has trade offs. It's brittle, prone to future problems and lack of maintenance as changes occur at Internet Archive. That's why the template exists to address these problems. Sometimes the trade off makes sense, other times not. I make custom searches when I feel it is justifiable, and sometimes miss some. -- GreenC 19:55, 22 February 2016 (UTC)
- Well, my feeling is that a link to works that may or may not be by or about the subject of the article doesn't really enhance the article. If the current method produces such links, it ought to be changed. It will probably require personal curation—you know, an actual personal assessment of the works linked to—if it's going to be done right. Thanks for fixing this one. - Nunh-huh 17:55, 22 February 2016 (UTC)
- It's because "Stevenson, Fanny Van de Grift, 1840-1914" (same birth-death) and "St. Ives" are being matched by the IA search engine to the search string
Internet Archive author template
[edit]I noticed that you added {{Internet Archive author |sname=James Eustace Bagnall}} to James Eustace Bagnall. Following the link does find some of his publications, but also one by a very different James Bagnall. Not being familiar with {{Internet Archive author}} I don't know if this can be prevented. Peter coxhead (talk) 15:48, 24 February 2016 (UTC)
- sopt=t worked for this case. It means if any future books are uploaded using only "James Bagnall" they won't be seen but if you put more emphasis on false positives than false negatives sopt=t will work. My preference is to allow in a few false positives so as to not cause false negatives in the future but it's a personal call on an article by article basis and what the results are. -- GreenC 15:57, 24 February 2016 (UTC)
- Ok, thanks; I understand. Peter coxhead (talk) 17:16, 24 February 2016 (UTC)
re Croses Criquet
[edit]Thanks for removing spamy link. I tried, but failed, and got tied up in knots when making enquiries about how to resolve the matter! Thanks for putting me out of my misery! Regards --Observer6 (talk) 16:36, 28 February 2016 (UTC)
- @Observer6: no problem. When using the template {{cbignore}} just type the letters {{cbignore}} into the article (without the tlx part, which is for displaying the name of the template without actually using the template). -- GreenC 17:01, 28 February 2016 (UTC)
Thx, GreenC. First, I couldn't see how 'cbignore' would solve the problem. Second, I didn't know that 'tlx' needed to be omitted if I was going to use it. Thirdly, I still couldn't understand how changing the bot flag from 'false' to 'true' (or was it the other way around?) was going to solve the problem of a spamy link! But 'failed' was brilliant! Glad to get a third party response and succinct solution to the fundamental problem! --Observer6 (talk) 17:23, 28 February 2016 (UTC)
- Yeah if you don't know how the moving parts work it can be mysterious. There probably needs to be better documentation with clear examples. Maybe I can work on that. -- GreenC 17:30, 28 February 2016 (UTC)
Disambiguation link notification for March 4
[edit]Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Aaron Swartz, you added a link pointing to the disambiguation page Scribner. Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 11:31, 4 March 2016 (UTC)
IA API bug
[edit]Are you getting anymore bad archives, as much as before. IA tech team reported to me in an email last night, "An earlier version from last week also fixed a problem in which I was returning the HTTP status of the CDX call rather than the underlying archived HTTP status code." which seems to describe the bug of bad archives with a non-200 OK being mistaken for a good archive with a 200 OK response.—cyberpowerChat:Limited Access 16:25, 8 March 2016 (UTC)
- @C678: Good to hear, hope that is it. They are having challenges. Short outages daily. Connection timeouts. Also saw a case where it reported a match in wayback, then tried again and it reported none found, then tried again and matched - intermittent result. I started today running WaybackMedic which looks for and corrects 4 problems. In the first 250 articles checked (from the 90k+ CB has edited since Dec) it reported problems in 30 articles (about 12%). Will get a bigger sample tomorrow after I manually verify the 30 did the right thing -- GreenC 01:47, 9 March 2016 (UTC)
- I also just deployed a massive update to Cyberbot that should fix a zillion bugs that were reported on the talk page. The scope of the change may have broken something else that I haven't seen, so please let me know if Cyberbot starts acting wierd as of the last hour of posting this. :P—cyberpowerChat:Limited Access 16:28, 9 March 2016 (UTC)
- Ok great. Will mark this date as a terminus for the 4 problems listed at WaybackMedic (although the 404 problem might continue depending on IA). -- GreenC 17:06, 9 March 2016 (UTC)
- I also just deployed a massive update to Cyberbot that should fix a zillion bugs that were reported on the talk page. The scope of the change may have broken something else that I haven't seen, so please let me know if Cyberbot starts acting wierd as of the last hour of posting this. :P—cyberpowerChat:Limited Access 16:28, 9 March 2016 (UTC)
Talk
[edit]Hello i just joined.How do i upload pictures? EnchantingZucchini (talk) 16:09, 24 March 2016 (UTC)
- That's an easy sounding that can be complicated. It depends if the picture is Fair Use or Public Domain (or Creative Commons). If Fair Use then it is uploaded directly to this website (left side of page, box says "Tools", "Upload file"). However you will need to read up on what constitutes Fair Use and provide appropriate rationales etc.. if it is a public domain then it is uploaded to http://commons.wikipedia.org and they have their own set of procedures and rules there. -- GreenC 16:17, 24 March 2016 (UTC)
Edit summaries
[edit]Hi. Will you please use edit summaries? Cheers fredgandt 14:57, 29 March 2016 (UTC)
URGENT..._ _ _... morse ...---... ...___… Hi listen your summary https://en.m.wikipedia.org/wiki/Positron_emission_tomography you can't just end it like that you only gave one example right there it is not correct you need to at least do some research you know a little deeper than you already do might as well since you dug that far on ,if you are as smart asi think you are which I'll assume for now that you are, is a very unintellectual subject so just because I have OCD sorry for the caps I humbly ask you to just put another possibility that it may be used for thank you SubtleAlpha246 (talk) 09:42, 21 June 2016 (UTC)
Oh yeah sorry about that urgent OCD. Sir I am so sorry. SubtleAlpha246 (talk) 09:43, 21 June 2016 (UTC)
AfD
[edit]Hi GreenC: A recent edit you performed at AfD has been reverted. You may want to check it out. North America1000 03:34, 1 April 2016 (UTC)
- @Northamerica1000: - Clever. Thanks for the laugh - and not having to deal with a real dispute! -- GreenC 13:24, 1 April 2016 (UTC)
- You're welcome? Cheers! North America1000 14:00, 1 April 2016 (UTC)
Disambiguation link notification for May 4
[edit]Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Best Translated Book Award, you added a link pointing to the disambiguation page Liu Xia. Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 10:45, 4 May 2016 (UTC)
Edit to Reading Half Marathon by WaybackMedic
[edit]Hi. Could you take a look at the recent edit by your bot (WaybackMedic) to Reading Half Marathon. I don't know which archiveurl= parameter it was trying to fix, but it appears to have replaced all the newlines in the article with ***!!***, thus rendered the whole article pretty nigh unreadable. I've reverted the change, so you will need to look back in the history to oldid:722269132. -- chris_j_wood (talk) 12:12, 27 May 2016 (UTC)
- Yes bug fixed. thanks for the notice. -- GreenC 17:04, 27 May 2016 (UTC)
WaybackMedic and talk pages
[edit]Thank you for this edit at Tomb of Absalom; I agree that there is no valid archived link for that citation.
Would it be possible for the bot to also update the {{sourcecheck}} template on the talk page? In this case I have updated the parameter at Talk:Tomb of Absalom. – Fayenatic London 13:08, 27 May 2016 (UTC)
- There are problems with the sourcecheck system. If there is more than 1 link there is no clear way to designate a "fail" or "pass" for each link much less to do it with a bot since the formatting is somewhat free-form. It would take a lot of work to prevent messing up the talk page and not sure if it would be big gain because it doesn't impact anything other than a notification, and WM is operating on archive.org links besides those added by Cyberbot, and my guess is only impacting about 5-10% of those. I'm also uncomfortable doing by bot what was meant to be a manual check since there can be soft-404s that bots believe are valid but only a human can verify; it would still work for the deletions, but the problems mentioned above. -- GreenC 17:27, 27 May 2016 (UTC)
- OK, thanks. I'll watch out for others that need checking manually. – Fayenatic London 11:20, 28 May 2016 (UTC)
Cucuteni-Trypillian culture edits
[edit]Could you please explain in plain English what is going on with this article? The last two edits are making me very concerned. Please reply to my own talk page. Thanks. --Saukkomies talk 20:59, 27 May 2016 (UTC)
- @Saukkomies: It's the same software bug noted above, it impacted 10 out of 1500 articles edited, or about half of one percent. The bug is fixed, the impacted articles were corrected, and the bot will not be editing those articles again. (Keeping on this page since multiple threads on the same topic). -- GreenC 21:30, 27 May 2016 (UTC)
- Thanks for the explanation. --Saukkomies talk 16:41, 30 May 2016 (UTC)
Re: Webcite
[edit]It doesn't redirect but is actually a way back archive page that I webcited lol. I noticed a while back that web archive links rot/_expire eventually but webcite seemed more lasting. Which is why I objected to the bot's changes Dan56 (talk) 23:48, 27 May 2016 (UTC) Dan56 (talk) 23:48, 27 May 2016 (UTC)
- @Dan56: That is true, good idea as robots.txt can make pages disappear from Wayback, or even disappear for unknown reason. Rather then battling with the bot, there is a {{cbignore}} template ("Cyberbot Ignore"). -- GreenC 00:31, 28 May 2016 (UTC)
- Cyberbot is not adding a badlink. Cyberbot is moving an archive URL to the proper field. I have no idea, why there is an archive of an archive, but the URL parameter needs to have the original URL of the page being displayed, especially a broken WebCite makes it impossible to figure out what the original URL was supposed to be.—cyberpowerChat:Online 01:42, 28 May 2016 (UTC)
- This is true, the
url=
field should be the original URL not an archive URL (archive.org or webcite). Confusing when there are two archive URLs in a template plus the IA API provides a third option. I would probably go with what is known working, the API result 200, which I believe is what happened. -- GreenC 02:03, 28 May 2016 (UTC)- @Cyberpower678:, none of this requires removing the webcite link, which always shows at the top the link that's been archived. Dan56 (talk) 07:53, 28 May 2016 (UTC)
- My point isn't about the bot removing the webcite archives, my point is the URL field needs the original URL field so people can easily see what the archive link was supposed to be archiving. Sure it's printed on the top of the snapshot, but what if WebCite goes down, crashes, their data centers have a fire, flood, or something that permanently destroys the snapshot. Then no one knows what that WebCite URL was holding in place of the original.
- Well, you would if you'd gone back to the revision dated whenever the BOT made the changes I've been reverting lol Dan56 (talk) 23:57, 28 May 2016 (UTC)
- My point isn't about the bot removing the webcite archives, my point is the URL field needs the original URL field so people can easily see what the archive link was supposed to be archiving. Sure it's printed on the top of the snapshot, but what if WebCite goes down, crashes, their data centers have a fire, flood, or something that permanently destroys the snapshot. Then no one knows what that WebCite URL was holding in place of the original.
- @Cyberpower678:, none of this requires removing the webcite link, which always shows at the top the link that's been archived. Dan56 (talk) 07:53, 28 May 2016 (UTC)
- This is true, the
- Cyberbot is not adding a badlink. Cyberbot is moving an archive URL to the proper field. I have no idea, why there is an archive of an archive, but the URL parameter needs to have the original URL of the page being displayed, especially a broken WebCite makes it impossible to figure out what the original URL was supposed to be.—cyberpowerChat:Online 01:42, 28 May 2016 (UTC)
2016 Wikimedia Foundation Executive Director Search Community Survey
[edit]The Board of Trustees of the Wikimedia Foundation has appointed a committee to lead the search for the foundation’s next Executive Director. One of our first tasks is to write the job description of the executive director position, and we are asking for input from the Wikimedia community. Please take a few minutes and complete this survey to help us better understand community and staff expectations for the Wikimedia Foundation Executive Director.
- Survey, (hosted by Qualtrics)
Thank you, The Wikimedia Foundation Executive Director Search Steering Committee via MediaWiki message delivery (talk) 21:48, 1 June 2016 (UTC)
WayBack Medic
[edit]Hi. I'd like to learn how to use WayBack Medic. I've read User:Green Cardamom/WaybackMedic, but I'm still confused. I think I'd like to test it. Would you please walk me through it?Zigzig20s (talk) 13:44, 4 June 2016 (UTC)
- It's a bot not an end-user tool. It will hopefully eventually process all articles that contain wayback links. -- GreenC 13:50, 4 June 2016 (UTC)
- OK, makes sense.Zigzig20s (talk) 13:53, 4 June 2016 (UTC)
List of things creating Internet Archive search queries
[edit]Is there a list somewhere of people/bots who are creating Internet Archive search queries? We're looking into what raw Lucene syntax we need to support in our search interface, and it appears that there are several people/bots creating such searches. Thanks! Greg (talk) 00:20, 5 June 2016 (UTC)
- @Greg Lindahl: .. sending reply via email. -- GreenC 01:29, 5 June 2016 (UTC)
Your dead link bot - idea for improvement
[edit]Hi Green Cardamom, I came across the useful work your GreenCbot is doing with dead links. Is it possible to automatically assign a tag on the top of the page and/or (hidden but visible for maintenance workers) category to it? Because now it appears very small in the list of references, e.g. Cesar Department, where I saw it, but the chance it will be picked up by someone and resolved is minute. Also because in the edit summary only appears "WaybackMedic" and no reference to "This page has dead links, please have a look at them" or something along those lines. In general I hate those tags at the top and they are useless in many cases as they stay on the pages for many years, but in this case I think it's useful, especially when linked to the Wikiprojects on the talk page. What do you think? Cheers, Tisquesusa (talk) 08:46, 5 June 2016 (UTC)
- The WP:Link rot problem is pretty immense, and growing daily, this bot is not the only one trying to solve it. I'm not sure there would be consensus to add a top hat for dead links, partly because they are often unsolvable (no archive exists) and how common they are. All dead links are invisible tagged by inclusion in a category which other bots/people can use to find them. The edit summary I agree is not good; the bot uses AWB which currently has no feature to dynamically modify the edit summary - I've asked for this feature twice. I should be using the direct API instead, but AWB is how the bot was designed when it started out. It's on my list of things to change in future versions. -- GreenC 14:01, 5 June 2016 (UTC)
Edits claiming "not dead"...
[edit]This edit [1] removed templates showing a couple of links as dead: it removed two strings like this:
{ { dead link|date=June 2016|bot=medic } } { { cbignore|bot=medic } } (sorry, can't remember proper way to untemplate)
These links just "helpfully" redirect to a generic front page, but they are dead in terms of the relevant content. I'm not sure what to do: I would have thought that cbignore instructed the bot to leave the dead link template, but it doesn't seem to. Imaginatorium (talk) 03:59, 7 June 2016 (UTC)
- Imaginatorium: Yeah. The WaybackMedic bot made some errors in about 300 articles so I wrote a patch to return and fix them but the patch created new problems by undoing dead links that were legit in some cases. There was no way around it, so I choose the path of least damage. I've been going though manually but if you see anything, please help restore the dead link and cbignore templates for any url's that seem to deserve it. If there was a cbignore before it should be restored (including the |bot=medic). -- GreenC 04:06, 7 June 2016 (UTC)
This message is being sent to inform you that there is currently a discussion at Wikipedia:Bot owners' noticeboard regarding an issue with which you may have been involved. Thank you. — xaosflux Talk 10:57, 7 June 2016 (UTC)
Question about Gutenberg list
[edit]Hey there, I see you put 99% for project Gutenberg, are you suggesting all the red links on guttenberg need a redirect as there are already articles for them? Because I noticed on the talk page you said there are 9,000~ gutenberg tags on wikipedia and the list has about 8,000? I ask because I notice a lot of redlinks on the gutenberg list still. Kind regards, Calaka (talk) 09:46, 8 June 2016 (UTC)
- I may have confused things feel free to revert. My intention was 99% complete in terms of adding the template to available articles, not in creating new articles. -- GreenC 12:13, 8 June 2016 (UTC)
Help:Using the Wayback Machine
[edit]Hi! I see that you're doing some cleanup of archive.org links in articles. Over at Help:Using the Wayback Machine, those sections are specifically about asking archive.org to create an archive. Is your edit there correct, are or those URLs with /save/
a special case that should have been left alone? -- John of Reading (talk) 19:16, 14 June 2016 (UTC)
- /save/ creates an archive of the page. It's a command that triggers the Wayback Machine to begin archiving the page. It only needs to be done once. Once the archive is created, linking to an archive page for citation purposes should be a 14-digit date. Since I don't know what the correct date is, replacing with "*" leads to the archive index page and hoping someone will fill in the correct archival date from the index. -- GreenC 19:46, 14 June 2016 (UTC)
- Thank you. That sounds as if the
/save/
URLs in the help page are correct, so I will undo your edit there. -- John of Reading (talk) 19:50, 14 June 2016 (UTC)- John of Reading: I see what you are saying, sorry. Yes that was a bad edit. I have a list of articles to process and I thought they were all mainspace. I see some WP mixed in. I'll remove those before proceeding. Thanks for the notice. -- GreenC 20:11, 14 June 2016 (UTC)
- Thank you. That sounds as if the
A barnstar for you!
[edit]The Barnstar of Diplomacy | |
Many thanks to you Good Sir, may the WaybackMedic find continuous blessings and wellness ongoing! SoS was answered and appreciated. "Dutch" Publican Farmer (talk) 23:35, 15 June 2016 (UTC) |
Your task has been approved for trial. — xaosflux Talk 03:26, 25 June 2016 (UTC)
- I saw your message on C678's page. If I read your use case #2 correctly—that it would add archiveurls to citations that lack it—what would you make of my use case here: User talk:Cyberpower678/Archive 34#Citation bot for links that are not yet dead cont.27d? Is that something your bot can address? If so, very exciting czar 21:35, 28 June 2016 (UTC)
- Do you mean case 2 mentioned here or case 2 mentioned here or case 2 mentioned here here :) I think what your asking is a way for links to be added to the Wayback Machine in an automated fashion. If so, that is already done for all external link added to Wikipedia. As I understand it. It should be easy to test to make sure its working. Find an external link that is not yet in the Wayback or Wikipedia, add it to an article, check a few days later and see if it's now in the Wayback. -- GreenC 21:46, 28 June 2016 (UTC)
- Sorry, I meant case two on C678's page. (Usually when I test links in the Wayback it automatically archives the page, so I'd have to experiment to try that test.) BUT regardless of whether IA is doing that in the background, my concern is the security of having
|archiveurl=
filled out automatically in a citation, as your second example did on C678's page. Is that something your bot is prepared to handle? If so, when will it add the archiveurls and is there a way to make sure it gets to a page within 24 hours, etc. (I imagine there might be some pushback if it added archiveurl params to all URLs sitewide, but I'm talking about cases in which editors specifically want the bot to do the work of archiving their links for them.) czar 21:51, 28 June 2016 (UTC)- WM doesn't create new archiveurl links. It deletes or modifies existing links that have problems. Thus "Medic". New additions are done by CB. -- GreenC 22:55, 28 June 2016 (UTC)
- "cases in which editors specifically want the bot to do the work of archiving their links for them" .. you'll need to be more explicit what you mean. If you want a dead link to be archived in the template, add a {{dead link}} template. If you want a link archived on Wayback, then it will be done automatically by background processes. If you want a link that is otherwise working and not dead to be added as an archiveurl to a template, there is no auto mechanism for that. -- GreenC 23:06, 28 June 2016 (UTC)
- Your last example is what I'm after. I thought that's what was happening in the second replacement in [2] but apparently not. Oh well, thanks anyway czar 23:56, 28 June 2016 (UTC)
- ...returning to this—don't you think it would be valuable to have such an "auto mechanism" for that? I imagine it'd be easier to repurpose one of the bots already doing the work than for me to write my own script/code czar 19:03, 8 July 2016 (UTC)
- Medic is a 1-pass bot, to do what your suggesting would require a continually running bot and that is CB. If you are serious about making a bot.. take a look at medicapi.nim (Python-like) in Wayback Medic's GitHub account, I figure development > 1000 man-hrs at this point (development + running), a lot of hard-won lessons on how the Wayback API (doesn't) work. Or the CB source (PHP) which is even more complex. From a policy view not sure it's the best idea for a couple reasons. Ideally the snapshot date would be chosen by a human to ensure it contains the text cited and there is nothing wrong with the page (soft-404 etc). We're using bots to deal with link-rot for already-down links because there are too many to fix manually. This process is not error-free though, thus CB's talk page messages to manually check the links. If an editor is taking the time to add a cite template flag like you suggested, they really should be adding the archive link directly it doesn't take much longer. Otherwise why not have an automated process that adds archive links for every cite template added. That may eventually happen but not sure there is consensus for that level of bot intervention due to the problems of accuracy. It's one thing to fix existing dead links, another to fully automate the process for still live links. DASHBot did that for a while and honestly it had problems that WaybackMedic is currently fixing (not that WM is perfect either). If you decide to explore further I would suggest trial balloon in Village Pump and notify the talk page of cite web template as certain users like Trappist the Monk and Cyberbot would be key. -- GreenC 19:35, 8 July 2016 (UTC)
- ...returning to this—don't you think it would be valuable to have such an "auto mechanism" for that? I imagine it'd be easier to repurpose one of the bots already doing the work than for me to write my own script/code czar 19:03, 8 July 2016 (UTC)
- Your last example is what I'm after. I thought that's what was happening in the second replacement in [2] but apparently not. Oh well, thanks anyway czar 23:56, 28 June 2016 (UTC)
- Sorry, I meant case two on C678's page. (Usually when I test links in the Wayback it automatically archives the page, so I'd have to experiment to try that test.) BUT regardless of whether IA is doing that in the background, my concern is the security of having
- Do you mean case 2 mentioned here or case 2 mentioned here or case 2 mentioned here here :) I think what your asking is a way for links to be added to the Wayback Machine in an automated fashion. If so, that is already done for all external link added to Wikipedia. As I understand it. It should be easy to test to make sure its working. Find an external link that is not yet in the Wayback or Wikipedia, add it to an article, check a few days later and see if it's now in the Wayback. -- GreenC 21:46, 28 June 2016 (UTC)
Very helpful to hear the backstory—thank you. I was thinking more of a single-pass run, and less of a bot than a userscript that would save myself a few minutes on each article I write. I used to spend the time archiving links, but now when I write an article with 30–60 links, it's a significant chunk of time doing a relatively mundane task, so I don't. I'd rather handle only the errors/cases that do not archive properly with IA (usually the same sites are repeat offenders, so WebCitation would be better in those cases) but the idea is that I'd like to call such a script on specific pages in specific requests. The other policy stuff could be interesting/useful down the line and it might be useful to prepare for that, but I think of my own writing and the hours a month such a script would save me. (That is, if I still was bothering to archive my links, which I should.) By the way, you recommended that I check whether Wayback was automatically archiving new links added to articles. I added several bare URL refs to Nadia Kaabi-Linke a week ago. If you're curious, they either haven't been archived or haven't been archived in 2016. Same for another link that was added with {{cite web}} format: [3]. Basically, I don't feel that I can trust the silent archive mechanism but I see value in being able to run it myself and see the results. czar 20:28, 8 July 2016 (UTC)
- I understand better what you're saying and actually that would be a useful tool since it would be run manually and you could check the results. Let's see.. if your comfortable with AWB and unix, WaybackMedic could do this with some modification I would be happy to provide. What environment are you working in with AWB, Windows? Do you run a VirtualBox Linux? I think Linux will be required, it's not a script but a compiled binary.
- Interesting about the archive test failing.. I'll bring this up with IA, that was understood to be working. -- GreenC 02:01, 9 July 2016 (UTC)
- I'm comfortable with AWB and Unix. I prefer to edit in OS X, so I have run AWB in Windows when necessary (though I'm open to other ways of running AWB). But the OS X should be fine with the Unix bit. This sounds really great if you could help. Also thanks for coordinating with IA czar 06:08, 9 July 2016 (UTC)
- Ok. You will need to install Nim, in order to compile the program, I don't have access to an OS X system. Nim installs in a single directory not invasive of the system. Recommend these instruction (the first section "Install"). Also will need GNU Awk version 4+ if not already. Ping me when these are working. -- GreenC 13:22, 9 July 2016 (UTC)
- @Green Cardamom: all good, I have Nim and gawk czar 18:30, 9 July 2016 (UTC)
- Ok great. Nim recently upgraded to 0.14 and I am on 0.13 .. it should be OK, but as a test could you download *.nim here into a local directory and run "nim c medic.nim" and make sure it compiles without error (might have some warnings). -- GreenC 19:18, 9 July 2016 (UTC)
- @Green Cardamom: all good, I have Nim and gawk czar 18:30, 9 July 2016 (UTC)
- Ok. You will need to install Nim, in order to compile the program, I don't have access to an OS X system. Nim installs in a single directory not invasive of the system. Recommend these instruction (the first section "Install"). Also will need GNU Awk version 4+ if not already. Ping me when these are working. -- GreenC 13:22, 9 July 2016 (UTC)
- I'm comfortable with AWB and Unix. I prefer to edit in OS X, so I have run AWB in Windows when necessary (though I'm open to other ways of running AWB). But the OS X should be fine with the Unix bit. This sounds really great if you could help. Also thanks for coordinating with IA czar 06:08, 9 July 2016 (UTC)
Looking at this more closely, the program is designed to fix existing archive.org links in an article. It's not designed to add links where none exist since that is what CB does. So it would be a significant rework to change the scope. However, if the article contained "best guess" archive.org links, it could automatically verify and fix those. There would be almost no coding changes needed. For example say there is a link http://www.moma.org/calendar/exhibitions/1317 with an access-date of March 1 2015. You would set a "best guess" of archiveurl=https://web.archive.org/web/20150301000000/http://www.moma.org/calendar/exhibitions/1317 and archivedate=March 1, 2015. Then run the program and if this link doesn't exist at that snapshot, it will be deleted -- or if there is a better snapshot date such as March 15 2015 it will change to that (it works by picking the closest available). -- GreenC 19:55, 9 July 2016 (UTC)
- Ah, but it won't save a copy of the live site, if it's still up? CB does that, right? What would you recommend as the best way forward for this? I see a few options: (1) A variation on tagging live links as {{dead link}}s just to get a pass from the bot, (2) Adding a dummy archive.org link like you suggested, and hope that an archived version already exists, and I suppose run this process manually, (3) Ask Citoid team to build it IA auto-archive into their Citoid API, (4) Make a request for a bot and pray for help, or (5) write a custom script that uses the IA API to save a site and add it in my wikitext (likely most time consuming but all I'm missing in my script is the IA part right now). You're familiar with my use case, so what would you suggest? I'd prefer to have the tool available to everyone, but I'm really looking for the quickest way of getting IA links for live sites into my citations. (My citations are formatted through Zotero export and the Citoid API, usually.) Also the medic.nim compile is missing "docopt"—if you think I should continue with that route, should that be part of another package or is it a separate dependency? Appreciate your help, czar 20:50, 9 July 2016 (UTC)
- No it doesn't trigger saves on working sites but it could easily, just a simple GET command. The easiest way right now would be (2) since the code is working. For (2) the program will take care of things automatically - it will change the dummy snapshot date to the working snapshot date, or remove the dummy archive link entirely (if no working snapshot exists). It's not far off from the bot flagging idea of (1) except the flag is an archive.org URL. Docopt can be installed using nimble, the package manager, if not already installed the Nim Install instructions above has the steps for nimble. I think we are close to testing it out. After it compiles download the rest of the files and follow steps 2-6 in the 0INSTALL text shouldn't take very long. Then run
./project -c -p main
to initiate a project. Then./driver -p main -n "article name"
using any article name. The new article text will be in ~/data/wm-XXXX if any changes were made.-- GreenC 21:40, 9 July 2016 (UTC)- All right, I'll give it a go—thanks for your help! Getting this error on the
nim c medic.nim
compile:medic.nim(2, 18) Error: undeclared identifier: 'randomize'
I havemath
installed andnimble install extmath
didn't solve it czar 21:52, 9 July 2016 (UTC)- Great! Random is related to the change from .13 to .14 .. in line 2 of medic.nim change
from math import randomize, random
toimport random
-- GreenC 22:57, 9 July 2016 (UTC)- Had to modify a few things in project.cfg (should that be added to 0INSTALL?) but I'm stuck on
Unable to find .../WaybackMedic/meta/main.auth - Unable to create .../WaybackMedic/meta/main/auth
It doesn't appear to be a folder permissions issue czar 23:32, 9 July 2016 (UTC)
- Had to modify a few things in project.cfg (should that be added to 0INSTALL?) but I'm stuck on
- Great! Random is related to the change from .13 to .14 .. in line 2 of medic.nim change
- All right, I'll give it a go—thanks for your help! Getting this error on the
- No it doesn't trigger saves on working sites but it could easily, just a simple GET command. The easiest way right now would be (2) since the code is working. For (2) the program will take care of things automatically - it will change the dummy snapshot date to the working snapshot date, or remove the dummy archive link entirely (if no working snapshot exists). It's not far off from the bot flagging idea of (1) except the flag is an archive.org URL. Docopt can be installed using nimble, the package manager, if not already installed the Nim Install instructions above has the steps for nimble. I think we are close to testing it out. After it compiles download the rest of the files and follow steps 2-6 in the 0INSTALL text shouldn't take very long. Then run
Ahh, skip the ./project -c -p main
step and manually create two directories ~/WaybackMedic/meta/main and ~/WaybackMedic/data/main then modify project.cfg
default.id = main
default.data = /path/WaybackMedic/data/
default.meta = /path/WaybackMedic/meta/
main.data = /path/WaybackMedic/data/main/
main.meta = /path/WaybackMedic/meta/main/
Shouldn't need to mess with projects again once this is working, it will always be project "main" .. it's a feature when dealing with 100s of thousands of articles to break it up into batches.
-- GreenC 23:53, 9 July 2016 (UTC)
- Okay—got it running without errors, but the output in the data folder doesn't appear to reflect any changes. I ran it on User:Czar/drafts/WaybackMedic test with and without
|deadurl=
set. The two target refs are the first two in the list-defined {{reflist}} czar 00:38, 10 July 2016 (UTC)- I ran it here and got the correct result. Need to see some debug output. First change driver.awk as its currently setup for GNU Parallel which you are not using. Edit driver.awk and find the commented line that says "Create index.temp entry (re-assemble when GNU Parallel" .. below that are two lines, comment them out; and below that are 6 commented out lines, uncomment them. Now re-run
./driver -p main -n "article name"
-- then run./bug -p main -n "article name" -v
(view info) which will show useful information. The third line starts with./medic ..
.. run that command and it will re-run medic with debugging output using the data cached in the ~/data/main/wm-XXX directory which was just created by driver. The second line of the "./bug -v" output should showData: cd ..
then cd to that directory and look for a file called "article.waybackmedic.txt" .. this is updated article (article.txt is the original). -- GreenC 01:46, 10 July 2016 (UTC)- Hm... still not getting it. Here is a link to my installation (removed all but the last attempt). I recompiled after changing those eight lines in driver.awk and I'm not getting "article.waybackmedic.txt" in the wm-XXX directory. Any ideas? I didn't get any error messages or special output. (You can see in the link that it downloads the article but doesn't change the archiveurl from the dummy to the updated URL.) Appreciate your help with all this czar 02:31, 10 July 2016 (UTC)
- The dropbox is helpful. It doesn't look like medic ran, only the driver front-end which downloads the article.txt and creates the data directories. Driver then calls medic. Try this to run medic manually with debug output ("-d y"):
./medic -p "main" -n "User:Czar/drafts/WaybackMedic test" -s "/path/WaybackMecic/data/main/wm-0709212754N/article.txt" -d y
(set "/path" to actual path). -- GreenC 02:48, 10 July 2016 (UTC)- When I run medic I get "Usage: medic" and then the bunch of flags but still no debug output in the data folder. I was playing around with it a bit more and changed the executable links in init.awk to be direct, in case that was the issue, and now I get this when running driver:
$ ./driver -p main -n "User:Czar/drafts/WaybackMedic test"
File "<string>", line 1
import urllib, sys; print urllib.quote(sys.argv[1])
- (Caret pointing to the b in "urllib") Also I'm not even getting the right output anymore. I can revert if need be, but I wonder whether a badly linked executable was responsible for the silent lack of output in the first place czar 03:17, 10 July 2016 (UTC)
- driver looked ok and seemed to be working the problem is with medic I would revert the changes to init.awk. If medic gave a "Usage: medic" result it's saying it doesn't understand the arguments or missing args. If not already, use cut and paste for the ./medic command above, because the order of args is significant. Beyond that I really don't know why (and I need to sign off for tonight). -- GreenC 04:15, 10 July 2016 (UTC)
- The dropbox is helpful. It doesn't look like medic ran, only the driver front-end which downloads the article.txt and creates the data directories. Driver then calls medic. Try this to run medic manually with debug output ("-d y"):
- Hm... still not getting it. Here is a link to my installation (removed all but the last attempt). I recompiled after changing those eight lines in driver.awk and I'm not getting "article.waybackmedic.txt" in the wm-XXX directory. Any ideas? I didn't get any error messages or special output. (You can see in the link that it downloads the article but doesn't change the archiveurl from the dummy to the updated URL.) Appreciate your help with all this czar 02:31, 10 July 2016 (UTC)
- I ran it here and got the correct result. Need to see some debug output. First change driver.awk as its currently setup for GNU Parallel which you are not using. Edit driver.awk and find the commented line that says "Create index.temp entry (re-assemble when GNU Parallel" .. below that are two lines, comment them out; and below that are 6 commented out lines, uncomment them. Now re-run
User:Czar -- On auto archival, I reported your test links and the response from IA: "We have discovered, and are fixing, a breakdown on our end that was effecting the timely archiving of new links from Wikipedia." Thanks for the discovery of the problem. -- GreenC 18:02, 10 July 2016 (UTC)
- Good, I'm glad—happy to help. I've been getting "Usage: medic" ever since I first ran medic, if that helps. I thought it was part of the output. I was directly copy/pasting the command from the bug output and from your suggestion with the aforementioned results. Not sure if we've hit a dead end. I don't have time to write it right now, but I poked around with the basic IA API and found a basic command for saving to IA last night so I might just try that down the line czar
- I suspect it's an OS X issue with the command line parsing library (Docopt). You gave it a good go and almost there I think. If there was an OS X shell account I could ssh into I would be glad to try, but understand if you want to try other options. To save a page to Wayback:
https://web.archive.org/save/_embed/$URL
Might take a few minutes to show up. The archive said they plan on doing it automatically for all links on Wikipedia. -- GreenC 19:23, 10 July 2016 (UTC)
- I suspect it's an OS X issue with the command line parsing library (Docopt). You gave it a good go and almost there I think. If there was an OS X shell account I could ssh into I would be glad to try, but understand if you want to try other options. To save a page to Wayback:
Greetings
[edit]Per wp:lead I can't say I agree with this edit summary, but I am fine with your changes, for what it's worth. wp:lead asks for it to include "any prominent controversies". Best. Biosthmors (talk) pls notify me (i.e. {{U}}) while signing a reply, thx 19:42, 21 July 2016 (UTC)
- Well the truth is there is no lead section because there are no sections. The lead section repeats/summarizes the body of the article's key points, a mini-article one can copy to elsewhere (such as a sub-section on McPherson in the article on near term extinction if it existed). To do that we would need to create sub-sections. I don't really care and am fine how it is given limited content. It just didn't seem right to put so much weight on comments critical in the first paragraph while other substantial parts of his life such as long academic career get a passing mention of one or two sentences ie. wp:weight / @Biosthmors: -- GreenC 00:39, 22 July 2016 (UTC)
Precious
[edit]green books award
Thank you for quality articles about books and related articles, such as Daniyal Mueenuddin, Best Translated Book Award and A Visit from the Goon Squad, for the rescue of content, for your dead-link-bot, for defending the idea of community consensus and living with Der Prozess, - you are an awesome Wikipedian!
--Gerda Arendt (talk) 06:34, 4 August 2016 (UTC)
Courtesy Notification: RfC Opened from a Discussion you participated in
[edit]Greetings,
I am sending this courtesy notification to let you know that a Request for Comment has been opened regarding whether or not to add an Infobox to Noël_Coward. The prior discussion has now closed so that a consensus can be reached on the matter.
Thank you, -- Dane2007 talk 19:25, 15 August 2016 (UTC)
Extended (hopefully final) trial approved, please see Wikipedia:Bots/Requests for approval/GreenC bot 2 for details. — xaosflux Talk 15:07, 18 August 2016 (UTC)
Task approved
[edit]Your task #2 request has been approved. Thank you for your patience and cooperation in the trials. — xaosflux Talk 02:43, 27 August 2016 (UTC)
BOT problem
[edit]Hello, there appears to be a problem with this BOT edit in that it adds a partial accessdate to the cite templates. Partial accessdates now trigger an error since the last update to the cite templates. Can you change BOT to add a full accessdate to the citations. Thanks. Keith D (talk) 23:35, 19 August 2016 (UTC)
- It's fixed. -- GreenC 01:48, 20 August 2016 (UTC)
- Thanks. Keith D (talk) 11:11, 20 August 2016 (UTC)
Google cache and WaybackMedic
[edit]I wondered if this task is something your bot could do, too? --bender235 (talk) 14:19, 22 August 2016 (UTC)
- It looks like IABot has it covered. [4] -- GreenC 14:53, 22 August 2016 (UTC)
- Sounds good. Although it seems InternetArchiveBot is inactive. --bender235 (talk) 17:15, 23 August 2016 (UTC)
- It's currently being tested (or the testing system is being developed, not sure where they are at the moment). -- GreenC 17:33, 23 August 2016 (UTC)
- Sounds good. Although it seems InternetArchiveBot is inactive. --bender235 (talk) 17:15, 23 August 2016 (UTC)
Thanks.
[edit]I actually wrote the piece but do appreciate all your efforts/help. Camimack (talk) 04:25, 27 August 2016 (UTC)
- You're right. Don't know what I was thinking. The "needs to be entirely rewritten" tag was ridiculous lost my presence of mind. -- GreenC 04:38, 27 August 2016 (UTC)
I just left that editor the following note: "I am the original author. Your edits /comment were not helpful." Then I saw a similar note from another editor, and another, more serious one. Camimack (talk) 16:59, 28 August 2016 (UTC)
Category:International League Hall of Fame inductees has been nominated for discussion
[edit]Category:International League Hall of Fame inductees, which you created, has been nominated for Merging. A discussion is taking place to see if it abides with the categorization guidelines. If you would like to participate in the discussion, you are invited to add your comments at the category's entry on the categories for discussion page. Thank you.RevelationDirect (talk) 00:37, 29 August 2016 (UTC)
WaybackMedic 2 removing citation line breaks
[edit]In this edit I've re-inserted line breaks removed by WaybackMedic 2 from citations kept in vertical format. Is it possible for the bot to respect such formatting for the future? Dhtwiki (talk) 19:58, 31 August 2016 (UTC)
- Yes, that shouldn't happen will investigate. -- GreenC bot (talk) 23:24, 31 August 2016 (UTC)
- OMG, the bot is alive. The end of the world is upon us.—cyberpowerChat:Offline 01:20, 1 September 2016 (UTC)
- Heh.. GreenC bot, please don't destroy the world. Thank you. -- GreenC 18:45, 1 September 2016 (UTC)
- OMG, the bot is alive. The end of the world is upon us.—cyberpowerChat:Offline 01:20, 1 September 2016 (UTC)
Dhtwiki, bug fixed. -- GreenC 18:45, 1 September 2016 (UTC)
Sorry ...
[edit]Sorry, I think I inadvertently stopped your bot with this message. I only saw your message at the top of the bot's talk page after I hit save. —Bruce1eetalk 05:29, 1 September 2016 (UTC)
- No problem, I added that message recently. -- GreenC 18:38, 1 September 2016 (UTC)
BOT date problem
[edit]Hi, there appears to be a problem with this edit and this edit by BOT. It has changed from web.archive.org to www.webcitation.org but left it with an invalid |archivedate=
Keith D (talk) 21:39, 4 September 2016 (UTC)
- Yeah an old bug. I had fixed cases[5] but apparently missed some, if you see any more please go ahead and correct it, thanks for the report. -- GreenC bot (talk) 22:49, 4 September 2016 (UTC)
Problem with your bot
[edit]GreenC bot just created two articles Green Line A Branch and Green Line A Branch D. Both were (I think) identical copies of already created articles Green Line "A" Branch and Green Line "D" Branch. Note the difference is the ". I assume this isn't meant to happen. I have redirected both to the existing articles. - Yellow Dingo (talk) 07:07, 5 September 2016 (UTC)
- Interesting. During the last batch run, the bot was using a new framework for editing (Pywikibot) and there was also a bug in the bot code with " so for some reason Pywikibot decided to create a page if it was missing. It effected 7 articles (now fixed). The " bug is also fixed. -- GreenC bot (talk) 15:14, 5 September 2016 (UTC)
Walter o Brien
[edit]Hey, so I see you were right about the changes that I made. I was being grumpy byo (talk) 05:56, 6 September 2016 (UTC)
Another BOT problem
[edit]Hi again, here is another BOT problem that may have been fixed by now, this edit added a leading zero to the date causing a cite error. Keith D (talk) 18:53, 6 September 2016 (UTC)
- Didn't know about that, thanks. This is an uncommon edit and fortunately limited to probably less than 50 articles. I'll fix them with AWB in the next day or two. -- GreenC 19:49, 6 September 2016 (UTC)
- Thanks, I probably have located and fixed about 20 up to now as they pop-up on my watchlist. Keith D (talk) 10:02, 7 September 2016 (UTC)
For Wayback Medic
[edit]The Wikignome Award | ||
I just wanted to let you know that I highly appreciate the work of your bot, fixing all these archive links on Wikipedia. Thank you very much. --bender235 (talk) 11:52, 7 September 2016 (UTC) |
I have a correction to Helena Bonham Carter's profile.
[edit]Helena Bonham Carter did not play the role of Queen Elizabeth II in "The King's Speech." She played Elizabeth's mother. — Preceding unsigned comment added by Marinbu (talk • contribs) 13:03, 7 September 2016 (UTC)
- (talk page watcher) @Marinbu: Correct: the article doesn't say she played Queen Elizabeth II- it says she played Queen elizabeth, but it links to Queen Elizabeth The Queen Mother- i.e., QE2's mum :) Hope tihis helps. Muffled Pocketed 13:16, 7 September 2016 (UTC)
Wayback Medic 2 adding "dead link" when primary is still active
[edit]I had included an archiveurl merely as a defensive practice in Desert Rose (Sting song), but when Wayback Medic 2 noticed the archiveurl had gone dead, it marked the ref with "dead link" even though the primary is still active. Michaelmalak (talk) 20:37, 7 September 2016 (UTC)
- Thanks for the report. The bot will sometimes check if the URL is working before adding the dead tag (example). Determined by
|deadurl=no
. However that's wrong it should also check when missing|deadurl=
. If there is a|deadurl=yes
then it assumes a human determined it was dead and won't try to second guess. At some point we need a bot that checks every|dead-url=
and{{dead link}}
for accuracy. -- GreenC 00:21, 8 September 2016 (UTC)
Hasty deadlink tag from WaybackMedic 2
[edit]I just reverted this edit, where WaybackMedic 2 hastily added a deadlink tag while archive.org was briefly offline for maintenance. Perhaps the bot should be programmed to add the tag only if a deadlink is still dead 24 hours later? —Patrug (talk) 22:36, 8 September 2016 (UTC)
- If you mean the Wayback outage in the past hour or so, that's different. The bot processed this page about 36 hours ago, the uploading is separate from the processing. Wayback outages happen daily and the bot handles and logs those. This looks like a change in the reported robots.txt status between the time of processing and now. That's dependent on the remote site operator and just looks like bad timing. (if it's due to a Wayback outage one would see every article processed during the outage with similar removal of links). -- GreenC 23:01, 8 September 2016 (UTC)
- Yeah you can actually see the robots.txt on August 29 which would cause a Wayback policy block due to "Disallow: /". The site owners updated it on or about September 8 to only disallow URLs containing a "/wp-admin/" which removed the Wayback block. The bot processed the page probably within hours before they changed robots.txt, then uploaded the diff after the change. This is a very rare occurrence. Dealing with robots.txt on Wayback is complicated and still being worked out. For now the best we can do is report what the page is at the time of processing. -- GreenC 23:25, 8 September 2016 (UTC)
- OK, thanks for the quick diagnosis. Glad it wasn't a problem that affects a large number of articles. —Patrug (talk) 23:43, 8 September 2016 (UTC)
hexadecimal?
[edit]—Trappist the monk (talk) 01:04, 9 September 2016 (UTC)
- I copied what was there without paying attention. -- GreenC 01:09, 9 September 2016 (UTC)
Disambiguation link notification for September 10
[edit]Hi. Thank you for your recent edits. Wikipedia appreciates your help. We noticed though that when you edited Financial Times and McKinsey Business Book of the Year Award, you added a link pointing to the disambiguation page Andrew Scott. Such links are almost always unintended, since a disambiguation page is merely a list of "Did you mean..." article titles. Read the FAQ • Join us at the DPL WikiProject.
It's OK to remove this message. Also, to stop receiving these messages, follow these opt-out instructions. Thanks, DPL bot (talk) 10:51, 10 September 2016 (UTC)
Weird Internet Archive link
[edit]On James L. McPherson, I found a weird looking archive link that may or may not be Internet Archive.
The URL scheme is obviously related to Wayback Machine, although there is no snapshot with that particular time stamp on the "actual" WBM:
There are currently 300 links to wayback.archive-it.org on Wikipedia. I wonder if we should convert those links to actual WBM snapshots, too. From the homepage archive-it.org it seems this project is related to Internet Archive, although it has a completely different look. --bender235 (talk) 14:35, 11 September 2016 (UTC)
- The archive-it is a branded service for orgs that want their own wayback machine, I don't think a URL conversion would be a good idea as they may not be the same databases. If you mean creating a wayback snapshot then converting, there is the problem of snapshot date has to be near the archive-it snapshot date due to citation verification. -- GreenC 15:42, 11 September 2016 (UTC)
- Oh, I didn't know that. Okay, then let's leave them untouched. --bender235 (talk) 19:28, 11 September 2016 (UTC)
WBM link in "url" rather than "archive-url"
[edit]Another issue that just occurred to me, which apparently WaybackMedic does not address yet. Quite often I see cases in which the link to the WBM snapshot inside a citation template is in the |url=
parameter rather than the |archiveurl=
parameter. How do you feel about having WaybackMedic fix something like
{{cite web |url=https://web.archive.org/web/20101208180142/http://www.abheritage.ca/abpolitics/administration/maps_choice.php?Year=1935&Constit=Vegreville }}
to
{{cite web |url=http://www.abheritage.ca/abpolitics/administration/maps_choice.php?Year=1935&Constit=Vegreville |archivedate=December 8, 2010 |archiveurl=https://web.archive.org/web/20101208180142/http://www.abheritage.ca/abpolitics/administration/maps_choice.php?Year=1935&Constit=Vegreville }}
Seems like a good idea to me. --bender235 (talk) 19:53, 11 September 2016 (UTC)
- This is a feature in IABot it already fixed a lot. Once it is running again I assume this feature will still be active but don't know for sure. Another is
{{webarchive |url=https://web.archive.org/web/December 8, 2012/https://web.archive.org/web/20101208180142/http://www.abheritage.ca }}
converted to{{webarchive |url=https://web.archive.org/web/20101208180142/http://www.abheritage.ca |date=December 8, 2010 }}
. -- GreenC 20:47, 11 September 2016 (UTC)
- Alright. --bender235 (talk) 23:49, 11 September 2016 (UTC)
Just to let you know, GreenC bot made a serious error editing P several hours ago. I'm guessing it was trying to edit P$C, and maybe the dollar sign messed it up? —Granger (talk · contribs) 11:19, 12 September 2016 (UTC)
- That would make sense but it had no trouble with Quiz $ Millionaire, I'll need to investigate what happened. -- GreenC 12:36, 12 September 2016 (UTC)
- Maybe the $ sign isn't getting escaped?—cyberpowerChat:Online 13:09, 12 September 2016 (UTC)
- Yeah that's the problem. Quiz $ Millionaire worked because the space after the $ but the shell interprets $C as a variable. -- GreenC 13:20, 12 September 2016 (UTC)
- Maybe the $ sign isn't getting escaped?—cyberpowerChat:Online 13:09, 12 September 2016 (UTC)
BOT error
[edit]Hi, there appears to be some problem with the BOT in this edit which added invalid dates including archive dates in 2018. Keith D (talk) 10:05, 13 September 2016 (UTC)
- You can't really blame the bot for that. Those archive snapshots, based on the date stamp in the URL are from 2018. If those snapshots actually work then there'a a bug in the wayback machine somewhere.—cyberpowerChat:Online 11:04, 13 September 2016 (UTC)
The bot is designed to fix bad snapshot dates, and it did for some of them (notice the first couple edits where it changed the invalid 20180821120407 to a correct snapshot date), but the one's it didn't fix fell through and incorrectly added the 2018 year into the archivedate field. I'll check it out but this article is a monster (appropriately). -- GreenC 13:31, 13 September 2016 (UTC)
This is fixed though I'm confident it will come back to haunt me. It is a problem involving redirects and other non-200 pages of which there are a variety each need special handing. I got the major ones and will try to deal with others if/when they appear. Wayback likes to put the new URL/snapshot in different places and formats with each type of page. -- GreenC 16:49, 13 September 2016 (UTC)
Bot breaking references
[edit]Hello! Your bot is breaking lots of pages by causing duplicate reference definitions. Sometimes (but not always) duplicate ref markers that have the same content get folded. When your bot edits one, it always makes them different, and that results in a "duplicate reference definition" on the page. This hides references, and references are fundamental to Wikipedia's reliability and at the core of its definition of notability. Here's an example edit: [6]. -- Mikeblas (talk) 14:29, 14 September 2016 (UTC)
- I don't know what you mean. The given example edit looks OK. -- GreenC 14:35, 14 September 2016 (UTC)
- Here's another example. [7] -- Mikeblas (talk) 14:37, 14 September 2016 (UTC)
- I still don't see it. You'll need to more explicit. There's only one ref named "soccerway" and only ref that uses that URL. -- GreenC 14:43, 14 September 2016 (UTC)
- Oh I see the other soccerway ref is imported from a template. -- GreenC 14:47, 14 September 2016 (UTC)
- Here's another example. [7] -- Mikeblas (talk) 14:37, 14 September 2016 (UTC)
At the given "La Liga" link, in the references section at reference #40, I see "Cite error: Invalid <ref> tag; name "La_Liga_fair_play_rules" defined multiple times with different content (see the help page)." in large red letters because of the edit your bot made. The previous edit doesn't have that error in the references. Help:Cite errors says that citation errors are dependent on using English in your user preferences; do you not have that setting? Maybe there's another setting tha reveals them.
In the version of the 2011 League of Ireland Premier Division that your bot edited, there are really two "soccerway" references. One was edited by your bot. The other comes from the {{2011 League of Ireland Premier Division table}} template that the page includes.
On the template page, the ref is defined this way:
<ref name="soccerway">{{cite web|url=http://www.soccerway.com/national/ireland-republic/premier-league/2011/regular-season/|title=2011 League of Ireland|date=|work=www.soccerway.com|accessdate=14 February 2011| archiveurl= http://web.archive.org/web/20101231074534/http://www.soccerway.com/national/ireland-republic/premier-league/2011/regular-season/| archivedate= 31 December 2010 <!--DASHBot-->| deadurl= no}}</ref>
Your edit defined it this way:
<ref name="soccerway">{{cite web|url=http://www.soccerway.com/national/ireland-republic/premier-league/2011/regular-season/|title=2011 League of Ireland|date=|work=www.soccerway.com|accessdate=14 February 2011| archiveurl= https://web.archive.org/web/20101231074534/http://www.soccerway.com/national/ireland-republic/premier-league/2011/regular-season/| archivedate= 31 December 2010 <!--DASHBot-->| deadurl= no}}</ref>
Because the bot edit, the content of the references with the same name doesn't match, and the reference error is generated. This makes the reference difficult to view. (Sorry for not indenting; the formatting isn't easy.) -- Mikeblas (talk) 14:52, 14 September 2016 (UTC)
- I don't think there's much a bot could do about this. It won't be just this bot, by any bot, tool or AWB script - indeed any human editor - that modifies the citation for whatever reason. The problem is with the template. The refs in templates, if named, should contain a unique identifier (such as "templatname-soccerway") so it doesn't conflict with ref names in articles. If editors see duplicate refs with different names (or a special bot is designed for this purpose to seek them out), they can merge like you did in this case. Otherwise there's really no practical way for a bot (or anyone) to know, it would have to scrape every ref from every template used in an article and compare, it would be super complicated and resource intensive. Every bot and tool that works on citations would require that code. -- GreenC 15:01, 14 September 2016 (UTC)
- I agree with GreenC, this can't be pinned on the bot. The blame lies on the editor(s) that created the duplicate reference to begin with instead of simply using
<ref name="soccerway"/>
.—cyberpowerChat:Offline 01:12, 15 September 2016 (UTC)- I'm confused, since you say "I don't think there's much a bot can do about this", then actually outline exactly what a bot should do about it. Indeed, searching all the included templates and finding the reference is tedious. That's exactly why a bot should do it -- computers are far better at tedious and error-prone tasks than humans are. While it's sometimes regrettable that Wikipedia allows references in templates, it does; so anything editing a reference should be compatible with that fact. -- Mikeblas (talk) 11:40, 15 September 2016 (UTC)
- Exactly, a specialized bot that searches out and fixes duplicate refs in template but it is impractical for every bot and tool written to deal with it on their own. There are already tools that aggregate duplicate refs surely someone could expand those tools to include refs contained in templates. This is not trivial BTW and it far exceeds anything my bot and other bots are designed for. Also I'm curious why an error message isn't already being generated when the duplicate ref names exist. -- GreenC 13:14, 15 September 2016 (UTC)
- I can most certainly say that having IABot detect this is infeasible and would require huge scale rewrite to simply even detect duplicate, especially if they're embedded in templates. This is something I neither have the time, or am willing to do. The simple solution is to convert the reference to the form as I mentioned, in the event any bot changes the ref.—cyberpowerChat:Online 13:31, 15 September 2016 (UTC)
- Exactly, a specialized bot that searches out and fixes duplicate refs in template but it is impractical for every bot and tool written to deal with it on their own. There are already tools that aggregate duplicate refs surely someone could expand those tools to include refs contained in templates. This is not trivial BTW and it far exceeds anything my bot and other bots are designed for. Also I'm curious why an error message isn't already being generated when the duplicate ref names exist. -- GreenC 13:14, 15 September 2016 (UTC)
- I'm confused, since you say "I don't think there's much a bot can do about this", then actually outline exactly what a bot should do about it. Indeed, searching all the included templates and finding the reference is tedious. That's exactly why a bot should do it -- computers are far better at tedious and error-prone tasks than humans are. While it's sometimes regrettable that Wikipedia allows references in templates, it does; so anything editing a reference should be compatible with that fact. -- Mikeblas (talk) 11:40, 15 September 2016 (UTC)
- I agree with GreenC, this can't be pinned on the bot. The blame lies on the editor(s) that created the duplicate reference to begin with instead of simply using
- I don't think there's much a bot could do about this. It won't be just this bot, by any bot, tool or AWB script - indeed any human editor - that modifies the citation for whatever reason. The problem is with the template. The refs in templates, if named, should contain a unique identifier (such as "templatname-soccerway") so it doesn't conflict with ref names in articles. If editors see duplicate refs with different names (or a special bot is designed for this purpose to seek them out), they can merge like you did in this case. Otherwise there's really no practical way for a bot (or anyone) to know, it would have to scrape every ref from every template used in an article and compare, it would be super complicated and resource intensive. Every bot and tool that works on citations would require that code. -- GreenC 15:01, 14 September 2016 (UTC)
Wayback
[edit]Thanks for the tip! I'll remember that remplate WhisperToMe (talk) 14:13, 15 September 2016 (UTC)
Removal of contents with insufficient edit summary, etc.
[edit]Hi Green Cardamom,
while I appreciate most of the work done by your bot, I'd like to criticize bot edits like this one: [8], where GreenC bot removed 4 KB of contents from the article leaving only a mostly meaningless edit summary "(WaybackMedic 2)".
- Due to the large number of changes the diffs no longer show the individual changes but just some huge blocks to the effect that it is very difficult to manually check the bot edits.
- IMO, the edit summary is not sufficient. Instead, it should specify the exact type and for each of them the exact number of changes carried out. If there are many changes, perhaps different types of changes should be split into several subsequent edits, making it possible for users to check and possibly revert one problematic type of change while leaving the others.
- In general, I find it problematic to remove valid archive-url links just because later changes to robots.txt cause archive.org to no longer display the contents. Ideally, archive.org should adhere to the contents of the robots.txt file at the time the snapshot was taken, but this is beyond our control, of course. However, archive.org might change their policy in the future, or the site owner might change the robots.txt file again, so that such archive-url links become useful again - but if GreenC bot removes such links they are lost forever causing an article to suffer from link rot eventually. What I propose instead is to either just comment out those links as HTML comments or to introduce a new dead-url argument muting the "archived on ..." stuff from display, but leaving it technically intact in the article's source. This would ensure that someone trying to improve a citation later on (perhaps decades in the future) will stumble upon that archive-url link in order to recheck it.
- By removing the links your bot added spaces in front of the closing }} template bracket. While only a cosmetical issue, this looks ugly. The bot should adjust to whatever style was used before.
I hope these comments will help you to improve your bot. Thanks. --Matthiaspaul (talk) 10:31, 17 September 2016 (UTC)
- very difficult to manually check the bot edits - There's a solution for that. At the top-center of the diff screen is a pyramid shape button. With that the changes become clear. I agree the edit summary should do as you say the problem has been I have used AWB-external-script as the edit framework and it doesn't support custom edit summaries, I have no control over that but if you click on the "WaybackMedic 2" it tells you what the bot does. In the future I hope to use a different framework. If archive.org changes its policy or the links become available again, IABot will re-add them, the information is readily available and easily reinserted in the future. The bot tries to maintain existing style of spacing it is quite difficult and sometimes doesn't get it exactly right. -- GreenC 11:35, 17 September 2016 (UTC)
- Hi, thanks for the quick reply.
- Regarding that "pyramid shaped button" you mentioned, I've carefully looked for any buttons or links in the diff view, but I can't find that one. Is this, perhaps, some optional script in your configuration rather than something generic? How is that button labelled (so I could search for that label?) (However, if it would be a client-side script, it would not be a general solution, as I (and many others) normally have scripts disabled for security reasons.)
- Regarding IABot readding archive-url links if they become available again, are you sure about that? Where is that information stored? With the contents of archive-date and archive-url gone, how could (without human assistance) a bot decide if a snapshot has the same contents as the link given in the url parameter, unless the snapshot was taken at the same time as indicated by access-date? Even a human may not be able to sort it out any more later on, if, for example, the original url has gone dead in the meanwhile, so there is nothing to compare with anymore.
- Regarding inserting spaces before }}, I can see how this particular case slipped through easily (there were spaces in front of |parameter= earlier), but it would be easy to fix it by adding a rule to remove any white-space in front of the closing }} if there was no white-space in this place before the editing.
- --Matthiaspaul (talk) 14:55, 17 September 2016 (UTC)
- (talk page stalker)Chiming in here as the author of IABot. IABot has uses a plethora of means to fetch an appropriate snapshot for a dead URL, and then maintains that information is a large DB, approx. 15GB in size. Also as a psuedo-liason to the staff and devs of the Wayback Machine, they are working on allowing old snapshots to remain functional even when newer snapshots get blocked by robots.txt. The good news, all the current snapshots that have been disabled retroactively, still actually exist on their servers, so the archives will eventually come back if they were disabled retroactively.—cyberpowerChat:Offline 22:04, 17 September 2016 (UTC)
- Hi Cyberpower678! Thanks for the clarification, much appreciated. However, if it is only a question of time before those links will come back eventually (if they were disabled retroactively, that is), we could leave them in place as well and just wait a couple of months (or years). Obviously, the GreenC bot cannot distinguish between links that will come back and those which won't, but just deferring this particular task for a while would safe us a lot of unnecessary edits which always carry some risk that something wents wrong and which bind precious editor time to check the bot edits.
- --Matthiaspaul (talk) 11:34, 18 September 2016 (UTC)
- I'm party to some of those conversations and although IA says they would like to change, I don't think IA will change its takedown request policy for legal reasons (though they might change the request mechanism). Content owners can make take down requests of copyrighted material. But I'm not a lawyer. Also many links removed by WaybackMedic have nothing to do with robots.txt -- GreenC 13:16, 18 September 2016 (UTC)
- To me, a robots.txt file by a site owner and a take down request by a copyright holder of some contents are two different things.
- In either case, I see a serious long-term problem here for Wikipedia: Assume a site with lots of valueable contents and very permissive robots.txt, so that most of the contents gets archived at IA. Now, some years (or decades) in the future, the site owner changes (long-term, this will happen for almost all sites), the new site owner is unrelated to the former site owner, hosts completely different stuff and sets up a very restrictive robots.txt file. This would not only keep IA from archiving the new contents (no problem), but also have the side-effect of removing the previous contents of the former owner. Wikipedia follows suit, removes the links and over time most of our previously well sourced articles lack their sources for verification. Some trolls come around and challenge the previously perfectly sourced stuff (just because they can) and if no other source can be found (in some cases, there is only a single reliable source even for important stuff), the contents will have to be removed, articles detoriate and eventually may have to be deleted as well. Of course, this wouldn't affect mainstream knowledge where we'll always find another source, but it would affect a lot of more detailed or more sophisticated articles and reduce the knowledge presented in Wikipedia downto what can be sourced in printed sources...
- While we are talking about "site owners", legally, domains cannot be owned, only rented. If you stop paying the fees, the domain will fall back to the registry and can be rented to someone else instead. Therefore, the whole idea of robots.txt files must be based on the assumption that it only applies to contents available at the same time, not for contents in the future or in the past.
- --Matthiaspaul (talk) 14:25, 18 September 2016 (UTC)
- I agree that robots.txt as a takedown request mechanism is not ideal. At the very least they can change mechanism easily because it's not a legal question but internal policy and technology. The problem you mention is real, where a domain expires and the new owner sets up a robots.txt .. this has caused problems no only for Wikipedia but former site owners who wanted their old content available on Wayback. Internet Archive is aware of this, there are a number of long threads in the forums there, it has impacted a number of sites. Nevertheless keeping in perspective the numbers in total are not large. There are currently about 1.1 million Wayback links in Wikipedia as of August 20. Of those approximately 30,000 have been removed as non-functioning. Of those approximately 10,000 are due to robots.txt (I'm still working on how many exactly). So roughly 10,000 out of 1 million is 1%. That's over the entire lifetime of Wikipedia 14+ years. -- GreenC 14:48, 18 September 2016 (UTC)
- I would say that being able to communicate with the executive director and some of the devs, will more likely get us heard for changes to be pushed then the standard means at the disposal to the public as published on their site. The mechanism Green and I are trying to push is that archives should only get taken down through a legal request from the site owner, and that only new snapshots should follow the currently active robots.txt while older ones remain untouched. As an alternative, we could try to come up with a way where site owners can apply another object like robots.txt, that is specific to the wayback crawler, but the thing to lookout for is making sure the site wasn't usurped by someone else.—cyberpowerChat:Limited Access 21:13, 18 September 2016 (UTC)
- Thanks to you both. It's good to know the problem is actively being worked on.
- I agree with that there should be some formal (and more complicated) way for copyright owners to take down stuff regardless of time. After all, IA did not ask for permission to archive stuff in the first place, and in some legislations this could be seen as redistribution ("copying") without former consent of the copyright holder.
- For the bulk of the contents, robots.txt (despite all its weaknesses in definition) does a reasonable good job - at least for as long as it isn't applied retroactively.
- I think it would be easier to establish some extensions to the robots.txt format instead of introducing a completely new object. (Lots of web software already has provisions to define robots.txt files, but not some other object - it could take many years before support for such a new IA object would be implemented in mainstream web software...)
- As an ad-hoc idea (which would certainly need more refinement), it could be established as a new rule that robots.txt can be applied retroactively by the site owner *only* if the robots.txt contains a line uniquely identifying the current site owner (for example by the "contract ID" with the domain registry, or the full name and address). This ID could be used to detect site owner changes, and in this scheme the robots.txt file would never be applied across site owner changes. This would give site owners the flexibility to reconsider if they want their contents archived or not, but also protect contents of former site owners from being suppressed. Unfortunately, unless there would be some easy means for bots to check the validity of such IDs with domain registries at crawl time, this scheme could be abused by deliberately providing IDs of former site owners. Also, publically readable IDs could be used by some to sabotage the contracts with the registries. However, with some more thought and research along this line the basic idea could perhaps be refined by some public-/private-key or challenge-response mechanism making it difficult/impossible to fake IDs.
- --Matthiaspaul (talk) 10:27, 19 September 2016 (UTC)
- Matthiaspaul, I'm copying the above and sending it to Wayback credited to you. I don't know if they are still looking for ideas but it's a fine one. If anything comes of it will let you know. Thanks. -- GreenC 12:46, 19 September 2016 (UTC)
- Sounds great! Thank you as well.
- --Matthiaspaul (talk) 15:10, 19 September 2016 (UTC)
- Matthiaspaul, I'm copying the above and sending it to Wayback credited to you. I don't know if they are still looking for ideas but it's a fine one. If anything comes of it will let you know. Thanks. -- GreenC 12:46, 19 September 2016 (UTC)
- I would say that being able to communicate with the executive director and some of the devs, will more likely get us heard for changes to be pushed then the standard means at the disposal to the public as published on their site. The mechanism Green and I are trying to push is that archives should only get taken down through a legal request from the site owner, and that only new snapshots should follow the currently active robots.txt while older ones remain untouched. As an alternative, we could try to come up with a way where site owners can apply another object like robots.txt, that is specific to the wayback crawler, but the thing to lookout for is making sure the site wasn't usurped by someone else.—cyberpowerChat:Limited Access 21:13, 18 September 2016 (UTC)
- I agree that robots.txt as a takedown request mechanism is not ideal. At the very least they can change mechanism easily because it's not a legal question but internal policy and technology. The problem you mention is real, where a domain expires and the new owner sets up a robots.txt .. this has caused problems no only for Wikipedia but former site owners who wanted their old content available on Wayback. Internet Archive is aware of this, there are a number of long threads in the forums there, it has impacted a number of sites. Nevertheless keeping in perspective the numbers in total are not large. There are currently about 1.1 million Wayback links in Wikipedia as of August 20. Of those approximately 30,000 have been removed as non-functioning. Of those approximately 10,000 are due to robots.txt (I'm still working on how many exactly). So roughly 10,000 out of 1 million is 1%. That's over the entire lifetime of Wikipedia 14+ years. -- GreenC 14:48, 18 September 2016 (UTC)
- I'm party to some of those conversations and although IA says they would like to change, I don't think IA will change its takedown request policy for legal reasons (though they might change the request mechanism). Content owners can make take down requests of copyrighted material. But I'm not a lawyer. Also many links removed by WaybackMedic have nothing to do with robots.txt -- GreenC 13:16, 18 September 2016 (UTC)
- (talk page stalker)Chiming in here as the author of IABot. IABot has uses a plethora of means to fetch an appropriate snapshot for a dead URL, and then maintains that information is a large DB, approx. 15GB in size. Also as a psuedo-liason to the staff and devs of the Wayback Machine, they are working on allowing old snapshots to remain functional even when newer snapshots get blocked by robots.txt. The good news, all the current snapshots that have been disabled retroactively, still actually exist on their servers, so the archives will eventually come back if they were disabled retroactively.—cyberpowerChat:Offline 22:04, 17 September 2016 (UTC)
Article title changes
[edit]Hi Green Cardamom!
There are lot of article titles has been charged, and most users don't notice until somebody edit the page that does had a link to the article that had a title has been charged. Should there been a bot that can automatically edit the link to the article where the title has been recently changed?
Sorry if my bad grammar confused you. TheAmazingPeanuts (talk) 01:52, 18 September 2016 (UTC)
I understand what you are saying but to put a different way, I think what you mean is "If an article is renamed, should the wikilinks to that article, in other articles, be changed automatically by a bot to reflect the new article name". I don't know the answer and it has probably come up before but you might ask it at Wikipedia:Bot requests. -- GreenC 12:58, 18 September 2016 (UTC)
- Yes, that was I trying to say. Thanks for the help, I ask there. TheAmazingPeanuts (talk) 09:12, 18 September 2016 (UTC)
A barnstar for you!
[edit]The Original Barnstar | |
Sorry for the hassle but I have been following your work on Wikipedia for a while now. I would like to hear your input regarding Zeek Wikipedia page
There is an ongoing debate regarding Zeek(company) page. I wanted to ask you for your opinion on the matter. I do have COI with Zeek, but I believe the article can meet WP:GNG or WP:CORP if improvements are made. The company has won many awards over the years and has gotten sufficient coverage in reliable sources such as CNBC, [[]Techcrunh] , Money saving experts and more. Eddard 'Ned' Stark (talk) 21:24, 23 September 2016 (UTC) |
Shall we start an SPI on this? Meters (talk) 23:17, 26 September 2016 (UTC)
- That's it. And another IP has shown up and reverted. I'll start the SPI. Meters (talk) 23:25, 26 September 2016 (UTC)
A 15 Road
[edit]Hi. This is the diff [9] from A15 road. You've changed the date from 11 July 2007 to 20070711. The article header states to use DMY dates as most of the other dates are. I assume there's a reason for this (IE wayback machine states in YMD dates) but there is no edit summary to go with the diff so I do not know why you've changed it. Regards. The joy of all things (talk) 13:42, 29 September 2016 (UTC)
- See the documentation for
{{wayback}}
. 11 July 2007 produces a red error. -- GreenC 13:45, 29 September 2016 (UTC)
A barnstar for you!
[edit]The Original Barnstar | |
Good work! –– ljhenshall (talk page) 21:57, 2 October 2016 (UTC) |
HTTPS now 50% of internet traffic
[edit]I like to believe that we contributed a tiny little share to this. ;) --bender235 (talk) 01:30, 17 October 2016 (UTC)
- Yes undoubtedly true. -- GreenC 15:42, 17 October 2016 (UTC)
Halloween cheer!
[edit]Hello Green Cardamom:
Thanks for all of your contributions to improve Wikipedia, and have a happy and enjoyable Halloween!
– North America1000 23:56, 31 October 2016 (UTC)
Your BRFA task #3 as been approved for trial. — xaosflux Talk 15:21, 3 November 2016 (UTC)
Interface Project
[edit]That interface I have been working. I've finished most of the needed backend, and am now starting on its frontend. That being said, I need your opinion, and feel to survey other users. This will take you to the interface's main page. If you take a look, no tool is ready to be used on it except for OAuth. I was wondering if you could login to the interface and pick out one of the tools. First time users are forced to read and accept the ToS before using the interface, and declining will log them back out again. What do you think of the ToS?—cyberpowerChat:Offline 02:56, 7 November 2016 (UTC)
- That looks amazing. Can't wait to see it in action. By the ToS you mean when logging into OAuth it asks for permission to do high-volume editing? That was the only thing that gave me pause as most editors normally need to get AWB or BRFA permissions. It could allow SPA's for example to cause trouble. Maybe the feature to run on multiple pages requires the editor already have other permissions such as AWB. Same with managing entire domains. -- GreenC 14:56, 7 November 2016 (UTC)
- Just found the ToS. Looks really good. Is it saying that when an edit request is made, it is done as the user and not InternetArchiveBot? That would be cool. -- GreenC 15:07, 7 November 2016 (UTC)
- I was speaking with Niharika, and we agreed that it would be a bad idea to have the queued mass edit runs be done on the requesting user's account. Consequently, single page requests can still be done from the user account. This in mind the Oauth permissions have been radically reduced, and the ToS has been updated.—cyberpowerChat:Limited Access 18:14, 7 November 2016 (UTC)
- Just found the ToS. Looks really good. Is it saying that when an edit request is made, it is done as the user and not InternetArchiveBot? That would be cool. -- GreenC 15:07, 7 November 2016 (UTC)
Discussing streamlining US cannabis articles
[edit]Your comments appreciated here: Wikipedia_talk:WikiProject_Cannabis#Do_we_need_to_do_some_consolidation_of_multiple_overlapping_US_cannabis_articles.3F. Goonsquad LCpl Mulvaney (talk) 22:25, 8 November 2016 (UTC)
Wikipedia:Bots/Requests for approval/GreenC bot 3 has been approved. — xaosflux Talk 23:22, 8 November 2016 (UTC)
Bot war
[edit]GreenC bot appears to be in an edit war with Fluxbot, as indicated by these edits: https://en.wikipedia.org/w/index.php?title=Gustave_Whitehead&diff=749389266&oldid=746210088 https://en.wikipedia.org/w/index.php?title=Gustave_Whitehead&diff=749414086&oldid=749389266 DonFB (talk) 05:48, 14 November 2016 (UTC)
Rollback
[edit]Sorry about that - clumsy fingers. Kanguole 22:06, 16 November 2016 (UTC)
ArbCom Elections 2016: Voting now open!
[edit]Hello, Green Cardamom. Voting in the 2016 Arbitration Committee elections is open from Monday, 00:00, 21 November through Sunday, 23:59, 4 December to all unblocked users who have registered an account before Wednesday, 00:00, 28 October 2016 and have made at least 150 mainspace edits before Sunday, 00:00, 1 November 2016.
The Arbitration Committee is the panel of editors responsible for conducting the Wikipedia arbitration process. It has the authority to impose binding solutions to disputes between editors, primarily for serious conduct disputes the community has been unable to resolve. This includes the authority to impose site bans, topic bans, editing restrictions, and other measures needed to maintain our editing environment. The arbitration policy describes the Committee's roles and responsibilities in greater detail.
If you wish to participate in the 2016 election, please review the candidates' statements and submit your choices on the voting page. MediaWiki message delivery (talk) 22:08, 21 November 2016 (UTC)
Is this account really affiliated with you? --Versageek 09:34, 24 November 2016 (UTC)
- User:Versageek, no sir! -- GreenC 15:33, 24 November 2016 (UTC)
- Perhaps related.[10][11] .. the first is an IP edit (they forgot to login) -- GreenC 15:50, 24 November 2016 (UTC)
IABot Interface
[edit][12]. Check the interface info page. I gave credit to you. Let me know if you want me to change anything?—cyberpowerHappy Thanksgiving:Online 15:15, 24 November 2016 (UTC)
- Great, thanks! Look forward to trying it out. -- GreenC 15:38, 24 November 2016 (UTC)
re-adding unsourced information
[edit]I've removed unsourced articles from Elissa Sursara's Wikipedia page as the articles have been removed from each of the sources cited. I attempted this edit under a different login earlier in the week but have admittedly forgotten my user details. Nevertheless, citing Wikipedia's own content removal policy, under unsourced (and inaccurate) information, the section edited should not be reincluded. After my previous edit, you added the content back. I presume this was an oversight as in the edit summary, "removing unsourced information" was added. I'm open to your feedback on this but feel strongly the section and the sources should not be included on the page. SVass2016 (talk) 09:03, 7 December 2016 (UTC)
- You're trying to purge the shark bite incident from the Internet but that is going to be very difficult. Even if the news stories were removed entirely from the Internet, the story can still be cited, a live link is not required to make a citation. If the newspaper announced the story was withdrawn, then we can't use it. But the mere fact the link is dead, well most links go dead (average lifespan is 7 years), that is normal and doesn't mean they removed it due to inaccuracy or defamation. If you have evidence, but the evidence is private knowledge (such as a legal letter from your lawyer to the newspaper and a confirming reply) then you might need to contact the Wikimedia Foundation. However this link is still live. -- GreenC 16:15, 7 December 2016 (UTC)
- At what point will the Streisand effect kick in? Good idea on the archives. I wouldn't think the archive servers are on Australian soil or anywhere near. That would make it a bit difficult. Cheers Jim1138 (talk) 19:21, 10 December 2016 (UTC)
Vine.com listed at Redirects for discussion
[edit]An editor has asked for a discussion to address the redirect Vine.com. Since you had some involvement with the Vine.com redirect, you might want to participate in the redirect discussion if you have not already done so. - CHAMPION (talk) (contributions) (logs) 04:01, 17 December 2016 (UTC)
Deletion inquiry
[edit]Was wondering if you'd kindly be interested in assessing Wikipedia:Articles_for_deletion/Heap_(company) (since you participated in the Mixpanel deletion discussion). It seems users DGG and SwisterTwister routinely vote for deletion together without offering analysis, so I wanted to get a third-party involved. I'm not fishing for a keep, but I am looking for a legitimate discussion if possible. Thanks for any consideration. GDWin (talk) GDWin —Preceding undated comment added 22:32, 17 December 2016 (UTC)
Season's Greetings
[edit]Hello Green Cardamom: Enjoy the holiday season, and thanks for your work to maintain, improve and expand Wikipedia. Cheers, North America1000 15:30, 18 December 2016 (UTC)
- Spread the WikiLove; use {{subst:Season's Greetings1}} to send this message