User talk:Coren/Archives/2009/July

This is an archive of past discussions about User:Coren. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page.

Peter Bradley-Fulgoni

Hi Coren,

I'm sorry you've come to the decision of deleting the article I've made. I actually know Peter Bradley-Fulgoni himself very well and had his personal permission and copyright to write this article. He is the one who holds the copyright to the information on soundtechniques.tv, and as you can see from that website, there is no "copyright" written anywhere. It is his CV which he has written himself, and soundtechniques.tv bought it off him as a deal. I'm doing him a favour, and this article would become more accurate and have more reliable sources throughout time. It will soon be created in Russian and Italian as well, just for your information. Could you possibly not generate the automatic message the article in the future, I would be extremely grateful. Thank you and sorry for causing the fuss!

Wtjulianchan (talk) 16:06, 1 July 2009 (UTC)

WP:SCV investigation: now listed at WP:CP since permission is asserted above, left appropriate notice on article talk page where the above has been reproduced. MLauba (talk) 16:42, 1 July 2009 (UTC)

Jive Aces

I hope this entry The Jive Aces is all per policy this time - could you please check it for me and let me know if there's anything I need to correct? Johnalexwood (talk) 13:07, 2 July 2009 (UTC)

Shira Kammen

I believe this page is now substantially different than the referenced page. How do we go about independently confirm that it is now satisfactory? Thanks!

Theodulf 13:33, 2 July 2009 (UTC) —Preceding unsigned comment added by Theodulf (talk • contribs)

Political Common Sense for America

I believe there is an error in deleting this page. The comparable data that is referenced is the standard book description from a retail book sellers website. This was given to the retailer by the author and is not copyrighted material. Please let me know if there is anything else i need to correct to reinstate this page

--Bboeheim (talk) 18:27, 2 July 2009 (UTC)

a boob-ish gamble

G'day C - hope you're good 'n all that :-) - I thought I'd swing by and take a bit of a gamble! As July rushes upon us, and the first half of good old 2009 is down the wiki plughole, I wonder if you might be able to fix up the GFDL compliance at Giovanni di Stefano - an article I've promised not to go near for the whole year (perhaps the wiki gods may forgive this one transgression? probably not, but never mind!).

The article was a collaborative work at the point it was summarily deleted and re-created by Fred back in December (was it 07? Good lord!) - I've mentioned this on wiki quite a few times, and either I'm a gigantic boob for bringing it up again when it was fixed ages ago (hand on heart, I haven't checked - that either makes me wise or lazy, or both?), or those who have been aware of it are gigantic boobs for taking an age to enact what should be a very easy thing to resolve. The final option, of course, is that you become aware of the pretty obvious GFDL violation, and do nothing about it - thus preserving the boobage for yourself. Any friendly passing admin could fix this up in pretty short order, though all should be aware that this is a pretty sensitive area, and you may be sued for 50 million euros. If that's a good enough reason to violate the GFDL (maybe!) then all we need to do is say so.... :-) Privatemusings (talk) 07:16, 30 June 2009 (UTC)

I'll be glad to give a hand, but it may have to wait until Thursday. July 1st is traditionally "moving day" here in Quebec, and I've got a number of tenants moving in and out of flats that will occupy much of my attention today and tomorrow. I will take a quick peek now, though, and see if something can be done sooner rather than later. — Coren ^(talk) 13:06, 30 June 2009 (UTC)

... and what is it with you and boobs anyways? :-) — Coren ^(talk) 13:07, 30 June 2009 (UTC)

heh... well I just can't get enough of 'em! - actually there's just something wonderful about the way 'gigantic boob' scans - always raises a smile for me! (yeah, this probably says heaps about me that's probably better left unexplored!) - there's really no rush on this one, I won't be involved again personally for another six months, though I will be witheringly sarcastic if the problem remains (or at least try to be!) - hope 'moving day' went well for you, and hope you find a fruitful solution :-) Privatemusings (talk) 12:46, 1 July 2009 (UTC)

Wait, you mean there are times where you are not witheringly sarcastic, or do you mean you'd be witheringly sarcastic about some other thing instead? :-) — Coren ^(talk) 17:42, 1 July 2009 (UTC)

Note that Privatemusings has previously been sanctioned for his involvement with that article.* The GFDL problem here is far from unusual as most use of Wikipedia:Oversight and Poor mans oversight causes the same GFDL problem, and we typically don't loose any sleep over it as the revisions are still in the system. If you would like to fix it, I noticed today that Rachel Marsden has a note at the bottom that the revision history is incomplete, pointing the reader to Talk:Rachel_Marsden/GFDL_History. As far as I can see, this is the only page which uses a dedicated subpage for this; as you may know, the history is usually dumped on the talk page, especially for the old Wikipedia:Transwiki solution. To be honest, I think the notice on the Rachel Marsden page is undesirable. I would prefer a template and a hidden category, so that republishers can easily identify pages where the list of contributors is not able to be obtained from the history functionality alone; readers dont usually care to know that sort of thing. We could also add a note in MediaWiki:Histlegend. But all this probably needs to be taken to WP:VPT for discussion, as these are not isolated problems. John Vandenberg ^(chat) 11:36, 2 July 2009 (UTC)

I was considering selectively restoring one revision per contributor not already in the history, with the actual revisions suppressed so that the deleted text and summaries do not return, but that the list of contributors is correct. This would satisfy the attribution requirements, and not return possibly problematic text. — Coren ^(talk) 11:56, 2 July 2009 (UTC)

That would be unusual and hit issues under the the Preserve the section Entitled "History" part of the GFDL. However it would be allowed under the CC-BY-SA which is enough for wikipedia.©Geni 16:06, 2 July 2009 (UTC)

Done. As far as I can tell, I've restored at least one revision from every named editor that contributed contents (i.e. excluding reverts) without the edit summary or text. This should properly attribute for CC-BY-SA. — Coren ^(talk) 03:20, 3 July 2009 (UTC)

I have to express my strong view that absolutely none of this was worth doing, and that sleeping dogs should have been allowed to lie. Newyorkbrad (talk) 16:25, 3 July 2009 (UTC)

I agree with Newyorkbrad. Doing this for a single article is not helpful, and this is a bad article to choose to be the first. John Vandenberg ^(chat) 02:22, 4 July 2009 (UTC)

um... well I came here to say thanks to Coren, though obviously this has stirred something up. I'm happy to discuss why I feel the principle of attribution is actually pretty important, or why my feeling is that some behaviour around this article has caused more problems than it's solved, or anything really (like asking brad and john 'why?') Privatemusings (talk) 12:05, 4 July 2009 (UTC)

Message To Coren, moderator of CorenSearchBot

This message was placed automatically, and it is possible that the bot is confused and found similarity where none actually exists. If that is the case, you can remove the tag from the article and it would be appreciated if you could drop a note on the maintainer's talk page. CorenSearchBot (talk) 03:26, 4 July 2009 (UTC)

And as suggested, I am leaving this message. Shut off your bot...that website copied the information from Wikipedia. I remember contributing the information...just check the talk page...

If not I am going to just delete all my information. In-Correct (talk) 03:47, 4 July 2009 (UTC)

Delhi 2 Dublin Remixed

This article is substantially similar to [1] (as well as [2] and [3] and countless other sites that also list the tracks on the album... it's not plagiarism. SoftwareSimian (talk) 19:46, 4 July 2009 (UTC)

Take care

and good luck, regardless if it is a retirement or a break. Ottava Rima (talk) 16:33, 6 July 2009 (UTC)

I second that. Good luck. Orderinchaos 23:38, 6 July 2009 (UTC)

Hercules Glades Wilderness

I removed the message that was placed automatically on the page Hercules Glades Wilderness, because I was just moving the contents over from the incorrect title of Hercules-Glades Wilderness. I realize now that I should have used the tag db-move, and I will see about moving the attribution history to the appropriate place. Fortdj33 (talk) 18:56, 6 July 2009 (UTC)

Autodeletions?

Does CorenSearchBot automaticaly mark for deletion? —Preceding unsigned comment added by Gosox5555 (talk • contribs) 21:09, 6 July 2009 (UTC)

No, it doesn't. What it does, in a nutshell, is tag for human attention. Sometimes, when there is in fact a copyright violation, the human that reviewed the article will tag it for deletion when there is no salvageable contents, however. — Coren ^(talk) 21:53, 6 July 2009 (UTC)

Thank you

Thank you for all the hard work you have done for this project, as an editor, an administrator and an arbitrator. Your efforts have been appreciated. I hope you have a good break and come back feeling refreshed. Regards, Sarah 01:21, 7 July 2009 (UTC)

Corenbot false positive at Brzonkala_v._Polytechnic_Institute_and_State_University

Hello,

FHI, any hyperlink that begins with the following text, "http://wikileaks.org/wiki/CRS:" will be pointing to a report by the Congressional Research Service, which is a branch of Congress that produces works in the public domain. Agradman ^talk/_contribs 03:04, 7 July 2009 (UTC)

Doxa 12:27, 7 July 2009 (UTC) Agreed that the link contains all the details. But the report is a UN document, which needs to be archived. A UN document cannot be copyrighted by other agencies. Doxa 12:27, 7 July 2009 (UTC) —Preceding unsigned comment added by Ckraju (talk • contribs)

RCE Southern North Sea

This article was re-created from a previously deleted one which featured material copied from the site www.rce-sns.org. Although the website owners and creators were happy with this (I am one of them) we had not added the site to the wikipedia commons. Although we may do this in future we believed that it would be quicker to re-create the wikipedia page in our own words instead. This is what we have now done and linked paragraphs to different sites with the only similar text being of place names. Supermatt84 (talk) 08:57, 8 July 2009 (UTC)

Wikipedia:Requests for comment/User page indexing

Please note Wikipedia:Requests for comment/User page indexing has been repurposed from the standard RFC format it was using into a strraw poll format. Please re-visit the RFC to ensure that your previous endorsement(s) are represented in the various proposals and endorse accordingly.

Notice delivery by xenobot 14:00, 8 July 2009 (UTC)

Music From Free Creek

While I am obtaining the initial track and personnel details from the Moogy Klingman page, these will be added to. In addition, this would appear to be similar to simply listing album credits, otherwise considered to be in the public domain.

Hope this approach seems reasonable.

Dreadarthur (talk) 17:00, 8 July 2009 (UTC)

milw0rm

The bot said that I copied text from a site which was actually a copy of the text I had previously written for the milw0rm Wikipedia article, which had been speedily deleted with no discussion. The text is mine and originally from Wikipedia in case you want to update your bot's settings.--Gloriamarie (talk) 17:09, 9 July 2009 (UTC)

CorenSearchBot

Re your note, I can't ever imagine a time that CorenSearchBot would not be tremendously useful. (Unfortunately. I dream of the day that CP becomes obsolete. :)) I'm dropping you a note, though, to question if it's stopped giving out notices to the contributors for a reason? Looking at today, it tagged [4] and properly advised the contributor, here. But it did not advise these contributors: [5], [6] and [7]. (I came by after noting a CP listing where the contributor was not notified from back on July 1.) Is there a glitch of some kind? If so, I hope it's a simple fix. :) --Moonriddengirl ^(talk) 15:37, 10 July 2009 (UTC)

Thought I'd do some talk page stalking since Coren isn't around (not that I have an answer), but I also raised this issue here noting that the bot seems to have an aversion to creating talk pages (it appears to notify them if the page already exists), and C (had already) left a note at WT:SCV saying there was a problem. Your template is getting more of an outing when I get chance to do some work :) – Toon 21:18, 10 July 2009 (UTC)

Ah! Thank you much for the update. :) --Moonriddengirl ^(talk) 00:21, 11 July 2009 (UTC)

Akaf;ieg Stuttgart FS-26

Yes the article is largely based on the web material but has been editted. the problem is the text is so good it would be hard to improve on. I stand by what I have posted and admit that it is based on the website (none of the text is directly copied). I couldn't see any way round the problem other than edit the parts that didn't read right or could be modified from information from another source (a translated german text). There is not a lot I can do.Petebutt (talk) 19:40, 10 July 2009 (UTC)

Permission request

The following is from WP:Civility/Poll:

One of the most neglected aspects of the debate is the atmosphere that incivility breeds, even on talk pages or project pages. It scares away contributors from participating in what is (rightly) seen as a gutter brawl. — Coren (talk) 19:04, 4 July 2009 (UTC)

I wonder if I could use it at User:Buster7/Incivility...Thanks in advance...:-|--Buster7 (talk) 05:53, 11 July 2009 (UTC)

The license of CorenSearchBot

Hello.

I was asked to adapt your bot for Polish Wikipedia. I rewrote large parts of its code (it uses now my own MediaWiki::API perl module) and made a lot of cleanups (I have turned on the strict and warning pragmas). I didn't think about CSB terms of use at all until now, can you tell me what is the license of CorenSearchBot sources? Cheers, Invisible Idiot (talk) 14:55, 11 July 2009 (UTC)

Good question, and one that I hadn't considered previously. Consider it released under the Perl Artistic License 2.0. I wish you had told me earlier you were doing code cleanup, though, because I've been doing the same and I have a much cleaner version in use at this time that I haven't yet sanitized for cleanup. — Coren ^(talk) 18:45, 11 July 2009 (UTC)

(link to the license) — Coren ^(talk) 18:48, 11 July 2009 (UTC)

Thanks for response. To be quite honest I didn't finish it yet, the bot is on test run and I am still tweaking it. If you are interested you can look at the source code on the page: http://tools.wikimedia.pl/~beau/tmp/csb.pl. I need to put somewhere my MediaWiki::API module, but it is not documented well and it doesn't really make it simpler to use API ;). I have noticed you are using LWP::UserAgent to fill forms (make edits), take a look at WWW::Mechanize (a subclass of LWP::UserAgent) it really simplifies things. Invisible Idiot (talk) 19:51, 11 July 2009 (UTC)

Yes and no; the more recent version still uses it, but in a cleaner and more modular way. I'll try to update the public version this weekend. — Coren ^(talk) 21:26, 11 July 2009 (UTC)

Incorrect behaviour (Fight Night (ITV programme))

Regarding the notice placed on Fight Night (ITV programme). I was in the process of transferring the material. Your bot should have taken circumstances such as this into account. Please fix it. Ubcule (talk) 17:40, 11 July 2009 (UTC)

North American Game Warden Museum

I've fixed the problem your bot highlighted. Please leave he article alone, or better yet, help improve it. Thank you. Happy editing. 7&6=thirteen (talk) 21:38, 11 July 2009 (UTC) Stan

Wikipedia:Arbitration/Requests/Case#Statement_by_Verbal

This is a simple notification that I've asked a question of you here. Maybe I'm missing the point, but I don't understand your rational. It seems it would apply to almost any case! Apologies if I've broken protocol in any way with this. Thanks, Verbal chat 22:58, 13 July 2009 (UTC)

Wong Doc-Fai

Thank you for your notification regarding concern of copyright infringement regarding this topic. I assure you that nothing stated is infringing upon anyone's copyrighted material. I researched the reference for supposed proof of copyrighted material located on the http://www.whitelionsofshaolin.com/ and discovered that it is a rewrite of the original information provided on the subjects web site. The information I am providing is a distillation of of events and activities that are in the public arena and cannot be considered proprietary nor copyrighted. Please do not create an obstacle for the creation of this page as it relates historically to other subjects within the world of martial arts. Thank you. Clftruthseeking (talk) 02:56, 14 July 2009 (UTC)

Dr5 Chrome / dr5

hello; I discussed this with another member; http://en.wikipedia.org/wiki/User_talk:Dicklyon we concurred that it would be best to change the name of the dr5 page to its rightful name dr5 chrome. the dr5 page should be deleted and the Dr5 Chrome page take its place. Pillhall (talk) 04:59, 14 July 2009 (UTC)

Relief

Hi, Coren. You asked me what relief I seek.[8] I'm trying to avoid the "Wikipedia time effect" here... but I have responded. Are you in a good position to tell me how to go about eliciting action from the committee yet? Bishonen | talk 11:21, 14 July 2009 (UTC).

I'm at work, but I'll take a look at it this evening after I get home. — Coren ^(talk) 14:06, 14 July 2009 (UTC)

Bot is tagging GDFL content

Your bot is tagging content on sites/pages expressly marked with the GDFL 1.2 license. Make it stop, plz. - CobaltBlueTony™ _talk 14:46, 14 July 2009 (UTC)

Hi. :) The bot needs to tag such sites. We can't accept them since our licensing transition in June. Imported text must minimally be compatible with CC-By-SA, and GFDL is not. See Wikipedia:Terms of use. --Moonriddengirl ^(talk) 14:53, 14 July 2009 (UTC)

That's theoretically true, but FWIW, I can't imagine a lower enforcement priority. The chances of anyone from such a site objecting to reuse on Wikipedia are essentially nonexistent. Newyorkbrad (talk) 14:56, 14 July 2009 (UTC)

Importing such text is a violation of our copyright policy. It's in meta's terms of use as well as our own. I can't imagine that Wikipedia would adopt even an unofficial "ignore it; it does no harm" policy over copyright violations. (eta: not that I disagree with you in theory, but when it comes to legal issues surely we need to toe our own line. After all, if someone from such a site did object and pointed out that we have tacitly encouraged the importing of such material by disabling bot detection and allowing it to remain, we would surely be putting ourselves in a bad position for contributory infringement--at least as individuals, if not as a collective.) --Moonriddengirl ^(talk) 14:58, 14 July 2009 (UTC)

(edit conflict)That may be well true. Nonetheless, we have a duty to bring and maintain Wikipedia above questioning on copyright matters. This is akin to stating that unsourced BLP violations are no biggie if they happen to be favourable to the article's subject. Plus, we're discussing automatic flagging by a bot here. That's not exactly taking admin time away from other matters. MLauba (talk) 15:04, 14 July 2009 (UTC)

I agree with MLauba and Moonriddengirl here; at the very least, CSBot bringing attention to the matter so that a human can verify the compatibility is important. Incidentally, this means we should go over User:CorenSearchBot/exclude to check that sites which were added there because they were compatible with the GFDL are still compatible with CC-BY-SA — which may imply having to remove many of the GFDL-only sources from there. — Coren ^(talk) 15:15, 14 July 2009 (UTC)

When was the cut-off date again? November 12th last year? Looks like a task to post at WP:COPYCLEAN. MLauba (talk) 15:57, 14 July 2009 (UTC)

Per your message on my talk page, Coren, why cannot GDFL-licensed sites also license themselves under CC-BY-SA? That makes no sense... - CobaltBlueTony™ _talk 17:31, 14 July 2009 (UTC)

Sorry for continuing the page stalking, but long story short, CC-BY-SA and GFDL 1.2 are not compatible licenses, which is the heart of the issue. Dual licensing is possible under GFDL 1.3 but only if dual licensing is adopted before August 1st (on the source site). Presumably GFDL 2.0 may change that - some day. MLauba (talk) 17:49, 14 July 2009 (UTC)

(edit conflict)I think what he means is that GFDL-licensed material can't be reused under CC-By-SA, not that the external site can't choose to co-license or re-license at some point. However, Coren seems to be saying there that "CC-By-SA only" material can be used under GFDL. If so, Coren, I think you're mistaken on that...at least, so Mike Godwin told me via e-mail back around when I first got started on copyright issues, when the whole licensing thing was new to me. It seems to indicate as much at GFDL#Compatibility_with_CC-BY-SA. Also, our terms of use support that when they say, "For compatibility reasons, any page which does not incorporate text that is exclusively available under CC-BY-SA or a CC-BY-SA-compatible license is also available under the terms of the GNU Free Documentation License." If CC-By-SA could be transferred to GFDL, that would be no concern; even if it was only licensed CC-By-SA, it would still be usable both licenses. But maybe I'm mistaking your note, and that's not what you're saying at all. :)

Any website that is currently licensed under GFDL can co-license going forward, but if they wish to retroactively relicense their text, it's going to be a real bear for them to do so since they'll either have to do it under the terms of GNU FDL 1.3 (I think they have to complete the transition by August 1, 2009 as well as meeting requisite conditions) or work out a whole new deal with GNU. Their headache; not mine, thank goodness. :/ --Moonriddengirl ^(talk) 17:52, 14 July 2009 (UTC)

But that fact remains in the real world, this issue has virtually no importance; if we just kept doing as we were, the odds are substantial that no one would notice or care. It's not that I'm not aware of the importance of satisfying legal requirements, but there are issues of prioritization, and this falls way off the bottom of the list. I find it ironic that we are willing to take on disputes in gray legal areas where there are people with profound concerns and convictions on the other side (e.g. the ongoing National Gallery issue, which has a very real chance of ending up in litigation), but go out of our way to police this sort of virtually immaterial, hypertechnical non-compliance as to which continued non-compliance would likely have no meaningful consequences at all. Newyorkbrad (talk) 21:44, 14 July 2009 (UTC)

The only alternative would seem to be that we promote deliberately ignoring satisfying legal requirements. We have a bot that searches for duplicate text. Instructing it to exclude this material which is hosted illegally would surely be some kind of malfeasance on somebody's part. As the word spreads that GFDL is no longer the license-du-jour, it's less likely to be an issue. Meanwhile, letting our contributors know that Foundation wide policy is that GFDL-only text cannot be imported is likely not only doing a service to the encyclopedia (since we ensure we don't build articles that we know from foundation may be illegal to display), but to them. Unlikely as it seems, if the courts go after anybody, currently they're in the firing line, no?

On a related topic, your talk page edit header is somewhat out of date, Coren. :) It says (formatting omitted) "Unless there is an explicit permission on the page (or site) allowing reuse without conditions (or under the GFDL) you can not use that text in a Wikipedia article!" --Moonriddengirl ^(talk) 21:57, 14 July 2009 (UTC)

So it was! Fix't. I suppose there are still gazillion on places all over the Wiki that needs such fixes. — Coren ^(talk) 10:19, 16 July 2009 (UTC)

(undent) As a follow up to the original query, I cleared the WP:SCV list of these, posted all five concerned articles at WP:CP and left a note on the contributor's page. Newyorkbrad, yes, this may be trivial, but if we ignore it and eventually get dragged to court, we will have a hard time demonstrating good faith. And even if it never actually happens, it's still our reputation at stake. Wikipedia has had a lot of bad press related to copyvio issues, but offsetting that, it also garners good press when the efforts of the copyright cleanup crew is recognized. Going after every issue we know of relentlessly is a matter of credibility for the project, no matter how trivial. On copyright and BLP, what matters is doing the Right Thing (tm), even if it isn't always a priority in the grand scheme of things. MLauba (talk) 23:54, 14 July 2009 (UTC)

Earl L. Vandermeulen High School

I created this article by combining and redirecting the existing articles for "Vandermeulen High School" and "Earl L. Vandermeulen", neither of which were properly titled. I then got the Coren bot notification. It does appear that the former text is quite similar to the URL posted on my talk page - I will amend article and remove tag. Please let me know if I need to do anything further. Thanks! --Neighborhoodpalmreader (talk) 17:12, 14 July 2009 (UTC)

Please put up the Dorothy McEwen article as submitted as I am the author and give you full permission to publish as posted. Thank you. —Preceding unsigned comment added by Maxframe (talk • contribs) 08:44, 16 July 2009 (UTC)

The Place And The Time

I copied the album track listing from another site, plus some text now deleted. It seems that the bot interprets a lift of a track listing as being inappropriate, yet it can't be--at least, I don't think so. I have removed the caution accordingly, which I hope is OK.

Dreadarthur (talk) 16:59, 16 July 2009 (UTC)

Yannick Koffi

recreating non-infringing elements of article.00pj (talk) 07:28, 17 July 2009 (UTC)

Changed

Hi, I have made some changes on the article Diabetes Australia Victoria so that the corenbot mark could be removed, can you check if it's okay now? thanks! Diabetes.victoria (talk) 05:27, 21 July 2009 (UTC)

I have changed the copyproblems under "Pukeuria" by replacing it with a link.

Turbonilla (talk) 15:41, 17 July 2009 (UTC)Turbonilla 17 Juli 2009

CorenSearchBot: HSD3B7 article

Hi. CorenSearchBot has flagged HSD3B7 as a possible copyright violation. Some of the material in this article was taken from "Entrez Gene: HSD3B7". This material was attributed to this source and in addition the {{NLM content}} template was added to the article. Therefore I believe the warning was a false positive. Cheers. Boghog2 (talk) 21:16, 17 July 2009 (UTC)

Henry Mildmay

"If CorenSearchBot is in error: Simply note so on this article's discussion page" No I'd rather do it here.

I am in the process of removed a lot of incompatible copyleft material See User:Philip Baird Shearer/BCWs copyright issues

A simple read of the page Henry Mildmay shows that it can not possibly be copyright material. Further it has on the page both citations and in the References section:

Attribution

{{DNB}} "This article incorporates text from the Dictionary of National Biography (1885–1900), a publication now in the public domain."

Further the page it compares is http://www.pepysdiary.com/p/3799.php has written on it "This text was last fetched from this Wikipedia page (where you can edit it) on 19 Jul 2009, 9:07pm under the terms of the GFDL."

So why is the dumb bot putting a template on the top of the page? --PBS (talk) 21:50, 19 July 2009 (UTC)

Please remove the template from the top of Henry Mildmay. --PBS (talk) 21:52, 19 July 2009 (UTC)

You can actually remove the tag yourself; it was mostly there for your benefit to begin with. :-) At any rate, the bot didn't know about {{DNB}} — now it does and won't bug you about it again. — Coren ^(talk) 14:48, 20 July 2009 (UTC)

If it your bot, then please clean up after it, yourself. Are you aware of all the other Category:Attribution templates that we have? As the bot is basically putting out a maintenance template, why not write the information onto the talk page rather than blighting article space. If it is not a maintenance template then it needs to be much more sophisticated so that it does not include obvious false positives like this one. --PBS (talk) 16:45, 20 July 2009 (UTC)

Gary Evans Foster. You need to fix this

You need to fix this. There is no copyright violation for this article. The info was taken from the findagrave website (bith/death dates and burial location) and the official citation on the Army Medal of Honor website per the inline citation. Additionally, stating that the information contained on the Home of Heroes website is in copyright violation is incorrect. They display the actual citation from of the Medal of Honor and that is not in copyright from them or anyone else. It is freely distributable as being a work of the US governent and I recommend you add it as an exception so I will stop getting these error messages. I don't mean to seem rude but I am getting a little tired of the bot leaving these messages for the Home of Heroes website when the Home of Heroes website, regardless of what they may claim, does not hold the patent or copyright for this info. Please add the home of heroes website as an exception so that it will not continue to generate. There is no info on the home of heroes site that is not already on the AMOH site. All the bot is doing in this case is wasting time for me, it and you. Thanks.--Kumioko (talk) 23:04, 19 July 2009 (UTC)

I'd like to recommend an alternate approach. Can you try this on for size and see what you think?

Hi, Coren. Your amazing bot, which tirelessly and very accurately tags countless pages for possible copyright violations day in and day out, making Wikipedia a better (i.e. more law-abiding) place, recently tagged a page I added. I am thinking that since the citation it referenced originally came from the US Army, it is probably not covered by copyright, despite the web site claiming that. Can you take a look and give me your opinion? If I'm wrong, I'd like to know why, and if I'm right, it would be great if you could add an exception so both of us waste less time in the future on this stuff. Thanks for volunteering your time on this great project, and thanks for any help you can offer.

Just a thought. Frank | talk 23:37, 19 July 2009 (UTC)

The easiest way to handle such cases is to use a standardized attribution template which CSBot can then recognize. For instance, {{DANFS}} is already in use for a naval database. If you make (or use) such a template, simply tell me what it is and I'll instruct the bot to cope. — Coren ^(talk) 14:51, 20 July 2009 (UTC)

Created {{ACMH}} for that purpose. Will create the category soon and leave a message on the appropriate wikiprojects. It would have worked with Frank's wording too, methinks. MLauba (talk) 16:02, 20 July 2009 (UTC)

Ok so should I drop this on all the MOH articles? or Just the new ones as they get created? --Kumioko (talk) 16:06, 20 July 2009 (UTC)

Adding it to the new ones will prevent the bot from tagging these in the future once Coren can modify the rulings (he's pretty busy so you might still get a couple of reports until then). For the existing ones, I suggest you add it when you get a chance to revise these, for completeness's sake. Also note that this can be used for other CMH content, not merely MOH citations. Hope that helps. MLauba (talk) 16:11, 20 July 2009 (UTC)

(undent) I was around anyways, and it only takes a few second to add a known template. Done. As for the other question, only new articles need to be templated to avoid CSBot tagging, but it's probably desirable to also place that template on other articles derived from that source if only for consistency (though there is no rush to do so). — Coren ^(talk) 16:30, 20 July 2009 (UTC)

Ok I will make sure I put the template on there although I still think the easiest solution would be for the bot to not use the home of heroes site. I will also add the template to all articles using the AMOH site. If I add it to the references section will that suffice?--Kumioko (talk) 17:02, 20 July 2009 (UTC)

Actually, that turns out to often be more problem than first seems; public domain information is often leeched by more than one site, so that if we exclude one the others will just pop up. — Coren ^(talk) 20:49, 20 July 2009 (UTC)

Done Category created, left messages on WT:MILHIST and WP:ODM. Hope that helps the concerned projects. Best, MLauba (talk) 16:32, 20 July 2009 (UTC)

thanks for the template. I modified it to allow for some variables and I updated the documentation. Here are the variables it now accepts article, url, author, accessdate. --Kumioko (talk) 17:17, 20 July 2009 (UTC)

Some thoughts for SCV

Hi, Coren. We are bandying about some thoughts for streamlining the listing process for Corensearchbot and were thinking that it would be helpful if we could create a system where the bot lists CSB notices on individual days, much like CP does. Currently, the CSB listings are being added to both SCV and CP, which is causing some overlap in work, since these are annotated differently. If it listed them on individual days, these could be transcluded to both CP and SCV, so those working on them will be working in the same place. I know your bot is having difficulty opening new pages, but if you think this is a good idea, I'd be happy to see if I can track down somebody who can help out with this. Maybe User:Tizio could help out, since he already opens new pages for CP and it is his bot that lists CSB taggings at CP? I wonder if he could also help with placing notifications?

The main conversation is at WT:SCV, though I've publicized it at WT:COPYCLEAN and WT:CP. Does this sound like a good and/or workable solution to you? --Moonriddengirl ^(talk) 11:36, 20 July 2009 (UTC)

I've got no objection to any change which makes your jobs easier. :-) I'll probably not have time to do this for the next two weeks, however, given that I pretty much only have time for coding on weekends and that next weekend is booked solid for me.

In the meantime, I'll keep an eye on that discussion until the best scheme has been figured out and I'll code it up during the following weekend. Incidentally, the "create new page" bug is one I do want to tackle then anyways, so it shouldn't be an obstacle. — Coren ^(talk) 14:46, 20 July 2009 (UTC)

Mary May Simon Bio

Permission to use the corporate bio of Mary May Simon, President of the Inuit Tapiriit Kanatami, has been granted by the Inuit Tapiriit Kanatami. Email info@itk.ca for verification.

Inuitofcanada (talk) 17:10, 21 July 2009 (UTC)

Denis Bédard

Your search thing noticed that an article I just produced is a complete copy of a website. There was no copyright marked on the website. What's the policy on that? DTOx (talk) 08:42, 22 July 2009 (UTC)

By international copyright law (the Berne Convention), you will have to assume any work is protected unless explicitly stated otherwise. A copyright notice is not necessary at all. MLauba (talk) 09:18, 22 July 2009 (UTC)

Thomas Palley

Your message was

I have performed a web search with the contents of Thomas Palley, and it appears to include a substantial copy of http://www.thomaspalley.com/?page_id=11. For legal reasons, we cannot accept copyrighted text or images borrowed from other web sites or printed material; such additions will be deleted. You may use external websites as a source of information, but not as a source of sentences. See our copyright policy for further details. This message was placed automatically, and it is possible that the bot is confused and found similarity where none actually exists. If that is the case, you can remove the tag from the article and it would be appreciated if you could drop a note on the maintainer's talk page. CorenSearchBot (talk) 12:55, 22 July 2009 (UTC)

I have removed the tag as the source http://www.thomaspalley.com/ was noted and another source used and sufficient rewording to avoid problems. Hope is OK and best wishes (Msrasnw (talk) 13:02, 22 July 2009 (UTC))

Technical boring stuff, essentially politics free

I've discovered an unpleasant form of vandalism, which I documented at WP:BOTREQ#Forged Billboard Links, but I'm getting no takers on building a bot to help me search it out. It struck me that CorenSearchBot might have the skeleton of similar code (i.e. performing a Google search based on text extracted from a Wikipage), and, if that's true, your mentioning it there might help inspire some eager young bot writer to borrow your code and build it for me. Basically, Billboard has a bug which allows you to move record chart positions around: if a song reached a position on any chart, vandals can massage the link to make it appear that it reached that position on any chart they want on any date they want. I have no idea how prevalent it is: I only noticed because they made a truly bizarre claim (a Latin pop star charting on the "Vietnam Hot 100"), and it was so bizarre that I didn't trust the chart position being served up, despite having "Billboard" stamped all over it.—Kww(talk) 03:11, 17 July 2009 (UTC)

Hmm. I'm not sure what a bot can do to help, from what I understand of the problem. Do you have a specific method in mind to figure out such constructed links from the genuine thing? — Coren ^(talk) 01:35, 19 July 2009 (UTC)

Perhaps I'm not being clear enough, and that's the reason I'm not getting responses

use http://en.wikipedia.org/w/api.php?action=query&list=exturlusage&eunamespace=0&eulimit=max&euquery=*.www.billboard.com&format=yamlfm to get a list of links to Billboard.com
For each link retrieved

Is it of form http://www.billboard.com/bbcom/esearch/chart_display.jsp?cfi=n&cfgn=Singles or Albums&cfn=namestring&ci=n&cdi=n&cid=datestring

extract ci=n field from link.
search billboard.com for links including that string

If no hits, then log as unverifiable link
If hit, then compare cid and cfn fields

if mismatch, log as forgery
if all match, then link is good

As an optimization, it could keep a database of verified links, so that on subsequent runs it wouldn't have to query Google every time.—Kww(talk) 13:57, 19 July 2009 (UTC)

Some context that might help: it's the "ci=" field that identifies a unique chart. Every week that a new "Canadian Hot 100" is created, a new chart id is created, and all links to that chart will include that field, encoded as "ci=". The charts name should be the same in each link, thus "cfn=" should be the same for every link. The data should be the same in each link, thus "cid=" should be the same in each link. For correct referencing, people do a search on a song title, artist, or chart, and Billboard's server manufactures the link for them. Unfortunately, Billboard's server doesn't have an interface to say what the correct values of the various fields are for a given chart: they assume that once they give you the link, you will use it without tampering with it. What I'm proposing is that we search through Billboard's news stories and find their internal links to various charts, on the assumption that they don't vandalize their own links internal to their site. Random testing shows me that for about 75% of links that I check, I can find an internal Billboard link to verify it against.—Kww(talk) 14:56, 19 July 2009 (UTC)

Note that this whole point has become moot: Billboard just revamped its website, leaving us with over 15000 dead links. Certainly need a bot to fix that, but I can't even conceive of how yet.—Kww(talk) 04:37, 23 July 2009 (UTC)

Sorry about the delay, as I've stated below under another thread I wouldn't have been able to work on it until the weekend after next. Does the new site have safeguards in place to prevent abuse? Maybe they noticed the mishandling of their data and did something about it. :-) It's worth investigating because if that's the case, that means they might be willing to cooperate to give us stable URLs for that. — Coren ^(talk) 10:21, 23 July 2009 (UTC)

Actually, they've provided a whole API for the site, which looked great until I found out that it won't return any data for a chart less than one month old. That strikes me as being an effort to make sure that we can't be truly up-to-date automatically: I just can't see the average Lady Gaga fanboy being content to wait a month to show the chart positions. I'm still pondering the right way out of this mess.—Kww(talk) 13:13, 23 July 2009 (UTC)

It's still reasonable to code a template to access it as references for stuff as soon as it drops down their blackout period, and will provide reasonably good and accurate long term cites. It might also be reasonable to contact them semi-officially to negotiate an exception for Wikipedia; I think a reasonable argument can be made that this benefits them positively as well. — Coren ^(talk) 13:19, 23 July 2009 (UTC)

New PD-US template

{{CRS}} - Mlauba said I should mention it to you or your bot. Regards, Novickas (talk) 16:21, 22 July 2009 (UTC)

CSBot now knows and understands that template. — Coren ^(talk) 17:09, 22 July 2009 (UTC)

Thanks - Novickas (talk) 22:36, 22 July 2009 (UTC)

Clarification requested

regarding this, if the block log is indeed wrong, you should probably add some clarification message in the log. LoverOfTheRussianQueen (talk) 20:33, 22 July 2009 (UTC)

The block entry, as placed, is accurate. The matter is currently being examined by the committee so further comments at this time would not be appropriate. — Coren ^(talk) 04:17, 23 July 2009 (UTC)

Wikipedia:Suspected copyright violations

Given the constant backlog on Wikipedia:Suspected copyright violations, which I understand to be a place for CorenSearchBot's suspicions and not human editors', would it be possible for CorenSearchBot to occasionally re-review its suspicions and remove entries from the list on which the CSB template has been removed and which no longer appear to be a copyright violation? When there were enough people regularly reviewing Wikipedia:Suspected copyright violations, such as before I semi-retired, it would be a good idea to leave those suspicions so every one could be manually reviewed. But now it may be worth keeping the scope of the suspicions somewhat current, so people aren't discouraged by the constant backlog and not review the latest entries. Tell me what you think; if you don't want to write the code yourself, I can write a new task for MadmanBot based off of your source code. — madman bum and angel 05:08, 24 July 2009 (UTC)

As one of the few remaining regulars, I think that's an extremely bad idea, the tags often get removed after a very minimal attempt at paraphrasing has been done, which the bot may miss but a human would immediately recognize. We're in effect considering other steps to streamline copyvio investigation efforts, I suggest you weigh in at WT:SCV#Home of the copyright problem. Cheers, MLauba (talk) 07:20, 24 July 2009 (UTC)

dropping a note related to Corenbot's copyright notice about Rara national Park

Note: The initial text was from the Department of National Parks and Wildlife Conservation (DNPWC) website, a Nepalese government body and hence authoritative. I have rewritten the text incorporating text from the Nepal tourism directory and a published book. The article is here -- Rara National Park Utsav₈₀|Blabber 06:57, 24 July 2009 (UTC)

Just fyi

This is just an fyi, as I consider this bot a legitimate and good idea. Sometimes it is useful to start a daughter article by copying and pasting an entire main article to preserve the named footnotes, e.g. <ref name="example"/>. Maybe the bot should wait a few seconds to see if the article remains in large part a copy. Maybe that is not technical feasible. Thats all I got. Savidan 20:20, 24 July 2009 (UTC)

Laughing samoans

The sentences from the Laughing Samoans website is mainly about venue, locations and tour date information, so it looks repetitive? It's all factual info. Don't know if I need to cite the same info from another source? If it's a problem I can delete the article. Ta (Teine Savaii (talk) 09:45, 26 July 2009 (UTC))

Royal Prerogative (United Kingdom)

Quote:

"This is an automated message from CorenSearchBot. I have performed a web search with the contents of Royal Prerogative (United Kingdom), and it appears to include a substantial copy of http://www.pepysdiary.com/p/494.php. For legal reasons, we cannot accept copyrighted text or images borrowed from other web sites or printed material; such additions will be deleted. You may use external websites as a source of information, but not as a source of sentences. See our copyright policy for further details.

This message was placed automatically, and it is possible that the bot is confused and found similarity where none actually exists. If that is the case, you can remove the tag from the article and it would be appreciated if you could drop a note on the maintainer's talk page. CorenSearchBot (talk) 10:32, 25 July 2009 (UTC)"

I am trying to fork an article on Royal Prerogative into a "world" article and a UK article. pepysdiary.com appears to be a clone of Wikipedia so it is not a copyright violation by me.

John Cross (talk) 10:37, 25 July 2009 (UTC)

See User talk:Coren/Archives/2009/July#Henry Mildmay this seems to be a reoccurring problem with this website. --PBS (talk) 08:47, 27 July 2009 (UTC)

Why not place the bot generated comment on the talk page instead of the article space? --PBS (talk) 08:51, 27 July 2009 (UTC)

good idea John Cross (talk) 21:45, 27 July 2009 (UTC)

Please note that you can't copy material from one article to another in Wikipedia without giving the original contributors credit. The word "fork" is insufficient attribution, as our current terms of use make quite clear that our contributors must be "credited, at minimum, through a hyperlink or URL when your contributions are reused in any form." In addition to making a note in the edit summary with a direct link to the source article, there is a template that can be used at {{copied}}. (We used to have multiple templates for each article's talk; this one is a recent introduction to keep it to one.) I'll go ahead and note the duplication for attribution purposes at this one, but if you've copied content between other articles, please be sure that you've given credit to the copyright holders of the articles you've used as source. Thanks! --Moonriddengirl ^(talk) 21:52, 27 July 2009 (UTC)

False positive

Hi, I recently created a page Sam Harper about a film director and copied his filmography from the IMDB website. Correct me if I am not allowed to do that. Just thought you should know that CorenSearchBot marked my edit. If this is a newbie comment, then sorry but am still new here. Thanks, Elppin (talk) 22:34, 26 July 2009 (UTC)

Comillas Foundation

It is true that part of the text in the article "Comillas Foundation" coincide with the Foundation webpage, as the Spanish version do, but that is a descriptive text and it is no affected by copyright. Anyway I am myself authorised to give permission to use that text as Director General of the Foundation. --Ignacio (talk) 18:31, 27 July 2009 (UTC)

George Motion 4

Hi, would you please consider, provided you haven't already, my proposed motion 4? You seem anxious to get this case over with, that is understandable, but let's not allow the cost of haste to be injustice to one of Wikipedia's most venerable and best writers. Thank you,--R.D.H. (Ghost In The Machine) (talk) 22:04, 29 July 2009 (UTC)

Actually, it's a bit late for this — the desysop motion already passed yesterday (part of the reason it has not yet been enacted is simply the fact that Geogre commented at the last minute, and that no arb believes there is an insane rush). But I should probably point out that a "temporary" desysop tends to be a harsher remedy than a straight up desysop; as things are, Geogre can both appeal the desysop if he wants to comment on the substance of the issues, or run for RfA at any time he wishes without a constraint on a timeline or having to wait a set amount of time, and leave the evaluation of the circumstances in the hands of the community. — Coren ^(talk) 22:49, 29 July 2009 (UTC)

As I pointed out and you are well aware; indefinite desysopings are defacto permanent ones. RfA is out of the question. All some need to see is deadmined by the AC and all other virtues are shoved aside. RfA is gamed and broken, especially for ex-admins seeking a fresh start. And I doubt that this committee, given how quickly it has chosen to give in to the demands of George's old enemies, would seriously consider granting his appeal. So I fail to see the logic in your argument that temporary is harsher. The AC's fairness and functions are being seriously questioned by the community. I doubt a Drumhead court-martial of one of WP's best and brightest, would help improve its fast falling repuation.--R.D.H. (Ghost In The Machine) (talk) 23:36, 29 July 2009 (UTC)

Susan J. Wolfson

Fixed the text. Sorry! —Preceding unsigned comment added by Critic11 (talk • contribs) 22:25, 30 July 2009 (UTC) The IGSM article is not voilating the copyright rule as the igsm.in is the official website of IGSM. IGSM is a educational instituion that is imparting management knowledge to the students. It is a institution that is run by a charitable trust. It is a nonprofitable organization. —Preceding unsigned comment added by Ritesh saxena (talk • contribs) 15:37, 31 July 2009 (UTC)

Military Interdepartmental Purchase Request

False alarm on this article: I was in the middle of converting a bad dab page (MIPR) into a good one, and duplicate pages existed for the 10 minutes or so that I was away from my PC. Thanks for the Searchbot, it's very helpful for weeding out copyvios. Per Ardua (talk) 16:11, 31 July 2009 (UTC)