Wikipedia:Bots/Requests for approval/VWBot 10
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: VernoWhitney (talk · contribs)
Time filed: 00:46, Tuesday October 26, 2010 (UTC)
Automatic or Manually assisted: Automatic
Programming language(s): Python
Source code available: No
Function overview: List pages newly tagged with {{Copypaste}} at WP:CP.
Links to relevant discussions (where appropriate): Wikipedia talk:Copyright problems#More copyvio work/automation?
Edit period(s): Daily
Estimated number of pages affected: 1 page/day
Exclusion compliant (Y/N): Y
Already has a bot flag (Y/N): Y
Function details: This edit would be combined with the already approved single-edit listing of pages which are newly tagged with {{subst:copyvio}} and {{Close paraphrasing}} (part of Task 3 and Task 5) and the relisting of the same (Task 7).
Discussion
Oppose. I am concerned that this bot is facilitating enforcement of Essay opinions and other non-policy. There seems to be a lot of it:[1], [2] and especially: [3] concern me. --Elvey (talk) 19:23, 26 October 2010 (UTC)[reply]
- If you would rather I link to derivative work I could, but it explains the situation less clearly and regardless, material which closely paraphrases or abridges a copyrighted work is still a WP:Copyright violation. As far as the last diff/article goees, the conversations here and here seem to indicate that there is no clear statute or case law establishing whether the content was or was not copyrightable, which is why I blanked it as a possible copyright infringement as demanded by policy in the first place. Have I been violating policy or guidelines somehow, or do you just feel that I am being overly aggressive in removing copyvio, or is it something else that I am missing? VernoWhitney (talk) 20:18, 26 October 2010 (UTC)[reply]
- I feel that last blanking indicates poor judgement. If you can't be bothered to make a determination regarding whether a copyright violation has occurred, you shouldn't be blanking. Paraphrasing is perfectly acceptable under policy and copyright law; plagiarism is not. We can't have WP:GANG behavior I see (not accusing you of being in one) that results in deletion of material by a gang, and yet no one is 'responsible' for the deletion, and no one (but an admin) can review it. Often, material is quickly deleted and is not reviewable, other than by admins, even when requested. --Elvey (talk) 21:31, 26 October 2010 (UTC)[reply]
- You're free to feel that way. My determination for that last one was based on the facts that a) the site explicitly claimed copyright and b) that the EU was not one of the explicitly listed entities whose laws could not be copyrighted in the United States, and so I felt it needed further review. Regardless, blanked material is not the clear-cut case of G12 (which could be reviewed by simply looking at the source(s)) and is not subject to speedy deletion--it sits there for a week and during that time anyone can view the text in history, rewrite it, explain how it's acceptable, or otherwise be involved in the process. I'm afraid I must completely disagree with you regarding paraphrasing, but that's not at issue with this BRFA.
- I feel that last blanking indicates poor judgement. If you can't be bothered to make a determination regarding whether a copyright violation has occurred, you shouldn't be blanking. Paraphrasing is perfectly acceptable under policy and copyright law; plagiarism is not. We can't have WP:GANG behavior I see (not accusing you of being in one) that results in deletion of material by a gang, and yet no one is 'responsible' for the deletion, and no one (but an admin) can review it. Often, material is quickly deleted and is not reviewable, other than by admins, even when requested. --Elvey (talk) 21:31, 26 October 2010 (UTC)[reply]
- Since it's my behavior you find fault with, would it help if I pointed out that auto-listing items at WP:CP generally means they are not handled by me, but by User:Moonriddengirl who is the regular admin on duty there? My routine is the daily bot-generated (not originally my bot, should that matter) WP:SCV listings and other neglected copyvio backlogs such as the currently 520 possible copyright violations identified via {{Copypaste}}, some of which have been tagged for over 3 years without any action being taken. Without this task (and in the absence of an influx of regular WP:COPYCLEAN volunteers) more of the articles tagged like this will be handled by me. VernoWhitney (talk) 21:57, 26 October 2010 (UTC)[reply]
- Blanking content where there is legitimate doubt of the legality of it is long-standing practice and a good one. It is better to pull it from publication for a few days while we it's clear than that we endanger the project and its reusers and cause material damage to copyright holders with unlawful use of their material. The vast majority of copyright problems listed for review are problems. A small but significant amount of them are cleared up by permission from the copyright holder once the procedure is made clear, but with most of them permission is never provided. --Moonriddengirl (talk) 00:44, 27 October 2010 (UTC)[reply]
- Wait, I'm confused. I looked at the first of the examples you listed. Are you objecting to the removal of content like "This is an excellent tree to use in a landscape in areas where you need to establish shade right through the year. It also serves very well as a background planting in a small garden. This is a nice shade tree to be planted in areas next to the swimming pool. The root system will not lift up paving and foundations. The fruits are not fleshy and therefore cause no mess on paved areas" where the source says, "Apodytes dimidiata is an excellent tree to use in a landscape in areas where you need to establish shade right through the year. It also serves very well as a background planting in a small garden. This is a nice shade tree to be planted in areas next to the swimming pool. The root system will not lift up paving and foundations. The fruits are not fleshy and therefore cause no mess on paved areas.'" (I've italicized the words that are duplicated so they'll be more easily seen.) The only words that haven't been copied are "Apodytes dimidiata." That's a clear violation of WP:C and WP:NFC --Moonriddengirl (talk) 00:53, 27 October 2010 (UTC)[reply]
- Perhaps copyright paranoia causes copyright paranoia paranoia. I was concerned by the edit summary, which I said I objected to, not the removal of the content you quote, MRG. Maybe PD-laws (which I created) or its /Doc can be tweaked to better reflect the understanding reached with respect to EU law. --Elvey (talk) 06:14, 27 October 2010 (UTC)[reply]
- Sure, that sounds like a good idea, although I think we should make clear that this is not based on an official position of the WMF but consensus of contributors. I wish we had an official position of the WMF, but, with Mike Godwin heading out of office, now does seem like a good time to ask for one. So far as I know, there's not yet legal precedent. In any event, I didn't find one when I glanced, but some search term combos are harder to narrow than others. --Moonriddengirl (talk) 11:47, 27 October 2010 (UTC)[reply]
- Perhaps copyright paranoia causes copyright paranoia paranoia. I was concerned by the edit summary, which I said I objected to, not the removal of the content you quote, MRG. Maybe PD-laws (which I created) or its /Doc can be tweaked to better reflect the understanding reached with respect to EU law. --Elvey (talk) 06:14, 27 October 2010 (UTC)[reply]
- Wait, I'm confused. I looked at the first of the examples you listed. Are you objecting to the removal of content like "This is an excellent tree to use in a landscape in areas where you need to establish shade right through the year. It also serves very well as a background planting in a small garden. This is a nice shade tree to be planted in areas next to the swimming pool. The root system will not lift up paving and foundations. The fruits are not fleshy and therefore cause no mess on paved areas" where the source says, "Apodytes dimidiata is an excellent tree to use in a landscape in areas where you need to establish shade right through the year. It also serves very well as a background planting in a small garden. This is a nice shade tree to be planted in areas next to the swimming pool. The root system will not lift up paving and foundations. The fruits are not fleshy and therefore cause no mess on paved areas.'" (I've italicized the words that are duplicated so they'll be more easily seen.) The only words that haven't been copied are "Apodytes dimidiata." That's a clear violation of WP:C and WP:NFC --Moonriddengirl (talk) 00:53, 27 October 2010 (UTC)[reply]
- Blanking content where there is legitimate doubt of the legality of it is long-standing practice and a good one. It is better to pull it from publication for a few days while we it's clear than that we endanger the project and its reusers and cause material damage to copyright holders with unlawful use of their material. The vast majority of copyright problems listed for review are problems. A small but significant amount of them are cleared up by permission from the copyright holder once the procedure is made clear, but with most of them permission is never provided. --Moonriddengirl (talk) 00:44, 27 October 2010 (UTC)[reply]
- Support Plagiarism is perfectly acceptable under copyright law, since the law does not concern itself with plagiarism. (It is, however, a problem under Wikipedia's guidelines.) Close paraphrasing of copyrighted content, however, is a legal problem, which is why ourcopyright policy says "Note that copyright law governs the creative expression of ideas, not the ideas or information themselves. Therefore, it is legal to read an encyclopedia article or other work, reformulate the concepts in your own words, and submit it to Wikipedia, so long as you do not follow the source too closely." (emphasis added) Our copyright FAQ also says, "Facts cannot be copyrighted. It is legal to read an encyclopedia article or other work, reformulate the concepts in your own words, and submit it to Wikipedia, although the structure, presentation, and phrasing of the information should be your own original creation...You can use the facts, but unless they are presented without creativity (such as an alphabetical phone directory), you may need to reorganize as well as restate them to avoid substantial similarity infringement. It can be helpful in this respect to utilize multiple sources, which can provide a greater selection of facts from which to draw." There is no proposal here for this bot to tag any article, but instead to list articles that have been tagged by others to ensure that they receive prompt human review. Since it's not uncommon for articles that have the close paraphrasing tag to actually be unusable, I fail to see how that can be a problem. --Moonriddengirl (talk) 00:44, 27 October 2010 (UTC)[reply]
- I agree, MRG. And I'm still not clear how this wouldn't facilitate the WP:GANG behavior I described and often see, and thereby be a problem. --Elvey (talk) 06:14, 27 October 2010 (UTC)[reply]
- We're obviously looking at this one differently. Can you explain how listing articles that have been tagged in the course of ordinary participation by random editors for human review to see if action is necessary could facilitate WP:GANG behavior? How can these random editors and admins who work CP be "tag-teaming"? If you believe that there are admins who are improperly handling copyright concerns at CP (whether that's me or somebody else), that would certainly be something to address, but I fail to see how "meatpuppetry" would be involved. In that case, you'd be talking about admin abuse or incompetence, and the bot has nothing to do with it. WP:CP is a community approved process board. It is transparent; the articles listed there are listed for a full week during which the content may be reviewed and comments may be left by any editor. I'm not the only admin who works CP, but in terms of your note—"that results in deletion of material by a gang, and yet no one is 'responsible' for the deletion"—I review every article listed at CP that I handle, no matter who tags it, and I am fully responsible for any deletions. Any admin who uses admin tools is responsible for the decision to do so, no matter who requested them or in what manner. This is not only good practice, but policy. It sounds to me as though your problem is not with the bot, but with the tag itself. At least if it's listed for admin review, it has a chance be reviewed by somebody experienced in Wikipedia's approach to non-free content. --Moonriddengirl (talk) 11:47, 27 October 2010 (UTC)[reply]
Approved for trial (7 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. I'd like to see more discussion on the suitability of the task, but we can begin testing. MBisanz talk 05:29, 27 October 2010 (UTC)[reply]
The discussion above is completely unrelated to the proposed bot task. Anomie⚔ 23:41, 10 November 2010 (UTC)[reply]
Arbitrary break
Thanks, I should be able to start testing tonight. That said, can I back up the rest of this discussion and ask that comments on my actions be moved to my talk page or somewhere else more appropriate. The three examples pointed out in the first comment were the result of articles brought to my attention by User:CorenSearchBot with some help from the Contribution surveyor, not VWBot at all.
This bot task is proposed in order to facilitate prompt human review, so that issues don't remain unaddressed for years, the backlog stops increasing, and that whichever editors tagged articles as copy/pastes may still be around to discuss why they tagged it in the first place if there's uncertainty. Are there problems with this task being automated, and if so what? VernoWhitney (talk) 13:26, 27 October 2010 (UTC)[reply]
- Trial complete. The relevant edits for your convenience 27th, 28th, 29th, 30th, 31st, and 1st. There were no new copy/pastes tagged on the 2nd. VernoWhitney (talk) 01:27, 3 November 2010 (UTC)[reply]
- I notice a few cases the bot copied extra template parameters, for example the listing for Armidale, New South Wales in this edit. Other than that, it seems to work well. Anomie⚔ 23:41, 10 November 2010 (UTC)[reply]
- Well that's certainly embarrassing. I had tried streamlining some of the older code while adding the new parts for this task and apparently left out that part of the logic. Fixed now. VernoWhitney (talk) 00:12, 11 November 2010 (UTC)[reply]
- Since it's such a minor issue, I don't see a need for another trial. I'm sure you'll keep an eye on it to make sure it really is fixed. Approved. Anomie⚔ 00:53, 11 November 2010 (UTC)[reply]
- Well that's certainly embarrassing. I had tried streamlining some of the older code while adding the new parts for this task and apparently left out that part of the logic. Fixed now. VernoWhitney (talk) 00:12, 11 November 2010 (UTC)[reply]
- I notice a few cases the bot copied extra template parameters, for example the listing for Armidale, New South Wales in this edit. Other than that, it seems to work well. Anomie⚔ 23:41, 10 November 2010 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.