Wikipedia:Bots/Requests for approval/Ganeshbot 4
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Ganeshk (talk · contribs)
Automatic or Manually assisted: Automatic
Programming language(s): AutoWikiBrowser and CSVLoader
Source code available: Yes
Function overview: Gastropod project is looking to use automation to create about 600 new species stub articles under the genus Conus. The stubs will use the World Register of Marine Species as reference.
Links to relevant discussions (where appropriate):
Edit period(s): One time run
Estimated number of pages affected: 580 new pages (approx.)
Exclusion compliant (Y/N): N/A
Already has a bot flag (Y/N): Y
Function details:
Gastropod project would like to use automation to create about 600 new species articles under the genus, Conus. This task has been verified and approved by the project.
This task will use the CSVLoader plugin with AutoWikiBrowser. CSVLoader allows creating articles using data stored in delimited text files.
A stub template was created at User:Anna Frodesiak/Gold sandbox to show how the finished article will look like. It was checked and verified by all the Gastropod project members. The page also served as a central place for discussions.
The data file for the task is available at User:Anna Frodesiak/Green sandbox.
The CSVLoader plugin will use the approved article template and the CSV data to create the finished articles. A reference to the World Register of Marine Species will be placed in each new article.
Two sample articles are in my sandbox:
Discussion
[edit]Isn't there an entire wiki devoted to this? --MZMcBride (talk) 03:07, 17 March 2010 (UTC)[reply]
- Doesn't mean we cant use the information. Looks pretty cool! Tim1357 (talk) 03:11, 17 March 2010 (UTC)[reply]
- MZMcBride, Meta has a whole page devoted to answering your question, Wikispecies FAQ. Regards, Ganeshk (talk) 03:13, 17 March 2010 (UTC)[reply]
- (edit conflict) Yes, yes it does. We could bulk import a lot of things into Wikipedia (case law, dictionary definitions, images, etc.) but we don't because we have other projects for such things (Wikisource, Wiktionary, Commons). We've had specific issues in the past with bulk creation of articles and with bulk creation of species articles in particular. This is going to need some broader discussion and consideration. --MZMcBride (talk) 03:14, 17 March 2010 (UTC)[reply]
- I am quoting from the meta page. The final sentence is important, we still need a encyclopedia for the general audience.
The primary reason that Wikispecies is not part of Wikipedia is that the two reference works serve different purposes and audiences. The needs of a general-purpose, general-audience encyclopedia differ from those of a professional reference work. Furthermore, much of Wikispecies is language-independent, so placing it in the English Wikipedia (for example) would be inappropriate. Wikispecies also has different software requirements from those of Wikipedia. The project is not intended to reduce the scope of Wikipedia, or take the place of the in-depth biology articles therein.
- Regards, Ganeshk (talk) 03:17, 17 March 2010 (UTC)[reply]
- I'm just passing by here, and I want to make it clear that I don't feel strongly about this either way. I'd just like to say that MZMcBride makes a point. I understand the argument that there are two audiences that the wiki's cater too and all, but really, by this point I would question the mass addition of most sets of articles into Wikipedia. I could understand finding and adding individual species articles, if there's something to actually write, but... do we really thousands of new stubs here that (apparently) consist of nothing more then some genetic information, when there is a specialized wiki that specifically does want those sorts of articles?
— V = IR (Talk • Contribs) 06:01, 17 March 2010 (UTC)[reply]- Yes, we need them for further expansion. We - wikiproject gastropod members - are starting stubs exactly like this manually. - This approval is only formal check, that this task of bot will not be against wikipedia guidelines. This task was already approved by one wikiproject Gastropods already. This bot task is fine and can be approved without protraction. --Snek01 (talk) 10:30, 17 March 2010 (UTC)[reply]
- I'm just passing by here, and I want to make it clear that I don't feel strongly about this either way. I'd just like to say that MZMcBride makes a point. I understand the argument that there are two audiences that the wiki's cater too and all, but really, by this point I would question the mass addition of most sets of articles into Wikipedia. I could understand finding and adding individual species articles, if there's something to actually write, but... do we really thousands of new stubs here that (apparently) consist of nothing more then some genetic information, when there is a specialized wiki that specifically does want those sorts of articles?
- Ohms, I understand your concern. There was a lot of pre-work done before this bot approval request. The stubs contain more just the genetic information. The data for the stubs was retrieved using Web Services. [1] I wrote a AWB plugin to check the WoRMS site and pull information like Authority and Aphia ID and create the data file. These are unique to each article and are used in the actual stub article to create a reference. Additional text has been added to lede by the project members to ensure the stub is not a one-liner. It has proper stub templates, taxo infobox and categories. To do this manually is a lot of work, so the bot request. Regards, Ganeshk (talk) 12:47, 17 March 2010 (UTC)[reply]
- First of all, articles like User:Ganeshk/sandbox/Conus_abbreviatus should not be published. Most of the content is unreferenced, and there's really no point in adding multiple empty section headers with ugly cleanup tags. If and when they get expanded, then the sections can be added. Moreover, I'm always a bit wary of bot-created content, and I'd like to see some evidence that it will be verified and accurate. –Juliancolton | Talk 19:31, 17 March 2010 (UTC)[reply]
- Let me talk to the project about removing those empty sections. On your second note, the article content has been verified by the project members. The data was pulled from a credible site, WoRMS. There was no new bot written for this task. CSVLoader and AWB are just software tools that facilitate automation. CSVLoader reads each line from the CSV file and replaces the column names in the stub template with data from the file, and creates the article. Regards, Ganeshk (talk) 22:06, 17 March 2010 (UTC)[reply]
- Well, this kind of thing had been done before. Tim1357 (talk) 00:58, 18 March 2010 (UTC)[reply]
- Yes, but it has also been done before with disastrous results. I still have nightmares about Wikipedia:Articles for deletion/Anybot's algae articles. Not that this will be anything like that, of course, but when you say "this has been done before", you don't settle my concerns about this task. I oppose all automated stub creation by bots. If over five hundred pages are simple enough and repetitive enough to be created by bots, I'm not entirely sure they should be on Wikipedia in the first place. What's the point of having 500+ articles that are essentially the same? — The Earwig (talk) 01:33, 18 March 2010 (UTC)[reply]
- This task is no way close to what AnyBot did. I had clearly mentioned that about 580 stubs will be created (nothing more). The list of articles that will be created are already listed in the data page (first column). The point of creating these articles is so that the Gastro team can come in and expand them. Regards, Ganeshk (talk) 01:45, 18 March 2010 (UTC)[reply]
- A couple of quick points: in direct reply to yourself, Ganeshk, The Earwig was making a general comparison in response to the "it's been done before" statement made by Tim1357, above. He did say that he didn't expect this to run into the same issues that Anybot's edits did. The Point though, is that while there have been runs of adding information to Wikipedia like this in the past, that shouldn't be understood to imply carte blance approval for continuing to do so. Wikipedia has grown a lot over the years, to the point where I'm not sure that these automated article additions are that helpful.
- Just brainstorming here, one possible way forward is for yourself and the rest of the project to manually create about a dozen of these articles. If they all survive for, say, 6 months, then that will go far in showing "hey, no one seems to have an issue with these articles on Wikipedia", if you see what I'm getting at.
— V = IR (Talk • Contribs) 03:21, 18 March 2010 (UTC)[reply]
- Thanks Ohms for looking for a way forward. Similar articles have been created manually and no one had any issue with them. Please browse through articles in the sub-categories of this category, Category:Gastropod families. Category:Acmaeidae for example. Regards, Ganeshk (talk) 03:34, 18 March 2010 (UTC)[reply]
This robot can start an article, which is much better in content and compatible with standards than any other commonly manually started gastropod related article. 1) This robot do things, that humans usually do not. It uses inline reference, which easily allows verify to everybody (that is important, because wikipedia can edit everybody), that the species really exist and is valid. 2) This robot properly add very complicated taxonomy of gastropods (there is very high danger of error, when starting by non-expert or unexperienced wikipedia editor) and also properly add the authority of species (this also can be a problem for non expert). 3) It works systematically and efficiently. Also note, that gastropods are generally highly underestimated in knowledge not only in wikipedia, but also in scientific world. There are less member projects for larger tasks than any other wikiprojects. We usually have no time even to use images from commons in some cases. There is for example less wikipedia articles for Conidae than related categories on commons(!). Help by a robot would be very helpful. There were some robots in the past that were considered unhelpful and some that were considered helpful. Are you able to learn from these lessons from past or will you refuse all (as recommended to refuse by The Earwig who starts the same(!) as articles [2] in unnecessary 8 edits as Ganeshbot can do easily in one edit) even if the robot will not broke any of wikipedia policies and guidelines? This start of 650 stubs is really very small part of the work that needs to be done. Maybe there are some fields in wikipedia that really does not need any help similar to this anymore, but marine organisms are among those ones, that without bots are unable to join among professional encyclopedias and stay updated. --Snek01 (talk) 12:17, 18 March 2010 (UTC)[reply]
- I dunno... I wouldn't trust a "professional" encyclopedia if it were written by unmonitored automated accounts... –Juliancolton | Talk 14:17, 18 March 2010 (UTC)[reply]
- But it is not this case. It is monitored. --Snek01 (talk) 14:49, 18 March 2010 (UTC)[reply]
- Is every article going to be manually verified for accuracy? –Juliancolton | Talk 15:39, 18 March 2010 (UTC)[reply]
- What "accuracy" do you mean?
- Whether or not the information presented is as factually correct as a human-generated piece of content. –Juliancolton | Talk 00:48, 19 March 2010 (UTC)[reply]
- The bot will only use data that was already downloaded from WoRMS. I have created, User:Ganeshbot/Conus data table that shows the data in wiki syntax. You can check the accuracy of the data using the species links. The bot generated content will be as good as the data that is hosted on WoRMS site (as of the download date). Regards, Ganeshk (talk) 03:11, 20 March 2010 (UTC)[reply]
- Whether or not the information presented is as factually correct as a human-generated piece of content. –Juliancolton | Talk 00:48, 19 March 2010 (UTC)[reply]
- What "accuracy" do you mean?
- Is every article going to be manually verified for accuracy? –Juliancolton | Talk 15:39, 18 March 2010 (UTC)[reply]
- But it is not this case. It is monitored. --Snek01 (talk) 14:49, 18 March 2010 (UTC)[reply]
- If you think, that we should verify, that there is not a mistake in World Register of Marine Species, then no. We will not verify it immediately after article creation. Wikipedia:Verifiability: The threshold for inclusion in Wikipedia is verifiability, not truth. Even despite this, there was an effort with results in identifying errors in Conus in World Register of Marine Species and its incompatibilities with other databases.
- If you think, that we should verify that Ganeshbot will not make an error, then it is the same as with any other Bot. (Who is verifying if some bot is adding a proper DOI? The bot either works all right, or works absolutely false). Any error of function will stop this bot.
- But yes, accuracy of all generated species articles will be verified based on its related manually created generic article. And every article that will be edited by any user who will add an additional reference will be verified so. This can not be done immediately, but there is an intention to not let article in the state that "This species exist", but to expand it. I expect (based on my experiences) that usual expansion will be like this: adding distribution, adding at least dimensions of shells and adding wikilink to its author. --Snek01 (talk) 17:12, 18 March 2010 (UTC)[reply]
- On the other hand, to be sure, that we will check all of bot generated articles (especially if Ganeshbot will be approved for other tasks). You can make some solution, that will alert for all bot generated articles, that were not edited by any wikipedian (or project member(?)) for a long time since its creation. --Snek01 (talk) 17:12, 18 March 2010 (UTC)[reply]
As a way forward, Can the BAG let me create the first 50 articles? The Gastro project can check and confirm that they are good. Once they are good, I can get the bot approval to complete the rest. I am hoping that will take care of the concerns here. Please advise. Thanks. Ganeshk (talk) 01:55, 19 March 2010 (UTC)[reply]
- Sure, but how about 20. Additionally, I want to see some input from members of the Gastropod wiki-project. Tim1357 (talk) 03:27, 20 March 2010 (UTC)[reply]
- Approved for trial (20 new pages). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Tim1357 (talk) 03:27, 20 March 2010 (UTC)[reply]
- Thanks Tim. I have completed the trial. Here are the 20 new articles. The bot skipped two existing articles, Conus africanus and Conus amadis. The only worrisome thing that I found was that Conus anabathrum shows up fine when I queried using web services, User:Ganeshk/sandbox/Conus anabathrum. But when searched online, it shows up as in quarantaine, unverified (screenshot). I hope the Gastro team has some explanation for this. I will ask the Gastro team to verify the results. Regards, Ganeshk (talk) 04:26, 20 March 2010 (UTC)[reply]
- Two things:
- Please, please, please, please... remove the {{expand section}} tags and empty sections. They look horrible.
- The category and stub template are in the wrong place. Should be like this; you can see what I mean here. See Wikipedia:Stub#How to mark an article as a stub.
- Otherwise, the article structure seems fine in practice, although I have yet to verify the actual data. Best. — The Earwig (talk) 04:59, 20 March 2010 (UTC)[reply]
- Thanks for your comments. The empty sections will be helpful for the person who is expanding the article next. How about a compromise? Two empty sections instead of three. See Conus alconnelli for example.
- I have fixed the category and stub structure on the template. Newer articles will have these changes. Regards, Ganeshk (talk) 18:19, 20 March 2010 (UTC)[reply]
- Two things:
- Thanks Tim. I have completed the trial. Here are the 20 new articles. The bot skipped two existing articles, Conus africanus and Conus amadis. The only worrisome thing that I found was that Conus anabathrum shows up fine when I queried using web services, User:Ganeshk/sandbox/Conus anabathrum. But when searched online, it shows up as in quarantaine, unverified (screenshot). I hope the Gastro team has some explanation for this. I will ask the Gastro team to verify the results. Regards, Ganeshk (talk) 04:26, 20 March 2010 (UTC)[reply]
- If I may add my comment as a member of wikiproject gastropods: All the information present in each of these 20 bot-created stubs seems precise, according to the cited reference. The Taxonomy follows Bouchet & Rocroi 2005 as it should. One thing comes to mind though... Several species in these articles have synonyms (Conus amphiurgus, for example). This information can be verified in the article's reference. Shouldn't it be present in the article's taxobox as well? If not, then project gastropods members would have to add this information manually later, I presume.--Daniel Cavallari (talk) 06:09, 20 March 2010 (UTC)[reply]
- Daniel, I was able to pull the synonyms using web services. I ran the bot again to update the articles with the synonyms (See Conus ammiralis). Can you please give the articles one more check? Thanks. Ganeshk (talk) 20:41, 20 March 2010 (UTC)[reply]
- Way to go, Ganesh! Now everything seems to be in order, as far as I can tell.--Daniel Cavallari (talk) 00:06, 21 March 2010 (UTC)[reply]
- Excellent. I think there should be <br/> instead of <br>. Like this [3]. --Snek01 (talk) 12:50, 21 March 2010 (UTC)[reply]
- Thanks. Okay. I will add the <br/> on the next runs. Regards, Ganeshk (talk) 13:39, 21 March 2010 (UTC)[reply]
- Excellent. I think there should be <br/> instead of <br>. Like this [3]. --Snek01 (talk) 12:50, 21 March 2010 (UTC)[reply]
- Way to go, Ganesh! Now everything seems to be in order, as far as I can tell.--Daniel Cavallari (talk) 00:06, 21 March 2010 (UTC)[reply]
- Daniel, I was able to pull the synonyms using web services. I ran the bot again to update the articles with the synonyms (See Conus ammiralis). Can you please give the articles one more check? Thanks. Ganeshk (talk) 20:41, 20 March 2010 (UTC)[reply]
ANI incident
[edit]Here I am trying to get this task approved. Another user picks up from where I left off last and creates 85 new articles. Please check ANI. I have requested that these new articles be deleted quickly. Regards, Ganeshk (talk) 03:48, 21 March 2010 (UTC)[reply]
- The articles have been deleted now. Ganeshk (talk) 04:06, 21 March 2010 (UTC)[reply]
New ideas
[edit]"Predatory sentence" can be at Ecology section like this [4], but both ways are OK. --Snek01 (talk) 12:50, 21 March 2010 (UTC)[reply]
- I think it sounds right in the intro. Let us leave it at that. Thanks. Ganeshk (talk) 13:39, 21 March 2010 (UTC)[reply]
Next steps
[edit]Two project members have checked and verified the 20 trial articles for acurracy. They found all of them correct. Can the BAG approve this task please? Thanks. Ganeshk (talk) 15:38, 21 March 2010 (UTC)[reply]
- Trial complete.
- Ok Ganesh. A few of us in the BAG were concerned that the bot would finish making all these articles and the Wikiproject would be swamped with these new stubs. We want humans to be at least glancing at the new articles. With that in mind, Ill give you Approved., but with a limitation. You can only create 100 articles per month. This gives your wikiproject some time to look over the stubs. Hopefully this is acceptable for you, and if not, please leave a message on WT:BAG. Thanks! —Preceding unsigned comment added by Tim1357 (talk • contribs) 04:12, 25 March 2010 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.