Wikipedia:Bots/Requests for approval/Sumibot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: Sumanth
Automatic or Manually Assisted:Semi Automatic, Supervised
Programming Language(s):
Function Summary: Creation of articles for villages/towns in India using AWB
Edit period(s) (e.g. Continuous, daily, one time run): Intermittent. One or two hours a day.
Edit rate requested: 10 edits per minute (peak rate)
Already has a bot flag (Y/N):
Function Details: This is how the bot works: This bot uses AWB. The user loads a list of articles to be created from a file. Only articles for which pages don't exist will be included in the list. Appropriate text for the articles is loaded in the "append" field of AWB. This text consist of appropriate references, categories etc. This example shows how this bot would work.
Discussion
[edit]Where is the data for the pages coming from Betacommand (talk • contribs • Bot) 21:52, 6 March 2007 (UTC)[reply]
- I gather data from Indian government websites. The sources are Ministry of Panchayati Raj and National Informatics Center. --(Sumanth|Talk) 03:38, 7 March 2007 (UTC)[reply]
IIRC, many of the Italian municipality pages were started this way. My understanding is that you have one program that generates wikipedia pages, but you want another (the bot) that can automatically post them to Wikipedia. Is that correct? If so can you manually post one in your userspace as a demo? --Selket Talk 19:04, 7 March 2007 (UTC)[reply]
- I generate the list of villages/towns that can be grouped together (e.g in the same Division) and the content for the pages semi-automatically (with the help of macros in my text editor). I then store the list in a text file and use that as input to "make list" option in AWB. I add the appropriate text in "append" field and then create the page. See this example in my user page: User:Sumanthk/sandbox/Pendurthi. I can use the text shown in this example for all villages which can grouped under Mandals in Visakhapatnam district. For other groups I just need to replace details such as state, district and division. --(Sumanth|Talk) 04:15, 8 March 2007 (UTC)[reply]
- Just FYI, a discussion about the templates to be used can be found here --(Sumanth|Talk) 12:02, 12 March 2007 (UTC)[reply]
- Can you show us a full example on 5 or 10 articles? Thanks. —METS501 (talk) 16:30, 17 March 2007 (UTC)[reply]
- I've created few pages under the following categories Mandals in Prakasam district and Mandals in Chittoor district. --(Sumanth|Talk) 04:37, 21 March 2007 (UTC)[reply]
- Looks real nice. The only concern I have is about the size of these places. Will enough reliable sources be available to turn these into decent articles? If not, could you consider restricting page creation to the more sizable communities? Pedda Raveedu, for example, returns only 2 Google hits! --kingboyk 00:10, 23 March 2007 (UTC)[reply]
- I will echo Kinkboyk, it would be nice to restrict this to larger communities only. —— Eagle101 Need help? 03:04, 23 March 2007 (UTC)[reply]
- These places are at the sub-district level and one can definitely find reliable sources barring few exceptions. These exceptions would be mostly due to confusion in English names of the places. Pedda Raveedu is the name listed in census of India. But the name in Panchayat website is Peda Araveedu and the name in state government website is Pedaaraveedu which return more Google hits. I can compare the data from various sources before creating the article, use the name which is more likely (more Google hits) and redirect pages with alternate names. --(Sumanth|Talk) 04:11, 23 March 2007 (UTC)[reply]
- That would be handy. —— Eagle101 Need help? 04:46, 23 March 2007 (UTC)[reply]
- Yes that would be a lot better. If you could create a few examples using that scheme and report back again it would be great; thank you. --kingboyk 13:07, 23 March 2007 (UTC)[reply]
- Please find few examples here . I resolved the confusion in name for the following pages Naidupet, Podalakur, Ojili manually. I also had to disambiguate a couple of pagesUdayagiri, Nellore district and Sangam, Nellore district. --(Sumanth|Talk) 09:04, 26 March 2007 (UTC)[reply]
- Yes that would be a lot better. If you could create a few examples using that scheme and report back again it would be great; thank you. --kingboyk 13:07, 23 March 2007 (UTC)[reply]
- That would be handy. —— Eagle101 Need help? 04:46, 23 March 2007 (UTC)[reply]
- These places are at the sub-district level and one can definitely find reliable sources barring few exceptions. These exceptions would be mostly due to confusion in English names of the places. Pedda Raveedu is the name listed in census of India. But the name in Panchayat website is Peda Araveedu and the name in state government website is Pedaaraveedu which return more Google hits. I can compare the data from various sources before creating the article, use the name which is more likely (more Google hits) and redirect pages with alternate names. --(Sumanth|Talk) 04:11, 23 March 2007 (UTC)[reply]
- I will echo Kinkboyk, it would be nice to restrict this to larger communities only. —— Eagle101 Need help? 03:04, 23 March 2007 (UTC)[reply]
- Looks real nice. The only concern I have is about the size of these places. Will enough reliable sources be available to turn these into decent articles? If not, could you consider restricting page creation to the more sizable communities? Pedda Raveedu, for example, returns only 2 Google hits! --kingboyk 00:10, 23 March 2007 (UTC)[reply]
- I've created few pages under the following categories Mandals in Prakasam district and Mandals in Chittoor district. --(Sumanth|Talk) 04:37, 21 March 2007 (UTC)[reply]
- Can you show us a full example on 5 or 10 articles? Thanks. —METS501 (talk) 16:30, 17 March 2007 (UTC)[reply]
- Sounds good. Snowolf (talk) CON COI - 23:12, 1 April 2007 (UTC)[reply]
Approved for trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. Try an extended trial (100 articles or so). We can be more lax about this because (I think) most of the data that will be submitted is checked, right? Also, please remove the mound of white space at the top of the article. AWB's general fixes will do this, or you can do it another way. —METS501 (talk) 17:32, 5 April 2007 (UTC)[reply]
- Any news about the trial? Actually the bot has made some edits on April 12, but no more that 45. Snowolf (talk) CON COI - 16:48, 19 April 2007 (UTC)[reply]
- I would complete the trial within a week. Busy with my work right now. --(Sumanth|Talk) 03:31, 20 April 2007 (UTC)[reply]
- Trial run Comleted. Please see this. There were a couple of mistakes which I corrected manually (see Nagari and Puttur). The bot incorrectly appended text to already existing disambiguation pages. I've requested for a new AWB feature which will help in avoiding such errors. I will check the category count after each run to catch such mistakes. --(Sumanth|Talk) 12:54, 27 April 2007 (UTC)[reply]
- Sorry for the delay, Approved. ST47Talk 13:41, 12 May 2007 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.