Wikipedia:Bots/Requests for approval/RscprinterBot 3
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Withdrawn by operator.
Operator: Rcsprinter123 (talk · contribs)
Time filed: 20:45, Sunday February 26, 2012 (UTC)
Automatic, Supervised, or Manual: Manual
Programming language(s): C#
Function overview: Updating Persondata templates
Links to relevant discussions (where appropriate): Wikipedia:Village Pump (proposals)#Persondata backlog done by bot, Wikipedia:Bot requests/Archive 45#Village Pump
Edit period(s): Daily
Estimated number of pages affected: 611000 +
Exclusion compliant (Y/N): Y
Already has a bot flag (Y/N): Y
Function details: This bot will basically add parameters to the Persondata template on every article in Category:Persondata templates without short description parameter and Category:Persondata templates without name parameter. All I need approval for is although I am adding the description, the edit rate will be so fast that I can't really use a main account for it.
It will be done through Persondata-o-matic. On the name parameter, the program will not need any human intervention and it will be able to run very quickly. The matter is quite an urgent thing, as the persondata categories above are the biggest backlogs in Wikipedia at current.
Discussion
[edit]Doesn't AWB add the name parameter? This plus this? This looks like a minor edit that can be done automatically with other edits and the backlog is mere 9k. So this doesn't fall under "urgent thing" and the discussion is about the description, not the name.
Regarding descriptions, what are the methods by which you choose what description to add? Although BAG doesn't need too many details since this is a manual task. Nevertheless, I should stress that even as a manual task there cannot be more than a handful of false positives for this to be approved as a bot task. — HELLKNOWZ ▎TALK 21:03, 26 February 2012 (UTC)[reply]
- Descriptions are added easily through Persondata-o-matic, there is a very easy interface where it shows you all the information, I speed type it in and press enter, bot makes edit, I do that over and over again, resulting in far too many for there to not be a bot account doing it. Names are just a little extra that are included in the system that may as well be done at the same time, and even if I strike "urgent thing" there was still a village pump discussion about it and I wouldn't call it a cosmetic bot, if it's helping with a large backlog. Rcsprinter (rap) (Contribs)
(Not Rcs) 21:11, 26 February 2012 (UTC)[reply]- I'm just pointing out that you are bundling the two tasks in a way that implies that the urgency of name parameter is the same as description parameter. The VP discussion is about descriptions and the backlog for names is 9k, it is hardly that large compared to others. All I'm saying is that "urgent" and "large backlog" and "discussion" don't really apply to the names as your function/consensus description would suggest. Names can be done at the same time, no problem, in fact it's preferred. I'm wondering about doing them alone. That's "cosmetic" that can be done by genfixes and no real need to do them alone. — HELLKNOWZ ▎TALK 21:21, 26 February 2012 (UTC)[reply]
Anyway, Approved for trial (100 edits adding description parameter). Please provide a link to the relevant contributions and/or diffs when the trial is complete. (Not for names alone pending some more BAG input on this.) Let's call it a technical trial pending further comments. There seems to be rough consensus for this, and the task (though invisible to most end users) seems somewhat urgent (if "backlog" of a lack of a specific metadata parameter is any indicator of that). — HELLKNOWZ ▎TALK 23:30, 26 February 2012 (UTC)[reply]
- Just to clarify a couple comments above. AWB does not add the Short description but it usually does add the name although I have seen exceptions from time to time. A quick question
- If the operator is typing in the description that seems to be an avenue for error do to misspellings and such?--Kumioko (talk) 01:17, 27 February 2012 (UTC)[reply]
- Thanks for clarification, that's what I thought. Re #1: The operator will have to demonstrate that they do not make spelling errors and such. If they do, then they ought to edit slower and more careful. That's why we have a trial. — HELLKNOWZ ▎TALK 09:39, 27 February 2012 (UTC)[reply]
- I have offered to do a bot to add the description parameter only for sports people. Sports people are usually just sports people and their sport can be easily defined based on category. While a politician could also be a military person or an artist could be a painter and a sculptor. Removing sports people would greatly lower the number of outstanding articles to do and lower the spelling errors. I had been waiting until my current bot request goes thru, but will submit it now so Rcsprinter can start. Bgwhite (talk) 10:17, 27 February 2012 (UTC)[reply]
The requeste to clean up Category:Persondata templates without name parameter should not be granted. The majority of these articles also don't have DEFAULTSORT set. There are alot of Arabic, Asian and other "weird" names that should only be added by a person who understands name sorting rules. This is something that can't be done quickly. Also, the listas value on the talk page is almost always wrong if it is a "weird" name. Bgwhite (talk) 10:36, 27 February 2012 (UTC)[reply]
- I concur; I've encountered these problems in the fast and do not think a bot can adequately deal with them. — madman 14:43, 28 February 2012 (UTC)[reply]
Trial
[edit] Trial complete. 100 edits there and firm. Only took 25 minutes too. Rcsprinter (message) (Contribs)
(Not Rcs) 17:28, 27 February 2012 (UTC)[reply]
Why has the bot removed "<!-- Metadata: see [[Wikipedia:Persondata]]. -->" from the template, this wasn't in the function details. As far as I know there is no consensus to remove that and it's helpful to those unaware of what this is. — HELLKNOWZ ▎TALK 19:40, 27 February 2012 (UTC)[reply]
- Per this discussion, AWB was changed to not add the "<!-- Metadata: see [[Wikipedia:Persondata]]. -->" comment. There is also mention of Persondata-o-matic adding code to remove the comment when it encounters the code. Bgwhite (talk) 20:12, 27 February 2012 (UTC)[reply]
- I see, this wasn't mentioned anywhere so I asked. Thanks for clarifying. I guess it's OK to remove then. — HELLKNOWZ ▎TALK 20:22, 27 February 2012 (UTC)[reply]
- No problem. It was a recent event and with so many places to change things, it is hard to keep up. Bgwhite (talk) 20:44, 27 February 2012 (UTC)[reply]
- I see, this wasn't mentioned anywhere so I asked. Thanks for clarifying. I guess it's OK to remove then. — HELLKNOWZ ▎TALK 20:22, 27 February 2012 (UTC)[reply]
What is your criteria for including nationality is short description? Wikipedia:Persondata#Short_description did not mention that and I assumed that's the guideline you would use.
- I'm not sure of any criteria. But, I've generally seen when a person works for a government, the nationality is added. For example, politicians, ambassadors, military personnel. For U.S. states, I've seen both "American" and the state being used. After that, I personally don't add it, but you got me on what exactly to do. Bgwhite (talk) 20:49, 27 February 2012 (UTC)[reply]
- [1] He was a painter, not a generic artist.
- [2] Do we use colloquialisms like "Children's author"?I would have said "Writer of children's literature", like the lead says.
- [3][4] Not capitalized.
- [5] He was also humorist.
- [6][7] and a few more. You used "&" instead of "and". I'm sure we would prefer spelling out words per WP:&.
- [8] "Clan leader" is surely more important than just "samurai".
- [9] typo in "English"
- [10] "Leader" is extremely vague, it does say "ruler of Tunisia" in the article.
- [11] Is being a "consort" really an important point?
- [12] You can specify as "film" actress
- [13] "Illustrator" seems just as significant if not more than just "author".
- [14] Is University degree more significant than being Editor-in-Chief of a journal?
- [15] You can be more specific as "president of LUKOIL" than just "businessman".
- You use both "footballer" and "football player". Although largely synonymous, what is your reasoning?
- [16] Can be more specific as "painter".
- [17] Judging from article, him being a writer is at least as important as being a doctor.
- [18] I think "singer and mandolinist" can be used instead of generic "musician"
- [19] Surely Countess of Holland is more important than being a noble.
- [20] He was also a statesman which is also prominent.
- [21] You could have specified "painter and sculptor" instead of generic "artist"
- [22] "bin Laden" is not a full name. In fact I would say "lead an Egyptian militant group" is more important.
- [23] I think it could have been clarified as "comic book" cartoonist.
- [24][25] should be gender neutral. Actor instead of Actress. Buisnessperson insead of Businessman or Buisnesswoman. Bgwhite (talk) 20:16, 27 February 2012 (UTC)[reply]
- I didn't realize those have to be gender neutral. Any reason why? — HELLKNOWZ ▎TALK 20:33, 27 February 2012 (UTC)[reply]
- Not a strict guideline per se, but a "should" suggestion per this MOS section. Bgwhite (talk) 20:44, 27 February 2012 (UTC)[reply]
- I see. I would say since we know the gender in these cases, it's not an issue? This isn't a generic description which MOS is about, rather pointed at a specific person. It does say gender-neutral language doesn't apply to "..wording about one-gender contexts..". — HELLKNOWZ ▎TALK 20:52, 27 February 2012 (UTC)[reply]
- I shall incorporate all these suggestions and critisism into it in further runs, what is the next step from here? Rcsprinter (yak) (Contribs)
(Not Rcs) 21:15, 27 February 2012 (UTC)[reply]
- I'm concerned that this may not be best done by a single manual bot. 200 edits/hour over 611K articles is 3000 hours; at 10 hours a day that's nearly a year. And even at that rate, in the first trial nearly a quarter of the selected summaries were queried. Perhaps some kind of consensus/voting robot running over on Toolserver would produce higher edit rates and better accuracy? Josh Parris 21:47, 27 February 2012 (UTC)[reply]
- I've requested a bot, BG19bot 3, that will go thru the sports people. They are probably the easiest to do... Usually play one sport and usually don't become notable for something else. There are ~45,000 footballers with out the description parameter. So, I'm guessing 150,000 articles will be done by the bot.
- There has to be some manual intervention. For example, a person who is a sculptor and painter. A person who is a poet and writer or a Film director and screenwriter. The artists are going to be the hardest. Perhaps, allow Rcsprinter to do just the artists for now. Hopefully a better way of doing the rest will be found.
- I'm not too worried about the mistakes above. A couple hundred more edits and Rcsprinter should get the hang of it. Bgwhite (talk) 09:59, 28 February 2012 (UTC)[reply]
- I understand the good intention, but that is not how bot approvals work. We cannot approve a bot account even with 1% error rate. What "manual" task means is that there would be no errors at all, because a human would resolve the cases. Bot account means no one needs to review the edits. A BRFA and a trial (among other things) are to prove that no one needs to review the edits. If we need to wait until the operator "get[s] the hang of it", then we will have to do that many trials. — HELLKNOWZ ▎TALK 10:08, 28 February 2012 (UTC)[reply]
- Indeed. I'm unsure as to why this request is open at all, except to add a task to RcsprinterBot. A manual task does not need approval unless it's being run at high speed; this task could be run under the operator's account. Additionally, I think that no manual task is going to address the sheer size of the backlog that prompted this request. The community discussion regarded either crowd-sourcing the backlog or having a bot run on very narrow subsets of articles (e.g., if an article has (footballer) in the title, it's likely that footballer is an appropriate short description). Personally, I'd advise withdrawing this request and considering better options. — madman 14:43, 28 February 2012 (UTC)[reply]
- I understand the good intention, but that is not how bot approvals work. We cannot approve a bot account even with 1% error rate. What "manual" task means is that there would be no errors at all, because a human would resolve the cases. Bot account means no one needs to review the edits. A BRFA and a trial (among other things) are to prove that no one needs to review the edits. If we need to wait until the operator "get[s] the hang of it", then we will have to do that many trials. — HELLKNOWZ ▎TALK 10:08, 28 February 2012 (UTC)[reply]
- I'm concerned that this may not be best done by a single manual bot. 200 edits/hour over 611K articles is 3000 hours; at 10 hours a day that's nearly a year. And even at that rate, in the first trial nearly a quarter of the selected summaries were queried. Perhaps some kind of consensus/voting robot running over on Toolserver would produce higher edit rates and better accuracy? Josh Parris 21:47, 27 February 2012 (UTC)[reply]
- re Rcsprinter123: Per BOTPOL: "The bot operator is responsible for reviewing the edits and repairing any mistakes caused by the bot.". Some of the issues above are subjective, but many should definitely be fixed. If you don't think they need correction, then you may need to consult another BAG member to review them. — HELLKNOWZ ▎TALK 10:08, 28 February 2012 (UTC)[reply]
- I have fixed the most severe of the diffs listed above, but don't want to withdraw yet as I think it could still work. Madman, I think I have already explained at the top that the purpose of having a BRFA is because of how many edits it will make. Thousands a day.
Now, I've thought of a way of doing things if there was a further trial. I could just do them all in one type of person, (though not sportspeople as Bgwhite is doing those), say artists, using this and see if I get better results that way and a bit less than a 25% error rate. What do we think? Rcsprinter (speak) (Contribs)
(Not Rcs) 17:15, 28 February 2012 (UTC)[reply]- For manual tasks the allowed error rate is 0%. As above, you don't need a bot account for manual tasks. You can create a dedicated account if you don't want the edits on your main one. — HELLKNOWZ ▎TALK 18:01, 28 February 2012 (UTC)[reply]
- I have fixed the most severe of the diffs listed above, but don't want to withdraw yet as I think it could still work. Madman, I think I have already explained at the top that the purpose of having a BRFA is because of how many edits it will make. Thousands a day.
← Rcsprinter, from things above.
- It sounds like doing the under your own account or a dedicated account is best.
- Doing one category at a time. Most of the descriptions would be the same and it becomes easier to ascertain what the description should be.
- I don't believe the "Manual = 0% error rate", because nobody is perfect. (I always say the day I din't make an error on Wikipedia is a day I didn't edit on Wikipedia. Strangely my wife says I'm always wrong) The error rate should be as close to 0% as possible.
- From your own account or dedicated account, you do 50 edits. I'll look over them and see what is right or wrong. Rinse and repeat until I think you have the hang of it, then go full boar add it. If you aren't comfortable with me, I know User:Waacstats deals alot with the description parameter and he may help you. Bgwhite (talk) 21:42, 28 February 2012 (UTC)[reply]
- Re 3. You can make as many errors as you want when editing outside BRFA approval. When it's for a bot account, we expect no errors because bots are expected to have no errors. Sure, it's never 0% and things no one anticipated slip in. But if the task is manual, then the operator is finding all these. So it would essentially be 0. Otherwise, why would you need a bot account if the task requires so much human judgment that it's basically all-human? Hence the suggestion for a regular alternative account. — HELLKNOWZ ▎TALK 21:52, 28 February 2012 (UTC)[reply]
- Withdrawn by operator. I give in. Bgwhite, I'll accept your offer. Contact me on my talk to sort something out. Rcsprinter (gas) (Contribs)
(Not Rcs) 16:27, 29 February 2012 (UTC)[reply]
- Withdrawn by operator. I give in. Bgwhite, I'll accept your offer. Contact me on my talk to sort something out. Rcsprinter (gas) (Contribs)
- Re 3. You can make as many errors as you want when editing outside BRFA approval. When it's for a bot account, we expect no errors because bots are expected to have no errors. Sure, it's never 0% and things no one anticipated slip in. But if the task is manual, then the operator is finding all these. So it would essentially be 0. Otherwise, why would you need a bot account if the task requires so much human judgment that it's basically all-human? Hence the suggestion for a regular alternative account. — HELLKNOWZ ▎TALK 21:52, 28 February 2012 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.