Wikipedia:Bots/Requests for approval/LaraBot
- The following discussion is an archived debate. Please do not modify it. Subsequent comments should be made in a new section. The result of the discussion was Approved.
Operator: MZMcBride
Automatic or Manually Assisted: Automatic
Programming Language(s): Python (wikitools)
Function Overview: Warning editors who create unreferenced biographies of living people.
Edit period(s): Daily
Already has a bot flag (Y/N): No
Function Details: The script queries the Toolserver's copy of enwiki_p.recentchanges and finds all new pages that are in Category:Living people from a specific day (usually one day prior to the current date). It goes through each title looking for a ==References==, ==Further reading==, ==Bibliography==, <ref, and http://. If it doesn't find any evidence of references, it substitutes {{Unreferenced BLP warning}} on the user's talk page. The list is output to Wikipedia:Database reports/Recently-created unreferenced biographies of living people for tracking / review.
Example biographies from June 3.
- Italian_Mafia_DJ
- Ron_Singleton
- Violeta_Isfel
- Liam_Bowman
- Frederick_Ponsonby,_4th_Baron_Ponsonby_of_Shulbrede
- Mostafa_Azizi
- Jason_Taylor_(guitarist)
- Sally_Greengross,_Baroness_Greengross
- Yogi_khari
- Danny_Sanderson
- Javad_Razzaghi
- Jordan_Schroeder
- Bill_Belk
I may need to include a check for ==External links==. Thoughts on this would be appreciated.
Discussion
[edit]- There are lot of chances that people who normally create unreferenced BLP articles are inexperienced even for adding categories like Category:Living people in the first place. They are usually added by other wikipedians later. Just a thought ! -- Tinu Cherian - 08:45, 4 June 2009 (UTC)[reply]
- Yes, that's one of the advantages of getting the pages a day later. (Hopefully others will have tagged the pages by then.) I realize I'll still miss some biographies, but without manually reviewing every new page, there's no reliable way to detect whether it's a biography of a living person or not without the category. --MZMcBride (talk) 08:57, 4 June 2009 (UTC)[reply]
- Ok. I agree. I guess the presence of "External links" should also be checked as many a times referenes are added as external links section. -- Tinu Cherian - 09:02, 4 June 2009 (UTC)[reply]
- Yes, that's one of the advantages of getting the pages a day later. (Hopefully others will have tagged the pages by then.) I realize I'll still miss some biographies, but without manually reviewing every new page, there's no reliable way to detect whether it's a biography of a living person or not without the category. --MZMcBride (talk) 08:57, 4 June 2009 (UTC)[reply]
- A few of these have infoboxes with links to NFL.com, which appears to be a valid reference. —Snigbrook 12:17, 4 June 2009 (UTC)[reply]
- And IMDB, through a template. If you can weed those out, I think this would be a great task. – Quadell (talk) 14:34, 4 June 2009 (UTC)[reply]
- Yes, I saw those. I've been debating in my mind whether or not those count as references. I suppose there's agreement that they do? It means that I'll just pull all external links when I get the list and I'll exclude pages that contain non-"en.wikipedia.org" links (to avoid "expand this" in stub links, etc. which ruin any queries for pages without any external links). Does that sound reasonable? --MZMcBride (talk) 16:10, 4 June 2009 (UTC)[reply]
- I think they could count as references, and I think your solution is reasonable. – Quadell (talk) 17:36, 4 June 2009 (UTC)[reply]
- Yes, I saw those. I've been debating in my mind whether or not those count as references. I suppose there's agreement that they do? It means that I'll just pull all external links when I get the list and I'll exclude pages that contain non-"en.wikipedia.org" links (to avoid "expand this" in stub links, etc. which ruin any queries for pages without any external links). Does that sound reasonable? --MZMcBride (talk) 16:10, 4 June 2009 (UTC)[reply]
- And IMDB, through a template. If you can weed those out, I think this would be a great task. – Quadell (talk) 14:34, 4 June 2009 (UTC)[reply]
June 4 results:
No. | Biography |
---|---|
1 | Alejo_Sauras |
2 | Zdravko_Mitev |
3 | Dimitar_Bosnov |
The script now checks for ==External links== and does a check for "true" external links. Only 3 results out of 112 new BLPs for 2009-06-04. --MZMcBride (talk) 02:13, 5 June 2009 (UTC)[reply]
Approved for trial (5 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Okay, take it away MZ! – Quadell (talk) 02:20, 5 June 2009 (UTC)[reply]
- Wheeeeeeee. :-) --MZMcBride (talk) 02:36, 5 June 2009 (UTC)[reply]
- Should you be using {{uw-unsourced1}}? – Quadell (talk) 03:01, 5 June 2009 (UTC)[reply]
- Maybe. I sort of hate the user warning templates. They're not very friendly, they include images and far too much text, etc. So unless someone cares, I'd prefer to use my own variant. Reading through {{uw-unsourced1}} a few times, it sounds very terse and belittling.... --MZMcBride (talk) 03:22, 5 June 2009 (UTC)[reply]
- It's good to be friendly, and it's fine to use your own message. But I'd recommend an amalgamation of the two, something like "Hi! It seems you recently created an unreferenced [[Wikipedia:Biographies of living persons|biography of a living person]]: '''[[article name]]'''. Our [[Wikipedia:Verifiability|verifiability policy]] requires that all content be [[Wikipedia:Citing sources|cited]] to a [[Wikipedia:Reliable sources|reliable source]]. Please add references as soon as possible. Thanks!" That at least tells them why, and gives them links to the policies if they want to read them. – Quadell (talk) 12:50, 5 June 2009 (UTC)[reply]
- So fix it. :P --MZMcBride (talk) 19:43, 5 June 2009 (UTC)[reply]
Hit a Unicode error this evening that prevented the bot from posting the list. I added a nag_users parameter and made a few minor adjustments. Hopefully that fixed the issue. --MZMcBride (talk) 04:44, 7 June 2009 (UTC)[reply]
- Any particular code issue with the Bot posting the msg more than once for the same article like this ? -- Tinu Cherian - 06:38, 8 June 2009 (UTC)[reply]
- (edit conflict) More Unicode errors (similar ones to the previous ones). It's a shell issue with Unicode, not a problem with my script. It runs fine from the command line, but cron + shell + Unicode → death. Anyway, I prevented the death from occurring and implemented some better checks to ensure I don't hit a user page multiple times if I run the script manually when it died previously midway through. (And I prevented updating the report page more than once for the same set of data.) Everything seems to be working well now. We'll see what happens when it runs again in about 20 hours. --MZMcBride (talk) 06:41, 8 June 2009 (UTC)[reply]
- Sounds good ! -- Tinu Cherian - 06:53, 8 June 2009 (UTC)[reply]
This seems to be working as expected now. I had one (theoretical) complaint about templating regulars, but everything else seems to be in perfect working order. --MZMcBride (talk) 21:32, 11 June 2009 (UTC)[reply]
Approved. Thanks for doing this! – Quadell (talk) 00:37, 12 June 2009 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. Subsequent comments should be made in a new section.