Wikipedia:Bots/Requests for approval/MusikBot 17
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard. The result of the discussion was Approved.
New to bots on Wikipedia? Read these primers!
- Approval process – How this discussion works
- Overview/Policy – What bots are/What they can (or can't) do
- Dictionary – Explains bot-related jargon
Operator: MusikAnimal (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 17:43, Tuesday, March 16, 2021 (UTC)
Function overview: Configurable task that adds the size of categories to JSON pages so that they can be used to create charts like {{articles lacking sources chart}}
Automatic, Supervised, or Manual: Automatic
Source code available: GitHub
Links to relevant discussions (where appropriate): Special:Permalink/1011787521#Proposal for bot to populate the size of categories over time, Special:Permalink/1013656861#Graph
Edit period(s): Daily
Estimated number of pages affected: 2 or 3 to start with, over time up to many hundreds or thousands
Namespace(s): Template (as a subpage of a chart), but potentially also Category and Wikipedia namespaces
Exclusion compliant (Yes/No): No
Function details: This is an expansion to MusikBot 16. The idea is for a configurable bot that keeps track of the size of categories over time, such that you can create charts like {{articles lacking sources chart}}. You can then display these at the top of the category page or in a WikiProject, for example to invite editors to help reduce backlogs and to show off your progress.
The bot will go off of an sysop-protected JSON configuration page. I've created User:MusikBot/CategoryCounter/config as an example. Each entry in the config should have the category name, granularity ("daily", "weekly" or "monthly"), and the title of wherever the data should be stored (probably a subpage of the chart template). There is also a 'cutoff' option, which specifies the number of days after which the dataset should be truncated (so older data is removed as new data is added). This may be necessary especially for the "daily" granularity because after several years the dataset can become too large.
The bot would go off of each category to populate another JSON page with the data, for example Template:Articles lacking sources chart/data. You can then make charts that display this data, such as with {{articles lacking sources chart}}. The bot won't create the chart for you, only populate the data, however there will be thorough documentation on how to set it all up.
Performance-wise, the category count query is pretty cheap, so I expect this task to scale to support many hundreds if not thousands of categories before we run into any problems. If the bot ever does break or stops running, the dataset pages can still be manually edited (by anyone) and the chart will update accordingly. I propose restricting editing of the bot's config page to sysops, as this can help prevent additions of very small insignificant categories that unnecessarily strain the system.
Discussion
[edit]- Note: MusikBot 16 is still in trial at the time of writing. I'm getting a head start on this BRFA as I suspect there will be a fair amount of discussion, but a trial probably shouldn't start until MusikBot 16 is approved (because this task will replace it). — MusikAnimal talk 17:43, 16 March 2021 (UTC)[reply]
- If this supersedes that/makes that redundant, isn't it best just to withdraw that BRFA in place of this one? Also, reading the above I'm not sure this one is too controversial. As I understand, the config is local and you're just adding data to a data file, likely a newly created template as the template wouldn't function without this task anyway. And the linked discussions show a desire for the task. So I don't personally see anything stopping this going to trial. ProcrastinatingReader (talk) 18:00, 16 March 2021 (UTC)[reply]
- Well, I also haven't written any code yet, so that is a barrier to starting a trial :) I didn't think this was a controversial task, but I wasn't certain enough with the "function details" to start writing code. With your vote of confidence, I'll get started! It shouldn't take me long. Code-wise it is basically a combination of MusikBot 16 and MusikBot 14 (though to be clear this task will not replace MusikBot 14 as that doesn't go by a category). I'll reply back here once I've got a working product. — MusikAnimal talk 21:42, 16 March 2021 (UTC)[reply]
- If this supersedes that/makes that redundant, isn't it best just to withdraw that BRFA in place of this one? Also, reading the above I'm not sure this one is too controversial. As I understand, the config is local and you're just adding data to a data file, likely a newly created template as the template wouldn't function without this task anyway. And the linked discussions show a desire for the task. So I don't personally see anything stopping this going to trial. ProcrastinatingReader (talk) 18:00, 16 March 2021 (UTC)[reply]
- This is now ready for trial if you and/or another BAG member feels it is also :) Code has been published. I will write end-user documentation today or tomorrow, but the "function details" above should describe everything you need to know. I can withdraw MusikBot 16 if you feel that is best, but the trial will end this Wednesday, anyway. Category:Articles with topics of unclear notability and Category:All orphaned articles are the two other cateogries we can use for the trial.
Also, I only just discovered Template:CatTrack which works very similarly (example). It appears to store the data externally, so this bot task I think is still valid. I'd still like to ping the maintainer Enterprisey in case I'm missing anything. I knew this clever idea of a category tracking bot must have already existed! :) I've already written my code, so I'd like to move forward with this BRFA if no one sees any issues/conflicts. What is nice is that CatTrack has lots of historical data we can copy over. Also, it goes by a template, which I was worried about initially be misused, but this sort of proves that isn't a problem. However the template doesn't support the granularity and storage location options, so all things considered I'm thinking CatTrack and MusikBot's CategoryCounter tasks should coexist as two different systems. Thoughts? Is there too much crossover? Apparently no one involved thus far was aware of CatTrack. — MusikAnimal talk 20:26, 22 March 2021 (UTC)[reply]
- I don't see anything stopping a trial if you want to proceed. On the other task, I imagine it's still better to withdraw that task (vs letting it expire). If you anticipate the task changing due to CatTrack (or want to collaborate on improving that tool or somesuch) I'll hold off on the trial, but functionality wise I suspect there's a material difference in retaining data onwiki and having onwiki graphs vs an external tool. ProcrastinatingReader (talk) 20:58, 22 March 2021 (UTC)[reply]
- CatTrack has been broken for over a year now. See discussion at User_talk:RoySmith/Archive_36#Followup_from_VPT. – SD0001 (talk) 11:20, 24 March 2021 (UTC)[reply]
- I don't see anything stopping a trial if you want to proceed. On the other task, I imagine it's still better to withdraw that task (vs letting it expire). If you anticipate the task changing due to CatTrack (or want to collaborate on improving that tool or somesuch) I'll hold off on the trial, but functionality wise I suspect there's a material difference in retaining data onwiki and having onwiki graphs vs an external tool. ProcrastinatingReader (talk) 20:58, 22 March 2021 (UTC)[reply]
- Approved for trial (14 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. Discussed a bit on IRC, should be good to go for trial. Trial is for two weeks of updates, one per cat per week, with 3 categories being updated in the trial period. ProcrastinatingReader (talk) 17:27, 23 March 2021 (UTC)[reply]
- Trial complete. As with all bot tasks, I first ran this fully supervised from my local environment. The configuration at that time had a "cutoff" of 30 days set for "All articles lacking sources". I did this just to test that this feature worked, which it did. I then changed the config to its current state and ran the bot again, and it worked as expected ([1][2][3]). The next step was to turn on the cron job, which I completely forgot to do. This is why the next round of edits didn't happen until April 2, when I finally remembered, hehe. After an initial glitch that was quickly fixed, it ran without issue ([4][5][6]). With the editing schedule now on Fridays, the task successfully did its job today, fully automated ([7][8][9]). — MusikAnimal talk 02:32, 10 April 2021 (UTC)[reply]
- Approved. ProcrastinatingReader (talk) 19:15, 10 April 2021 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at Wikipedia:Bots/Noticeboard.