User:TedderBot/NewPageSearch
Appearance
The New Pages Patrol is the perfect place for assessing articles and finding new contributors. However, it's a busy place to patrol, even with the patrolled flag. User:AlexNewArtBot provides content divided by subject area.
When the bot and user vanished in 2011, I coded a replacement for it. The code is written in Java with User:MER-C's Wikimedia API, and the source code is posted to Github with a reuse-friendly license. That way when I'm hit by a bus, another user can get it running quickly.
Do you want to see new features?
[edit]Please post them on the talk page. I want to control how features are added to the following lists. If you can help clean up this page, feel free to do so.
Things I need help with
[edit]- "search query clerk" - if you understand the search queries (or even some of them) and want to help maintain and fix queries, let me know.
- graphical person - I need a mascot/logo. This shed deserves to be painted.
- documentation - I'm terrible with wording. I need help explaining what this bot does and help explaining what sparklines are.
Specification notes
[edit]- The definition of a lede, for doubling points, is narrowly construed: it is from the beginning of the page until two newlines or the beginning of a section ("=="). Effectively, it's limited to infoboxes, cleanup tags, and the first paragraph.
- Can't process the \p{charset} thing. TODO: explain.
Tasks
[edit]To implement
[edit]- Leave annotations alone on search result page, both before and after the search text (before for User:Dudemanfellabra to mark 'unrelated', after for User:Nthep, [1])
- Self-document search pages. Have a person or project as owner? (for User:SunCreator)
- RevisionID of article seen (since it is cached)
- Lazy load rules
- article title in cached text: example
- Configuration to turn archives off (for User:SunCreator)
- Configuration to turn infobox parsing off. (for User:Acroterion and WP:WPARCH)
Completed
[edit]- (20110518) Invert output so the newest result is at the top
- (20110518) Move order of processing so a given ruleset is processed and posts, then another next ruleset is processed
- (20110518) Maintain state of each ruleset independently, start with ruleset+1
- (20110519) Respect bot flag
- (20110520) Only run on a given rule if necessary (more than 24 hours since last run)
- (20110520) Logging, not stdout
- (20110521) Detect pages removed and put them in the archive
- (20110523) Log page: User:AlexNewArtBot/ShipsLog (on errors page: User:TedderBot/NewPageSearch/Ships/errors)
- (20110602) Turned off caching for fetching rules pages
- (20110602) Added count of inhibitors to search logs
- (20110602) Bug: inhibit/excludes not working correctly? (for User:Lionelt, example is Neil McAuley on User:AlexNewArtBot/Conservatism with inhibitor "right wing back")
- (20110623) RevisionID of the ruleset when loaded (for User:SunCreator)
Searches to implement
[edit]- Motorcycling
- Redirects
Long-term
[edit]- Also watch for redirects turned into articles (
there's an editfilter for thisno more, so this task is difficult unless we can steal another list)