Jump to content

Wikipedia talk:AutoWikiBrowser/Archive 11

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 5Archive 9Archive 10Archive 11Archive 12Archive 13Archive 15


Problem with Special:Log/Newusers

I'm trying to make a list from Special:Log/Newusers, but I'm not getting any users whose talk pages don't yet exist, even if I uncheck "Ignore existing pages" in the "Skip articles" section. I deduce that this is because the code added or tweaked per the request at Wikipedia talk:AutoWikiBrowser/Archive 4#suggested functionality addition assumes that one would only want users with live talk pages. But I want to find users without talk pages, and I suspect that the unchecked "Ignore" option never comes into play because the generated list must first include the desired pages. If so, could this be fixed so that all users in the desired portion of the log are represented? If this is done, the default behavior of skipping non-existing pages should automatically provide the current functionality, and folks in my situation will be accomodated as well. Thanks. ~ Jeff Q (talk) 00:29, 12 September 2006 (UTC)

Any thoughts on this problem yet, folks? Am I being dense, perhaps? ~ Jeff Q (talk) 22:33, 16 September 2006 (UTC)

Creating page list by filtering on content

I would like to use AWB's excellent mechanisms for fetching pages and examining their content to generate a list of pages with challenging editing problems. The idea is that AWB can find problem pages matching a specific pattern, but the fix to each page may take some research, so it would be nice to simply generate a list for offline work. However, I haven't come up with a decent way to do this. The "Make list" filter only works on page names, as I understand it. The skip articles can identify target articles (or filter out non-targets), but only to perform an operation on them — they toss the page off the page list whether or not they perform the operation. (Tagging the articles for attention is an option, but I'd prefer to create an offline list rather than edit each article twice, once to tag and once to fix.) Nor can I see how to use the "Find and Replace" options, even the "Advanced" rules, to manipulate either the page list or a separate file (like a log). Do the experienced AWB users here have any advice for this AWB newbie? Thanks. ~ Jeff Q (talk) 00:51, 12 September 2006 (UTC)

If you can program in C# or VB.NET your best bet would be to make a plugin. It would be very simple to implement. You'd build your list, AWB would send the text of each article to the plugin, the plugin would analyse the content and write it out to a log and just tell AWB to skip the page (so AWB wouldn't actually do any edits). You wouldn't need a fancy user interface or anything so you could do that with a few lines of code and some regular expressions. --kingboyk 10:10, 12 September 2006 (UTC)
Sounds like fun. You don't happen to know of any cheap (and legal!) C# or VB.NET programming tools, do you? I can't even afford to upgrade my Windows OS with Microsoft's monopoly-enabled fees. ~ Jeff Q (talk) 21:12, 12 September 2006 (UTC)
Microsoft Visual Studio Express. The bees knees. AWB is developed in the C# version. My plugin uses VB.NET (which, of course, all the best programmers use - isn't that right Martin? ;)) --kingboyk 21:17, 12 September 2006 (UTC) PS There are Java and C++ versions too, but I can't vouch for either of them as I haven't used them. --kingboyk 21:19, 12 September 2006 (UTC)
Cool! I've been wanting to try out C# after having read an article about it a few years back that made it look better designed for OOP than than C++'s grafting of OO onto C. (Ugh, what geeky alphabet soup.) Thanks for the info. ~ Jeff Q (talk) 00:38, 13 September 2006 (UTC)
I'm as happy to bash MS as the next guy (my first PC had Linux on it over 10 years ago), but dotnet is OOP heaven. When I first read a massive tome on it every page was "wow, it does that?" and "that's clever". It's first rate. Definitely as a C++ programmer you want to use C#. I'm using VB.NET as I have a lot of experience with VBA in Access, and VB6. They all compile to the same Intermediate Language so, with a very small number of exceptions, they all do pretty much the same thing. Good luck and let us know how you get on! --kingboyk 09:31, 13 September 2006 (UTC)
Oh noez! A programming language thread ;) ! C++ does have it's merits. But not for those using old fashioned C programming paradigms (read a decent book that explains things like RAII). I admit, average joe programmer is quick at achieving progress in C#, as such it isn't a bad language. It's also cool for rapid prototyping. --A C++ freak ;-) 09:51, 13 September 2006 (UTC)
Have you tried C++.NET? Is it any good? Or are they incompatible bedfellows?
Horses for courses. I'm into rapid application development. I have no desire to write device drivers, no ability with art so no interest in creating fancy graphics etc etc. I also think there's a certain amount of snobbery about low vs high level languages. Indeed take C# vs VB.NET - VB can do almost everything that C# can do, but it's a higher level language. Surely that makes it better? (unless coming from a C background). --kingboyk 10:16, 13 September 2006 (UTC)

1) could there by an option "ignore wikilinks" (into wiki database scanner) ?

2) Feature, automated searching the list of articles (for a MISTAKE) – i.e. articles are created from database but many of them are already fixed (database gets out-of-date soon). To eliminate those "fixed" articles I load a new settings with only one regex/string matching MISTAKE, then set "skip when no change/replacement made" and push "start the process" – if it find "no change/replacement" those "no needed" (I don't want to apply general/other fixes for them etc. if no MISTAKE is available anymore) articles are removed from the list, but the process stops when MISTAKE is founded. The thing is to check all articles automatically in this case (similar to "auto save"), like auto ignore (remove from list) if there's no MISTAKE, leave the article on the list if MISTAKE is founded, and check consecutive articles, could this be implemented in the future version? gregul

I'm not sure what you mean by ignore wikilinks? As for the second idea, if I understand correctly, this has been suggested before, but I refused on the grounds that it would be a large drain on servers to have people crawling through thousands of pages. Martin 13:55, 12 September 2006 (UTC)
2) I will be doing this by switching articles by hand anyway, this's for not doing redundant edits which will be included into database
1) Not to search into [[pl: ]] [[se: ]] etc. (it's called "ignore interwiki links" as in "find and replace") --gregul
now if i search through database – sometimes there's nothing to change because im ignoring interwiki into "find and replace" gregul

404 on startup with nonstandard Default.xml

Once again, this is using AWB with en.wikinews, I've overwritten the default config .xml file with that detail, plus setting the EnableRegexTypoFix option. Now, whenever I start up AWB I get a 404 error, my guess it is perhaps looking for a page of regexes on Wikinews. If this is the case, can you let me know what I'd need to create on wikinews, and where I'd need to copy from? I'd love to be able to include fixes to change quotes from MS Office into plain quotes - they break our PDF/print edition.

Steps to reproduce are, File->User and project preferences, set project to Wikinews, select make from Category, enter a recent date (eg September 1, 2006), click on the More options tag and select Enable RegexTypo Fix, uncheck Skip article, click Make list, select File->Save settings, overwrite Default.xml, quit AWB, restart and observe the error, should be: The remote server returned an error: (404) Not Found. --Brianmc 17:35, 12 September 2006 (UTC)

The page is http://en.wikinews.org/wiki/Wikinews:AutoWikiBrowser/Typos now I have created the page it works ok. Martin 19:32, 12 September 2006 (UTC)
Thank you for this, I've copied the typo list from wikipedia and added it to my watchlist so I spot updates. I really appreciate this tool and have made some significant changes on Wikinews with its help. --Brianmc 20:14, 12 September 2006 (UTC)

Newest version crashes

The current version of AWB always crashes on the first or second edit. Does anyone else have this problem?--Kungfu Adam (talk) 21:46, 12 September 2006 (UTC)

No. It's quite usual, alas, for it crash after a thousand or more edits, but I've never had it crash after one or two. --kingboyk 22:33, 12 September 2006 (UTC)
No problem for me either, doesn't crash after 100+ edits. Lincher 03:37, 13 September 2006 (UTC)
Maybe I had a bad download. I could reinstall...--Kungfu Adam (talk) 11:10, 14 September 2006 (UTC)
No such luck. I guess I can wait for the next release and see what happens.--Kungfu Adam (talk) 17:42, 14 September 2006 (UTC)
Hmm.. I also have the same problem, I tried both 3.0.2.9 and 3.0.3.0 and they crashed on my first and second edit. Dunno why though. --WinHunter (talk) 01:25, 15 September 2006 (UTC)

XML settings bug?

Loading these settings (make list from category) I get an error at

if (reader.MoveToAttribute("index"))
   listMaker1.SelectedSource = (WikiFunctions.Lists.SourceType)int.Parse(reader.Value);

in UserSettings.cs.

Settings (tested with plugin deleted, it's not a plugin issue) -

<?xml version="1.0" encoding="utf-8"?>
<Settings program="AWB" schema="2">
  <Project>
    <projectlang proj="wikipedia" lang="en" />
  </Project>
  <Options>
    <selectsource index="Category" text="Mexican politician stubs" />
    <general general="True" tagger="True" unicodifyer="True" />
    <categorisation index="0" text="" />
    <skip does="False" doesnot="False" regex="False" casesensitive="False" doestext="" doesnottext="" moreindex="0" />
    <message enabled="False" text="" append="True" />
    <automode delay="15" quicksave="False" suppresstag="True" />
    <imager index="0" replace="" with="" />
  </Options>
  <regextypofix>
    <regextypofixproperties enabled="False" skipnofixed="False" />
  </regextypofix>
  <FindAndReplaceSettings>
    <findandreplacesettings enabled="False" ignorenofar="True" ignoretext="False" appendsummary="True" afterotherfixes="False" />
  </FindAndReplaceSettings>
  <FindAndReplace>
    <replacerules enabled="False">
      <rule name="Rule" type="0" enabled="True" />
    </replacerules>
  </FindAndReplace>
  <startoptions>
    <summary text="clean up" />
    <summaryindex index="clean up" />
    <find text="" regex="False" casesensitive="False" />
    <menu>
      <wordwrap enabled="True" />
      <toolbar enabled="False" />
      <bypass enabled="True" />
      <ingnorenonexistent enabled="True" />
      <noautochanges enabled="False" />
      <skipnochanges enabled="False" />
      <preview enabled="False" />
      <minor enabled="False" />
      <watch enabled="False" />
      <timer enabled="False" />
      <sortinterwiki enabled="True" />
      <addignoredtolog enabled="False" />
    </menu>
    <plugins />
  </startoptions>
  <pastemore>
    <pastemore1 text="" />
    <pastemore2 text="" />
    <pastemore3 text="" />
    <pastemore4 text="" />
    <pastemore5 text="" />
    <pastemore6 text="" />
    <pastemore7 text="" />
    <pastemore8 text="" />
    <pastemore9 text="" />
    <pastemore10 text="" />
  </pastemore>
  <preferences>
    <preferencevalues enhancediff="True" scrolldown="True" difffontsize="150" textboxfontsize="10" textboxfont="Courier New" lowthreadpriority="False" flashandbeep="True" />
  </preferences>
</Settings>

--kingboyk 14:31, 13 September 2006 (UTC)

On further inspection I think the settings are getting saved incorrectly, and "selectsource index" should be "0", not "category"? --kingboyk 14:38, 13 September 2006 (UTC)
This is only in SVN, I've changed it now. Martin 15:28, 13 September 2006 (UTC)

Making lists

Martin, any chance we could get these?

  • Make list from category - first 200 articles. Sometimes I want to sample the category and not get the entire thing (especially if contains 100,000 articles!). Links on page for a category page doesn't currently work; an alternative to my request might be to makle links on page for a category page work i.e. it returns the listing on the first page.
  • What redirects here.

--kingboyk 14:55, 14 September 2006 (UTC)

I suppose I can put an optional limit in the category, but the other things would need a change in User:Yurik/Query API. Martin 15:37, 14 September 2006 (UTC)
Blimey. I didn't know about that. Never heard of it. (rolls eyes). --kingboyk 17:45, 14 September 2006 (UTC)

Login problem

My AWB behaving weirdly

My Auto Wiki Browser refuses to believe that I'm logged in, even though I very obviously am. As you can see at the screenshot to the left, I had logged in successfully, yet it was still prompting me to log in again. What on earth is the matter? Ingoolemo talk 04:48, 16 September 2006 (UTC)

Most likely that you are not using the monobook skin. Martin 10:12, 16 September 2006 (UTC)
I'll look into making it work though. Martin 15:21, 16 September 2006 (UTC)
Drop me a line when you figure it out. Thanks, Ingoolemo talk 00:50, 19 September 2006 (UTC)`

find and replace - ignore external/interwiki links, images, nowiki ...

when this option is set, regex: ('''[^a-z].*?'''( \(.*?\))?) ?[-–]? ?(jest )?to(^:| )

won't catch: '''Bielefeld''' to

[it's in the main article]

in pl:Bielefeld – everything it's ok when I unset that option, regex checker tells it's true anyway, so it might be bug gregul

It was a problem with an internal regex being too greedy, I've fixed it. Martin 15:21, 16 September 2006 (UTC)

Lists of large categories

I'm finding that when I create a list of articles from multiple large categories, AWB omits a substantial number of the articles. Specifically, the subcats of Category:Orphaned articles, there are about 17,000 articles listed, and when I create a list from them, many articles are left out of the list (several hundred at least), even if I try it twice. So um... any ideas? If this is a known bug, is there any reliable tool to generate a list of all articles in a large category? --W.marsh 14:29, 16 September 2006 (UTC)

Ah I'd glad you posted this. When I build a listing of Category:Living people I get 120,000 or so articles. If I build a list of Category:Biography articles of living people (talk pages tagged with living=yes) I only get 101,000. I do a bot run and discover that thousands of my remaining 20,000 or so articles already have living=yes. Mediawiki hasn't updated the category properly (unlikely, because the job queue runs often enough on WPBiography); the list comparer is broken (possible but I don't think it's this); or there's something wrong with the list grabbing from large cats. --kingboyk 14:37, 16 September 2006 (UTC) PS My plugin keeps a log so I can furnish a skipped list if need be.--kingboyk 14:37, 16 September 2006 (UTC)
I've noticed this, as it only occurs on very large categories, I half suspect it is the queri API rather than AWB, but I'll find out for sure soon. Martin 15:21, 16 September 2006 (UTC)

Adding wikiproject banner to talk pages

Hi, is it possible to add a wikiproject banner to the talk pages of articles using AWB? I clicked on more options, clicked on append message and wrote down the banner of the project {{WP India}}. And then I saved. But nothing happened. Please suggest -- Lost 19:50, 16 September 2006 (UTC)

You have to set all the settings, then start the process. Martin 21:17, 16 September 2006 (UTC)
Thanks, I did set all settings as far as I could gather. But not able to do it. Help would be greatly appreciated. -- Lost 04:52, 17 September 2006 (UTC)

Purpose of AWB?

I'm sorry if this is a stupid question, but what is the actual purpose of AWB and/or what is the main function of it that makes it superior to simply going around in IE and editing pages? The article doesn't exactly make it clear (to me). I tend to see mainly spelling and grammar errors corrected with AWB tags in the change-log. What exactly does AWB allow you to do? TheHYPO 00:42, 17 September 2006 (UTC)

AWB can be used for repeating the same task over and over and over and over again. Like adding a template to every page in a category, (even hundreds of them). It can also be used to do tasks like update a link or image on pages. It is not designed to replace your normal browser, or to be your primary editor. Some bots run solely using the find and replace utility of AWB. — xaosflux Talk 02:02, 17 September 2006 (UTC)

User login

Hi, I'm using AWB on Swedish Wikipedia, and it works well. To my knowledge, I have not entered my username in AWB or its config files, still AWB is logging in with my standard login. How can it work? Magic? I'm clogging down the RC with my edits though, how do I make AWB login as my bot account? //Knuckles 06:22, 17 September 2006 (UTC)

Yes, it's magic :) Actually, no, it's because AWB uses the Internet Explorer engine. You'll have to log out of Wikipedia in IE and log back in as your bot. If you want to run AWB and do manual edits at the same time using 2 different accounts, use IE for your bot and do your manual edits in another browser like Firefox or Opera. --kingboyk 09:13, 17 September 2006 (UTC)
Aha! I added this in the User manual on the front AWB page. Thanks! //Knuckles 11:24, 17 September 2006 (UTC)

"Correcting the misspelling" of a direct quote

Hi there. Occasionally, people will come across the Black Mesa (game mod) page using AWB and "correct" the spelling of a person being quoted, even when the spelling is in their exact words. Is there a way to prevent this? Thanks. Viewer 06:28, 17 September 2006 (UTC)

Put "(sic)" or "[sic]" next to the intentional spelling mistake, and any AWB user with any wits about them will know it's as quoted and leave it? --kingboyk 09:10, 17 September 2006 (UTC)

What to do, what to do?

I have just downloaded and been registered for AWB and looked at the Terms and Conditions. Does spell-checking - the task I plan to complete with it - class as unecessarily minor edits? Thanks. Ck lostsword|queta!|Suggestions? 17:25, 17 September 2006 (UTC)

No, spell checking is not minor. thanks Martin 17:37, 17 September 2006 (UTC)
Are you sure? I'd been under the impression from Wikipedia:Minor edits that simple spelling corrections were minor edits. Where I'm still fuzzy though, is whether the addition of a {{stub}} template counts as minor or not. --Elonka 18:15, 18 September 2006 (UTC)
Err... no, this is a seperate issue. You're talking about "do I tick the Wikipedia 'minor edit' box or not?". The original question was about the terms and conditions of AWB and not making "unneccessary minor edits". --kingboyk 18:19, 18 September 2006 (UTC)
In addition to Kingboyk, the way I see it, an unneccessary minor edit is one that doesn't change the presentation, how the page looks to end-users, like the examples listed under Rules of Use and many of the ones under General fixes. Spelling is not an unneccessary edit, because it improves the quality of the article. Harryboyles 12:30, 20 September 2006 (UTC)

Memory leak?

Again, this is something I've observed before I started using a plugin, so the problem is within AWB itself. I've found that AWB memory usage can increase steadily throughout a session until it's at 400MB or more of physical RAM. Also in the past it's been normal for me to wake up in the morning and find that AWB stalled throughout the night. To counter the second problem, I've added a feature to my plugin to stop and restart AWB if the list isn't empty and it doesn't send any articles to the plugin in 10 minutes. Unfortunately that has the side effect of trying to keep AWB running if it's struggling for memory.

My machine has 1GB of memory, but this morning when I got up both of my AWB processes had crashed as out of memory and, rather annoyingly, they'd taken my Firefox with umpteem open tabs up down with them. I can only imagine that certain resources aren't being disposed of correctly or objects are somehow kept alive when no longer needed. Any ideas Martin and has anyone else doing thousands of automated edits noticed this? --kingboyk 10:04, 18 September 2006 (UTC)

It's the IE control, it seems to want to cache pages. I have never had any problems with it, even on large runs, maybe your IE has a different option set to cache pages in a different way or something. Martin 10:08, 18 September 2006 (UTC)
Ah. Well, remember the issue I had with gigs of pages being cached? Also, since I zapped that cache my MSDN help viewer has been f*cked too. My version of IE must have problems. Any registry settings or owt you know of to help fix it? --kingboyk 10:20, 18 September 2006 (UTC)
My Pc's got 2GB of ram in, and 1GB of page file. I've had AWB running for long enough to have to close it due to using all the page file. I havent really done AWB runs recently, or any large ones, but it seems to be a bit better in the newer version.
Ive noticed it during any .NET app that i've created, whenever you open or close forms, and press buttons and the memory usage just keeps increasing! I know people run AWB bots and stuff, with i think mboverlord running one quite a lot... And martin, you have bluebot don't you? Reedy Boy 10:41, 18 September 2006 (UTC)
I regularly do runs of multiple thousands of edits without any problem, the memory usage does get quite high after a couple of thousand, but it seems to reach a ceiling eventually. Historically there was a problem with stalling occasionally, but that particular problem has been solved. Martin 13:10, 18 September 2006 (UTC)
I made ~3000 edits today, I noticed memory usage went up to ~300mb, then IE seemed to purge itself and it went right back down. Martin 18:27, 18 September 2006 (UTC)

Pre-loading pages

As a feature request, would it be possible to have AWB pre-load a page? Sort of like running a tabbed browser? I notice that when I'm running through a long list (such as Special:Uncategorizedpages), that I usually only need a few seconds to actually decide what to do with a particular page, but that it takes just as long to wait for the next page to load after I hit "save." If AWB could be pre-loading the next page in the list, while I'm making the decision on the current one, that would speed things up considerably, as I wouldn't have the "wait for page load" delays. --Elonka 18:20, 18 September 2006 (UTC)

It would be pretty difficult to implement. Normally the delay is fairly insignificant, but the servers have been slow for the last couple of days. Martin 18:27, 18 September 2006 (UTC)
Also, sometimes I use AWB from a dialup location (yes, I know I have masochistic tendencies <grin>), so in those situations, the pre-load would still be very useful.  :) But I understand if it would be too difficult -- I figured it couldn't hurt to ask! --Elonka 18:54, 18 September 2006 (UTC)
Would it be that difficult? You could grab the next page in advance using an invisible webcontrol object. Of course, some nasty plugin developer might come along and have a plugin programatically insert new items at the top of the file list though ;) --kingboyk 19:04, 18 September 2006 (UTC)
One way to speed up the loading process in general is by setting Internet Explorer to not load images. Back in User:JoeBot's hayday (i.e. avoiding studying for finals), it really helped for long lists of articles with misspellings. JoeSmack Talk 18:26, 19 September 2006 (UTC)
Thanks for the tip, it really helps. //Halibutt 20:13, 19 September 2006 (UTC)
As a follow-up, I've found that I can also speed things up by keeping open two instances of AWB, each working on a different section of a list. That way as soon as I click "Save" on one, I can flip to the other version and work on its page, while waiting for a new page to load in the first version. It'd still be a bit easier if it worked like Firefox with the animated "swirl" on a tab showing me that a page was still loading, but this way works too.  :) --Elonka 23:37, 21 September 2006 (UTC)
The icon changes background when a page is loaded. Rich Farmbrough, 21:29 29 September 2006 (GMT).

Revision IDs

Martin, is there currently any property or method which returns revision IDs? What I'm thinking of is logging the diff when my bot makes a change.

If not, how difficult would it be to add functionality for this? --kingboyk 13:00, 19 September 2006 (UTC)

I'm not sure what the easiest way to find the revision id is, so I'm not sure how I could implement it. Martin 14:04, 19 September 2006 (UTC)
You could dig out the link for the "permalink": The html of Foreigner contains:
<li id="t-permalink"><a href="/w/index.php?title=Foreigner&oldid=76304567">Permanent link</a></li>
--Ligulem 14:20, 19 September 2006 (UTC)
I'd really rather not deal with HTML from Wikipedia pages, as I think that's a job which should be encapsulated in the webcontrol :) My plugin doesn't get its hands dirty with such things, it just processes article text :) Nice suggestion though; any chance of implementing something like that into the webcontrol Martin? --kingboyk 14:30, 19 September 2006 (UTC)
The html snippet above is missing in the page returned from the "&action=edit" URLs, so that would mean an additional load of the whole page before going into edit of the page. Very inefficent, but it could work. Maybe we could "simply" ask the MediaWiki devs (duh!) to include the revision ID somwhere in a html comment of the "&action=edit" page. We could try to make a patch for MediaWiki. But I don't know what chances that patch would have to get live ;-)... --Ligulem 14:47, 19 September 2006 (UTC)
"The html snippet above is missing in the page returned from the "&action=edit" URLs" Doh! How bizarre - I imagine they'd be quite happy to rectify that? --kingboyk 14:48, 19 September 2006 (UTC)
This is probably by design. The permalink is missing in the toolbox of the sidebar when in edit mode. --Ligulem 14:51, 19 September 2006 (UTC)
It would make sense to put the revision id in with the other javascript variables, after there are similar variables already there:
		<script type= "text/javascript">
			var skin = "monobook";
			var stylepath = "/skins-1.5";

			var wgArticlePath = "/wiki/$1";
			var wgScriptPath = "/w";
			var wgServer = "http://en.wikipedia.org";
                       
			var wgCanonicalNamespace = "Project_talk";
			var wgNamespaceNumber = 5;
			var wgPageName = "Wikipedia_talk:AutoWikiBrowser";
			var wgTitle = "AutoWikiBrowser";
			var wgArticleId = 3625052;
			var wgIsArticle = false;
                       
			var wgUserName = "Bluemoose";
			var wgUserLanguage = "en";
			var wgContentLanguage = "en";
		</script>
Martin 15:16, 19 September 2006 (UTC)
Bugzilla? Or do either of you have a hotline to a dev? --kingboyk 15:19, 19 September 2006 (UTC)
Hmm. No hotline from my side. Don't expect a fast response from a bugzilla entry. An entry on bugzilla with a patch has much better chances. We could ask on wikitech-l what would be the best way to implement that. Think I should definitely start doing a test MediaWiki install finally. I've synced to the sources with SVN lately to dig up something for the village pump [1]. Oh well. --Ligulem 16:25, 19 September 2006 (UTC)

Problems w/ running AWB under VirtualPC

I've been trying to persuade AWB to run on Windows XP running on my PowerMac by Virtual PC. I can open the program fine, can log in and get the green light, but when I try to set AWB running I get an edit window appearing at the top, the text appearing in the box on the bottom-right, and the message "Loading changes" appears in the bottom left. However, that message doesn't go away, and the process then restarts (I get the "restarting in 6..5..4..etc, and a "Page cannot be displayed" message in the top window). Does anyone have any suggestions for getting this running properly? Thanks. Mike Peel 17:01, 19 September 2006 (UTC)

Since you have my plugin installed, I'd suggest starting by exiting AWB and deleting the file "Kingbotk AWB Plugin.dll" from the AWB folder. Then try again, and report back here. Let's keep things simple by finding out if vanilla AWB works. --kingboyk 17:15, 19 September 2006 (UTC)
Tried again with a fresh copy of AWB, version 3030. All preferences should be as standard. Exact same problem. In case it matters, it's located in C:\Documents and Settings\Administrator\Desktop\AutoWikiBrowser. I'm running Windows XP SP2, with .NET framework 2.0, under Virtual PC for Mac 7.01 on Mac OS X 10.4.7. The only other application installed on XP is Virtual Machine Additions. I've not installed any XP updates past SP2. Mike Peel 18:38, 19 September 2006 (UTC)
Are you able to open a Wikipedia page in Internet Explorer within the virtual machine? --kingboyk 19:11, 19 September 2006 (UTC)
Yes. I have no problem accessing the internet from the virtual machine in general. Nor have I experienced any unusual delays in loading pages. AWB gets the contents of categories, and the initial page contents, without a problem. It then seems to stall when applying changes to the content. Mike Peel 19:26, 19 September 2006 (UTC)
That's a question for Martin, then, sorry. --kingboyk 19:29, 19 September 2006 (UTC)
Though it would be interesting if it did work on that platform, as I have no way of testing/debugging, it is almost impossible for me to work out what is going wrong. If I had to hazard a guess, I would say that the IE control doesn't work 100% on that platform. Martin 10:11, 21 September 2006 (UTC)

I've just downloaded the latest version of AWB, and it seems to be working fine now. I guess that the bug was one of those fixed in the last few versions. My thanks to both of you for your help. Mike Peel 16:33, 6 October 2006 (UTC)

Great, though I have to admit, it must have been more luck than judgement on my part. thanks! Martin 19:29, 6 October 2006 (UTC)

Recategorising articles per CFD

I received the following request on my talk page:

Probablky you don't want to be asked about plugins, but I'll try anyway. Here's the situation. On WP:CFD we deal with lots of moves daily. Currently some of we have pywikipedia wrappers to do the task. For instance at Wikipedia:Categories for discussion/Working we have listings with some fixed structure. For instance today I would copy and paste the webpage into notepad to get

# Category:Fictional aerokineticists to Category:Fictional wind manipulators
# Category:Fictional atmokineticists to Category:Fictional weather manipulators
# Category:Fictional chronokineticists to Category:Fictional time manipulators
# Category:Fictional cryokineticists to Category:Fictional ice manipulators
# Category:Fictional biokineticists to Category:Fictional characters with healing powers

and then a python script uses a regex to extract the category and process them in batch with pywikipedia.

Caveat. Pywikipedia category script isn't very good. It doesn't handle well catgories included as <pre> [[category:category name|some parameter]] </pre> and it dies on categories added to redirect pages. Thus, things can't be really handled automaticlaly anyway. But AWB seem to be able, so I'm humbly asking for you helping CFD, and consider writing a plugin that does basically the same. Takes that structured listing, extract the category names and then.. either move or remove the category (could be 2 plugins) with AWB. -- Drini 18:49, 19 September 2006 (UTC)

I don't have the time to help at the moment, alas.

I was wondering though - AWB can recategorise out of the box can't it?

Could anyone else here help them out with this? --kingboyk 10:17, 20 September 2006 (UTC)

Drag and drop

I noticed that if I drag and drop an XML file onto the browser control in AWB, AWB tries to start processing. Presumably this is an affect of IE raising an event or is it actually intentional?

On a related note, the behaviour I was looking for is being able to drop an XML settings file (perhaps onto the options tabs) and have AWB load those settings. Might be cool. Yay or nay? --kingboyk 15:37, 20 September 2006 (UTC)

Adding articles to categories

I added a bunch of Wikipedia pages to a new category using AWB, and somebody has complained that they weren't sorted. This has led me to thinking that if AWB is adding non-mainspace pages to categories maybe it ought to add the PAGENAME variable as a sort key automatically? --kingboyk 09:49, 21 September 2006 (UTC)

You can specify a key already, just enter the category with the key, e.g. "new cateory|{{subst:PAGENAME}}" Martin 10:11, 21 September 2006 (UTC)
Oh I see. Cheers. --kingboyk 10:12, 21 September 2006 (UTC)

Other wikis?

This is probably a futile hope, but I was wondering if there's any (reasonably simple) way to adapt the AWB program to use on another (outside) wiki (using mediawiki software). It's rather time consuming to propagate a new wiki, and I thought some sort of at least partially automated editing system, would make the task a bit easier, and programming bots seems like far too complex an endeavour... TheHYPO 03:05, 22 September 2006 (UTC)

You can use m:MWiki-Browser (MWB). Just enter the url of the wiki into the wiki field. Please note that MWB is a stripped down version of AWB. --Ligulem 12:30, 22 September 2006 (UTC)
Thanks very much for the point-out. TheHYPO 22:23, 22 September 2006 (UTC)

Auto-load of settings

I think it would be great if there were an option to set a settings file as the default settings on startup. I know you can replace the default.xml with one of your own but it would be handy if you could do it within the program. Harryboyles 09:32, 22 September 2006 (UTC)

I just load in the settings you want and then save over the default Reedy Boy 14:14, 22 September 2006 (UTC)

Feature request: - "Further attention" log.

I'd like a "further attention" button for making a list of articles I need to come back too later - during the couple of runs I've made I found articles with extensive problems that I can't really fix "on the fly". - Stephanie Daugherty (Triona) - Talk - Comment - 18:01, 22 September 2006 (UTC)

There's a "false positive" button, but you can't do a lot with it so far: it just dumps the name into a flat file for later retreival. It would be nice to develop this feature further. Your input would be invaluable. HTH HAND —Phil | Talk 20:35, 23 September 2006 (UTC)

Another important small request

Please remove &frasl from the unicodified symbols. It is indistinguishable from the regular slash in the source code. —Mets501 (talk) 18:24, 22 September 2006 (UTC)

Too fast?

I'm aware that one of the points say "Don't edit too fast; consider opening a bot account if you are regularly making more than a few edits a minute.". I think I may be infringing on that: I am replacing album infobox ratings, i.e. (4/5) or with (the Rating-5 template). I do a few of these a minute, is this acceptable? -- Reaper X 01:37, 23 September 2006 (UTC)

I'm guessing your not checking your edits, then consider opening a bot account. Read WP:BOT then visit Wikipedia:Bots/Requests for approval.--Andeh 01:39, 23 September 2006 (UTC)

Actually I am. Mass mistakes is somethin I don't want on my track record. That and I am not computer-oriented enough to know a thing about how to get/make/do whatever with a bot. -- Reaper X 02:07, 23 September 2006 (UTC) Gee, I'm going to seek help in opening a bot account. Thanks anyway. -- Reaper X 02:46, 23 September 2006 (UTC)

Bot accounts aren't as difficult as it may sound. It just means that you can make multiple edits per minute, and it isn't going to show on recent changes i believe, but will on peoples watch lists.. You can run AWB in the same way, just log into the bot account Reedy Boy 09:43, 23 September 2006 (UTC)

Protected pages

I'm not being allowed to save edits to protected pages on en.wikinews, yes I'm an admin, I run cleanup on old articles as they go into the archive. Someone has done protection on these, but missed things like multiple wiki links. I'm sure AWB has let me save edits to protected articles before, the only difference I'm aware of is I've added en.wikinews to my MSIE trusted sites so the Dynamic content on the front page works. (I'd take a fix for the dynamic content or a fix in whatever makes collapsable sections work on Wikipedia but not Wikinews (The style sheet perhaps?) --Brianmc 17:09, 23 September 2006 (UTC)

Bot timer

Hi Martin: I have an idea for the bot timer, which I don't think is too difficult. Can you change the bot timer so that it starts counting when the previous edit finishes? Right now it starts when AWB is ready to make the edit. Perhaps you could do something like set a variable bottimerready = 1 when the previous edit is finished and then start the countup timer, and when the value of the timer equals the selected delay, set bottimerready = 0 and stop the timer and the program can only make an edit when bottimerready = 0. I'm sure you'll have a much better way to do this, but it's just an idea. —Mets501 (talk) 22:27, 23 September 2006 (UTC)

Did you miss this suggestion? —Mets501 (talk) 13:58, 8 October 2006 (UTC)
I think the current method makes more sense, apart from being more simple, it naturally slows down the edit rate when the servers are slow. Martin 16:23, 9 October 2006 (UTC)
The problems with it is that if AWB goes through 100 pages and skips 90 of them at the beginning, it will still wait to make the 1st edit after it hasn't edited in several minutes. —Mets501 (talk) 18:12, 9 October 2006 (UTC)

Better with Every Version

Cheers Martin! - More useful features at our fingertips

Keep up the good work!!

Reedy Boy 10:03, 24 September 2006 (UTC)

This version doesnt seem to hog memory as much either!! And then it increases up, and then drops down to below 120Mb again =D Reedy Boy 10:29, 24 September 2006 (UTC)

First time with regular expressions

Where am I going wrong with this regex substitution? (for year wikilinks in cvg infoboxes)

find: \[198(0-9)\]

substitute: [198$1 in video gaming|198$1]

That doesn't work. If I change the find to: \[198[0-9]\] then it finds years ('1983' for example), but doesn't catch the digit for substitution (it actually puts '$1').

I finally got it working with: \[198(0|1|2|3|4|5|6|7|8|9)\], which substitutes correctly but that's a bit yukky looking. So what's wrong with (0-9) ? Thanks for any help. Marasmusine 22:10, 25 September 2006 (UTC)

You might want to make use of the code for digits '\d'. As in:
\[198(\d)\]
or:
\[(\d{4})\]
Links such as '[1983 in video gaming|1983]' are sometimes called 'easter egg links' because it looks just like [1983] and has an unpredictable outcome. Some editors think that they are not a particularly good idea. You can see them discussed from time to time in wp:mosdate and wp:context.
Converting formats like '[August 11] [2000]' to '[August 11] [2000 in video gaming|2000]', as you have done, is not a good idea. This is because it breaks the date preference mechanism. This is an understandable mistake that many people make because of the unfortunate way that dates are handled in Wikipedia. Please can others help Marasmusine with this.
Regards bobblewik 18:57, 26 September 2006 (UTC)

Feature request or just a question

Is there a way for a AWB to start with a list of articles on main namespace, and add messages to the article's talk pages? If not, could you please add it, that would really help with adding project banners on talk page headers. Please advise. - Ganeshk (talk) 05:39, 26 September 2006 (UTC)

In the list context menu there is an option called "Convert to talk" which converts the articles to the appropriate talk page. Martin 09:34, 26 September 2006 (UTC)
Martin, I don't see that menu option on my AWB. I see following only:
  • Filter out non main space
  • Filter
  • Sort alphabetically
  • Save list to text file
  • Launch Database scanner
  • Launch list comparer
Please advise. Regards, Ganeshk (talk) 18:48, 7 October 2006 (UTC)
Looks like you've got an old version of AWB. I see it on my v.3.0.4.1. MaxSem 19:19, 7 October 2006 (UTC)
The context menu is the one you get when you right-click the list, not the main menu. Martin 19:20, 7 October 2006 (UTC)
Found it. Never thought of right-clicking. Thanks Martin. - Ganeshk (talk) 18:49, 11 October 2006 (UTC)

Namespace changes on dawiki

Hi, I would like to notify you that there has been some namespace changes on dawiki. I would therefore like you to change /AWB/WikiFunctions/Variables.cs to the following:

149                 case LangCodeEnum.da:
150                     Namespaces[-2] = "Media:";
151                     Namespaces[-1] = "Speciel:";
152                     Namespaces[1] = "Diskussion:";
153                     Namespaces[2] = "Bruger:";
154                     Namespaces[3] = "Brugerdiskussion:";
155                     Namespaces[4] = "Wikipedia:";
156                     Namespaces[5] = "Wikipedia-diskussion:";
157                     Namespaces[6] = "Billede:";
158                     Namespaces[7] = "Billedediskussion:";
159                     Namespaces[8] = "MediaWiki:";
160                     Namespaces[9] = "MediaWiki-diskussion:";
161                     Namespaces[10] = "Skabelon:";
162                     Namespaces[11] = "Skabelondiskussion:";
163                     Namespaces[12] = "Hjælp:";
164                     Namespaces[13] = "Hjælp-diskussion:";
165                     Namespaces[14] = "Kategori:";
166                     Namespaces[15] = "Kategoridiskussion:";
167                     Namespaces[100] = "Portal:";
168                     Namespaces[101] = "Portal diskussion:";
169
170                     strsummarytag = "ved brug af AWB";
171                     break;

Thanks. --Maitch 14:57, 29 September 2006 (UTC)

Ok. I'll update it. Martin 14:03, 30 September 2006 (UTC)

Problem with AWB/I.E.

Now I only use I.E. for AWB related purposes, on this machine, but I have for some time had a situation where AWB can get stuck on "Loading page to check we are logged in...." After this IE seems unable to access enWP, although other sites seem fine (including ang.wp). Sometimes clearing the cache/cookies appears to help, sometimes rebooting. I have just cleared everything, installed the latest AWB and re-installed I.E.. nothing seems to work, so I'm reinstalling the dontnet framework. Has anyone else had this problem? Rich Farmbrough, 21:36 29 September 2006 (GMT).

Oh yes, and sometimes when it clears I get stacked "You are not logged in" dialogue boxen. Rich Farmbrough, 21:45 29 September 2006 (GMT).
I've not seen this before, but the next verison will have an improved mechanism for detecting loggin status etc. so hopefully it will be fixed anyway. Martin 14:03, 30 September 2006 (UTC)
If that includes recognising that a "please login" winforms messagebox is already up and doesn't pop up that would be great (and fix a bug of mine where the plugin nudges AWB to restart and AWB pops up another login box, leaving me with 10+ of em when I get up ;)) --kingboyk 10:55, 1 October 2006 (UTC)

Just a little more info, in case anyone else experiences it: The fix is (in I.E) "delete files" and check "delete all offline content". Not a real problem, now I can consistantly overcome it. Rich Farmbrough, 20:08 4 October 2006 (GMT).

Custumizing

Could you make a menu who we can choose the "general fixes" we wanted. Because on the other languages, there are fixes, we don't want (such as deleting more than one "return at the line", and subst'itution of "Vienne", etc.). Thank you very vbery much for these future feature ! 86.213.165.71 13:39, 30 September 2006 (UTC)

also not everybody wants disambig at the bottom, so this would be good idea ;] gregul 17:53, 30 September 2006 (UTC)
  1. again about "ignore external/interwiki links, ... images, ... " – does ignoring whole image section is done on purpose? (now it ignores complete [[image:...]] section) I think that ignoring could end on "|" character, because farther there's a description of image, which can contain matched strings
  2. could you expand the above feature to "ignore into double brackets" and between [[and | ?
  3. how about ignoring also galleries? and like above in 1. – only the name of images included into gallery, but without their descriptions, I think that could be included too
  4. would there be a possibility (besides of modifying sources by myself) to ignore images from other-languages? "[[Image:" is there called in other way ("[[Grafika:" on plwiki), so in fact, that option isn't working completely
  5. how can I avoid a situation when I founded some articles into database but in fact most of these are redundant, because I always set "ignore external/interwiki links, ...", so they won't be changed anyway gregul 21:25, 30 September 2006 (UTC)
These 5 things are important, so please answer what do you think gregul
The general fixes are designed for the en wikipedia, they would be far too complex to customise, as many rely on the fact that other functions have already happened. However, as most wikis use the same format, the general fixes should largely work ok with most projects, if they don't, then don't use them. Having said that, I can change things like moving disambig tags to only work on en wiki. Martin 10:41, 1 October 2006 (UTC)

Date wikifying bug

A recent edit of Arsenal F.C. using AWB [2] exposed what is probably a bug. AWB looked to wikify all dates in the article; two of the article titles used in the {{cite web}} templates had dates within them, so the code changed from:

{{cite web
 | title=Arsenal Holdings plc Results for the year ended 31 May 2006

to:

{{cite web
 | title=Arsenal Holdings plc Results for the year ended [[31 May]] 2006

This broke the links when they were displayed. I have since reverted these two specific changes, thought I kept the rest. [3] Could this bug please be fixed? Thanks. Qwghlm 18:50, 1 October 2006 (UTC)

AWB does not wikify dates, the user operating the software did this, there is no bug. Martin 19:04, 1 October 2006 (UTC)
OK, sorry for the bother. Thanks for clearing that up - I was misled by the edit summary. Qwghlm 19:56, 1 October 2006 (UTC)

I wouldn't call these links broken, they link fine and display fine. Rich Farmbrough, 20:12 4 October 2006 (GMT).

placing categories in the right place

can you include into general cleanup that in :pl.wiki (and others) a category name is [(k|K)ategoria and not only [Kategoria ? and perform change from small (if present) to big letter

[4] – of cource I can use a regex, but that would force me to do "general fix" after "find and replace" to have this placed in the right order

gregul 12:18, 2 October 2006 (UTC)

Yeah please, and the First letter of the name of the article in the category too ! And for the french Wikipedia, if any letter of the article is an accentued letter, then, writing in the category the title, without the accentued letters (ex : [[Catégorie:Travail du bois]] in Élagage des conifères => [[Catégorie:Travail du bois|Elagage des coniferes]]. Thank You very very much ! 81.51.91.236 12:31, 2 October 2006 (UTC)
it seems (I'm not sure becouse i've got a lot of regexes included) that new AWB changes from [[kategoria: to [[Kategoria:, but still first letter remains small, this could be applied also for [[grafika: etc. (images, for all languages), and maybe for any other constructions that I don't remember at this time gregul

Help Creating Regex Expression for Bot

I'd like some help creating a regular expression to make some changes to an edit I recently made with my bot. Specifically, I'd like to replace:

{{Merge-date|MONTH YEAR|RANDOM TEXT}} with {{Merge|RANDOM TEXT|date=MONTH YEAR}}

{{Mergeto-date|MONTH YEAR|RANDOM TEXT}} with {{Mergeto|RANDOM TEXT|date=MONTH YEAR}}

{{Mergefrom-date|MONTH YEAR|RANDOM TEXT}} with {{Mergefrom|RANDOM TEXT|date=MONTH YEAR}}

Thanks, alphaChimp(talk) 01:18, 3 October 2006 (UTC)

I know there's probably a better way, but this worked for me:
Find: {{Merge-date\|(.*?|)\|(.*?|)}}, and replace with {{Merge\|$2|date=$1}}
Find: {{Mergeto-date\|(.*?|)\|(.*?|)}}, and replace with {{Mergeto\|$2|date=$1}}
Find: {{Mergefrom-date\|(.*?|)\|(.*?|)}}, and replace with {{Mergefrom\|$2|date=$1}}
Mets501 (talk) 01:35, 3 October 2006 (UTC)
Don't forget if you're doing this in a case-sensitive fashion, the first letter can be upper or lower case and still be pointing to the same template (due to Wikipedia's Mediawiki configuration). [Mm] would cover that. --kingboyk 11:32, 3 October 2006 (UTC)
Thanks guys. I'm not running it case sensitive. You can check out the effects on Alphachimpbot. One correction to the above syntax (which is awesome) though: {{Merge\|$2|date=$1}} should be {{Merge|$2|date=$1}}, etc. (just remove the "\") alphaChimp(talk) 17:41, 3 October 2006 (UTC)
Single replacement version: Find: {{(Merge[a-z]*)-date\|(.*?|)\|(.*?|)}}, and replace with e {{$1|$3|date=$2}} (assuming there are no other merge*-date templates, which would also be caught in this pattern). -- JHunterJ 17:54, 3 October 2006 (UTC)

Restricting access on en.wikinews

Based on the Typos going in Wikinews:AutoWikiBrowser/Typos, I assumed that if I set up wikinews:Wikinews:AutoWikiBrowser/CheckPage I'd get blocked from editing. Can this be introduced without too much trouble, or is it another page name I need to use to restrict access? --Brianmc 12:44, 3 October 2006 (UTC)

By chance I have just (yesturday) implemented the checkpage to work on other projects, it will be in the next version. Martin 14:06, 3 October 2006 (UTC)

noinclude

Hello, On this edit, AWB tried to move the "Category:Days in 2004" to the bottom, outside of the <noinclude></noinclude>. --After Midnight 0001 02:34, 4 October 2006 (UTC)


div tag move

I made this change yesterday with AWB. As you can see in the diff, the div tag was moved when it shouldn't have been. Is this an easy fix? If it makes a difference, I was not using "apply general fixes". --Kbdank71 10:26, 4 October 2006 (UTC)

What's going on in this edit?

Check it out for yourself: [5]. Unicoding, general fixes, and all the other options were disabled. Alphachimp 12:47, 4 October 2006 (UTC)

That's a very rare bug, will only occur with maths tags inside image links using certain settings, but I have fixed it now anyway in the version I just released. thanks Martin 13:17, 4 October 2006 (UTC)
OK, thanks. I really hope the error wasn't proliferated any further than that one article. (If you check the article, you'll see that the first pass by Alphachimpbot had no difficulties) Alphachimp 14:29, 4 October 2006 (UTC)

Problem with CheckPage

In Main.cs in code

                       else
                       {
                           MessageBox.Show(UserName + " is not enabled to use this.", "Problem", MessageBoxButtons.OK, MessageBoxIcon.Exclamation);
                           System.Diagnostics.Process.Start(Variables.URLShort + "/wiki/Wikipedia:AutoWikiBrowser/CheckPage");
                           return false;
                       }

"/wiki/Wikipedia:" should be replaced with "/wiki/Project:", or it will send non-enwiki users to the non-existant page on en:. MaxSem 17:44, 5 October 2006 (UTC)

I'll change that, thanks. Martin 13:33, 6 October 2006 (UTC)

Latest version

I don't know if it's due to changes you've made in the latest version, whether it's because I cleared my IE cache manually again, or because I turned off image display, but I've just had an AWB run tag 4,400 pages without crashing. That's the best I've managed for a long time. I'll give you the credit Martin, so - thanks very much! :) --kingboyk 13:09, 6 October 2006 (UTC)

I added a timeout function when loading/saving, that may have been it (as well as a quite bit of tweaking), I think the servers have been quick recently as well. Thanks! Martin 13:33, 6 October 2006 (UTC)
Does that mean AWB is pretty much guaranteed not to stall now? My "nudge" feature no longer needed?? --kingboyk 23:57, 6 October 2006 (UTC)

Re: International issue

Problem reported here could be fixed by the following patch:

Index: C:/Projects/AWB/AWB/WikiFunctions/Variables.cs
===================================================================
--- C:/Projects/AWB/AWB/WikiFunctions/Variables.cs	(revision 396)
+++ C:/Projects/AWB/AWB/WikiFunctions/Variables.cs	(working copy)
@@ -50,6 +50,11 @@
         public static Dictionary<int, string> NamespacesCaseInsensitive = new Dictionary<int, string>(24);

         /// <summary>
+        /// Contains unlocalized names of namespaces
+        /// </summary>
+        public static Dictionary<int, string> GenericNamespaces = new Dictionary<int, string>(24);
+
+        /// <summary>
         /// Gets a URL of the site, e.g. "http://en.wikipedia.org/w/".
         /// </summary>
         public static string URL
@@ -656,10 +661,34 @@
                     break;
             }

+            GenericNamespaces[-2] = "Media:";
+            GenericNamespaces[-1] = "Special:";
+            GenericNamespaces[1] = "Talk:";
+            GenericNamespaces[2] = "User:";
+            GenericNamespaces[3] = "User talk:";
+            GenericNamespaces[4] = "Project:";
+            GenericNamespaces[5] = "Project talk:";
+            GenericNamespaces[6] = "Image:";
+            GenericNamespaces[7] = "Image talk:";
+            GenericNamespaces[8] = "MediaWiki:";
+            GenericNamespaces[9] = "MediaWiki talk:";
+            GenericNamespaces[10] = "Template:";
+            GenericNamespaces[11] = "Template talk:";
+            GenericNamespaces[12] = "Help:";
+            GenericNamespaces[13] = "Help talk:";
+            GenericNamespaces[14] = "Category:";
+            GenericNamespaces[15] = "Category talk:";
+            GenericNamespaces[100] = "Portal:";
+            GenericNamespaces[101] = "Portal talk:";
+
             NamespacesCaseInsensitive.Clear();
             foreach (KeyValuePair<int, string> k in Namespaces)
             {
-                NamespacesCaseInsensitive.Add(k.Key, Tools.CaseInsensitive(k.Value));
+                string s = Tools.CaseInsensitive(k.Value);
+                if (GenericNamespaces.ContainsKey(k.Key) && GenericNamespaces[k.Key] != k.Value)
+                    s = "(" + s.TrimEnd(':') + "|" + Tools.CaseInsensitive(GenericNamespaces[k.Key].TrimEnd(':')) + "):";
+
+                NamespacesCaseInsensitive.Add(k.Key, s);
             }
         }
     }

Index: C:/Projects/AWB/AWB/WikiFunctions/Parsers.cs
===================================================================
--- C:/Projects/AWB/AWB/WikiFunctions/Parsers.cs	(revision 396)
+++ C:/Projects/AWB/AWB/WikiFunctions/Parsers.cs	(working copy)
@@ -429,7 +429,7 @@

             foreach (Match m in catregex.Matches(ArticleText))
             {
-                x = cat + m.Groups[1].Value.Replace("_", " ") + "]]";
+                x = cat + m.Groups[m.Groups.Count-1].Value.Replace("_", " ") + "]]";
                 ArticleText = ArticleText.Replace(m.Value, x);
             }

Hope it helps. MaxSem 14:20, 6 October 2006 (UTC)

Thanks. Martin 19:30, 6 October 2006 (UTC)

Request comment

User:Tobias Conradi is disrupting the checkpage, please can someone else comment Wikipedia_talk:AutoWikiBrowser/CheckPage#AWB_out_of_WP_policy. Martin 19:37, 6 October 2006 (UTC)

Added to the section

I have added to a existing section above. Please respond there.- Ganeshk (talk) 18:50, 7 October 2006 (UTC)

Insert or insert tag context menus

Could we get context menu items for these regularly added categories please Martin? -

Thanks. --kingboyk 15:01, 9 October 2006 (UTC)

Bot timer

I think you missed my post above. —Mets501 (talk) 15:08, 9 October 2006 (UTC)

Regex help needed

I'm a complete noob with regexes. Is it possible to do the following change with AWB:

Interstate X.svg or Interstate_X.svg to I-X.svg, where X is a one-to-four character string

If so, how would I do this? If not, is there any workaround to achieve a variable-like effect? Thank you. --NE2 22:11, 9 October 2006 (UTC)

Yes, find: Interstate( |_)(.*?|).svg and replace with I-$2.svg. This is completely untested but should work. —Mets501 (talk) 01:14, 10 October 2006 (UTC)
Thank you for the speedy reply. There's a problem though; that also finds [[Interstate 495 (Massachusetts)|I-495]] and replaces it with [[I-495 (Massachusetts)|I-495]]. I believe this is because the variable can be any length, so it finds this false positive as long as there is a .svg somewhere after it. --NE2 01:32, 10 October 2006 (UTC)
Interesting, try find: Interstate( |_)([0-9]|)([0-9]|)([0-9]|)([0-9]|).svg and replace with I-$2$3$4$5.svg. —Mets501 (talk) 02:35, 10 October 2006 (UTC)
I actually wanted Interstate( |_)([0-9A-Z]|)([0-9A-Z]|)([0-9A-Z]|)([0-9A-Z]|).svg, since there are routes like I-H201 and I-35E. Thank you. --NE2 03:14, 10 October 2006 (UTC)

"Messages can only be appended to talk pages"

I'm trying to nominate a large number of images for deletion, but I can't add the "message", since it only lets me do it on talk pages. Is there any way to bypass this? --NE2 01:51, 10 October 2006 (UTC)

I'll change this to work on all pages, as it is a fairly common question. Martin 11:17, 11 October 2006 (UTC)

Problem with category sorting

On Interstate 444, AWB wanted to replace [[Category:Interstate 44| 4]] with [[Category:Interstate 44|4]]. --NE2 04:08, 10 October 2006 (UTC)

Ok, I'll fix this. thanks Martin 11:17, 11 October 2006 (UTC)
Thank you. --NE2 11:41, 11 October 2006 (UTC)
was [[Category:Interstate 44|4]] wrong ? what's the difference between this and previous with extra space ? gregul
Note how in Category:Interstate 44, I-444 is sorted in the "blank" section. Removing the space would move it below US 277. Category:Interstate 95 is a better example; all the state articles are first. --NE2 21:35, 11 October 2006 (UTC)
I guess, it changes [[Kategoria:Argentyna| ]] as well, to [[Kategoria:Argentyna|Argentyna]] in pl:Argentyna pl:wikipedysta:gregul


IRCMonitor tweak?

Hi Bluemoose: I figure the IRCMonitor included with AWB should have an option "Only if comment matches regex:" similar to the existing "Only if title matches regex:". Perhaps you would consider that in a future release? Thanks. –Outriggr § 04:27, 12 October 2006 (UTC)

I haven't worked on this for a while, but it is a worthwHile suggestion. thanks Martin 18:09, 12 October 2006 (UTC)

br

Can AWB automatically change <br /> tags to <br />? — AnemoneProjectors (talk) 13:01, 12 October 2006 (UTC)

It isn't really needed - MediaWiki does it automatically when translating wikitext to HTML. MaxSem 14:03, 12 October 2006 (UTC)

bold bug

it's definetely into "apply general fixes", because after unset, it doesn't try to do a change like this [6] gregul

I've done a load more tweaking to make things work better in other languages, so this will be fixed in next release. Martin 18:09, 12 October 2006 (UTC)

List of recently used configurations

Patch is there. MaxSem 17:49, 12 October 2006 (UTC)

Thanks, just committed it. Martin 18:09, 12 October 2006 (UTC)

Hanging

With the newest version, 3.0.4.1, I'm getting frequent "hangs" after I click on the "Save" button. Sometimes, yes, it's just slow, and if I wait a minute it'll automatically move on to the next article in the list, but sometimes it's gotten stuck for several minutes, until I click on "Ignore", at which point it wakes up and moves on. When I check the related article, the save is getting processed, such that if I instead click "Stop" and then restart the process, it skips the article as having already been handled. The problem seems be occurring about 50% of the time today... Is it just a problem with sluggish servers, or is there perhaps some other bug going on? --Elonka 19:10, 12 October 2006 (UTC)

(update) Another odd behavior worth mentioning, is that it will display the correctly "saved" article in the upper panel, before it freezes. So it's definitely getting some feedback from the server. --Elonka 19:16, 12 October 2006 (UTC)
Quite possibly due to server problems, but I have fixed a few things anyway, so hopefully the problem will disappear one way or other. Martin 15:10, 14 October 2006 (UTC)

Help with regular expressions

Hello all! I'm just wondering if it's possible to do the following with AutoWikiBrowser using expressions...

Convert 41° 35' 08.1000" N<br />70° 32' 25.3000" W to {{coor dms|41|35|08.1000|N|70|32|25.3000|W|type:airport}}

For the N/S set, the first number can be 1-3 digits, the second 1-2 digits, the third 1-2 and 0-4 (##.###). The same pattern repeats for the E/W set.

I've just been searching for °, and it's been working, but I have some 1,000 articles to do and it's quite tedious to have to manually reformat each entry. Thadius856 02:57, 13 October 2006 (UTC)

The following works (I just used it on Winder-Barrow Airport):
([0-9]|)([0-9]|)° ([0-9]|)([0-9]|)' ([0-9]|)([0-9]|)(.|)([0-9]|)([0-9]|)([0-9]|)([0-9]|)" ([A-Z]|)( |)<br />( |)([0-9]|)([0-9]|)([0-9]|)° ([0-9]|)([0-9]|)' ([0-9]|)([0-9]|)(.|)([0-9]|)([0-9]|)([0-9]|)([0-9]|)" ([A-Z]|)
to
{{coor dms|$1$2|$3$4|$5$6$7$8$9$10$11|$12|$15$16$17|$18$19|$20$21$22$23$24$25$26|$27|type:airport}}
If there are other optional spaces, use more ( |) and don't forget to increment the variable numbers. --NE2 03:02, 13 October 2006 (UTC)

Works great. Thanks! Just for the record, I was wrong before... the N/S coordinate only goes to 90 deg at the poles (so there's no need for a 28th digit to, as N/S can never be 3 digits). It seems you already knew this, though. :) Thadius856 20:54, 13 October 2006 (UTC)

I haven't tested it, but you might also try:
([0-9]{1,2})° ([0-9]{1,2})' ([0-9]{1,2})(\.[0-9]{0,4}|)" ([NS]?) *<br /> *([0-9]{1,3})° ([0-9]{1,2})' ([0-9]{1,2}\.?[0-9]{0,4})" ([EW]?)
to
{{coor dms|$1|$2|$3$4|$5|$6|$7|$8|$9|type:airport}}
A little shorter, and I think the period needs the backslash to escape it, even with the other version. -- JHunterJ 21:06, 13 October 2006 (UTC)

Both seem to be working. However, I'm running into a problem with (some) articles where somebody input it across two lines.

coordinates = 33° 42' 50.8000" N
96° 40' 25.2000" W |

I tried setting both versions to multiline="true" and have had no luck. So, I attempted to modify NE2's version (placing the original as singleline="true" and my version as multiline="true"), but apparently I'm not advanced enough to figure it out. Here what's I tried:

([0-9]|)([0-9]|)° ([0-9]|)([0-9]|)' ([0-9]|)([0-9]|)(.|)([0-9]|)([0-9]|)([0-9]|)([0-9]|)" ([A-Z]|)([0-9]|)([0-9]|)([0-9]|)° ([0-9]|)([0-9]|)' ([0-9]|)([0-9]|)(.|)([0-9]|)([0-9]|)([0-9]|)([0-9]|)" ([A-Z]|)
to
{{coor dms|$1$2|$3$4|$5$6$7$8$9$10$11|$12|$13$14$15|$16$17|$18$19$20$21$22$23$24|$25|type:airport}}

It looks to me like it should work, but it's not finding it to make changes in the first place. :\ Thadius856 21:14, 13 October 2006 (UTC)

Note: An example to play with would be Grayson County Airport Thadius856 21:16, 13 October 2006 (UTC)
([0-9]{1,2})° ([0-9]{1,2})' ([0-9]{1,2})(\.[0-9]{0,4}|)" ([NS]?) *(?:<br />|\s+) *([0-9]{1,3})° ([0-9]{1,2})' ([0-9]{1,2}\.?[0-9]{0,4})" ([EW]?)
to
{{coor dms|$1|$2|$3$4|$5|$6|$7|$8|$9|type:airport}}
works on the example page. -- JHunterJ 22:14, 13 October 2006 (UTC)
Here's the expression I used for tidying the RamBot aticles:
<datagridFAR find="(\d+)°(\d+)'(\d+)"\s*(N|S)o(u|r)th,\s*(\d+)°(\d+)'(\d+)"\s*(W|E)(est|ast)" replacewith="{{coor dms|$1|$2|$3|$4|$6|$7|$8|$9|city}}" />
However be aware it's not tested for the southern or eastern hemispheres. Rich Farmbrough, 11:35 14 October 2006 (GMT).
It doesn't appear to handle decimals in the seconds, but I do like \d better than [0-9] though, and it handles spelled-out compass points. Again untested:
(\d{1,2})°\s*(\d{1,2})'\s*(\d{1,2}(?:\.\d{0,4})?)"\s*(N(?:orth)?|S(?:outh)?)(?:\s*<br />\s*|\s+)(\d{1,3})°\s*(\d{1,2})'\s*(\d{1,2}(?:\.\d{0,4})?)"\s*(E(?:ast}?|W{?:est)?)
to
{{coor dms|$1|$2|$3|$4|$5|$6|$7|$8|type:airport}}
--JHunterJ 11:51, 14 October 2006 (UTC)

Mathematics articles and Unicode

I have added a note to the caution "Don't do anything controversial with it" to notify do-gooders that the Wikipedia mathematics community has repeatedly confronted the issue of automatic conversion of entities to Unicode, and firmly objects. Individuals writing articles are free to use UTF-8 characters if they prefer, but existing entities (like “&theta;”) should not be replaced. (Because many characters do not have HTML entity names, I keep a private page with a large table to use in my own edits, and inform newcomers of Code2000 and the upcoming STIX fonts.) We are not anti-Unicode, nor anti-bot; this consensus involves issues special to mathematics, such as coexistence with TeX markup and MathML. Your cooperation will be appreciated. --KSmrqT 05:01, 13 October 2006 (UTC)

I'll change the code so maths articles are ignored automatically. Though I have to say, there is hardly a firm objection in the given discussions. In fact, the logic used against unicode is highly flawed, as unicode is already used virtually everywhere it could be. Martin 09:23, 13 October 2006 (UTC)
Many thanks. One underlying problem is that mathematics markup for Wikipedia is an ugly mishmash of TeX, wiki, HTML, and UTF-8. There are bugs and limitations in the first two, and browser problems with the latter. The mathematics community outside of Wikipedia has standardized on LaTeX, and a project is in the works to make a better converter than texvc; it's called blahtex, and can produce lovely MathML. It already works well by itself, but use of MathML in Wikipedia output requires the pages generated by MediaWiki to be valid XML, and the developers have been slow to make the necessary fixes. The goal is to be able to use LaTeX markup exclusively, for both inline and display equations, and have it always look readable, attractive, and consistent. Since the TeX is going to use names like theta, it is preferable to use named entities for the same characters where possible.
For example, the fundamental trigonometric identity
is lovely as a display when marked up as Tex.
<math> \sin^2 \theta + \cos^2 \theta = 1 \,\!</math>
But it is highly unsatisfactory inline, , where the font sizes, font faces, and baselines clash with the surrounding text. In fact, TeX theta alone looks bad inline, coming out either or depending on whether it is converted to a PNG image or not. Instead we can write sin2 θ+cos2 θ = 1 in an inline context, using wiki markup.
sin<sup>2</sup> &theta;+cos<sup>2</sup> &theta; = 1
This illustrates the kind of crap Wikipedia mathematicians have to live with today. And whenever a well-meaning bot gleefully changes all our &theta; markup to θ, it not only makes our life more difficult in the present, but complicates our conversion to all-TeX markup (with beautiful consistent MathML typesetting) in the future.
Bear in mind that I am only relaying one of the arguments from our extended discussions, and not trying to do justice to all the issues raised. But I hope this helps others better appreciate that the request is not Luddite.
As an example of a wiki markup bug, nested superscripts should be able to use the obvious markup.
''A''<sup>''B''<sup>''C''</sup></sup>
No such luck, as the bug converts this to ABC. Instead, we must use a dodge like
''A''<sup>''B''<span><sup>''C''</sup></span></sup>
to get the desired ABC.
So thanks for your kindness in amending the bot; we can use all the help we can get. --KSmrqT 10:55, 13 October 2006 (UTC)
One likely way forward, is to use AWB when Blahtex is ready to rock an' roll to convert all the current maths display methods. This would mean that there is little harm in unicodifying at present. What do you think? Rich Farmbrough, 11:28 14 October 2006 (GMT).
You mean like changing a&nbsp;+&nbsp;b&nbsp;=&nbsp;c to <math>a+b=c</math>? That would be really hard, I think. —Mets501 (talk) 16:39, 14 October 2006 (UTC)
No, changing &theta; to θ makes life more difficult for mathematical editors even in the present. People who do not write a lot of mathematics think WYSIWIG is a boon. Not necessarily. Do a time-and-motion study, in the style of GOMS, for typing the name versus selecting a special character from a menu. Then add in the penalty of inconsistency, using \theta in displayed TeX versus a UTF-8 character inline. If we had a two-view editor, in the mold of Lilac, then we might get the best of both worlds. I don't see that happening here any time soon. --KSmrqT 22:58, 14 October 2006 (UTC)

white space removal

Hi! I have a suggestion, to the 'skip' section, could a checkbox saying skip if only whitespace is removed be created, to help bots decrease the server load? ST47Talk 10:23, 13 October 2006 (UTC)

Hi, the "Skip when no changes" does do this to a certain extent, but to be honest, a bot should be automatically skipping articles when the change it is making is not done. The only time this may not be possible is when the change is part of the "General fixes", but that is why I made the "More" skip options. Martin 10:37, 13 October 2006 (UTC)

stub on bottom

it isn't good if "section stub" is moved at the end, because section stub relates to a section, not to whole article, could this option be unset in some way, in other languages moving stubs at the end isn't welcome gregul

Actually, AWB have some code to prevent this, could you provide a sample page where AWB misbehaves? MaxSem 11:22, 14 October 2006 (UTC)
It looks like he's talking about other languages, where the word for section is different. --NE2 23:00, 14 October 2006 (UTC)
I see. So far, "Apply general fixes" option is for en WP only. MaxSem 07:31, 15 October 2006 (UTC)
many "general fixes" are useful everywhere, but yes, the name is {sekcja stub}, I also thought about not moving {stub} or setting this after an article (before categories), many admin's bots move it this way gregul

Error with 3.0.41

"The system cannot find the file specified. (Exception from HRESULT: 0x80070002)" Problem occurring at beginning of run, where I am having problems with AWB confirming that login and authorisation are there. (Similar to my problem above, but harder to shift.)

Now: Unhandled exception: perhaps this data will help.


************** Exception Text **************
System.NullReferenceException: Object reference not set to an instance of an object.
   at WikiFunctions.Browser.WebControl.set_Status(String value)
   at WikiFunctions.Browser.WebControl.IncrememntTime(Object sender, EventArgs e)
   at System.Windows.Forms.Timer.OnTick(EventArgs e)
   at System.Windows.Forms.Timer.TimerNativeWindow.WndProc(Message& m)
   at System.Windows.Forms.NativeWindow.Callback(IntPtr hWnd, Int32 msg, IntPtr wparam, IntPtr lparam)

Loaded asemblies in a comment here. I'd look at the code, but I can't seem to sync tortoise SVN anymore.

Regards, Rich Farmbrough, 14:28 14 October 2006 (GMT).

I've fixed the NullReferenceException. Googling for "HRESULT: 0x80070002" it seems this error is normally caused by a problem with the .NET framework installation or (rather strangely) if the computer has some kind of adware running on it. See [7] and [8]. thanks Martin 15:05, 14 October 2006 (UTC)
Brilliant. Will leave feedback later. Rich Farmbrough, 16:15 14 October 2006 (GMT).
When is the next release? Rich Farmbrough, 17:23 15 October 2006 (GMT).
No fixed date, probably within the next week. Martin 19:23, 15 October 2006 (UTC)

Bug fix for other projects

I've been using AWB on a number of projects, especially Commons. AWB knows where to look for its files, but it would be helpful if the shortcut in the edit summary was not WP:AWB but COM:AWB (the short version there). This would also be true for Wikinews (WN), Wikibooks (WB) and so on. Thanks.--Nilfanion (talk) 02:22, 15 October 2006 (UTC)

Fixed for Commons and Meta, but this system requires some rewriting. Currently, summary tag and project namespace depend on language only and makes no difference between WP and WS for example. I'll write a code to load namespaces directly from the server. MaxSem 07:42, 15 October 2006 (UTC)
BTW, there is no such redirect as commons:COM:AWB, and I can't find a local AWB page on Commons. Perhaps, link to en WP page should be given? MaxSem 09:21, 15 October 2006 (UTC)
Heh, we are going to write up the commons local page soon.--Nilfanion (talk) 13:48, 15 October 2006 (UTC)

Wow, rewrote a lot of code, can someone more proficient in English check new messages before I commit them:

  • An error occured while loading project information from the server. Please make sure that your internet connection works and such combination of project/language exist.
  • Error loading namespaces.
  • Defaulting to the English Wikipedia settings.

Thanks. MaxSem 10:22, 15 October 2006 (UTC)

There is an English problem with this sentence:
  • Please make sure that your internet connection works and such combination of project/language exist.
It might be corrected in different ways, depending on what it should mean.
  1. Please make sure that your internet connection works and that such a combination of project and language exists.
  2. Please make sure that your internet connection works and that such combinations of project and language exist.
Or you might mean something else. I'm not sure how to interpret the "/", whether as "and", "or", or something else. --KSmrqT 13:06, 15 October 2006 (UTC)
I'd really rather hardcode the namespaces in than load them up at runtime. Martin 10:49, 15 October 2006 (UTC)
The problem is that such hardcoding requires bo be much more complicated than it is currently. Different projects of the same language may have a different set of namespaces, and with growing numbers of new projects, especially Wikiversities, keeping this list up to date would become harder and harder. Take a look at my implementation here. MaxSem 11:13, 15 October 2006 (UTC)
There are currently about 700 Wikimedia projects, without looking at wikia etc. etc.. Rich Farmbrough, 12:02 15 October 2006 (GMT).
What would be good is if the namespaces were only loaded if they were not hardcoded in already, for example we could hardcode the namespaces for wikipedia projects, as these are the most common, but make it load the namespaces for other projects at runtime. Martin 12:55, 15 October 2006 (UTC)
I'll think about a namespace cache. MaxSem 15:31, 15 October 2006 (UTC)
A combination of your code plus the existing code could do this very simply. Martin 15:51, 15 October 2006 (UTC)
Existing code typically supports only wikipedias in other languages. For example, no: Namespaces[4] = "Wikipedia:";, and therefore AWB will not function fully on no.wikibooks, no.wikisource and no.wikiquote. Adding nested switch statements to fix this will complicate the code even more. MaxSem 16:49, 15 October 2006 (UTC)
I just commited what I meant, this way the namespaces are only loaded for sites that are not wikipedia, but otherwise your method is used to get them. Martin 19:23, 15 October 2006 (UTC)
Looks good. MaxSem 20:06, 15 October 2006 (UTC)