Jump to content

Wikipedia talk:List of Wikipedians by number of edits/Archive 4

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 1Archive 2Archive 3Archive 4Archive 5Archive 6Archive 10

Any updates soon?

I know the stats behind this page haven't been updated since May 16 2005 (according to [1]), but this data was last updated on April 27 2005. Any chance we could update this list to our most recent dataset? Also, any word on when the dataset will be updated? - jredmond 21:28, 15 Jun 2005 (UTC)

Check with User:ClockworkSoul, as he did the last two updates. --tomf688(talk) 19:32, Jun 18, 2005 (UTC)
Just a note - User:ClockworkSoul has made no edits since 21 May. I haven't tried to reach him by email. -- Rick Block (talk) 19:51, Jun 18, 2005 (UTC)
Damn. I'd be in this list right now if there was an update - I'd be a WP:1000 person! Come on ClockworkSoul :P, update this baby. Harro5 July 2, 2005 09:19 (UTC)
The CSV file that's supposed to be updated weekly is from May 27, 2005. 2004-12-29T22:45Z July 5, 2005 06:46 (UTC)

There was a database dump from the just before the 1.5 release (June 15th, I think). Not sure why this never made it to the stats pages (and hence the CSV). However the stats script is incompatible with 1.5. So there will be no updates down this route for a while. Pcb21| Pete 5 July 2005 07:26 (UTC)

According to HTTP HEAD, StatisticsUsers.csv was "Last-Modified: Fri, 27 May 2005 14:46:17 GMT". The last database dump was 20050623.

Who is in charge of creating the CSV file (who used to do it)? Who is in charge of creating the database dumps? We need to talk to these people to see when the new versions will be made.

Also, I have no idea where to get a full list of bots. If anyone can point to one, or create one, that would be useful.

I'd be willing to write a script to generate the table, either from the CSV or the database dump, but if it's from the db dump I might not be able to run it. My server which is connected to the internet only allows a certain amount of traffic, and Verizon still hasn't provided me with DSL. I might be able to use a friends, though, if someone tells me when the next dump is going to be available. anthony 14:24, 11 July 2005 (UTC)

Brion Vibber for the dumps themselves. Erik Zachte will be able to tell you his plans for making a new stats script that works with 1.5. His plans may or may not include the CSV by-product of course. Pcb21| Pete 16:04, 11 July 2005 (UTC)
I've written scripts to generate the tables given the csv file. Anyone think it's worth updating the article to the May 27 csv data? I'll post the scripts somewhere soon. -- Rick Block (talk) 23:14, July 11, 2005 (UTC)
There must be half-a-dozen such scripts by now :) Classic bicycle shed problem - we can all solve the easy bit but none of us the hard bit. Update if you want to.... . Also, remember that Kate's Tool (http://kohl.wikimedia.org/~kate/cgi-bin/count_edits.cgi) is an easy way to get your own edit count. Pcb21| Pete 07:40, 12 July 2005 (UTC)
Welp, Verizon just informed me that I'm not getting DSL. So unless someone wants to give me developer access, you can count me out. anthony 21:47, 18 July 2005 (UTC)

Note there's a new place you can run SQL queries on database dumps: see the lovely WikiSign site set up by Benutzer:Filzstift!! It has an English dump from June 23. — Catherine\talk 20:29, 19 July 2005 (UTC)

Nice (for other uses), but to gather these stats from an SQL query you'd need the full hist dump, not just the current articles. anthony 23:54, 20 July 2005 (UTC)

I see that there is now an XML dump, as of 7/13. If someone runs "wget http://download.wikimedia.org/wikipedia/en/20050713_pages_full.xml.gz -q -O - | gunzip -c | grep -e '<contributor>\|<title>' >count.txt" it would be simple to generate the table from that file, and that file could be compressed small enough for someone else to download. anthony 00:28, 21 July 2005 (UTC)

I'm doing that now, let's see how big a file I get. (I have access to a modest-sized Linux machine at MIT, so it has plenty of bandwidth.) The output file is gonna be huge - it's up to 6MB and it's only up to "Abbreviation". Does someone have a program which will grovel over that file and produce StatisticsUsers.csv, or some useful equivalent? If necessary, I could hack up a program to do it (in C, of course, so it'll be blindingly fast :-). Oh, BTW, your command above had a bugs in it - the "gunzip" element of the pipeline should have been just "gunzip -c -d -f". Noel (talk) 20:24, 21 July 2005 (UTC)
PS: An optimization would be to include a "grep -v -e '<ip>'" stage in the pipe as well; I assume we have no interest in counting anon's. I don't suppose there's a grep that allows you to do both at one, is there? I suppose it would be easy enough to whip up a specialized program that did it, all in one fell swoop. Actually, it would probably be easy to write that as a front-end on the count program, and pipe the output of the gunzip straight into the counter (once it's debugged). Noel (talk) 20:46, 21 July 2005 (UTC)
Hmm. Seems like that might not be a good optimization. Looking at the output, I see lines like: "<contributor><ip>TimShell</ip></contributor>" which presumably result from the reassignment of an edit from an anon to an editor. Hmmmm... Noel (talk) 21:47, 21 July 2005 (UTC)
Yea I d/l it last night and gzip'd, 6gb I think it was, left grep running on my server, hadn't had a chance to check to see if it was done yet, if it is, yer more than welcome to use the output.txt. I was gonna write a similar small prog to parse it the rest of the way. Who?¿? 20:34, 21 July 2005 (UTC)
Hmm, well there's no point both of us doing it! If you have the output file already (run it through grep -v as above, to filter out the useless anon lines, and see how big the result is), there's no need for me to do this run, I'll cancel mine. Noel (talk) 20:46, 21 July 2005 (UTC)
PS: It's 40MB (plus anons) at "Christmas Tree", so the output file will likely be about 300-500MB. Noel (talk) 20:50, 21 July 2005 (UTC)
Well, it terminated, but I think it may have aborted (that's the problem with running this over a single TCP connection for a mongo database), the resultant file is only 55MB long. How long is your output file? I can always try and run mine again later tonight, if needed. Still, enough for me to whip up the counting program. Noel (talk) 21:47, 21 July 2005 (UTC)
Sorry, I hadn't had a chance to get to the serv yet. Yea I d/l the entire dump, instead of parsing it on the fly, not to bog down the server, so I got the whole 6gb compressed, I'm sure the output will be in the range you said. I should have it here shortly to see, and I'll provide a url, it'll be txt, so wont be long in d/l. Who?¿? 22:43, 21 July 2005 (UTC)
No good, my server cried at about 80% grep completion, so I have a 25mb output file with lots of dups and anon's. Can't re-try it right now either. So I have to defer back to Noel or anthony. Sorry guys. Who?¿? 23:56, 21 July 2005 (UTC)
Not to worry; I'll try it later tonighte, when things are less loaded, and hopefully it'll complete. If not, I'll see if I can get shell access to run it directly on download once I have it running; why take the data to Mohammed, right? (It's parsing the input file OK now, I just have to allocate an array of counters and increment the appropriate one, depending on which namespace the article is in, which I have to figure out (should be easy to code up). Noel (talk) 00:56, 22 July 2005 (UTC)

The output file is gonna be huge - it's up to 6MB and it's only up to "Abbreviation". Hmm, it'll probably zip up pretty good, but maybe it'll still be unreasonably big. Does someone have a program which will grovel over that file and produce StatisticsUsers.csv, or some useful equivalent? I figure it wouldn't be hard to make a perl script, if nothing else. You'd have to check things a little better, for instance any article with "<contributor>" in it would be included, but other than that just set up a hash table of contributors and increment for each one. The title I suppose was unnecessary for this usage, but I can think of a few other uses for that info. Oh, BTW, your command above had a bugs in it - the "gunzip" element of the pipeline should have been just "gunzip -c -d -f". Ah, right, I was testing this in two stages at first, and then forgot to take out the filename when I combined it into one. PS: It's 40MB (plus anons) at "Christmas Tree", so the output file will likely be about 300-500MB. That's reasonable in itself. But it'll probably compress extremely well. You could even run it back through gzip in the pipe if you're short on disk space. Well, it terminated, but I think it may have aborted You might be able to quickly find the last article title using the "tac" command. Anyway, I don't know if wget can be rigged up to do a file resume to stdout using a specified byte range. Then again, maybe the problem is with the web server. Damn, this is annoying not having my DSL connection :). anthony 22:28, 21 July 2005 (UTC)

ROTFLMAO! I'm a total fossil, I don't know from Perl. It's being written in C, but that's OK, I can emit C code about as fast as I can type. It's almost done (see above). You don't need to do a hash table, there's that nice <id> field there that you can use to index an array. Noel (talk) 00:56, 22 July 2005 (UTC)
OK, it's working. What format output should it produce - the exact same format as StatisticsUsers.csv, or should I keep going and try and post-process it into something that's ready to paste into the page? Who used to the processing from the StatisticsUsers.csv to what's in page, anyway? Anyone know? Also, should I retry the dump, or does someone who had a dump on their machine want to get the program and run it locally on their machine? Noel (talk) 03:33, 22 July 2005 (UTC)
Hmmm, problem - the database lines that "grep" filtered out don't include a timestamp, so I can't produce the "last 30 days" numbers; all I can get are grand totals. What now? Noel (talk) 06:05, 22 July 2005 (UTC)
Ah, I see; there's also a <timestamp> entry in the database, I'll have to pull those out too. In fact, I think I'll do what I talked about above - tweak the front end a bit so I can get rid of the grep stage entirely, and just feed the database dump straight into this program. Should be fine; as it's written in C, it quite fast (processes a 55MB file in about 2 seconds of real-time). Noel (talk) 16:07, 22 July 2005 (UTC)
User:ClockworkSoul has done the most recent updates starting with the .csv file, but has been missing since May. I recently wrote a bash script to generate the lists that go in the page given the .csv file. If you'd like the source let me know. If the .csv file will be updated regularly, I'd be willing to do the post processing (and I'll post the source someplace). -- Rick Block (talk) 18:42, July 22, 2005 (UTC)
Sounds great. If you don't mind, could you generate the csv (in the same format) as well? anthony 23:24, 22 July 2005 (UTC)

Here's some recent output from my program (run on a 375MB chunk of the database dump, which is why the numbers are small):

en,0,0,1,0,0,0,Tbc
en,55,0,2,0,0,0,Maveric149
en,3,0,1,0,0,0,Stephen Gilbert
en,54,0,3,0,0,0,Koyaanis Qatsi
en,3,0,1,0,0,0,RoseParks
en,24,0,1,0,0,0,Andre Engels
en,24,0,1,0,0,0,JimboWales
en,11,1,1,0,0,0,Liftarn
en,42,0,1,0,0,0,Ams80
en,33,1,1,0,0,0,Ahoerstemeier
en,22,0,3,0,0,0,CatherineMunro
en,4,0,1,0,0,0,TUF-KAT
en,17,0,2,0,0,0,Angela
en,1,0,1,0,0,0,Efghij
en,1,0,2,0,0,0,Aravindet
en,3,0,2,0,0,0,Frihet
en,9,0,1,0,0,0,RedWolf
en,4,0,1,0,0,0,Dehumanizer
en,7,0,1,0,0,0,Marcika
en,9,0,1,0,0,0,Anville
en,8,0,2,0,0,0,Quadell
en,37,3,3,1,0,0,Mustafaa
en,0,0,1,0,0,0,MDMullins
en,2,0,2,1,0,0,D prime
en,0,0,2,0,0,0,LinkBot
en,4,0,2,0,0,0,Philomax 2

which as you can see looks just like the old StatisticsUsers.csv, except that the last two columns are zero - this is because I didn't feel like generating rankings (which would be a fair amount more work for me). Rick Block and I have been discussing this and he said he could work around it in his post-processor (which is takes the pseudo-StatisticsUsers.csv and coughs up the page contents). My program has a zillion flags on it to control what it collects and what gets printed, but I've set if up so that it none are specified, it spits out what looks like the old StatisticsUsers.csv data (i.e. main total, main last 30 days, non-main total, non-main last 30), as above. Noel (talk) 03:07, 23 July 2005 (UTC)

Very cool. This script should work for the rakings, except the seventh field will be the ranking 30 days ago instead of 7 days ago:
awk -F, 'BEGIN {OFS=","} {print $2-$3,$1,$2,$3,$4,$5,$6,$7,$8}' pseudo-StatisticsUsers.csv | sort -rn | awk -F, 'BEGIN {OFS=","} {print $3,$2,$4,$5,$6,$7,++a,$9}' | sort -rn | awk -F, 'BEGIN {OFS=","} {print $2,$1,$3,$4,$5,++a,$7,$8}' >StatisticsUsers.csv
Might be buggy, and might take a long time, though. I don't know. It worked with a sample file I modified from the output you gave. That gives the ranks for main namespace edits. If you want I could modify it to give rankings for total edits instead (slightly longer, and I think you'd have to run awk and sort an extra time). anthony 04:04, 23 July 2005 (UTC)
The real list doesn't count bots, so it's a little more complicated than this (not much). I'm thinking about making the positional change column in WP:1000 be relative to the last posted version, with newcomers to the list listed as "new" (rather than with a positional change). I have a version in bash that does this, somewhat slowly but not ridiculously slowly (3-4 minutes on my Mac). I'll post the source soon. -- Rick Block (talk) 04:35, July 23, 2005 (UTC)
Yeah, I was just trying to generate the CSV, not the list. Also, I've asked this before, quite a while ago, but I'm not sure if it wasn't answered or I've just forgotten the answer. How are you getting the list of bots? anthony 10:56, 23 July 2005 (UTC)
I can help reduce the running time of the post-processor; my program has an argument to only list people with least N mainspace/other/total edits (you select which), so that can greatly reduce the size of the output list, which is the #1 factor in how long operations which include a sort take. Also, if you want to produce the ranking as of last week, there's also a flag you can use to set the "recent" period to N days. I think the list of "bots" was constructed by hand, because there were some people who used bots from their main account (back in the old days), and I think those people had special footnotes in this page. Fow now, it's not in the database dump, so I can't produce it. Noel (talk) 13:16, 23 July 2005 (UTC)
I've reimplemented the bash scripts in awk, and they run much faster. Any suggestions for where I should post the source? I'm thinking Wikipedia:List of Wikipedians by number of edits/scripts or some such. There are two, one for the main namespace edit list and the other for the all namespaces list. They both include a hard coded list of bots (and user accounts including bot edits). If Noel can generate the csv file, I can update the article. -- Rick Block (talk) 17:11, July 23, 2005 (UTC)
I've posted the awk versions at user:Rick Block/wp1000. -- Rick Block (talk) 20:41, July 23, 2005 (UTC)

Progress! I figured out why it wasn't working to pipe the output of the wget | gzip into grep - some flipping visigoth (vandal is such an overused word, and this moron really outdid himself) entered a single line of text into the database that was more than 256KB long. Somehow I doubt grep is set up to handle lines that long! So I took my database-reading program and hacked out from it a simple filter that just reads the database and discards the <text> entries and sends everything else straight through. I started another wget | gunzip piped into that, and that's running now; I'm saving the output on my machine (it should be much smaller than the full XML dump). Then tomorrow I can run my program to generate the fake StatisticsUsers.csv from it. More later... Noel (talk) 07:57, 24 July 2005 (UTC)

Now I've hit a block I don't think I can work around. I'm getting the following error message from gunzip: "gunzip: stdin: unexpected end of file" - and it occured in the exact same place on two consecutive runs. In other words, it's not a random network error causing it. Either there's a TCP bug, or wget has a problem, or something. Anyone have any suggestions? If someone has a copy of the full XML dump on their machine, I can provide the simple filter program (it just uses stdio, so it will run on anything) to filter out the database entries I need; that will be a much smaller file, which I can then copy on over. Noel (talk) 16:14, 24 July 2005 (UTC)
I will be using Noel's program on my laptop in a few, its a lot more powerful than my pc and server. So I should have a good dump and output fairly soon. Who?¿? 22:35, 24 July 2005 (UTC)
Evidentally there is an error in that dump 20050713_pages_full.xml.gz, so I'm gonna try it on the current http://download.wikimedia.org/wikipedia/en/pages_full.xml.gz dated 16 JUL. I tried to repair the error, but its definately too big for that, and no telling what it will erase. Who?¿? 01:01, 25 July 2005 (UTC)
Well after lots of errors, heres a parsed version of the dump, pre-csv, just users and contribs, using Noel's program. http://questdesign.net/output.xml.gz (169mb). Who?¿? 12:36, 25 July 2005 (UTC)
Note that 169mb is the uncompressed size. anthony 12:59, 25 July 2005 (UTC)

pages_full seems to be the same file as 0713_pages. They have the same date and the same size. Just downloaded your output.xml.gz. I'll see what I can do with it. anthony 12:41, 25 July 2005 (UTC)

I thought it was a little bit small. It only seems to get up to the letter F, and even then, I don't think it's strictly in alphabetical order.

[anthony@mcfly anthony]$ grep \<title output.xml | wc -l
10022
[anthony@mcfly anthony]$ wc -l all_titles_in_ns0
1048867 all_titles_in_ns0

I guess the gzipped xml file is broken. anthony 12:49, 25 July 2005 (UTC)

Yea sorry about that, Noel just pointed that out. Even if the dumps are the same, they both have the same error, I tried it on 2 different machines with 4 different unzip programs. Gets an error in about the same spot each time. Need to find a different dump, preferably newer. Who?¿? 13:01, 25 July 2005 (UTC)

Nomination for deletion

On July 9 2005, this article was nominated for deletion. The result was keep. See Wikipedia:Votes for deletion/List of Wikipedians by number of edits for a record of the discussion. – Rich Farmbrough 01:20, 17 July 2005 (UTC)

Updating the list

Hello, I've opened a case regarding updating the list at Computer help desk/Dmcdevit 20050718. It seems to me the best way to solve this problem is to generate a new SQL script that runs against the database. Downloading over 20 gigs of data for one report seems like overkill. Unfortunately I don't have the time to create the new script right now but if someone sees this before I can get to it and they do create it, please attach the SQL to the case above so in case this happens again a starting point for another solution is archived. Triddle 16:34, July 21, 2005 (UTC)

Good plan. I didn't know we had a computer help desk... Of course, I'm not allowed to post there anyway. anthony 10:30, 26 July 2005 (UTC)
Wasnt this done by Jamesday with this script? Who?¿? 10:36, 26 July 2005 (UTC)

Length of List

With the growth in number of users, it would be useful to show more than just the top 1000, maybe 2000 or 3000 users (we could put this in daughter articles if this makes the project page too big). NoSeptember 03:44, 22 July 2005 (UTC)

That would be a good idea, but would prefer it on a sub-page possibly. Not to extend load time of current page. Who?¿? 03:46, 22 July 2005 (UTC)
Good idea. The list has previously been extended from 200 and 500 people so a further extension to, say, 2500 to take account of greater user numbers seems appropriate. I think all users with >1000 all namespace edits should be listed. Pcb21| Pete 21:44, 24 July 2005 (UTC)
For the record, in the original list you only needed 100 edits, and that time mav had made 10% of the edits to WP! old data here Pcb21| Pete 21:48, 24 July 2005 (UTC)

Data

Thank you for generating the list! (question now changed since I've realised my error) OK, I mis-read that, my article contributions on the list are 4000 or so, but according to WP:KT I haven't yet hit 4,000, so is it including template+Images pages? -- Joolz 23:30, 25 July 2005 (UTC)

The source used to generate the updated list was posted, and counts only article contributions. Kate's tool must be using a different database. -- Rick Block (talk) 00:13, July 26, 2005 (UTC)
The other anomaly I noticed is if you look at my article contributions in June (I use myself as an example merely because that's who I checked) and then look at my contribution to all namespace in June, the article ones are twice as much as the all namespace ones, but all namespaces includes articles (surely), so it's a little odd -- Joolz 00:26, 26 July 2005 (UTC)
This one is a bug in the script. The article count includes article edits made since June (including July), the all namespaces count only includes edits made in June. I don't have access to the database and can't rerun the script. I'll leave a note on user:Jamesday's talk page about this. -- Rick Block (talk) 00:59, July 26, 2005 (UTC)
This list doesn't exclude deleted revisions, Kate's tool might. I haven't checked to see if deleted articles count as deleted revisions or not. Jamesday 04:56, 26 July 2005 (UTC)

Well done on getting the update out. I'm sure any remaining statistical gremlins will be sorted out before long. -- Solipsist 06:28, 26 July 2005 (UTC)

That's incredible, I'm number 3? I know I'm in the top ten but I was thinking closer to 10 than 1. But if the numbers are right, well, that's really cool. Everyking 06:56, 26 July 2005 (UTC)

I updated Wikipedia:List of Wikipedians by number of recent edits for the for the dates 1JUN - 24JUL from the data given on this page. See how ya'll like it. I didn't mark all the bots and left some out. If someone feels like adjusting that, that would be great. I only did main namespace for now, I guess I could do all namespaces at some point. Who?¿? 10:20, 26 July 2005 (UTC)

Another update maybe?

Is there going to be regular updates here, or is it too much work? I don't know the computer work involved but from the discusions above it sounded like people had things figured out. Anyways, just wondering. - Trevor MacInnis(Talk | Contribs) 15:54, 20 September 2005 (UTC)

Jamesday did the last update using direct (developer) access to the database. He posted the script somewhere, so I think pretty much all it would take is someone with developer access to re-run the script. -- Rick Block (talk) 18:34, 20 September 2005 (UTC)
He's also usually pretty busy, but here is the link for the script Wikipedia:List of Wikipedians by number of edits/script. Who?¿? 20:07, 20 September 2005 (UTC)
Its been two months and a week. It could really use an update maybe. JobE6 16:01, 1 October 2005 (UTC)
I tried to d/l the full dump again, but there is an issue with d/l it or something, we get an error at the same spot. So basically I think we need a developer to run the script, I wouldn't ask atm though, they are pretty buzy with a few nasty bugs. Who?¿? 17:28, 1 October 2005 (UTC)
Is there any way to get a partial dump, just of the tables needed to run the script? — Stevie is the man! Talk | Work 15:40, 11 October 2005 (UTC)
I tried to get the two latest dumps, thought I had one, but it was corrupt. You can't get a partial one because you need the entire edit history, which we use scripts or programs to filter out the contributors. I tried again a few days ago, and plan to try again here shortly. Who?¿? 19:06, 11 October 2005 (UTC)
The edit history isn't separated from the page content (that is, in separate tables) in the database? — Stevie is the man! Talk | Work 01:35, 12 October 2005 (UTC)
I haven't opened up the entire db in awhile, so I dont know all of the tables. I dont know if you can get just the history w/o direct access to a slave to run a query. It's not an important query, so not one that we would ask to be ran. If you look on http://download.wikimedia.org/wikipedia/en/ , I don't see a dump of just the history. The only way to do it, is to have a copy of the entire dump, which for some reason fails on d/l. I read something about changing d/l options, but I dont remember where. Who?¿? 03:14, 12 October 2005 (UTC)

A "blasphemous" proposal

I expect to be beaten down now (hopefully only virtually), but, in light of the recurring problems arising with updating this page, and the new outlay of Kate's tool (which discourages editcountitis rather strongly), are we sure we really need this page any longer? Lectonar 10:22, 14 October 2005 (UTC)

Yes, we are sure. Ambi 10:41, 14 October 2005 (UTC)
It's not essential, but it'd be nice. Grutness...wha? 11:19, 14 October 2005 (UTC)
Our opportunities of having random, purposeless, curious, no-stress fun on the Wikipedia are so limited already, so why take away one of the few of such opportunities? An incredibly big deal is made out of editcountitis and etc... really, we hear so damn much about it. But seriously, how many times have you actually seen it in action other than in cases where the user is a newbie with 500 edits? Once you break the 1200~1500 edits barrier nobody gives a damn anymore. We really should start concentrating on real issues instead of this sort of non-existent problem. (PS: I don't know if my message sounds a little too informal, but I am only speaking this openly because I respect you). Regards, --Sn0wflake 17:57, 14 October 2005 (UTC)
I agree that editcountitis is mostly an imagined problem. I also agree that seeing contributors' edit number standings is kind of interesting/fun. I do think it is one good measure of how much work contributors have done, and it thus serves as a kind of reward for the hard work. — Stevie is the man! Talk | Work 20:56, 14 October 2005 (UTC)

Just because Kate doesn't like his counter does not make that dislike into official policy. One ex user changes nothing, SqueakBox 18:02, 14 October 2005 (UTC)

It's fun. Keep. Even if we do not need it, it just shows facts, how to make an opinion out of this is another thing. If you think pure counting is bad, than provide other stat? de: has alternative stat, but badly I think they stopped this simple one. Tobias Conradi (Talk) 02:57, 17 October 2005 (UTC)

We really do, because i absolutely do not understand WP:KT, but this page is (more or less) comprehensible even to technophobes like me. I dont understand the figures, but at least i can come away with something.. Jdcooper 11:08, 10 January 2006 (UTC)

Keep. What other way could I justify whining on my user page about spending too much time here? It is fun, and one of the few tangible rewards I get. Elf | Talk 18:04, 10 January 2006 (UTC)

Error?

  • Main namespace: 11942
  • all spaces: 9069

for me: Tobias Conradi (Talk) 02:57, 17 October 2005 (UTC)

I think they're reversed. Acegikmo1 04:16, 17 October 2005 (UTC)
Me too. the "Article-space" number for me tallies very closely with Kate's tools' total for all namespaces and vice versa. Grutness...wha? 04:48, 17 October 2005 (UTC)
Yea, definately, I just added ranks. I will swap them now. «»Who?¿?meta 07:09, 17 October 2005 (UTC)
ISTR that ranks aren't normally given to bots, either, so you might want to change that too! :) Grutness...wha? 07:16, 17 October 2005 (UTC)
I knew someone was gonna say that :) I was lazy this time. I have a generic script to do it, i'll see about weeding them out. Unless someone else does first. «»Who?¿?meta 07:19, 17 October 2005 (UTC)
Ok, doing it now, but I have to go flag all the bots that arent "flagged" as bots. «»Who?¿?meta 07:37, 17 October 2005 (UTC)
Done. I think i got them all. Some of them don't have flags, so I had to check their userpages. There are 2 more, but there userpage doesn't indicate that they are bots. Something else odd though, User:John Price doesn't exist, but has a contribs link which shows 0. But kate's shows it has contribs. Not sure what's up with that, unless it's just me. «»Who?¿?meta 09:11, 17 October 2005 (UTC)

Where am I?

Can someone please check both lists and tell me where I am. I cant seem to find it. I know I should be on the list because on September 11th I reached 5,000 edits. Can anyone find it? Look under the name Moe Epsilon please. — Moe ε 22:23, 18 October 2005 (UTC)

You're right, there was a problem loading part of the list part way through the process. I'll prepare an updated list because others may also have been affected. Your counts were: all namespaces: 5542, all in September: 757; main: 5090, main September 667. Jamesday 03:47, 22 October 2005 (UTC)
I checked the original data, before I added rankings, and I don't see you. My bot isn't on there either, so I'm not sure. «»Who?¿?meta 23:13, 18 October 2005 (UTC)

You can't find your name because the list has stopped being updated for a long time. That's the issue. It says that it was updated in October 15th, 2005 but that's not the case. Svest 01:34, 22 October 2005 (UTC)

yes it was updated the 15th by James Day. Check the history. It was data from September to early October. «»Who?¿?meta 01:39, 22 October 2005 (UTC)
Yes, I am sorry for the inconvenience! You are right. Svest 01:42, 22 October 2005 (UTC)

Clarification required

The message at the top says the list is out of date. Then it says it is current as at 15 October. Seems pretty up to date to me. Am I missing something? JackofOz 00:56, 22 October 2005 (UTC)

It was updated but someone forgot to remove the message. See the conversation above this one. --tomf688{talk} 00:59, 22 October 2005 (UTC)
Thank you. JackofOz 01:16, 22 October 2005 (UTC)
What date is out of date, just out of curiosity? (1895 edits for me now, my quest for 2000 before January 1, 2006 is on! Very rationally of course :) ). -- Hurricane Eric - my dropsonde 02:26, 26 October 2005 (UTC)

IP addresses

If it's of interest to anyone, I've parsed the full dump from c. 20 October 2005 and done a ranking for IPs, - some of which have names for historical/technical reasons. I've listed those with over 1500 edits here. Intersting to note that the top "true" IP is Microsoft. Rich Farmbrough 20:44, 30 October 2005 (UTC)

Differing figures

Curiously this page claims I have more edits than does Kate's counter which I have been monitoring for a while. I am 223rd. Any ideas? SqueakBox 17:12, 9 November 2005 (UTC)

  • I suspect some of your edits have been deleted. Check the newest version of Kate's tools, which features the deleted edit counter too. -- WB 04:36, 25 November 2005 (UTC)

Time for an update?

It's been over a month since the last update. Time for another one I think.Gator (talk) 14:13, 21 November 2005 (UTC)

Agree with that. :) --Andylkl (talk) (contrib) 10:33, 25 November 2005 (UTC)
Agree. JackofOz 13:17, 25 November 2005 (UTC)
How can you launch script for generating this list? Dump in download.wikimedia.org does not include user table, that is necessary for getting names and so on. Thanks. --Porao 08:17, 27 November 2005 (UTC)
What do you need to exactly do to retrieve the data? If it's easy enough, any of us can do it. Can anyone include a comprehensive walkthrough? -- WB 22:07, 27 November 2005 (UTC)
The script (here) requires SQL access to a slave copy of the master database. Developers have this, and pretty much nobody else. -- Rick Block (talk) 00:31, 28 November 2005 (UTC)
Um... now contact the developers then? That seems like the only solution... -- WB 01:28, 28 November 2005 (UTC)
Do users who were missed out from the last update need to do anything to ensure they are included this time? CLW 09:17, 28 November 2005 (UTC)

I updated it. Man that sucked. I see why it's been so long. The provided script pretty much does not work now. I'll design some new queries in a few weeks, and begin monthly updates. Let me know if you find any bugs in this version. --Gmaxwell 19:55, 29 November 2005 (UTC)

I've completely redesigned the script, and nice pretty new (and hopefully correct) numbers are posted. If desired, I can change the page to transclude the tables and set a cronjob to automatically update the table pages once a month. --Gmaxwell 22:47, 29 November 2005 (UTC)
Gmaxwell, that's a good idea. Could you do that? --maru (talk) contribs 03:53, 29 January 2006 (UTC)

18:59, 29 November 2005 Update

Sadly, must be something wrong with the update. I'm not in it, and have 8500 article and 14000 all edits. sniff. --Tagishsimon (talk)

You're there now. Minor bug I discovered as soon as I put it up, I was working on it when you wrote here... --Gmaxwell 19:57, 29 November 2005 (UTC)
Darn, the namespace breakup is wrong. I've rewritten the procedure from scratch. Running my new counter now. --Gmaxwell 20:31, 29 November 2005 (UTC)
Done, and rank is back. It *should* be accurate now. If it's not, its all my fault.--Gmaxwell 22:47, 29 November 2005 (UTC)
Your work's much appreciated; thanks. (But you've lost the links from the user names to the the User pages:) --Tagishsimon (talk)
There were links? ha. Well in any case you'll be really impressed with what I've done now.. check back in later. --Gmaxwell 02:54, 30 November 2005 (UTC)
Just a question? Would weekly be too much work for you? We are all suffering some degree of editcountitis. haha. -- WB 06:10, 30 November 2005 (UTC)
Well, I would have said yes earlier today... but I've totally rewritten it again, it now gives much more detailed information (and includes anons although they and bots aren't given ranks) which has resulted in it probably being too slow to run more often than monthly on english wikipedia. here is and example of the new output, though I'm still going through changes. It may turn out to be too costly to even use at all (including anons means I pretty much must read and sort all 30,000,000 revisions). --Gmaxwell 06:25, 30 November 2005 (UTC)
That output is impressive.-gadfium 07:22, 30 November 2005 (UTC)

Are you sure you marked all bots? like, SimonP, Olivier... mikka (t) 06:30, 30 November 2005 (UTC)

SimonP and Olivier are not bots... -- WB 06:33, 30 November 2005 (UTC)
Sure, that's what they want you to think! Mindspillage (spill yours?) 06:40, 30 November 2005 (UTC)
Well, heh.. There *are* bots which are not marked. But thats because they don't have the bot flag. My bot Roomba is one of them. --Gmaxwell 07:10, 30 November 2005 (UTC)

My rating dropped because the bots are numbered now. Shame. I wanted to be 50th on a top 50 website and boast that I've contributed 1/2500th of the interweb. JFW | T@lk 14:38, 30 November 2005 (UTC)

Confused

Not being a big one on edit count, I hadn't really checked my edit count tally for a while. When I did, thinking I'd try and figure out when my 5000th edit would have been, I was a little confused, as my edit count had actually seemed to go down. A few weeks later (i.e., today) I just figured out why, thanks to the WP:1000 page. For some reason, as of 29 November, I'm on there twice, at #1099 (with 3990 edits) and at #1340 (with 2119) edits. Can anyone explain this to me? Is there some way of rectifying this? Please let me know on my talk page if possible. I wub yew all. Proto t c 15:17, 2 December 2005 (UTC)

answered on user's talk page Lectonar 15:24, 2 December 2005 (UTC)
Yes. *shamed* I am hella dumb. Proto t c 15:33, 2 December 2005 (UTC)

Crash!

Not sure what's going on with this page, but it now causes IE to crash my computer. Since it was updated earlier, I've tried to open it four times, and each time IE seized up. Has something done a Bad Thing to the page? Grutness...wha? 09:41, 11 December 2005 (UTC)

Yeah, I tried to look at it earlier too, and the same thing happened. Somebody who can access it, please go find some text in there to cut out so the rest of us can load the page. Everyking 10:35, 11 December 2005 (UTC)
It's ok in firefox, I think the problem is possibly that it is so massive now, not sure how to fix it other than reverting back. Martin 11:19, 11 December 2005 (UTC)
Could cut back the list to include less people? Have a longer list on a different page? Maybe something like that. Everyking 11:43, 11 December 2005 (UTC)
The fact that it is almost 3mb probably has something to do with your problems! I left Gmaxwell a note asking him if he could split it up or something, as it is just unmanagable for me to do. Martin 12:01, 11 December 2005 (UTC)
Yup. If ever a page needed splitting into 36 or so separate pages, this is it. --Tagishsimon (talk)
The new format is so large and data-intensive my computer nearly crashed. That, and I'll be darned if I can find the number of "total edits" anywhere. My vote is to return to the OLD FORMAT. Badagnani 04:38, 19 December 2005 (UTC)
100% agreement. Ambi 12:35, 21 December 2005 (UTC)
I second that. --Andylkl [ talk! | c ] 12:37, 21 December 2005 (UTC)
I third, fourth, and fifth it. Do it, and stat, please. ナイトスタリオン 13:30, 21 December 2005 (UTC)

Almost same here, my Firefox browser hanged for 10 minutes or so after loading and clicking to edit the page. It's almost untouchable for me at the moment. :/ --Andylkl [ talk! | c ] 12:28, 21 December 2005 (UTC)

For me browsing the page works. I am using Firefox. --Roland2 13:55, 21 December 2005 (UTC)

Sorting

Maybe I am missing something, but it might be an idea for the introduction to have a bit more explanation about the rank/sorting order. I thought the first table used to be sorted by the total number of edits in the Article space, but that doesn't appear to be true any more. The second table does seem to be sorted by the number of edits in all namespaces.

Similarly I assume the 'rate' items are something like 'number of edits per day' in each namespace. -- Solipsist 15:13, 11 December 2005 (UTC)

I think the sort is by the sum of the "total" and "this month" numbers, which seems to be some sort of bug (either the "total" isn't the actual total, or the sort order is wrong - pick one). -- Rick Block (talk) 17:13, 11 December 2005 (UTC)
Yes it's definitely wrong. 62.31.55.223 19:57, 11 December 2005 (UTC)
I was wondering about the sorting order as well.. -- WB 22:21, 11 December 2005 (UTC)
Pile on, how is this sorted? xaosflux Talk/CVU 18:16, 13 December 2005 (UTC)
So, eh, how is this sorted? Who sorts it? -- Ec5618 21:01, 11 January 2006 (UTC)
Yes, it's sorted on the sum.. which is a bug. I've fixed the bug, but the DB server wasn't staying up long enough to finish the query at that point in time.. They are more stable now, so I can run it again.. but I need to know what form people want first. :) Ideally I'd like to include a little more information than the orignal. (at least have a rank number that skips bots...) I agree that the table is a browser killer. --Gmaxwell 06:32, 30 January 2006 (UTC)
  • Thanks for the explanation. I have added a note to the page to explain that the sorting is a little bit off. Thanks for your work at maintaining the updates! Johntex\talk 00:10, 1 February 2006 (UTC)

Split this article?

One for main namespace, the other one for all namespace. Since it is too long to edit (over 2 MB), split to two would make it much easier to edit. — Yaohua2000 18:52, 13 December 2005 (UTC)

good idea - even then it would be too big in its current form. Do we need the split by namespace? See more comments under "Chash!" above. Grutness...wha? 01:57, 14 December 2005 (UTC)
Why not, when the script generates it, insert a heading every 50 rows? You'll get a nice little edit button for a 50 row section, rather than the whole thing. Joe D (t) 03:06, 14 December 2005 (UTC)
You'd still get the problem that its impossible to load on some browsers. And simply splitting it every fifty rows might well make the job of updating it quite a chore. Grutness...wha? 03:53, 14 December 2005 (UTC)

Can we please make this a list again? It's the huge tables that take forever to render. --MarkSweep (call me collect) 21:33, 22 December 2005 (UTC)

WTF does "rate" mean?

Could someone please explain (on the project page) what the "rate" number refers to? ··gracefool | 23:01, 11 January 2006 (UTC)

template

does anyone know why the WP shortcut template on the top of the page is broken? I've tried fixing it and nothing seems to work.--Alhutch 19:19, 19 January 2006 (UTC)

It seems to be because the page is too large. violet/riga (t) 19:40, 19 January 2006 (UTC)

What does this page do?

Does anyone actually know what this page does, and how it is organised? Several editors have asked how the list is sorted, several editors have wondered what this page is about. No answers have been given, anywhere, to my knowledge. If no-one can actually answer that question, I'm not sure what the point of the article is. -- Ec5618 19:55, 19 January 2006 (UTC)

It's time to update it anyway :)—Ëzhiki (ërinacëus amurënsis) 20:04, 19 January 2006 (UTC)
hmmm, I know what its meant to do, but it doesnt seem to do it well, I have over 22,000 edits, not 12,000. Martin 20:24, 19 January 2006 (UTC)
And someone should probably explain where the difference comes from, how the rankings are calculated, etc. In other words, make this an article, not a backwater page visited solely by editcounting freaks. The explanation should be top notch, with the regular updates being less important. -- Ec5618 20:37, 19 January 2006 (UTC)
I've finally got more than 1500 edits. Update the page dammit! :) Stevage 19:57, 26 January 2006 (UTC)
Right. If no-one can explain the purpose, layout or operation of this page, it shouldn't exist. Would anyone mind if I nominate it for deletion? -- Ec5618 20:37, 26 January 2006 (UTC)
Yes. It will be fine once updated properly. violet/riga (t) 20:39, 26 January 2006 (UTC)
A deletion nomination would be a huge waste of time of the community. Have you read the previous deletion debate, Wikipedia:Votes for deletion/Wikipedia:List of Wikipedians by number of edits, which resulted in an overwhelming keep vote? What is to understand about this page? Some people want to keep track of edit counts. I look at edits in various namespaces when I consider a RfA (though not from this list). NoSeptember talk 20:45, 26 January 2006 (UTC)
Then, please, what is the purpose of this page, what does it mean, and who operates it, and how? The layout is not clear, as evidenced by virtually every post above. -- Ec5618 21:33, 26 January 2006 (UTC)
It gets updated whenever someone gets around to running some programs to produce this data (check the article history to get an idea of who has updated it in the past). Does its purpose need to be any more than just trivia for the curious? Put it on your watchlist and wait until someone updates it, it is hardly a high priority thing to do. ;-) NoSeptember talk 21:53, 26 January 2006 (UTC)
I actually get that the page isn't hurting anyone, the point is that no-one seems to know what the numbers actually mean. See #sorting above, for example. Is the list corrupted, or does it use some sort of vague ranking system to decide that an editor with 2000 edits ranks below another with 1950 edits? How is the list sorted? And why does no-one seem to know how it is sorted? Why is no-one updating this beast, and how would one go about updating it? Who do I talk to about updating this thing? Who do I talk to about proposals to change the layout? Is it technically possibly to split the article, so that it takes less time to load?
This page is frustratingly mysterious, which is rather unique on Wikipedia. Anyone can edit it, but no-one has the know-how to actually do anything useful. Without knowing more about the concept behind this page, discussing it is quite useless, which I why I'd like to see some information. -- Ec5618 22:05, 26 January 2006 (UTC)
From reading the archives, I get the impression that a lot of this is ad hoc. Someone will decide to write a better program and take the page over. Who did it last? Off hand, I don't know. NoSeptember talk 22:16, 26 January 2006 (UTC)
The script is provided, in fact there are several renditions in the history. You tell me, what do you want? I can run it again whenever. ... I forget about this page, so I don't come around and answer questions often. --Gmaxwell 06:30, 30 January 2006 (UTC)

Size

Due to the new table format, this page causes some computers I use to lock up almost completely. It looks nice, but it doesn't work that well for everyone. Ingoolemo talk 17:41, 27 January 2006 (UTC)


New page formatting

Above, Gmaxwell asks for input on the best way to format the output from the script. Clearly the main problem at the moment is just that the tables are too large to sit comfortably in an average browser. I would say the items to consider would be;

  1. Remove the daily rate cells
  2. Remove the table formatting and go back to plain text layout
  3. Reduce the max rank back to 1000
  4. Merge the two tables and add an allnamespace ranking column in the first table
  5. Reintroduce the change in rank since last month/when the script was last run

For my part, I'd recommend doing 1, 2 and 4 and if that doesn't get it small enough try 3 too. It would seem a shame to loose the table formatting, but I suspect that could be the biggest thing that is choking the browsers. -- Solipsist 07:57, 30 January 2006 (UTC)

I'd say do 1, 2 and 4. The list went to 1500 before with no problems - it's the formatting that's stuffed things up. Grutness...wha? 08:56, 30 January 2006 (UTC)
I don't see any reason not to do 3 other than simple vanity -- i.e., the people ranked in the second 1,000 not wanting to get kicked off the page. In fact, I'd be willing to go even smaller, say 500, except WP:1000 is still a shortcut to this page. So 1,000 seems a logical choice. (Full disclosure: In the present version I'm listed at 1,089 and 1,000, respectively, in the two tables.) - dcljr (talk) 23:55, 30 January 2006 (UTC)
Well given all the caveats and general pointlessness of counting edits, I've often thought that the main benefit of this page is that it tends to encourge people to do some more editing before the next update comes out in the hope of improving their rank. This slight competitiveness is probably a good thing. In that case, we would want a list long enough to encourage the most people. A short list would be rather exclusive, but that might make people try harder to get on it. A long list might get more editors watching their edit position (I kind of like the way the current list includes some anon IP editors). I imagine the length should grow as Wikipedia gets bigger and there is probably some objective way to decide how big it should be, such as including the top 10% of active editors or something, but it is easiest just to decide whether we want a long list or a short list and then pick a round number. -- Solipsist 06:52, 1 February 2006 (UTC)
  • I think we should do number 1,2,and 5. I see no problem to go back to the plain style. Things seemed to work fine then. We could also consider splittin it into two pages (one for article, one for all namespaces). Johntex\talk 00:41, 31 January 2006 (UTC)
  • I agree that 1, 2, and 5 are wise courses of action. If we continue to have size issues, 3 may be worth serious consideration. – ClockworkSoul 16:11, 11 February 2006 (UTC)

Numbers

Why are the edit counts on this page higher than the ones on Interiot's Contributions Tree Tool which appears to be more up to date? Rmhermen 02:19, 1 February 2006 (UTC)

this page doesn't count edits to deleted articles, which might make some difference. Not sure whether Interiot's tool does or not, but even if it doesn't, some articles may have been deleted between the time this page was last updated and the time interiot ran his tally which could account for the difference. Grutness...wha? 07:19, 1 February 2006 (UTC)
The page notes that the last month of edits were double-counted as well. - BanyanTree 03:27, 11 February 2006 (UTC)

My name

Since my username was changed from SWD316 to Moe Epsilon could someone replace my old name with my new name? MOE Epsilon 03:08, 17 February 2006 (UTC)

Table size

Without any changes to the data, I removed all formatting and converted it to wiki table format and the main Namespace table went from 1,566K down to 475K. Couldn't we just do this? -- Iantalk 09:39, 20 February 2006 (UTC)

Yes, I support this also. The current table even overloads my computer, which has a 1.5 gHz processor and an enormous amount of RAM. Ingoolemo talk 08:55, 27 February 2006 (UTC)
I also support this. While I can load up this page eventually, it takes awhile. --PS2pcGAMER (talk) 09:03, 27 February 2006 (UTC)

Can this be updated

Can someone update this article? --James 03:36, 27 February 2006 (UTC)

I have requested assistance from Gmaxwell, and he has kindly agreed to try to update this page within the next few days. Johntex\talk 23:41, 1 March 2006 (UTC)
I just wish somebody would trim it back so I could load it without my browser crashing. Everyking 00:28, 2 March 2006 (UTC)
Could this be a multipage list somehow? –Shoaler (talk) 15:16, 5 March 2006 (UTC)
It shouldn't have to be. It used to load fine until the page size was increased so dramatically. --tomf688{talk} 15:26, 5 March 2006 (UTC)
Maybe we should move up the cutting line and update the rankings. Deryck C. 16:34, 11 March 2006 (UTC)

Gmaxwell, thanks for the update. The info for all namespaces seems to actually be edits in the main namespace though. Either that or the data is about 2 months behind. Can this be fixed? -- Y Ynhockey (Talk) Y 05:03, 12 March 2006 (UTC)

I was in the process of uploading it still. :) I had intended to clean out both headings, but only nixed the one, the old 2mbish version made my browser choke. It should all be good now. --Gmaxwell 05:29, 12 March 2006 (UTC)

Updated

It's updated now, as of sometime today (I didn't notice when it finished). It's running faster than it used to, so I'll try to update it more often. We're back to the plain format because the excessive tables format was a failure. :) I'll probably move this over to toolserver next time I update it which will allow me to go back to a more complex format without knocking things over. Let me know if you find any problems. Oh, and I changes the criteria for inclusion in all namespaces.. You had to have a rank over 2,500. There are now 3,461 users with over 1,500 total edits. --Gmaxwell 05:29, 12 March 2006 (UTC)


My (non)entry

I dont seem to be listed. Can any one expalin why?--Light current 02:29, 28 March 2006 (UTC)

I see you at #471 on the first list and #335 on the second list. EWS23 | (Leave me a message!) 02:39, 28 March 2006 (UTC)

Ahh! I see. Edit count is number of article edits not total edits. This is: not made clear/I missed it at top of page!--Light current 02:43, 28 March 2006 (UTC)