Talk:Usage share of web browsers/Archive 2
This is an archive of past discussions about Usage share of web browsers. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 | Archive 3 | Archive 4 | Archive 5 |
What about stats from w3schools.com
Hello, I am wondering why you do not include statistics form "w3schools.com". According to Alexa it has a much higher traffic ranking (presently #2000 worldwide) than any of the other services mentioned here. (see their statistics page) There must be a reason. :-) Best regards --Marbot (talk) 10:31, 8 December 2008 (UTC)
- Further down on their results page is the following:
- W3Schools is a website for people with an interest for web technologies. These people are more interested in using alternative browsers than the average user. The average user tends to use Internet Explorer, since it comes preinstalled with Windows. Most do not seek out other browsers. These facts indicate that the browser figures above are not 100% realistic. Other web sites have statistics showing that Internet Explorer is used by at least 80% of the users.
- So, W3Schools' data doesn't seem to be as accurate as the other websites we have listed here. For example, IE and Firefox have about 50% each. Xenon54 11:08, 8 December 2008 (UTC)
- Fair enough. You are right. However, I find it amazing, that more progressive users may generate enough traffic for a page to make it rank at about 2000. --Marbot (talk) 13:12, 8 December 2008 (UTC)
- There is no relation between the traffic to the site reporting the stats, and the total traffic measured by the service the reports are based on. A single site's usage (w3schools.com) is not a representative sample of browser usage share, where the other sites are reporting on the usage share measured at tens of thousands of different websites.
- Fair enough. You are right. However, I find it amazing, that more progressive users may generate enough traffic for a page to make it rank at about 2000. --Marbot (talk) 13:12, 8 December 2008 (UTC)
BE CAREFUL! the data that this article show differs from the w3schools.com one. It is imperative to fix it.
While the disclaimers concerning W3Schools are very valid, I note with interest that their numbers match those for my own website much better than those currently given in the article, in particular by having Firefox clearly ahead of Explorer instead of the other way around. (And, no, unlike W3schools, I do not serve a niche of web experts.) We should bear the risk in mind that the systematic errors mentioned in the article may over-favour Explorer.94.220.255.11 (talk) 22:44, 19 November 2009 (UTC)
Hourly data
A few weeks ago I added a link to http://safalra.com/website/web-browser-market-share/ after it was on ComputerWorld because I though it was useful, but I lost the bokmark and came back here to find it and it was gone. The history says Wikiolap deleted it because he said it „provides information how to break into Net Application system without paying“ but that's not right. It doesn't say how to hack into the system, and you can't get their other data you have to pay for like data for US states, it just says how to change the options for their public graphs so they show diffrent browsers, so I dom't see what's wrong with it. --82.33.205.102 (talk) 17:18, 15 December 2008 (UTC)
- I've just received a message through my website about this (I created the referenced page), presumably from the anonymous editor above (although it came from a different IP address, and they seem to be on a fixed IP). Actually, I'd have to disagree with one of their minor points — I don't think it's hugely useful, as I don't think anyone really needs hourly data (although daily data can be obtained using a similar technique, and is probably more useful). I'm mildly offended by the 'breaking into' comment, but that's Wikipedia for you — unnecessary external links are a constant issue, so I can understand over-reacting. To '82.33.205.102': add it back in if you want, but I'm not particularly bothered — just don't start an edit war over this. —Safalra (talk) 18:10, 15 December 2008 (UTC)
- As both sets of graphs have now stopped working, I think the point is moot. —Safalra (talk) 09:08, 10 January 2009 (UTC)
IE shells
shouldn't we mention the problematic of the ie shells = more "browsers / different guis" --> "same" user agent --> recognized as ie? they are different browsers, its really the same like shared gecko browser: you install 5 gecko browsers/same engine (but al with the engine built-in) and 5trident shells/browsers (layout engine already installeD) mabdul 0=* 20:58, 20 December 2008 (UTC)
Differentiating between desktop and embedded
I think it would be useful to split off the embedded browser market share from the overwhelmingly dominant desktop share.
Opera Mini, NetFront, Pocket IE, Nokia S60, Safari for iPhone/iPod, Playstation 3 Browser and PSP Browser all compete on platforms like PDAs, phones and gaming devices. Their market share on those platforms is not evident from the 0.00 and 0.01 variously attributed to them here.
— Nicholas (reply) @ 13:17, 4 February 2009 (UTC)
Make Colors in Pie Chart Match Prominent Colors of each Browser
I like the pie chart, but I thought of one way to potentially improve it. Use prominent colors of each browser as the color for that browser's pie slice. For example the blue pie slice already closely (exactly?) matches the color of the "e" in the Internet Explorer desktop icon. However, some of the other pie slices don't match their browser that well. For example, I think instead of yellow for the Mozilla Firefox pie slice, a better choice would be the orange-ish color of the fox's head in the Firefox desktop icon. What do you think? I don't use the other browsers. Perhaps other people could choose more appropriate colors for the other browsers as well. —Preceding unsigned comment added by 70.157.123.148 (talk) 15:47, 11 March 2009 (UTC)
- Previously, the graphic did use orange for Firefox (yellow was Chrome), but when Jdm64 replaced that graphic with the current SVG version, he changed the colors. I'd prefer an effort to "match" browsers and colors, as well. --Groggy Dice T | C 09:05, 6 June 2009 (UTC)
Another counter
Here's another counter: http://gs.statcounter.com/ . Rwxrwxrwx (talk) 12:32, 15 March 2009 (UTC)
confused about w3c
http://www.w3schools.com/browsers/browsers_stats.asp doesnt seem to match up with the page. Am i reading it wrong? John.n-IRL 15:09, 19 May 2009 (UTC)
- I think you are confusing W3Schools and W3 Counter. -- Schapel (talk) 17:24, 19 May 2009 (UTC)
Notability vs reliability. NPOV
Net Application (Hitlinks) are possibly the most well known and most used stat counter but have often been criticised for being extremely unreliable and generally inaccurate despite popularity. They are given prominence in this article (mainly by the pie-chart at the top). Given that it's certainly the most NOTABLE company, giving it prominence is possibly in line with the fact that Wikipedia places a value on notability, but given that it is a commercial company should it be given prominence in a WP article over other such companies? ɹəəpıɔnı 06:39, 3 June 2009 (UTC)
- Who says NetApplications is "extremely unreliable and generally inaccurate"? Is there evidence that this is so? -- Schapel (talk) 11:23, 3 June 2009 (UTC)
- "tracking firm admits data is skewed", "not the first or even second time the numbers published by Net Applications have "mysteriously" changed from one day to the next" ɹəəpıɔnı 05:48, 7 June 2009 (UTC)
- I'm sure all the browser stats are skewed one way or another. I can't imagine how to get a truly representative sample of the usage of web browsers. Net Applications data do not change "mysteriously." The data are run through quality assurance. You're not showing that Net Applications data are "extremely unreliable and generally inaccurate" at all. -- Schapel (talk) 12:52, 7 June 2009 (UTC)
- That is precisely my point. I wasn't implying that Net Applications are inaccurate relative to the others, I was arguing against giving them prominence in the article relative to other sources. All stats sources probably have some inaccuracies, Net Appications and their competitors. Does Wikipedia have reason to believe Net Applications is a better source than others? If not, why does the pie-chart show their stats?
- Every single browser article on Wikipedia takes it's stats from this page, and giving Net Applications prominence ensures that their stats are always used in every one of these articles.
- I cannot say they are significantly less reliable than any of their competitors, but I have given two references to articles commenting on anomalous statistics from Net Applications suggesting that they can occasionally be off. If there's no references to say that they are significantly more reliable, or encounter less anomalies in their statistics, than competitors, I don't see the benefit to Wikipedia to use them as an exclusive source in other articles. Giving them prominence in this one ensures that they will be used as such. ɹəəpıɔnı 19:31, 7 June 2009 (UTC)
- Having changed the marketshare table in Microsoft Windows from being NA-only to including NA, Xiti, W3C, and OneStat, (need to update it!) I am sympathetic to your concerns. However, I think you are mistaken in believing that the prevalence of Net Applications in other articles is due to its pride of place in this article. Rather, both this article and those articles are reflecting their prevalence in the media.
- As for why Net Applications is so dominant in the media, I've thought about writing something up on a userpage about why the media prefers NA. But a big part of it, I believe, is this: regular monthly updates. It gives more chances for Net Applications to be written up than those who release their figures more infrequently. It allows reports to plan ahead for the release of their latest figures, as opposed to outfits who release reports more irregularly. Even in stories that aren't directly about browser or OS usage stats, this gives NA an edge. If a journalist is writing up a story on Steve Jobs' health, and wants to look up-to-date, would he rather say, "according to Net Applications, OS X's marketshare last month was...", or "according to OneStat, OS X's marketshare eight months ago was..."? --Groggy Dice T | C 21:01, 7 June 2009 (UTC)
- I'm sure all the browser stats are skewed one way or another. I can't imagine how to get a truly representative sample of the usage of web browsers. Net Applications data do not change "mysteriously." The data are run through quality assurance. You're not showing that Net Applications data are "extremely unreliable and generally inaccurate" at all. -- Schapel (talk) 12:52, 7 June 2009 (UTC)
- "tracking firm admits data is skewed", "not the first or even second time the numbers published by Net Applications have "mysteriously" changed from one day to the next" ɹəəpıɔnı 05:48, 7 June 2009 (UTC)
How about maintaining a table and pie-chart showing and averaging the latest results from all the major counters, and use those as the main features of the page? It would cancel out any biases. Rwxrwxrwx (talk) 10:49, 8 June 2009 (UTC)
- @Groggy Dice, I take your point about NA's notability in media contributing to their use as a source here, it's a good one. I was speaking from experience of people making points on talk pages referencing this article, but you're probably right that it's not as oft referenced as bloggers/etc. in article references are.
- As for the regular updates, I'm not sure how true that is (that this sets them apart). Take the link provided by someone two sections above this - seems a fairly new statistics source as it has yet to be added to the article that I can tell, but an example of a regular source.
- @Rwxrwxrwx, not a bad idea though I wonder how easy it would be to maintain such. I was merely suggesting toning down the prominence rather than bringing other sources up to the same level, but that's an interesting idea all the same. ɹəəpıɔnı 15:28, 8 June 2009 (UTC)
- This has been suggested before (cannot find thread right now), but there were several problems with this suggestion. For one, different providers use different methodologies and track different number of web sites, so averaging across them would not be very meaningful. Second, deriving any non-trivial statistical measure is dangerously close to original research. So either we don't have any graphs at all, or keep one for specific provider, but not derived ones. Wikiolap (talk) 15:31, 8 June 2009 (UTC)
- The most important problem of NetApplication is that it's not global. They does not take in account good half of the world. It becomes very obvious after you look at StatCounter data, which provide quite accurate country-by-country statistic where you can see IE has extremely large popularity in China and Asia. Safari has almost no popularity in Europe and Asia, while Opera makes over 40% in CIS countries that results in 10% for Europe and about 4% wordwide, while Firefox isn't really choice of Eastern Europeans. Elk Salmon (talk) 19:27, 18 June 2009 (UTC)
- If NetApplications were "not global", the number would not match closely with the other global sources. However, NatApplications' data do, but StatCounter's do not. If anything, it looks like StatCounter and W3Counter are "not global". The statistical term for this is that the data are skewed, likely due to an unrepresentative sample. The article explains why W3Counter's data are skewed towards less popular browsers. We should figure out why StatCounter's are also. -- Schapel (talk) 14:54, 18 June 2009 (UTC)
- Still more like NetApplications makes their stats only from several western counties leaded by USA and Canada, where Apple products popularity is relatively high. Just like OneStat. I just gave an example. Taking in account active internet population of Eastern Europe and Asia Safari cannot reach so high position, as Apple products are completely unpopular in Europe. As Safari has 8% in North America it cannot be 8% worldwide. Large internet population of Europe and Asia will reduce this number significantly. So as IE has dominant position in Asia while Opera has dominant position in CIS. World is very different. It's like CIS is all about Yandex, China is all about Baidu, USA has large Live/Bingo popularity with no dominant position of Google while rest of the world is all about Google. Same as USA is about Linux, Blackberry, iPhone and Windows Mobile on phones with not, while CIS and Europe is all about Symbian with somehow large popularity of Windows Mobile and with zero popularity of Linux and Blackberry. NetApplications is absolutely obviously onesided. Elk Salmon (talk) 19:27, 18 June 2009 (UTC)
- That doesn't make a whole lot of sense. NetApplications matches OneStat and TheCounter fairly closely, and are show global share. StatCounter is even more heavily skewed away from IE and towards Firefox and Opera than Adtech, which shows European usage only. If anything, it looks like StatCounter is "not global". Of course, they are all skewed to some degree one way or another, but W3Counter, Adtech, and StatCounter are among the most highly skewed stats on the page. NetApplications is relatively balanced, as are TheCounter and OneStat. -- Schapel (talk) 20:27, 18 June 2009 (UTC)
- There's two issues. Firstly, NA are more English-language oriented and more oriented towards western websites. So when they say they are reporting global stats, they are not lying. They are in fact reporting stats of vistors FROM all countries, but TO websites primarily located in th US, because Hitslinks is primarily marketed in the US. So while visitors shown in NA's stats are from all over the world, they are visitors to websites more oriented towards the English-speaking world, rather than smaller local sites in national languages. Instead of comparing NA/StatCounter to OneStat and TheCounter, try comparing them to the individual stats of smaller local sites in national languages of any given countries. Yandex.ru is probably the biggest example, it matches StatCounter quite closely.
- The second issue is the type of websites Hitlinks and such services are marketed at - NA tends to market at corporate customers which will likely lead to higher stats for IE, while marketing at blogs may lead to lower IE stats for StatCounter (as the more technically proficient users will be blog readers and not use IE, while those on corporate networks will probably be stuck with IE). Both lead to inaccuracies, my point here is all are inaccurate so none should be given prominence. ɹəəpıɔnı 02:01, 20 June 2009 (UTC)
- That doesn't make a whole lot of sense. NetApplications matches OneStat and TheCounter fairly closely, and are show global share. StatCounter is even more heavily skewed away from IE and towards Firefox and Opera than Adtech, which shows European usage only. If anything, it looks like StatCounter is "not global". Of course, they are all skewed to some degree one way or another, but W3Counter, Adtech, and StatCounter are among the most highly skewed stats on the page. NetApplications is relatively balanced, as are TheCounter and OneStat. -- Schapel (talk) 20:27, 18 June 2009 (UTC)
- Still more like NetApplications makes their stats only from several western counties leaded by USA and Canada, where Apple products popularity is relatively high. Just like OneStat. I just gave an example. Taking in account active internet population of Eastern Europe and Asia Safari cannot reach so high position, as Apple products are completely unpopular in Europe. As Safari has 8% in North America it cannot be 8% worldwide. Large internet population of Europe and Asia will reduce this number significantly. So as IE has dominant position in Asia while Opera has dominant position in CIS. World is very different. It's like CIS is all about Yandex, China is all about Baidu, USA has large Live/Bingo popularity with no dominant position of Google while rest of the world is all about Google. Same as USA is about Linux, Blackberry, iPhone and Windows Mobile on phones with not, while CIS and Europe is all about Symbian with somehow large popularity of Windows Mobile and with zero popularity of Linux and Blackberry. NetApplications is absolutely obviously onesided. Elk Salmon (talk) 19:27, 18 June 2009 (UTC)
- If NetApplications were "not global", the number would not match closely with the other global sources. However, NatApplications' data do, but StatCounter's do not. If anything, it looks like StatCounter and W3Counter are "not global". The statistical term for this is that the data are skewed, likely due to an unrepresentative sample. The article explains why W3Counter's data are skewed towards less popular browsers. We should figure out why StatCounter's are also. -- Schapel (talk) 14:54, 18 June 2009 (UTC)
- The most important problem of NetApplication is that it's not global. They does not take in account good half of the world. It becomes very obvious after you look at StatCounter data, which provide quite accurate country-by-country statistic where you can see IE has extremely large popularity in China and Asia. Safari has almost no popularity in Europe and Asia, while Opera makes over 40% in CIS countries that results in 10% for Europe and about 4% wordwide, while Firefox isn't really choice of Eastern Europeans. Elk Salmon (talk) 19:27, 18 June 2009 (UTC)
Statistics from Wikipedia
Is it possible to include browser stats from Wikipedia-hits?
- I was also looking for this data, would be interesting. --193.166.137.75 (talk) 07:22, 3 July 2009 (UTC)
- I agree with you, would be very interesting. Anyone know how to get this data?????? --Marceloml (talk) 23:17, 18 August 2009 (UTC)
Ordering of stat providers
Shouldn't these be put in alphabetical order? (ADTECH at the top, WebSideStory at the bottom) 82.33.205.162 (talk) 14:29, 8 July 2009 (UTC)
Summary of known sample/bias for stat providers
Would it not be useful to include a brief summary of known sampling issues or bias? e.g. Net Applications is mostly for English sites, AT Institute mostly French, whether they are for one country or many, etc? 82.33.205.162 (talk) 14:29, 8 July 2009 (UTC)
IE8 compatibility mode
IE8 renders (almost exactly) the same as IE7, and declares itself as such in the User-Agent (but with an identifier that it's actually IE8 IIRC) - reference article for discussion here by Net Applications which gives a breakdown of IE8 mode vs IE8 in IE7 mode.
Given that IE8 is effectively being IE7 in this mode (link above says about 10% of the time in that study), how should this be handled? I see options as:
- Lump in with IE7
- Lump in with IE8
- Break out separately
82.33.205.162 (talk) 14:29, 8 July 2009 (UTC)
Usage share of web browsers: June/July 2009
We need an update of the usage share of web browsers for June/July 2009; including a updated pie chart. There is no doubt that these statics have dramatically changed taking into consideration with the fact that Firefox 3.5 has been recently released. —Preceding unsigned comment added by A9l8e7n (talk • contribs) 00:04, 24 July 2009 (UTC)
- Unfortunately, the source we have been using for these statistics is Net Applications, and they are still conducting their "review" of the June numbers, and the July numbers won't be out for another week.
- However, as far as the pie chart is concerned, the release of FF3.5 probably won't affect it much, since 3.5 will mostly gobble share from earlier Firefoxes, and the pie chart doesn't break down by versions. --Groggy Dice T | C 18:39, 25 July 2009 (UTC)
- Exactly, this is getting in the way of showing how many users are actually using firefox 3.5.--A9l8e7n (talk) 19:26, 25 July 2009 (UTC)alen
- Perhaps using a source with more regular updates in the top image would suit your purposes better? I personally believe having a single image from a single source at the top of a page representing multiple (often conflicting) sources is grossly subjective, but cycling the sources of that image might make it less so at least. ɹəəpıɔnı 05:44, 31 July 2009 (UTC)
Summary table
I don't think the summary table should be showing the mean of all stats. The main problem is that some are global stats, and some are for specific geographical locations. If the purpose is to give the latest stats from each source, then let's do just that, and leave the median out. We should also probably consider leaving out non-global stats, because they are listed in the table without any explanation of that they are. -- Schapel (talk) 04:31, 4 August 2009 (UTC)
- Well, except for AT Internet Institute, all the sources from the current table are global. I still think that the median is valuable. Removing "AT-II" from the table -- ok. But, the median could be useful in creating a new chart. The current one uses Net Apps, but a more reasonable data source would be the median of all global sources (because how would we determine what 'one' source should represent the chart). This would be similar to the OS market share page. Also, I think that some of the smaller sources should even be removed from the page entirely. The page has become excessively disorderly with the amount of tables. Having more sources adds little value to the page, but an easy to read summary table and carefully picked sources would improve the page. Jdm64 (talk) 07:58, 4 August 2009 (UTC)
- I agree that some sources should be removed. mail.ru, StatCounter, and W3 Counter would be my picks for removal. -- Schapel (talk) 01:03, 21 August 2009 (UTC)
- As you mentioned, one is non-global. Another is not a one-month stat like the others, and thus skews to older data. It would seem to me that all of these can stay in the summary table, and simply not be included in the "median" calculation. Then again, neither should data older than the month expressly stated in the "median" table row, and yet we are. That seems like an error. And, of course, that's not the median that's being calculated -- it's the mean. (And rightly so, IMHO, but the label should be corrected.) Gnassar (talk) 11:49, 18 October 2009 (UTC)
- I agree stats from different months should not be included in the median or even the mean for that matter because these stats often change from month to month by whole percentage points. I think we should only include stats from the current month in the median and mean and include the latest stats from other sources in a separate summary table. 4 of the 7 sources update their stats monthly, the rest update sporadically if at all. If we have 2 tables we can show a meaningful median and mean on one with the last month and have the other table just summarize the other sources' latest stats. Whether or not the sources are globally even is not very important IMHO as long as they have a relatively large sample size; sources covering mostly Europe balance out sources covering mostly North America. In any case the Europe only sources are old and therefore would not factor in anyway. But that's just my idea. I would appreciate feedback on it. Thorenn (talk) 15:52, 1 December 2009 (UTC)
Wikipedia Stats of browser usage share
How can we find stats of wikipedia browser usage share??? Would be good to put these stats in this article --Marceloml (talk) 01:42, 19 October 2009 (UTC)
Prefetching, is Gecko the only one?
"Gecko-based browsers (such as Firefox) can prefetch linked web pages, potentially increasing hits. Link prefetching in Gecko-based browsers is used on pages with enhanced markup, including Google search results."
Does IE 8 do this now as well? 68.13.126.138 (talk) 00:12, 28 October 2009 (UTC)
These sites differentiate between visits and hits. They are reporting browser share by percentage of visitors, not by percentage of hits. —Preceding unsigned comment added by 76.124.117.20 (talk) 00:36, 7 November 2009 (UTC)
is ADTECH used for average?
In the summary section under the table it says: "Note that The Counter value is an average of the past 18 months, and the AT Internet Institute value applies only to Europe. Neither is therefore included in the monthly summary calculations."
But ADTECH is also Europe specific, so it shouldn't be used in the mean and median calculations either. If this is already being dome, it should say so here right? Just wanted to see what's going on with it. - The Talking Sock talk 23:59, 22 November 2009 (UTC)
TheCounter math
TheCounter reports cumulative numbers from Feb, 2008. This makes its reports incompatible with everybody else. Especially for the purpose of the Summary table, where it is excluded from the median. I took cumulative numbers for October 2009, and subtracted cumulative numbers for September 2009 - this results in pure October numbers. Note, that IE8 falls under "Netscape compatible" there, because they didn't bother to include IE8 in the parsing algorithm (and user agent does say "Mozilla 4/0 (compatible; MSIE 8.0 ...)" ). I did not do it for all of their stats, just for summary table, so now the numbers can be included in the mean and median calculations. Wikiolap (talk) 02:28, 25 November 2009 (UTC)
TheCounter is substandart site. No point to include it at all. 95.133.48.57 (talk) 19:25, 9 January 2010 (UTC)
Feb 2009?
Can we get a more recent pie chart at the top of the page? Mathiastck (talk) 01:28, 2 December 2009 (UTC)
The counter .com
I think it's not fair to average the data from The counter .com to others, as what they measure is quite different and will show a much greater share for older browsers and browsers which used to be more popular, and a much lower share for newer browsers and browsers which used to be less popular. It defeats the purpose of having reliable data. Furthermore, The counter .com looks and feels outdated as it doesn't even recognize Google Chrome/Chromium, so I have big doubts about the reliability of it. 12:04, 18 January 2010 (UTC)
Why is "The Counter" getting included in this list? By not specifically recognizing the third most popular and fastest growing broswer, it is less useful than the other sources. At least put it at the bottom of the list. —Preceding unsigned comment added by 99.247.48.243 (talk) 14:50, 19 January 2010 (UTC)
The counter dot com data, from the links on the right of the table, is not monthly as the links imply, but it starts on feb 01 for 2008 and ends on the month listed in the link such as september 08 which is actually a 243 days survey. The same seems to be true for all the other links.
60.231.195.143 (talk) 04:46, 19 December 2008 (UTC) silver_xxx
ps I reread the top posts and this has been handled correctly —Preceding unsigned comment added by 60.231.195.143 (talk) 04:49, 19 December 2008 (UTC)
I derived the monthly data for thecounter.com data for the month of November and I got a very strange result:
Internet Explorer (all versions) | 42.77057471% |
---|---|
Firefox | 21.01947469% |
Safari | 9.95037636% |
Opera | 1.025331744% |
Netscape compatible | 24.30565329% |
Other (Netscape+Konqueror+Unknown | 0.929411471% |
Strangely enough the Netscape compatible category is about equal to Firefox's share plus three which is about equal to Chrome's share in other sources. Could thecounter.com be double counting Firefox as Netscape compatible? I know Netscape has been using Gecko since 6.x so I suppose it is possible and it has already been established that Chrome could be counted under WebKit as Safari as well as being technically "Netscape compatible". By the way I got these numbers by taking the difference between monthly totals as the total hits for the month in question then I took the difference between months for each browser version. I then calculated the percentage of each browser version by dividing each of their monthly differences by the total monthly difference and multiplied by 100. I added all the Internet Explorer percentages together and lumped all Netscape versions with Konqueror and Unknown for Other. In case this is confusing I'll show you how I got the Netscape compatible value: the difference between November and October's visitors is 92266913-89104911=3162002, but there was an error of 5-4=1 so its actually 3162001. Next the monthly difference was found to be 4428036-3659491=768545 and so the percentage was found to be 768545/3162001*100=24.30565328726967512027984810884.
If you don't believe these numbers then go to http://www.thecounter.com/stats/2009/November/index.php and find out for yourself. Either my whole process is flawed and this point is moot or the method thecounter.com uses is flawed and we may need to exclude it from the summary table calculations or adjust for its flaws. Thorenn (talk) 22:04, 3 December 2009 (UTC)
Objectiveness of Summary Graph and Stats
Done
Considering this article goes into detail about the different sources, does it not seem a bit subjective to include a graph and statistics in the Summary with only one source taken into account? (Especially since all the sources can be drastically different). Babydutka (talk) 20:13, 29 January 2009 (UTC)
- ... Well, there is the issue that various sources have quite different time chunks - people updating the graph tended to do it frequently which was nice. Unfortunately, Net Applications hasn't updated in a couple of weeks while they do some mysterious review. In my opinion the pie chart should switch to StatCounter - and perhaps note more prominently this is a single data source. —Preceding unsigned comment added by 162.99.35.70 (talk) 18:29, 14 July 2009 (UTC)
- There should be an diagramm, which includes several stats site, so there is a possibility to get the best result. --87.78.23.122 (talk) 01:40, 29 July 2009 (UTC)
- I would second basing the graph on StatCounter. Net Applications always seemed to US-centric so odds are that even with their switch in how they caluclate shares, they still overestimate IE global share. The Arkady (talk) 21:23, 13 August 2009 (UTC)
- I'm open to doing the same as the Usage share of desktop operating systems page where the chart data is from the median. The addition of a summary would also be good. I'm still waiting for Net App to come out with fresh data so I can update the pages. Jdm64 (talk) 05:08, 29 July 2009 (UTC)
- Doing median or any other deriviation is problematic. Different sources have different scale (by orders of magnitude), the methodology is different (apples and oranges) etc. Also, trying to derive any measure will be dangerously close to doing Original research. Having graph with clear attribution to its source is at least non-ambigious. It serves as example. Wikiolap (talk) 17:55, 29 July 2009 (UTC)
- Well, the desktop os page currently uses the median, and I think it works fairly well. The percentages are all close and there was all ready a summary table, so using the median in the graph was natural. Will it work on this page? Maybe? I'd like to see at least a summary table because it would make retrieving useful information from the page much easier. Jdm64 (talk) 20:01, 29 July 2009 (UTC)
- I don't particularly care *which* one you guys use, but I'd like to note that right now the graph claims to use the median, when that is clearly, wrong - the median for example for IE is 58%, not 64% - the graph should make clear which source it is using at the moment. Heck. I have no idea if the numbers even match the pie slices.—Preceding unsigned comment added by 162.99.35.251 (talk • contribs) 14:49, 20 October 2009
- Using the median figures significantly overestimates FF usage (by 3%+) compared with the more conventional mean average. Is there a justifiable reason for this, given that there are both high and low extremes (Net Applications vs Clicky) in the set, so the mean wouldn't be skewed? --Psdie (talk) 23:26, 13 May 2010 (UTC)
- "more conventional mean average" The more conventional method in this case is to use median. With the mean, outliers can drastically skew the figures. citation. ~a (user • talk • contribs) 23:47, 13 May 2010 (UTC)
Mail.ru stats
- IE: 48.76
- Opera: 26.39
- Firefox: 20.67
- Safari: 0.56
These stats are quite surprising. Less than half use IE!? Opera more popular than Firefox!? And does no one have Macs in Russia? Even SeaMonkey's beating Safari! Anyone know how reliable this site is? 71.155.236.174 (talk) 08:39, 4 July 2009 (UTC)
- My issue with mail.ru stats is that they don't keep history - the link always points to today's data, so it is impossible to verify how the stats looked last month or last year. Wikiolap (talk) 03:20, 6 July 2009 (UTC)
- I don't think there's any question of reliability when it comes to a site providing its own statistics of its own visitors, although mail.ru's stats are a little different than StatCounter's (who have IE at around 33%) which I'd usually consider the most reliable global stats source. Opera's usage in that part of the world is the main reason YUI have Opera in the "A-Grade" category of their graded browser support guidelines.
- The question of mail.ru stats history is one I've been trying to figure out myself, I wasn't aware they didn't keep a history at all, I just thought their stats history was quite difficult to find as I don't speak Russian. Are you sure they delete them completely? ɹəəpıɔnı 03:38, 6 July 2009 (UTC)
- I wasn't able to locate historic stats from mail.ru (I do speak Russian). Perhaps they exist somewhere, but well hidden. Unless someone can find a link to them, we are risking breaking WP:V. Wikiolap (talk) 04:21, 6 July 2009 (UTC)
- I found this which seems to be almost the same. If I'm interpreting it all wrong, apologies - as I said, no russian - but it seems to show stats for mail.ru, whereas the stat.mail.ru site would presumably show stats for all mail.ru domains, including the likes of blogs.mail.ru and video.mail.ru. ɹəəpıɔnı 06:28, 6 July 2009 (UTC)
- I wasn't able to locate historic stats from mail.ru (I do speak Russian). Perhaps they exist somewhere, but well hidden. Unless someone can find a link to them, we are risking breaking WP:V. Wikiolap (talk) 04:21, 6 July 2009 (UTC)
Look at real browser stats in Russia, Ukraine or Belarus, before saying stupid things:
http://gs.statcounter.com/#browser-RU-monthly-200910-201001-bar
http://gs.statcounter.com/#browser-UA-monthly-200910-201001-bar
http://gs.statcounter.com/#browser-BY-monthly-200910-201001-bar
95.133.48.57 (talk) 19:29, 9 January 2010 (UTC)
Hmm, looks like the mail.ru stats were deleted on Jan.3 without any discussion. Local statcounter data for Russia can be found here. Currently, Opera has 36% browser share, follower closely by IE and Firefox. Esn (talk) 09:07, 19 May 2010 (UTC)
Removing TheCounter
I think TheCounter data should be removed from this article. They exclude Google Chrome, and apparently also Internet Explorer 8. Furthermore, their way of reporting statistics is incompatible with other websites. We have enough sources, so if there are no objections I will remove TheCounter. Mushroom (Talk) 12:22, 18 January 2010 (UTC)
- I don't see either of the above points as grounds to remove TheCounter. If we were to only include statistics from companies deemed (in the SUBJECTIVE opinion of an arbitrary Wikipedia editor) "reliable", we may as well remove every table from the page. It is not Wikipedia's place as an encyclopaedia to judge the performance of companies included in its articles - it just documents objectively. TheCounter seems to meet inclusion requirements for Wikipedia (e.g. WP:Notability) - beyond that, I don't see any reason not to include it. ɹəəpıɔnı 17:35, 18 January 2010 (UTC)
- Actually I don't think it is notable, and that would be another reason to remove it. The same is true for most other sources, by the way, except maybe Net Applications which seems to be the only one notable enough to have its own article. Anyway, since you disagree with my proposal I will leave it there. Mushroom (Talk) 18:01, 18 January 2010 (UTC)
- I am not opposed for removing TheCounter from the article. Editors of the Usage share of operating systems also decided to remove this source.Wikiolap (talk) 18:26, 19 January 2010 (UTC)
- I agree we should remove it. It has multiple issues that make it difficult to work with and which make it incompatible with the rest of the stats. The only reason I did not remove it when I changed the layout of the summary table was that it updates monthly although even that has issues. The main problem is that it has a historical usage share table, the removal of which would probably be disruptive if it were done wholesale; perhaps we should move it to its own section since it isn't outdated as far as having a lack of new data goes. The summary table can be changed to exclude it from the calculations of mean and median like the non-monthly reports. On that note does any one object to those reports being moved to the older reports section since they are no longer relevant to the present year? Thorenn (talk) 19:39, 19 January 2010 (UTC)
- I concur with this proposal, except TheCounter shouldn't be included in the summary table at all because of its bad sampling method, it would only be misleading. I also think the calculating method should be included in each section and those sources that does not publish the methodology - if any - should be removed. At this point it's only reported in w3counter. --Sapeli (talk) 05:52, 20 January 2010 (UTC)
- I agree TheCounter has a bad sampling method but it would be problematic to remove it from the summary table; at least in its current form. Perhaps we should label the main sources as something along the lines of "Monthly updating sources with data from only the relevant month" or place an end-note to that effect. As for your second comment, I am not sure what you mean by "methodology". All the sources give information on how their data is generated: StatCounter Net Applications StatOwl and of course W3 Counter. —Preceding unsigned comment added by Thorenn (talk • contribs) 15:36, 20 January 2010 (UTC)
- Any counter included here should be reliable and up-to-date. If The Counter doesn't meet those standards, it should go. Regarding the summary table, it was originally introduced simply to provide average figures for feeding to the pie-chart. Any counter which does not contribute to the average figures should not appear there; all it does is clutter up the table. Rwxrwxrwx (talk) 15:52, 20 January 2010 (UTC)
- There are many sources in the article that do not have up-to-date information. It has been this way for years. We should retain older sources of information so we can see the change over time. TheCounter is an important source in this respect because it was one of the few sources that had information in the first several years of the millennium. -- Schapel (talk) 16:18, 20 January 2010 (UTC)
- I phrased that badly; what I meant was that any counter which is not reliable and up-to-date should not be included in the summary table or be regarded in the same light as those which are reliable and up-to-date. You're right that if it's historically significant it should be kept, in some form. Rwxrwxrwx (talk) 16:28, 20 January 2010 (UTC)
- Since we already have an "Older reports" section, we could just move TheCounter and other unreliable or outdated sources there, and remove them from the summary table. This way no information would be lost (except from the table, but it wouldn't be very significant), and we would solve the problem. By keeping TheCounter at the top of the list we are implying that it is one of the most reliable sources, which it clearly is not. Mushroom (Talk) 16:55, 20 January 2010 (UTC)
- If we move TheCounter's historical usage share table to the Older Reports section we would have to rename it to include unreliable sources because TheCounter's data is technically up to date while still being unusable for calculations. If we reserve the Summary Table for sources used to derive mean and median then the Other sources part of the table will need to be removed. This is not problematic for most of the sources listed because they are listed again in their historical usage share tables but the recently added Wikimedia stats are not listed elsewhere on the page and I don't know where else it could be put should we choose to retain it. Thorenn (talk) 18:10, 20 January 2010 (UTC)
- TheCounter's data is not up-to-date, because they have no data for January 2010. How about we wait until February 1, 2010, and if there still isn't any TheCounter data for 2010, we move it to the Older Reports section, which is exactly where it will belong. -- Schapel (talk) 20:00, 20 January 2010 (UTC)
- Of course they have no data for January 2010; the month is not over yet! None of the sources give preliminary data (with the exception of StatCounter) unless you are a paying member. Therefore, as far as having data from the previous month TheCounter's data is up-to-date. Had it not been it would be a simple matter to move it to the Other sources part of the Summary Table and move its historical usage share table to the bottom like the rest of the old sources that no longer update. Since it technically qualifies as up-to-date, moving or removing it is a non-trivial matter since we have to redefine what sources are suitable for inclusion in the article and where they should be in the page. Thorenn (talk) 20:31, 20 January 2010 (UTC)
- Every month for the past several years, TheCounter has had data available on the first of the month. They have never before waited until the end of the month to provide data. In the past, it has been updated every hour of the month. The fact that this has not been happening for 20 days and hasn't been fixed yet means it's unlikely for it to be fixed soon. -- Schapel (talk) 21:36, 20 January 2010 (UTC)
- But it has been updated and has been since the 31 of December; what wasn't updated was Wikipedia's link to the December data. I have fixed that now so it links to this instead of this. Thorenn (talk) 22:19, 20 January 2010 (UTC)
- But every month for the past several years, TheCounter data for the current month has been updated on the TheCounter site every hour. On January 20, 2010, there is still no data for January 2010 at the TheCounter site (the link gives a 404 error). Therefore, the TheCounter site is not up-to-date. If this continues until February 1, 2010, I propose we rightfully move TheCounter data to the Older Reports section. -- Schapel (talk) 01:51, 21 January 2010 (UTC)
- If this is the case then yes we can take that action then. In the meanwhile I have moved the stats in the Other sources part of the table to the Older reports section so it does not clutter up the Summary Table. —Preceding unsigned comment added by Thorenn (talk • contribs) 17:36, 21 January 2010 (UTC)
- I phrased that badly; what I meant was that any counter which is not reliable and up-to-date should not be included in the summary table or be regarded in the same light as those which are reliable and up-to-date. You're right that if it's historically significant it should be kept, in some form. Rwxrwxrwx (talk) 16:28, 20 January 2010 (UTC)
- There are many sources in the article that do not have up-to-date information. It has been this way for years. We should retain older sources of information so we can see the change over time. TheCounter is an important source in this respect because it was one of the few sources that had information in the first several years of the millennium. -- Schapel (talk) 16:18, 20 January 2010 (UTC)
Swapping Chrome and Safari
Chrome keeps gaining usage share and Safari is practically stagnant in comparison. Should there be a point at which we should swap the places of Chrome and Safari for the whole page? If so, have we already passed that point? If we do this it will require a major overhaul of the page layout because columns cannot be interchanged easily. Every month we add at least another 4 rows of data so every month we spend without a decision on this means more work if we eventually do decide to swap them. If you are interested in this page please weigh in on this issue. Personally I think we should do it but section by section and involving multiple editors if possible perhaps using this template. Thorenn (talk) 17:46, 1 February 2010 (UTC)
- I'm totally for it. I might be able to to the conversion in one go using this converter. Jdm64 (talk) 22:04, 1 February 2010 (UTC)
- I did one table, other's can help doing the rest Jdm64 (talk) 04:37, 2 February 2010 (UTC)
- I did the rest, all the historical usage tables are now consistent in the Chrome, Safari order Dangrossman (talk) 08:40, 2 February 2010 (UTC)
- I did one table, other's can help doing the rest Jdm64 (talk) 04:37, 2 February 2010 (UTC)
Thank you Jdm64 and Dangrossman for your work but there are still the older report sections that haven't been changed. Do you think we should convert those as well to the new order too for consistency or leave them as is because they do not show Chrome's recent rise? Thorenn (talk) 14:45, 2 February 2010 (UTC)
- I'd say keep how they are. The tables should be sorted from left to right, larger to smaller for the most recent data. Jdm64 (talk) 05:13, 3 February 2010 (UTC)
I think that during the conversion, the asterisks that mark the older, unweighted data for NetApplications were lost. Please fix this, and look carefully for other subtle unintentional changes. Thanks. -- Schapel (talk) 13:18, 3 February 2010 (UTC)
Accuracy Notes
I feel the accuracy notes may be misleading. Correct me if I'm wrong, but all 5 services this page pulls stats from are image/javascript-based trackers. None are web log analyzers. In that light, some the sources of overestimation listed do not actually occur:
"A web browser that refreshes the webpage at a regular time interval." -- These services all report browser share by visitors, not by page views. Refreshing the page does not alter these numbers.
"A feed reader that requests the RSS or Atom feed at a regular time interval." -- A feed reader is only downloading the feed, XML content, it's not executing any JavaScript or images these trackers could use. I have never seen an RSS feed contain one of these website trackers -- it doesn't make sense, as so many people use feed readers that wouldn't parse embedded JavaScript. The top feed readers would strip it out for security reasons. People that want feed stats get them by proxying their feed through something like FeedBurner, not by pasting a web tracker into the feed bodies.
"Extra files like CSS hacks and JavaScript hacks are often sent to Internet Explorer." -- These are not log analyzers, those requests are never recorded by these services.
"Many types of software, such as Web validators or crawlers, fetch web pages, and send fictitious user-agent strings to appear more like normal traffic." -- Web crawlers do not execute the tracking JavaScript and run the code that inserts the tracking image into the page, they are not counted by these types of services.
And from the underestimate section:
"Generally, the more faithfully a browser implements HTTP's cache specifications, the more it shall be under-reported relative to browsers which implement those specifications poorly" -- These companies send appropriate headers to tell the browser not to cache tracking code, so the opposite is true -- a faithful implementation of cache specification will result in not caching and recording the visit.
"User Agents are not guaranteed to be a certain format. As an example of the inconsistency almost every User Agent pretends to be Mozilla 5.0." -- The fact that almost every user agent includes that as PART of the otherwise browser-specific string does not make it inconsistent.
Dangrossman (talk) 08:54, 2 February 2010 (UTC)
- As the article says, "Measuring browser usage in the number of requests (page hits) made by each user agent can be misleading." That's why usage share is measured by tracking visitors, not page hits in the log. I'm not sure why people keep elaborating about how using page hits is misleading. -- Schapel (talk) 12:51, 2 February 2010 (UTC)
Only headers, no footers?
I just noticed that all the footers from the tables were removed. These were useful for the longer tables that do not fit entirely in the browser window, so you could see which columns are for which browsers without scrolling back up to see the headers. Could we add the footers back to the tables that don't fit entirely in the window? -- Schapel (talk) 16:47, 5 February 2010 (UTC)
- I have added footers to the two tables that don't fit in my browser window. -- Schapel (talk) 15:05, 9 February 2010 (UTC)
W3Counter change
It looks like W3Counter retroactively added the total percentages of the top five browsers (in addition to the top ten browser versions) to all their stats. From what I can tell, the table in the article lists these new numbers from January 2009 to present, but has the older data inaccurately calculated by summing up the browser versions prior to January 2009. Someone should go back and fix the 2007 and 2008 data in the table with the updated numbers. -- Schapel (talk) 21:20, 5 February 2010 (UTC)
- I have corrected all the numbers in the table with the updated numbers. -- Schapel (talk) 15:05, 9 February 2010 (UTC)
Gecko column in the StatOwl table
What's up with the Gecko column in the StatOwl table? Where does the data come from? What browsers does Gecko represent? It clearly isn't Gecko-based browsers. No other tables in the article have a column named Gecko, either. Can someone explain this mysterious column? -- Schapel (talk) 00:43, 7 March 2010 (UTC)
- I suppose I should remove it if we can't verify the information. -- Schapel (talk) 14:06, 9 March 2010 (UTC)
- But gecko is in this charts! Maybe they identify minefield-builds? maybe it is another browser. I don't know. I didn't find any res about that! mabdul 15:42, 9 March 2010 (UTC)
- What charts? Where? The only other item I can find in the charts and graphs is Other, and this number does not match the number reported for Gecko. Where does the data come from? I suppose I should remove it if we can't verify the information. -- Schapel (talk) 16:35, 9 March 2010 (UTC)
- I will look again. Maybe they changed their stats. I know that there was a gecko persentage! mabdul 12:51, 13 April 2010 (UTC)
- What charts? Where? The only other item I can find in the charts and graphs is Other, and this number does not match the number reported for Gecko. Where does the data come from? I suppose I should remove it if we can't verify the information. -- Schapel (talk) 16:35, 9 March 2010 (UTC)
- The Gecko column seems to refer to all non-Firefox Gecko-based browsers not Minefield builds. Although no other tables in the article have a column named Gecko they do have other labels which may refer to the same thing; Net Applications has Mozilla, W3Counter has Other Mozilla, and StatCounter has Mozilla listed in its raw data. The values in those categories are roughly similar to the Gecko column and drilling down gives http://statowl.com/web_browser_usage_by_subversion.php?timeframe=last_6&interval=month&chart_id=4&fltr_br=&fltr_os=&fltr_se=&fltr_cn=&trends=[]gecko|1&x=110&y=38 so I think it should stay. Thorenn (talk) 16:01, 9 March 2010 (UTC)
- But gecko is in this charts! Maybe they identify minefield-builds? maybe it is another browser. I don't know. I didn't find any res about that! mabdul 15:42, 9 March 2010 (UTC)
Some Possible New Data Sources
http://marketshare.hitslink.com/browser-market-share.aspx?qprid=0 http://getclicky.com/marketshare/global/web-browsers/
Just some I noticed and thought I'd put up for debate. Any others anyone has in mind are of course always welcome. —Preceding unsigned comment added by Zamadatix (talk • contribs) 18:53, 11 March 2010 (UTC)
- We've had the first source in the article for years. The second source might be good to add, but it seems like data only for individual days is available. I would prefer if we could get the average data over the period of a month or a quarter (three months) so the data would not fluctuate so much. -- Schapel (talk) 16:00, 12 March 2010 (UTC)
- The second source looks promising if we average the data over a month or quarter ourselves. On the other hand the data starts only last November and in the middle of it at that. I suppose we could put a footnote about that though. On a related note what are your opinions on including Wikimedia stats in the summary table now that we have found monthly stats up to the present with a good likelihood of consistent data for the future? Do you think it would be valid? According to Alexa Internet, Wikipedia is the 6th most trafficked website in the world but is underrepresented among people over 55, people who didn't go to college, and people with children and overrepresented among people under 24 and, people without children. Also it is overrepresented among people accessing it at school, in all other groups it is "similar to the general internet population".Do you think that makes it acceptable enough to be compared to our other sources? Keep in mind we did put it in when we had the data from November 2009. Thorenn (talk) 19:01, 12 March 2010 (UTC)
- I sent an email requesting that the raw data as a .txt file be provided (since thats what the flash app loads, except its on a local directory of the server) or another means to get the raw data. Hopefully I will get a response soon. Zamadatix (talk) 16:33, 14 March 2010 (UTC)
- The response was "Hi, sorry we don't have that available at this time. It is something we plan to add in the future though." Zamadatix (talk) 21:02, 14 March 2010 (UTC)
- When i said we could average the data over a month or quarter ourselves I meant take the data for each day in a given month or quarter and average it over the period of the month or quarter in question. The information already there is unwieldy to access for calculations but it is possible to work with. Also, if we do include it what would we do with the November 2009 data since it is not for the whole month? Thorenn (talk) 11:56, 15 March 2010 (UTC)
- Include everything and make a note that the data for nov09 isn't fully avaible... mabdul 14:29, 15 March 2010 (UTC)
- I have found the text file containing the data for the graph, it can be found here. I have also started to build the table for the historical usage share by averaging the daily data over each respective month. So far I have the historical monthly share for IE and Firefox which extends back to Late September not Mid-November as we thought. Apparently we were looking at the wrong graph it is Top families not Top versions. Also there is no option to view just one month or one quarter like the other sources so we won't be able to have month or quarter specific links in the tables. My preliminary work can be found here, feel free to add to it. If you want to know how I produced this work I will post instructions to replicate it on that page shortly. Thorenn (talk) 19:20, 16 March 2010 (UTC)
- This is what I have so far. If there are no objections to its current state I will put it in the article after the note about the partial month data is added. To see my raw data click here.
Clicky.com (Late September 2009 to present)
Period |
Internet Explorer |
Firefox |
Chrome |
Safari |
Opera |
Mozilla |
---|---|---|---|---|---|---|
February 2010 | 49.92% | 32.80% | 8.09% | 7.54% | 1.41% | 0.23% |
January 2010 | 50.71% | 32.89% | 7.79% | 6.85% | 1.51% | 0.23% |
December 2009 | 50.87% | 33.40% | 7.70% | 6.17% | 1.58% | 0.25% |
November 2009 | 52.21% | 32.97% | 7.80% | 5.15% | 1.55% | 0.31% |
October 2009 | 53.10% | 32.71% | 7.68% | 4.66% | 1.45% | 0.37% |
[Note 1] | September 200954.58% | 31.96% | 7.44% | 4.25% | 1.34% | 0.41% |
- ^ The September average involves only data from September 21st onward as this is when data was first collected.
Thorenn (talk) 01:07, 19 March 2010 (UTC)
- I'd drop Konqueror. We don't include it in any of our other tables, even though NetApp does have data on it. The percentage is smaller than the margin of error of the data (or far to close to it). Other than that, it's ok. Jdm64 (talk) 02:36, 19 March 2010 (UTC)
- thren replace konq with other and add the rest to it! mabdul 10:49, 19 March 2010 (UTC)
- There is a category called Other/Unknown in the Clicky data but it is all marked as zero; Other is probably less than 0.01 percent in the data so if Konquerer data were replaced and added to other it would be practically the same as just Konquerer it is probably easier just to drop Konquerer as I have done above. I have also added a possible note for the September data and changed the URL to point to the appropriate flash graph. After I get feedback on this and there are no further issues I will transfer it to the article and modify the summary table to include it. Thorenn (talk) 17:19, 19 March 2010 (UTC)
- something is totally wrong with there data! don't includem them! As you can see, I expanded the table with the sum for the months and some are above 100% WITHOUT the Konquerer data. really strange! mabdul 18:00, 19 March 2010 (UTC)
- It appears that the fact that it doesn't add up is due to rounding errors for example this table
October 2009 IE Firefox Chrome Safari Opera Mozilla Sum w/o Konq Sum w/ Konq 53.09645161 32.71419355 7.679677419 4.661290323 1.452580645 0.372258065 99.97645161 99.99645161
I fixed the number format. There's no reason to have numbers like 07.34. I'd also drop the sum/other column. None of our other tables have it, but also because the values go over 100%. It's ok, that things don't add up to 100% as that would usually mean an "other" category, but going over 100% shows rounding or double counting errors. W3Counter only shows the top 10 and adds up to about 95%, and that's fine. Jdm64 (talk) 19:45, 19 March 2010 (UTC)
- Am I the only one who can see October/Firefox = 32.71419355 in one table and 33.70% in the other. That's not a rounding error, it's an error error. I don't see any numeric data at the ref given so I can't tell who's right. I also don't see how a 1% error gives us 110% total. I think adding the 'totals' column temporarily in a spreadsheet version could show up error errors like this, and that may be worth it if anyone has the time (and access to the raw data to sort out typos if found) --Nigelj (talk) 20:04, 19 March 2010 (UTC)
- OK, I fixed now the table/column I summed up. maybe I will find time to look at thre raw-file. mabdul 21:20, 19 March 2010 (UTC)
- Sorry for the error I made, I accidentally put the values for the last day of each month instead of the average for each month for Firefox. As you can see I have fixed this error and the totals are more reasonable now; anything else that is inconsistent should be attributable to rounding errors. In case you want to verify my data the raw data provided by Clicky is available here and my derived data can be found here.
- Ok, looks good. I don't see any glaring errors. I'd still drop "sum" because none of the other tables show that. Jdm64 (talk) 20:24, 20 March 2010 (UTC)
- jep, let it free ;) mabdul 20:49, 20 March 2010 (UTC)
Wow, nice work :)! i think we can add this to the Summary table now? 72.241.145.229 (talk) 13:27, 21 March 2010 (UTC)
- Before we add it to the summary table, let's clean up the table to make it like the other tables in the article and then put it in the article. Is it okay if I remove the Sum column now? -- Schapel (talk) 14:25, 21 March 2010 (UTC)
- I did this already. Was only added to indicate that something was wrong! mabdul 15:47, 21 March 2010 (UTC)
- We also need to indicate where the data comes from. We should link to the text files that contain the data. -- Schapel (talk) 16:20, 21 March 2010 (UTC)
- Yeah, the problem is, that we have only a single raw file. we should also add this in the summary table and correct the mean and median. Can somebody explain me in which order we list the different tables? mabdul 16:41, 21 March 2010 (UTC)
- I think it makes sense for the order in the summary table to be the same as the order of the detailed tables in the article, and the order of the detailed tables seems to be the ones with the data going farther back in time come first. That would put the Clicky.com data at the end of the section after Wikimedia and at the bottom of the summary table. -- Schapel (talk) 18:39, 21 March 2010 (UTC)
- Yeah, the problem is, that we have only a single raw file. we should also add this in the summary table and correct the mean and median. Can somebody explain me in which order we list the different tables? mabdul 16:41, 21 March 2010 (UTC)
- We also need to indicate where the data comes from. We should link to the text files that contain the data. -- Schapel (talk) 16:20, 21 March 2010 (UTC)
mmh. ok, by this logic the order (at that moment) is wrong. but why this order? it doesn't make real sense to me. alpha order or something else would be better. mabdul 19:33, 21 March 2010 (UTC)
- The tables are in chronological order, just as the order of the rows within the tables, and the order of the sections in the article. I think it's been that way for years. Why mix chronological order with alphabetical order? -- Schapel (talk) 20:54, 21 March 2010 (UTC)
- I do think you misunderstood me. Why not order them all in alpha order? It doesn't make sense to me in this order. Why chronological order (since they started to report)? That doesn't say anything about the quality (i.e. statowl has a us-focused stats) nor the numbers of sites they represented! mabdul 20:51, 23 March 2010 (UTC)
Wikimedia stats
I just added some more stats to the Wikimedia table and moved it into the main section. There are plenty more where they came from. Here is a good index page for them if anyone's interested. --Nigelj (talk) 23:41, 12 March 2010 (UTC)
- shpouldn't we add iceweasel to the firefox data? mabdul 16:03, 21 March 2010 (UTC)
older stats
does anybody know the reasons why the stats from some pages (in the "older reports"-section) aren't anymore actual? Were the companies overtaken? Or does it have any other reasons? Maybe we should leave a note (since the companies/pages doen't have their own wikipedia article). mabdul 02:09, 13 March 2010 (UTC)
- Which one in specific? Most of the sites on this article did not report a reason for retirement. ~a (user • talk • contribs) 02:29, 13 March 2010 (UTC)
- How about to add older stats than this? I found some by searching information about Cello: old yahoo stats and maybe [2] [3]. more will follow mabdul 11:23, 28 March 2010 (UTC)
Stats from other countries/continents
Shouldn't we reflect, that FF and Opera has a higher usage share in Europe, Safari and IE less and that the IE6is dominant in some countires (like Korea)? There are different stats out there! mabdul 22:22, 26 March 2010 (UTC)
- Yes, i agree. I have added statcounter's data.Sandro kensan (talk) 20:52, 24 May 2010 (UTC)
Mobile Browser stats
Shouldn't we make a new article (or integrate in this article) browser stats for mobile browser / phones? This is something really new, but the mobile browser will grown (and get more percentage on the whole global market). mabdul 12:52, 13 April 2010 (UTC)
Pie chart too complicated to update
The existing pie chart is really complicated to update; I've given up trying. It involves updating an obscure template with the figures, and also using R (which very few people have installed, or the knowledge to use) to generate the new SVG file. What's wrong with using the simple solution, the Google Charts API (or another similar service)? Just use the following URL (for April 2010 data):
... and a PNG file is generated with everything we need. Rwxrwxrwx (talk) 16:18, 6 May 2010 (UTC)
- The chart is not hard to update. The source code for the R script that I use to generate the chart is in the description. All it takes is updating the numbers and rerunning the script. R is free and open source and is found on all major systems (lin/mac/win). If you can't update the chart that's ok, because that's what I've been doing -- since it's my chart. Jdm64 (talk) 20:17, 6 May 2010 (UTC)
- Upon reading this section I decided to try out the Google Charts API and I ended up creating this:[4]. Perhaps this is unsuitable for this article but I was wondering if it would it be useful in the Market adoption and Usage Share sections of the Firefox and IE articles. I would appreciate any feedback on this matter. Thorenn (talk) 23:58, 7 May 2010 (UTC)
European Data from StatCounter
I have add European data from statcounter. The table can be move to specific section like Country/Continent stats.Sandro kensan (talk) 16:44, 23 May 2010 (UTC)
Wikimedia 'Squid reports' happening again
I just found that Squid reports for March and April 2010 have appeared in that series. A few months ago we updated these figures and moved them into the 'current' section, for inclusion in the summary table. Then they dried up after Feb '10, so we moved them out to 'historical' again. Now they're back, is it time to do it all again? Surely someone here can find out who maintains these figures; are they going to be updated; and can we rely on them being updated before we go through the whole business again? --Nigelj (talk) 10:01, 1 June 2010 (UTC)
Maybe ping Erik_Zachte? He's credited at the very bottom of those reports 200.101.119.195 (talk) 12:58, 1 June 2010 (UTC)
- Done --Nigelj (talk) 14:20, 1 June 2010 (UTC)
I would like to see monthly reports for at least two months in a row before we use them in the summary table. -- Schapel (talk) 13:14, 1 June 2010 (UTC)
I run this report manually from time to time for all missing months. Someday I would like to have this fully automated, and publish the report soon after completion of month. One of those things on my todo list that never reaches the top. There is always extra work to do before partial automation can be turned into full automation, e.g. extra checks and safeguards that now are done manually, autocorrective behavior (e.g. on incomplete or missing dates in squid logs) etc. Also once in a while I make small changes before next run (e.g. add new scan patterns for mobile platforms, like iPad recently). Expect May report soon. Erik Zachte (talk) 14:38, 1 June 2010 (UTC)
- Thanks for the response, Erik, and thanks for producing and maintaining the stats. We do find these useful here, as they represent a large, worldwide sample of people with, presumably, very little selection bias. Hopefully, now we know that they are likely to be available for the foreseeable future, we can persuade each other that they're worth promoting back to 'current' and monitoring. Thanks again for the effort you make to produce them in this easily-consumed way. --Nigelj (talk) 15:16, 1 June 2010 (UTC)
I've just found the May data online as promised, so I have updated the table, and taken the liberty of moving it into the current, rather than the out-of-date, section. Thanks Erik. --Nigelj (talk) 16:43, 3 June 2010 (UTC)
Updating summary table
Updating the summary table piecemeal is confusing and resulting in many more edits than necessary, because some editors put the latest data in the summary table but not in the other tables and others don't understand when the median should be updated. It's hard to read when it has a combination of two different months. Why don't we simply not update the summary table at all until all the data for a month is in? -- Schapel (talk) 18:59, 3 June 2010 (UTC)
- Given that people have itchy fingers when the data is available, that might be hard to implement. I would suggest removing the mean from the table, as it is only the median which gets used further (in the pie chart). Also, remove the IE "example" figure from the lead section (like I did before but was quickly reverted). Also, as discussed above, make the pie-chart update process simpler and open to all instead of relying on one expert user to keep it updated using specialised software. Rwxrwxrwx (talk) 12:03, 7 June 2010 (UTC)
- The example is important because some people can't understand a completely abstract explanation and need a concrete example. The mean is useful because people who have experience with statistics can see how skewed the data are. The pie chart should be redone using SVG that can be updated by anyone with a text editor. I will change the note regarding updating the table and see how it goes. -- Schapel (talk) 10:18, 1 July 2010 (UTC)
- Why not adding a html comment in the table sections to indicate that everybody should wait until the development status --> maybe some persons will wait o.O give it a try! mabdul 10:51, 1 July 2010 (UTC)
World Wide Web Survey 1994-1998
How come the other browsers in the GVU WWW user survey (January 1994 to October 1998) aren't listed? There are also other surveys at [5].Smallman12q (talk) 14:02, 5 June 2010 (UTC)
Piechart
Two points. First, I noticed today that the percentages used in the piechart File:Web browser usage share.svg (including 1.29% for 'other') do not add up to 100%. I don't know how the R software is fudging this, but at present something isn't right.
Second point is that I noticed this because I have just written a new SVG version of that piechart that uses client-side ECMAScript to create 'dynamic' SVG to generate the segments from the same input data. This works fine in Firefox and Chromium, and I have put a copy at http://www.mistweb.net/BrowserUsageShare.svg to test it. The problem is that when I try to upload it to Commons, I get an error, "This file contains HTML or script code that may be erroneously interpreted by a web browser." I have asked about this at http://commons.wikimedia.org/wiki/Commons_talk:Media_for_cleanup#Dynamic_SVG.2C_ECMAScript but I expect the problem may be permanent and is a protection for the server-side code that Commons uses to generate PNGs for non-compliant browsers.
I wonder what people here think. I could arrange for my SVG script to output a final version of the SVG markup if anyone was interested. So far creating it this afternoon has just been an interesting exercise for me, but I'm happy to tidy it up some more if it will be of any use here. --Nigelj (talk) 19:20, 1 July 2010 (UTC)
- I really don't see the complication with R. And it's not necessarily fudging it, it's just that even the other value is using the median so it doesn't add to 100%. But this is perfectly fine because we are suppose to be compiling information from sources not messing with or making our own source. R is treating the number as values, not percentages. It then creates the percentages to display based on the sum. So the displayed percentages are normalized to the sum. Jdm64 (talk) 20:46, 1 July 2010 (UTC)
- Well, one important difference of the fudging is whether 50.53% is a tiny bit more than half (as in my diagram) or quite a lot more than half (as it appears in the current one). If that percentage drops to 49% soon, then using the current method that will still look like more than half. It's out by over 2%. Where does the 1.30% 'Other' figure come from? I don't see it in the summary table. --Nigelj (talk) 15:53, 5 July 2010 (UTC)
- The "other" is from the median of the remaining percentages from 100%. I summed each source's data; subtracted that from 100% and then took the median. Since the "other" median isn't in the table (I think it should be). I could up the other to make it add to 100%. But, then the other would not be correct from what we're gathering from the sources. Currently the median other is 1.29%, but to correct the chart the other would have to be 3.38% (which would make it larger than Opera, but it should be smaller), which is 163% larger than what it should be. But the other sections are only 2% from their median value. Only Internet Explorer is noticeable. I side with taking the approach that introduces the least amount of deviation from the data of the source table. Jdm64 (talk) 20:12, 5 July 2010 (UTC)
- Well, one important difference of the fudging is whether 50.53% is a tiny bit more than half (as in my diagram) or quite a lot more than half (as it appears in the current one). If that percentage drops to 49% soon, then using the current method that will still look like more than half. It's out by over 2%. Where does the 1.30% 'Other' figure come from? I don't see it in the summary table. --Nigelj (talk) 15:53, 5 July 2010 (UTC)
Please restore the piechart. --79.44.80.151 (talk) 09:45, 1 August 2010 (UTC)
StatOwl in Summary table
I note that the StatOwl numbers have a note: "92% of sites monitored by StatOwl serve predominantly United States market." which I presume explains why their IE and Safari numbers are the highest while their Firefox, Chrome and Opera numbers are the lowest. Also StatOwl is the slowest in getting their most recent data up. Can we rename the Summary table to 'Global summary table' and drop StatOwl? -- Limulus (talk) 22:03, 2 March 2010 (UTC)
Global summary table
would look like this... it might be worth while to drop the median row and just bold the entries that are median... -- Limulus (talk) 22:13, 2 March 2010 (UTC)
Source | Latest report | Internet Explorer |
Firefox | Chrome | Safari | Opera |
---|---|---|---|---|---|---|
Net Applications | February 2010 | 61.58% | 24.23% | 5.61% | 4.45% | 2.35% |
W3Counter | February 2010 | 48.70% | 32.10% | 6.80% | 5.60% | 2.10% |
Stat Counter | February 2010 | 54.50% | 31.83% | 6.71% | 4.08% | 1.97% |
Mean† | February 2010 | 54.93% | 29.39% | 6.37% | 4.71% | 2.14% |
Median† | February 2010 | 54.50% | 31.83% | 6.71% | 4.45% | 2.10% |
Global summary table (alternate)
Source | Latest report | Internet Explorer |
Firefox | Chrome | Safari | Opera |
---|---|---|---|---|---|---|
Net Applications | February 2010 | 61.58% | 24.23% | 5.61% | 4.45% | 2.35% |
W3Counter | February 2010 | 48.70% | 32.10% | 6.80% | 5.60% | 2.10% |
Stat Counter | February 2010 | 54.50% | 31.83% | 6.71% | 4.08% | 1.97% |
Mean† | February 2010 | 54.93% | 29.39% | 6.37% | 4.71% | 2.14% |
- ^† The values in italics are median.
Though considering most are from Stat Counter, I wonder if including median is actually worthwhile... -- Limulus (talk) 22:17, 2 March 2010 (UTC)
- The current list of sources are all reliable. We have so few sources we can't afford one less. If you can replace it, then we can talk about that separately. Also the median should stay, I'd drop the mean. Jdm64 (talk) 05:27, 3 March 2010 (UTC)
- The mean doesn't seem to account for the probable different number of hits for each tracker, making it a little misleading as an indicator.
- Why would the mean need to take into account the different number of hits for each tracker? The results of each tracker are in percentages, not hits, so there's no need to weight the data. -- Schapel (talk) 11:05, 13 April 2010 (UTC)
- Agree with Jdm64. Wikiolap (talk) 06:39, 3 March 2010 (UTC)
- I am not concerned that StatOwl is 'unreliable', I am concerned that it has a severe US-centric bias and thus skews the overall numbers when talking about global stats. I am saying that it would make the numbers MORE accurate to remove StatOwl. -- Limulus (talk) 06:49, 3 March 2010 (UTC)
- The whole point of the summary table is to give the user as much reliable information as possible. Then it's up to them to make the final decision on what to believe. Also, the US is a major consumer in the internet and many who come to this english page are going to be from the US. Henceforth, having a US-centric source is justifiable so long as we state that fact. But more importantly, by having more sources it allows us to level-out any bias in the data. This is clearly seen by the fact that the mean/median is closest to Stat Counter and NetApp even with the bias of StatOwl. Would you also purpose we remove W3Counter because the demographics of their monitored sites are more tech savvy than the general population? I could find fault with all the sources if I tried hard enough. But then what? We'd have nothing to draw any summary, especially any useful one because the median would be coming from so few number of sources. Jdm64 (talk) 10:08, 3 March 2010 (UTC)
- I must agree with Limulus, there is no point of having incorrect data just so you can have more of it. StatOwl clearly does not provide accurate numbers and is in no way any help to anyone. Also StatOwl uses worldwide trends but is most commonly visited by US browsers, so it will not provide US or Worldwide information accurately. Having 1 less is better than having 1 more bad one. Zamadatix (talk) 15:29, 7 March 2010 (UTC)
- They're all "incorrect". There's not any way we're going to get a 100% unbiased sample. The best we can do is have multiple sources with differing biases. The differing biases tend to cancel each other out, and this canceling effect will increase with more samples. The best we can do is not add extremely biased sources, such as data collected from the server logs from one site (e.g. w3schools, mail.ru, wikipedia). -- Schapel (talk) 16:54, 7 March 2010 (UTC)
- Yep, that's basically what I was trying to say. Although, I would argue that server logs from one site would be valid if that site is extremely popular. Would you not add the server logs from Google's servers? They're the number one visited site on the Internet (alexa top 500) -- I'd say that's a good sampling. Wikipedia is around number 6, and I'd say that's also enough traffic to be unbiased. It is true that the other smaller sites that give their server logs out wouldn't be reliable. Jdm64 (talk) 23:29, 7 March 2010 (UTC)
- Considering that different browsers default to different search engines, I would imagine that search engine logs would be among the most biased of samples for browser usage. It's not the amount of traffic that's important. It's that the sample should be a representative sample. -- Schapel (talk) 00:21, 8 March 2010 (UTC)
- Microsoft sites would be biased. But Google having 86% of searches (net app) and nearly everyone doing searches, I'd say that's representative. A site would have a bias in the logs if the content is representative of the types of people that use a specific browser. For instance, Arstechnica and other tech sites are slanted towards firefox (48%+). Bing, Microsoft and other windows centric sites would have higher IE shares. My point is that sites that are more general in content or are sites that everyone visits would have less bias. Google, Wikipedia, Youtube, Facebook and Twitter are sites that the vast majority of people using every browser visit on a daily basis. Jdm64 (talk) 03:14, 8 March 2010 (UTC)
- Considering that different browsers default to different search engines, I would imagine that search engine logs would be among the most biased of samples for browser usage. It's not the amount of traffic that's important. It's that the sample should be a representative sample. -- Schapel (talk) 00:21, 8 March 2010 (UTC)
- Yep, that's basically what I was trying to say. Although, I would argue that server logs from one site would be valid if that site is extremely popular. Would you not add the server logs from Google's servers? They're the number one visited site on the Internet (alexa top 500) -- I'd say that's a good sampling. Wikipedia is around number 6, and I'd say that's also enough traffic to be unbiased. It is true that the other smaller sites that give their server logs out wouldn't be reliable. Jdm64 (talk) 23:29, 7 March 2010 (UTC)
- They're all "incorrect". There's not any way we're going to get a 100% unbiased sample. The best we can do is have multiple sources with differing biases. The differing biases tend to cancel each other out, and this canceling effect will increase with more samples. The best we can do is not add extremely biased sources, such as data collected from the server logs from one site (e.g. w3schools, mail.ru, wikipedia). -- Schapel (talk) 16:54, 7 March 2010 (UTC)
- The mean doesn't seem to account for the probable different number of hits for each tracker, making it a little misleading as an indicator.
- Google and Youtube promote Chrome - so there is definitely bias. I wouldn't include Google stats. Facebook stats would be neutral (of only we could get its logs...)Wikiolap (talk) 05:45, 8 March 2010 (UTC)
- Google promotes Chrome. But if somebody installs and uses chrome then wouldn't that actually increase the usage share of chrome? This increase would be for every site that the user visits. So there is really no bias, but what we might see is that the new chrome convert only visits certain sites. This gets at a major flaw in the statistics. Server logs can only ever report 'access share' instead of real usage share. But we must use the access share to approximate the usage share. Jdm64 (talk) 06:11, 8 March 2010 (UTC)
- Then why would Microsoft statistics be biased towards IE then ? Once someone has IE - they can use it on any site, so Microsoft sites should not be biased towards IE. I actually don't care much whether or not we include Google/Microsoft stats (we don't have them anyway, so this is largely theoretical discussion), I just want to have same playing field - both have their own browsers, so if we think one is biased, then the other one is biased too. Wikiolap (talk) 17:01, 8 March 2010 (UTC)
- Content. Just like tech sites are Firefox bias, Microsoft's content is windows/ie centered and the default browser is IE. Similarly Apple would have a bias for Safari because the content is about their products. Yes Google makes a browser, but is the content of their site centered around their products that relate to OS or browser -- no. Similarly linux sites would have a strong bias for firefox because the types of people that a drawn to the content of the site. All this to say that StatOwl shouldn't be removed. Jdm64 (talk) 19:22, 8 March 2010 (UTC)
- Then why would Microsoft statistics be biased towards IE then ? Once someone has IE - they can use it on any site, so Microsoft sites should not be biased towards IE. I actually don't care much whether or not we include Google/Microsoft stats (we don't have them anyway, so this is largely theoretical discussion), I just want to have same playing field - both have their own browsers, so if we think one is biased, then the other one is biased too. Wikiolap (talk) 17:01, 8 March 2010 (UTC)
- Google promotes Chrome. But if somebody installs and uses chrome then wouldn't that actually increase the usage share of chrome? This increase would be for every site that the user visits. So there is really no bias, but what we might see is that the new chrome convert only visits certain sites. This gets at a major flaw in the statistics. Server logs can only ever report 'access share' instead of real usage share. But we must use the access share to approximate the usage share. Jdm64 (talk) 06:11, 8 March 2010 (UTC)
- Google and Youtube promote Chrome - so there is definitely bias. I wouldn't include Google stats. Facebook stats would be neutral (of only we could get its logs...)Wikiolap (talk) 05:45, 8 March 2010 (UTC)
Someone removed StatOwl from the summary table. I would like to see a consensus reached before we remove any entries from the table. -- Schapel (talk) 17:03, 4 June 2010 (UTC)
- I would like to see StatOwl removed from the table. The table is supposed to show a global representation of the usage of web browsers, but the US-centric StatOwl throws off the median and mean. StatOwl is not necessarily inaccurate, but doesn't have the global representation that the others have which produces these figures which are around 10% out.--Baina90 (talk) 21:10, 5 June 2010 (UTC)
I would either like StatOwl removed or an indication on the main chart that it does not represent a global summary but a US leaning (but not US specific) one. Overall I think the question of what the chart does represent should be clarified especially if as is currently the case it might be viewed as misleading. —Preceding unsigned comment added by 87.74.74.230 (talk) 21:51, 7 August 2010 (UTC)
- There is already an existing note about StatOwl: 92% of sites monitored by StatOwl serve predominantly United States market. Wikiolap (talk) 16:28, 8 August 2010 (UTC)
- Not on the chart just the summary table therefore making the chart misleading. —Preceding unsigned comment added by 87.74.74.22 (talk) 19:38, 15 September 2010 (UTC)
Wikimedia is not a web usage statistics site.
Why is wikimedia included.
Those statistics are about wikipedia only. 86.83.239.142 (talk) 17:34, 2 August 2010 (UTC)
- Wikimedia should be include because is huge, and helps to get an approximate number of their usage. Juanjosepablos (talk) 19:28, 24 September 2010 (UTC)
- Wikimedia is indeed huge and thus relevant. Among the most visited Websites, Wikipedia has a unique audience: users looking for information and encyclopedic content. As such, Wikimedia is indeed very interesting. It should be noted that other Websites in the top 20 are search engines and social networks, thus a very different audience. Except Amazon, which is pretty interesting too. Dodoïste (talk) 23:46, 24 September 2010 (UTC)
Monthly stats more interesting
I'm wondering why stats that are reported quarterly are listed first, while other stats that are reported on monthly listed later. Aren't the monthly stats more interesting and therefore more noteworthy than quarterly stats? I suggest moving the quarterly stats to the bottom Daniel.Cardenas (talk) 02:19, 2 October 2010 (UTC)
- The stats are listed in chronological order of first stat. The quarterly stats are more interesting, because month-to-month variations in measured browser usage have more noise, and you can see a period of time that is three times longer. We should convert the longer monthly tables to quarterly. -- Schapel (talk) 12:09, 2 October 2010 (UTC)
- What about previous years in quarterly format and recent data in monthly buckets? Daniel.Cardenas (talk) 18:52, 2 October 2010 (UTC)
- Why? The recent monthly variations are mostly noise, with the exception of IE usage falling and Chrome usage rising. With a smaller sample size and greater variation of weekdays (when IE usage is higher) vs. weekends (when alternative browser usage is higher), monthly data will have more of an error, and thus more noise. By reporting data three times as often, each monthly measurement represents one-third of the amount of change that a quarterly measurement does, and thus less signal. The signal-to-noise ratio of monthly reports is therefore much lower than quarterly reports. This is why for monthly stats for browsers other than IE and Chrome (which are the ones changing the most), whether the number for a particular browser is higher or lower than the previous month is approximately random. If someone wants the detailed monthly reporting, they can always go directly to the source of the data, which is a click away. -- Schapel (talk) 13:18, 3 October 2010 (UTC)
- >with the exception of IE usage falling and Chrome usage rising.
- Isn't that the most interesting data? Yes, firefox change is mostly flat so up and down ticks in monthly data doesn't mean much.
- >By reporting data three times as often, each monthly measurement represents one-third of the amount of change that a quarterly measurement does, and thus less signal.
- Do you agree there is signal in the I.E. and chrome monthly data?
- >If someone wants the detailed monthly reporting, they can always go directly to the source of the data, which is a click away.
- Or we can be good to our wikipedia readers and display the more recent data in which are readers are interested in.
- Daniel.Cardenas (talk) 16:08, 3 October 2010 (UTC)
- I'm still not understanding what is "interesting" about the monthly data. We all know Chrome use is rising and IE use is falling. We don't need to see monthly updates to know that. If you do want monthly stats for NetApplications or TheCounter, it's a click-thru to monthly stats and other kinds of data. If you want daily stats, you can click through to GetClicky data. It will be the same after W3Counter and StatCounter are converted to quarterly. -- Schapel (talk) 12:41, 4 October 2010 (UTC)
- Well lets find out who we is. Could other people chime in and indicate if they like the monthly stats? Thx, Daniel.Cardenas (talk) 16:27, 4 October 2010 (UTC)
- I prefer keeping quarterly stats. I would even prefer to have yearly stats, because this is encyclopedia, not a news source, and information we put in should remain relevant for years to come, when monthly or even quarterly variations would be not very important. Wikiolap (talk) 18:14, 4 October 2010 (UTC)
- Well lets find out who we is. Could other people chime in and indicate if they like the monthly stats? Thx, Daniel.Cardenas (talk) 16:27, 4 October 2010 (UTC)
- I'm still not understanding what is "interesting" about the monthly data. We all know Chrome use is rising and IE use is falling. We don't need to see monthly updates to know that. If you do want monthly stats for NetApplications or TheCounter, it's a click-thru to monthly stats and other kinds of data. If you want daily stats, you can click through to GetClicky data. It will be the same after W3Counter and StatCounter are converted to quarterly. -- Schapel (talk) 12:41, 4 October 2010 (UTC)
- Why? The recent monthly variations are mostly noise, with the exception of IE usage falling and Chrome usage rising. With a smaller sample size and greater variation of weekdays (when IE usage is higher) vs. weekends (when alternative browser usage is higher), monthly data will have more of an error, and thus more noise. By reporting data three times as often, each monthly measurement represents one-third of the amount of change that a quarterly measurement does, and thus less signal. The signal-to-noise ratio of monthly reports is therefore much lower than quarterly reports. This is why for monthly stats for browsers other than IE and Chrome (which are the ones changing the most), whether the number for a particular browser is higher or lower than the previous month is approximately random. If someone wants the detailed monthly reporting, they can always go directly to the source of the data, which is a click away. -- Schapel (talk) 13:18, 3 October 2010 (UTC)
- I think, while there are people here willing to do the updates, the monthly updated table with medians is a very valuable resource here. I don't see how that could be maintained monthly if the figures were all blurred into quarters. What is the problem that we are trying to solve here? Is it too much work to update the tables monthly? It's not noise, as there is clear 'signal' in all the monthly figures (try looking a yearly climate figures too see what noise looks like :-). Are the history tables too long? They could all be shortened to see a rolling year, in my opinion; the page archives have the full history. Surely it's not that IE is about to go below 50% in our medians, and some people want to delay that moment for a couple of months? I don't believe that could be true, so what are we trying to fix? --Nigelj (talk) 18:35, 4 October 2010 (UTC)
- It's not really a matter of trying to fix something. It's a question of which way of doing things is better. Quarterly information seems to have the advantage: it's less work, it makes it easier to spot trends, it makes the tables smaller so we can see a longer period of time. As far as I have seen, the only advantage to monthly updates is that they are "interesting" and "valuable" in the opinions of some. In what specific way are they "interesting" and "valuable"? -- Schapel (talk) 18:48, 4 October 2010 (UTC)
- It's an encyclopedia: "interesting" and "valuable" is what we do. --Nigelj (talk) 18:56, 4 October 2010 (UTC)
- If it's truly valuable, you should be come up with something concrete it is valuable for. I can't think of anything. Can anyone else? Monthly data is valuable for the practical purpose of ____________. -- Schapel (talk) 15:03, 5 October 2010 (UTC)
- ...of following the story. By your reasoning, there would be no need to publish individual football scores, let alone attend the games. Just publish quarterly aggregates, or end of season league tables. --Nigelj (talk) 15:12, 5 October 2010 (UTC)
- No, this is absolutely incorrect, and shows that you do not understand my reasoning at all. There is no measurement error in football. If one team beat another team 28-27, we are absolutely sure about which team won -- the one that scored 28 points. In usage share, there are measurement errors. This means we cannot reason is this absolute way about the numbers presented. What we want to do to be able to reason about browser trends is to reduce the amount of noise in the data. This can be done by averaging over longer time periods, and by averaging stats collected from a variety of sources. This is what we have been doing in this article for years. -- Schapel (talk) 15:30, 5 October 2010 (UTC)
- ...of following the story. By your reasoning, there would be no need to publish individual football scores, let alone attend the games. Just publish quarterly aggregates, or end of season league tables. --Nigelj (talk) 15:12, 5 October 2010 (UTC)
- >it's less work
- People are interested in doing the work, so I don't see this is as a value.
- >it makes it easier to spot trends, it makes the tables smaller so we can see a longer period of time.
- The proposal is to use quarterly buckets for older data. Or if people prefer use yearly buckets for older data.
- Daniel.Cardenas (talk) 19:08, 4 October 2010 (UTC)
- I end up cleaning up many of the edits that others do, so I'm interested in a way of doing things that is less work. I think changing all the usage stats throughout Wikipedia to use quarterly stats would lead to far less work, and it would also more easily show important trends, instead of emphasizing random fluctuations in measurements. -- Schapel (talk) 15:03, 5 October 2010 (UTC)
- If you're finding it a slog, just try not doing it. There are thousands of WP editors, and quite a few of us here, watching this page. You'll see - it'll still get done. --Nigelj (talk) 15:18, 5 October 2010 (UTC)
- I've tried this before, and I find it means I have to come in and fix things later. A few months ago I had to fix several months of Clicky data that was wrong. Sorry, been there, done that. -- Schapel (talk) 15:30, 5 October 2010 (UTC)
- Can we get back to which is more useful for the reader? "Are the history tables too long?" Yes, as it is right now the page is WAY too long. "Readers may tire of reading a page much longer than about 30 to 50 KB" (here) and "Long and sprawling lists of statistics may be confusing to readers and reduce the readability and neatness of our articles" (here). ~a (user • talk • contribs) 16:14, 5 October 2010 (UTC)
- I've tried this before, and I find it means I have to come in and fix things later. A few months ago I had to fix several months of Clicky data that was wrong. Sorry, been there, done that. -- Schapel (talk) 15:30, 5 October 2010 (UTC)
- If you're finding it a slog, just try not doing it. There are thousands of WP editors, and quite a few of us here, watching this page. You'll see - it'll still get done. --Nigelj (talk) 15:18, 5 October 2010 (UTC)
- I end up cleaning up many of the edits that others do, so I'm interested in a way of doing things that is less work. I think changing all the usage stats throughout Wikipedia to use quarterly stats would lead to far less work, and it would also more easily show important trends, instead of emphasizing random fluctuations in measurements. -- Schapel (talk) 15:03, 5 October 2010 (UTC)
- If it's truly valuable, you should be come up with something concrete it is valuable for. I can't think of anything. Can anyone else? Monthly data is valuable for the practical purpose of ____________. -- Schapel (talk) 15:03, 5 October 2010 (UTC)
- It's an encyclopedia: "interesting" and "valuable" is what we do. --Nigelj (talk) 18:56, 4 October 2010 (UTC)
- It's not really a matter of trying to fix something. It's a question of which way of doing things is better. Quarterly information seems to have the advantage: it's less work, it makes it easier to spot trends, it makes the tables smaller so we can see a longer period of time. As far as I have seen, the only advantage to monthly updates is that they are "interesting" and "valuable" in the opinions of some. In what specific way are they "interesting" and "valuable"? -- Schapel (talk) 18:48, 4 October 2010 (UTC)
- What about previous years in quarterly format and recent data in monthly buckets? Daniel.Cardenas (talk) 18:52, 2 October 2010 (UTC)
How about:
- deleting Mozilla, gecko, netscape, and other columns. Always close to zero in current data. Don't delete in older data with significant percentage.
- 4 or less monthly buckets
- 2 years of quarter data
- 2007 and early to be yearly data?
Daniel.Cardenas (talk) 18:45, 5 October 2010 (UTC)
- Let's make it simple and simply convert the tables in Historical usage share to quarterly. There's very little information in the recent monthly measurements (they can be approximately predicted from previous data, and suffer from random fluctuations), and it would be even more work than it currently is for editors, because the rows would need to be converted from monthly data to quarterly data at some point. They would also be hard to read if different rows showed different periods of time.
- I find it odd that you're now suggesting that some data be deleted outright (!) from the tables. Might not some readers find Mozilla, Gecko, and Netscape use interesting? My take on it is that if there's a column and row for the data in the tables, and there's a piece of data that goes there, it should be placed there. Currently, missing data in the table means the data really is missing. It would be confusing if it could mean that either it's missing or it meets some arbitrary criterion of "insignificant". -- Schapel (talk) 20:52, 5 October 2010 (UTC)
- I agree with Cardenas Daniel, i don't understand why Net application that are reported quarterly are listed first, while other stats that are reported on monthly listed later. I think montly statistics are significant and i preferred to put quarterly stats out of the summary table. -- 151.82.203.193 (talk) 21:35, 5 October 2010 (UTC)
- Sorry, now i'm login --Sandro kensan (talk) 21:38, 5 October 2010 (UTC)
- What exactly is significant about them? I can't understand why some think they're important. No one will explain. Can you give a specific example of significant information that would be lost if we converted to a quarterly format? And why that information is significant? -- Schapel (talk) 12:35, 6 October 2010 (UTC)
- >Might not some readers find Mozilla, Gecko, and Netscape use interesting?
- They would find it interesting when it had significant share but if it is close to zero how interesting is that? Interesting in the older tables when share is greater than half a percent. Suggest delete Netscape column in statowl and statcounter tables. Are you saying you prefer those columns not be deleted? Daniel.Cardenas (talk) 01:48, 6 October 2010 (UTC)
- I don't think we should delete data just because you personally find it uninteresting. There should be a good, objective reason. -- Schapel (talk) 12:35, 6 October 2010 (UTC)
- Because zero percent is not very notable. But if people find it interesting then it should be left. Daniel.Cardenas (talk) 19:44, 6 October 2010 (UTC)
- I don't think we should delete data just because you personally find it uninteresting. There should be a good, objective reason. -- Schapel (talk) 12:35, 6 October 2010 (UTC)
If the raw data tables are too long, there is probably something at Help:Collapsing#Collapsing tables by default that would allow us to hide all but the last year's data, for example, with a show link for anyone who's interested in it. This way nothing is deleted, and nothing needs to be recalculated into quarterly or annual summaries - Just move a row into the collapsed section of each table each time a new one is added to the live section. I don't think that the width of the data tables is a serious problem at the moment. --Nigelj (talk) 15:14, 6 October 2010 (UTC)
- This collapsing approach does not have several of the benefits of aggregating the data into quarters. It would not allow one to see the data over a period of many years at once on a normal monitor, and it would not smooth out the noise from the monthly fluctuations. Aggregating by quarters emphasizes the important trends; it doesn't just make the tables shorter. -- Schapel (talk) 16:52, 7 October 2010 (UTC)