Jump to content

Talk:Load (computing)

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Waiting, states, calculations

[edit]

Is there a contradiction in this article?

In the "Calculating Load Average" section, it states:

 each process that is using the CPU or waiting for disk activity
 adds 1 to the load number

In the "Important things to note" section, it states:

 processes which are awaiting I/O, "sleeping" or "blocked" don't
 contribute to load average

Doesn't "waiting for disk activity" mean that a process is blocked waiting for disk I/O?

In general, it would be very helpful if this article explained in which "states" a process contributes to the load number.

Beric 21:45, 22 February 2006 (UTC)[reply]

Optimum load average should be the same as the number of processors (or in case of hyperthreading, virtual processors). ie not 2 for a dual processor sysem, 4 for a dual processor with hyper etc... —Preceding unsigned comment added by Mchicago (talkcontribs) 19:08, 27 February 2006

A clearer, more detailed, and authoritative explanation of waiting, states, and calculation (including analysis of the operating system code) can be found in the reference listed under External Links (http://www.teamquest.com/resources/gunther/display/5/index.htm). Anyone contemplating changes to this section (I lack the necessary detailed knowledge) should take a close look at that reference. 71.126.183.125 16:18, 6 February 2007 (UTC)[reply]

Confusion here seems to stem from the fact that Linux calculates load average differently from most UNIX systems: on most systems, only running or runnable threads count towards the load, but Linux also includes threads in certain types of sleeps (i.e., disk I/O). This difference becomes most significant during storage system stalls, such as NFS server failure, where the load average on Linux will spike because many threads will be blocked in NFS I/O indefinitely. I've updated the article a bit to reflect this distinction, but no of no other UNIX system that takes this approach.

It is my firmly held belief that most Unix systems (any that I've come across, certainly) behave in the way which is described to be specific to Linux, in that processes in short-term wait, such as waiting for disk I/O to complete, will contribute to the calculated load on the machine. This is certainly true for NetBSD, almost certainly true for all the relatives (the other BSD flavors) given the common heritage. I'm also pretty sure that both Solaris and HP-UX also behave in this manner, i.e. include processes in short-term uninterruptible wait as contributing to the load on the machine. —Preceding unsigned comment added by 158.38.62.77 (talk) 11:15, 26 November 2009 (UTC)[reply]

212.59.34.129 (talk) 08:37, 19 November 2009 (UTC) Harald I'd like to know, how multi-core CPUs fit into the following sentence:[reply]

 In a system with four CPUs, a load average of 3.73 would indicate that there were, on average, 3.73 processes ready to run, and each one could be scheduled into a CPU.

I'd assume, that a Dual CPU, Dual Core system counts for 4 parallel processes.

Instantaneous percentage of CPU utilization on Windows?!?

[edit]

The author states that "On Microsoft Windows PC systems, the system load is given as an instantaneous percentage of CPU utilization." That's not possible, there is no CPU "speedometer" on any processor chip set that I know of. An operating system can only recognize two basic states: idle (0% utilization) or not idle (100% utilization). To report a utilization between 0% and 100% requires that it average the two basic states over some time period. Most operating systems do this by sampling the current state during each clock interrupt and incrementing one of the two state counters (most operating systems also have sub-counters under Not Idle such as Wait I/O in UNIX or processing nested interrupts). The utilization that is then reported is the difference between the counters over some sampling interval divided by the total samples during the interval and that is what is reported as utilization. As for what time interval Windows utilities such as perfmon and the Logs & Alerts Service use I have no clue.—Preceding unsigned comment added by 171.159.64.10 (talkcontribs) 13:01, 9 June 2006

So you're just complaining about how it's worded?--Bobby D. DS. 23:17, 29 October 2006 (UTC)[reply]

Nope. He was pointing out that just calculating the CPU load average could be meaningless on those systems that change CPU frequency on the fly. If, for example, you see 50%, it's 50% of what ? (of maximum CPU capacity at that particular frequency). It's like measuring speed from an erratically moving car. If that is to be fixed, it would have to somehow poll the frequency that the cpu is running at. —Preceding unsigned comment added by 70.54.202.152 (talkcontribs) 13:03, 1 November 2006

I suspect part of the problem is that the term "load average" is very well defined, but the term "load" is vague at best. This article is currently mostly about the Unix load average, and not about "Load (computing)". Whether we should change this article to be more general, or simply rename it to Load average and create a new Load (computing) stub with the rest of the content, I dunno. --DragonHawk 02:06, 4 January 2007 (UTC)[reply]

Windows systems don't have a load or system averaging (home versions) the task manager and system performance monitor poll the cpu per set interval (user defined or system default) for changes and report that. It defines cpu usage. not quiet system load. Miaviator278

Processes or jobs?

[edit]

The article talks about "each process that is using CPU". I think it's more common to talk about "each job", which isn't necessarily synonymous. For example, a process with two threads can count as two jobs. —Preceding unsigned comment added by 86.17.154.122 (talkcontribs) 04:30, 31 July 2006

I think the problem is that each UNIXoid operating system will use different terms. For example, on Linux your jobs are called "tasks".—Preceding unsigned comment added by 62.31.146.25 (talkcontribs) 11:42, 27 August 2006


don't know about you, guys, but wherever I found the word "job" before (related to something to be executed by a computer, that is) it referred to a batch file containing some non-interactive stuff to be done in a certain sequence (series or parallel or both). A job usually contains more processes and has nothing to do with the load average we are talking about here. —Preceding unsigned comment added by 70.54.202.152 (talkcontribs) 13:07, 1 November 2006

For unix, there are specific meanings to the terms "job" and "process". A job is in the jobtable, a process is in the process table. See `jobs` for a list of jobs for the current terminal, and `ps` for a list of process for the current system. Don't be confused by cron terminology; jobs aren't processes. --87.194.236.208 (talk) 15:20, 4 May 2008 (UTC)[reply]

Load Average on non-*nix systems

[edit]

Load Average is also used in the Tenex operating system developed at BBN and it's spawn, Digital Equipment Corporation's TOPS20. It was not originally used in DEC's VMS operating system for VAX and AXP based systens, but a drop-in device driver was made available to the computing community, and just about every VMS system connected to the Internet has it installed. It is often used on VMS systems to compute a threshold at which ANONYMOUS FTP service will not be allowed. It is also used in VMS Clusters to shunt a cluster-wide service to the least loaded system. The VMS Load Average driver actually computes the load for several system variables; I'd have to go back and look at the code or the documentation to find out what those variables are.

I also remember seeing the Load Average used on TOPS20 before seeing it on BSD Unix.

-HWM —Preceding unsigned comment added by TenexHacker (talkcontribs) 03:38, 3 September 2006

Error in example of load average?

[edit]
For example a load average of "3.73 7.98 0.50" on a single CPU system can be interpreted as...

the CPU was overloaded by 373%

Shouldn't it be overloaded by 273% ?

To me, a load of 2.00 would mean that it's overloaded by 100%, not 200%. —Preceding unsigned comment added by Vmardian (talkcontribs) 13:25, 21 November 2006

The author did OK until they said "For example a load average of "3.73 7.98 0.50" on a single CPU system can be interpreted as:
the CPU was overloaded by 373% (needed to do 373% as much work as it can do in a minute) during the last minute."
This is NOT correct. (which is why I deleted that section previously, but it re-appeared).
Yes, completely totally wrong. Wrong wrong wrong wrong. That is not what load average means at all. 96.42.113.114 (talk) 20:32, 21 August 2012 (UTC)[reply]
As the author correctly stated earlier, load is how many processes is waiting to run (+ 1 actually running) on the system. e.g if there are 9 waiting and 1 running the load will be 10.
As only one process can run at a time on a single core processor (obviously multi-processor or multi-cores will be better), the others have to wait dependant on a multitude of factors from what the process is, to the nice (priority) value etc,
But let's take a simple example, ignoring the overhead as the CPU switches tasks.
You have 10 small, identical programs running on a single core CPU. They only take 5% CPU when they run as they are not CPU intensive, but they are designed to run for 1 second. Only 1 can run at a time (it's a single CPU, single core, remember) - so what is the CPU usuage (%) and the load ?
CPU utilization is 5% because only one can run at a time but the load is 10.
Don't forget multi-tasking is an illusion of programs running concurrently. They only get a share and are switched to other waiting tasks.
I was going to suggest trying a low-load device like a sound player, but when I started the second xmms playing and played a mp3, the load dropped from 0.41 to 0.0 on my amd64 4000+ notebook running pclinuxos 2.6.16.27.tex1 #1 Thu Aug 10 20:13:42 CDT 2006 i686 Mobile AMD Athlon(tm) 64 Processor 4000+ unknown GNU/Linux
So it looks like top or something else is broken. XMMS playing a MP3 takes 1.5% CPU but as there are 143 tasks with x and kde running, something is seriously wrong.
Anyway, I hope you get the idea.
15:38, 13 February 2007 (UTC)~~marrandy
Your example is flawed. Running processes do not use only a percentage of the CPU. The running process is always using 100%. When we say that a process takes 5% of the CPU we mean that, over a defined period, it is running for 5% of the time. Each of your 1 second long processes with 5% utilisation is in reality running for 1/20 of a second and waiting for something (e.g. disk I/O or sleeping) for 19/20 of a second.
In other words, it's possible to have a Load Average below 1.0 and have 100% CPU usage (single process using all CPU with no other processes waiting) but it's not possible to have a Load Average above 1.0 with less than 100% CPU usage (doesn't make sense to have lots of processes blocked with idle CPU). At least, that's how I read this. —Preceding unsigned comment added by 66.92.218.124 (talk) 20:21, 18 October 2007 (UTC)[reply]
Wrong. So very wrong. Processes waiting on IO, thus are using ZERO CPU time, add to the load average. Thus it is entirely possible to have a high load average with less than 100% CPU usage. This is an indication that you are IO bound, NOT CPU bound. This is why just blindly upgrading your CPU because load average is high, is the wrong thing to do. This is why load average is meaningless on its own, and mostly meaningless even with other context. It tell you nothing you can't more easily figure out just looking at CPU usage in top and the blinking lights of your hard drive. 96.42.113.114 (talk) 20:32, 21 August 2012 (UTC)[reply]
seems to give a much clearer idea of the concept of load and load average. i quote an important point:

"It is important to remember that a CPU is a discrete state machine. It really can be at only 100%, executing an instruction, or at 0%, waiting for something to do. There is no such thing as using 45% of a CPU. The CPU percentage is a function of time." Hackeye (talk) 05:03, 24 November 2007 (UTC) hackeye[reply]

The profile of the load depends on when those processes want their 1/20th of a second slot. If they all do something for 1/20th of a second then sleep for 19/20ths and you start them at exactly the same time the load will start at 10 (1 running, 9 waiting for the CPU), drop to 9 after 1/20th seconds (1 running 8 waiting, 1 sleeping), then 8 and so on until after 0.5 seconds all the processes are sleeping. In this case, the average of the load over a whole second is 1.1. If each process is designed to run in a different 1/20th of a second e.g. the first process runs straight away, the second waits 1/20th seconds then runs, the third waits 2/20ths of a second then runs etc, the load will be 1 while there is a process running and 0 while there isn't. Over the second, the average load will be 0.5 in this case.
This is why your media player example appears to be broken. Linux counts a process waiting for disk I/O towards the load average even though if a processor became available the process would not be able to run. I imagine that one media player spends alot of its time waiting for the disk to deliver blocks from the media file. A second media player probably introduces contention somewhere else where the waiting processes would not be counted towards the load. Either that, or the media files used in the second test were still in the disk cache from the first test, thus massively reducing the required IO wait time.
Jeremypnet 13:09, 12 March 2007 (UTC)[reply]


why is a load of 7.9 over 5 minutes mean half the time the processor is in use? that cld make some sense for 15 minutes, its clear this example needs revising, i'll se if i can find a new example on line
theonhighgod 16/03/07
The article itself says that the load is calculated as exponentially damped/weighted moving average. As I understand that means, a load of 5 in the last minute is counted as more significant than a load of 5 in that minute before. However the example A load of 3.73 means that during the last minute, the CPU was overloaded by 273% does assume an arithmetic average hence is flawed.
PS: I still admit, that the example even if not absolutely correct, might help to give a new reader a good idea of this complex matter :D
212.55.216.242 16:11, 15 August 2007 (UTC)[reply]


i totaly Agree on the example being wrong. What if you have 1 highest-priority job, which will need 1minute of cpu time, and 3 lowest-priority jobs, which need 5seconds of cpu time. What if the lowest-priority jobs are only scheduled when the highest priority job is either blocked, sleeping oder done?
Then you would have a Load Average of 4 for the first minute. Remember, total CPU Time is 1 minute 15 seconds. So you'd only need a 25% faster CPU to do all the work in one minute. The example in the article is undoubtly wrong and missleading.
I'd say, you'd be able to calculate the percentage of cpu load if you knew the jobs that run in a given time and each of the jobs consumed cpu time, thus getting a total cpu time consumed over a given time period. cpu time consumed / time passed = percentage of cpu load. But i guess that's just not available on a Unix system, /proc/stat lists the average load in time since system start. So there should be no talking about a percentage in this part.

—Preceding unsigned comment added by 147.88.200.112 (talk) 07:22, 10 June 2008 (UTC)[reply]

Unix or windows or both

[edit]

This page needs to decide if it is about unix (& clones) load averages, or about the concept of computer load. There isn't currently a page about unix load averages, and most of the unix related stuff here is reasonable enough to start one. The windows section needs to be expanded or explained. The "Other Meanings" section would probably be better shifted to their respective pages, or made into separate stub pages. --CalPaterson (talk) 15:29, 4 May 2008 (UTC)[reply]

The windows section needs deleted, really. Windows has no use for load averages. 89.240.240.107 (talk) 13:19, 12 May 2008 (UTC)[reply]

"Load average is not CPU utilization" Section

[edit]

I just removed a section titled "Load average is not CPU utilization" added by Sp0 a few months ago. The text of the section was:

Even though the statements in the previous section might suggest that load average is related to CPU utilization because the section relates CPU to load average, load average does not measure CPU utilization of processes. One reason it does not do this is because load averages computations of processes are in a wrong order to relate to trend information of CPU utilization. In other words, the calculations and numbers directly produced by load averages do not compute numbers in an order from more CPU intensive to less CPU intensive or vice versa, nor do they give computations of numbers that would give another way that would result in direct information about CPU utilization. In summary, the functions of load average give numbers based on load queue of processes. The next section uses the same reference to suggest that load average is not very important or vital to system performance information until or unless the a system's CPU is heavily loaded to around 100%. Then, at levels close to 100%, load average could be very important or significant to determinacy of system performance; however, this would be because average load numbers give direct information about process queue length not CPU utilization -- which is something they do not give direct information about.

Without making any comment on the accuracy of this information, there are two severe problems with this: First, it is almost completely incomprehensible due to grammatical errors and a generally rambling style. Second, the section existed to contradict the rest of the article. Sp0, if you believe the article is factually inaccurate, the correct response is to edit or replace the existing text to make it accurate, and preferably also to include citations of reliable sources to prevent an edit war. -- Tyler (talk) 18:24, 8 January 2010 (UTC)[reply]

Misleading and wrong...

[edit]

This article has so many sections that are misleading or wrong that I don't even know where to start... Almost want to delete much of it, yet lazy to rewrite...

  1. Adding to the note by previous user, "Load average is not CPU utilization" is correct. Load average is not necessarily CPU utilization. If the system was hanging on disk all the time, and CPU was hitting high iowait, it's not exactly "utilizing" the CPU.
  2. Statements like this is just wrong: "(no processes had to wait for a turn)". We already know processes waited on the minute average, just because we change averages doesn't magically change history. Average does not define minimums and maximums.
  3. To use the word "overloaded" and "underloaded" is correct but also highly misleading. Your system may not be conceptually underloaded at all even if the system has under 1.00 load average. Loads spike and drop, once again, average does not define minimums and maximums. Whether it's underloaded or not depends on what is running and how it's affecting the system. To give a specific example, let's say we have a voice communication program and every time you say something, it uses more CPU. But your CPU can't handle the increased load and thus sends stuttering messages. So, the communication program overloads your CPU. But you don't talk 100% of the time, so, it may be underloaded most of the time. Then to say your CPU is underloaded but your CPU couldn't handle the communication program is highly confusing.
  4. "This means that this CPU could have handled all of the work scheduled for the last minute if it were 1.73 times as fast" is also untrue. Load average, once again, is not purely dependent on CPU. That implies that load average only depends on CPU.

---Grumps (talk) 06:50, 15 October 2012 (UTC)[reply]

Confusion on Hz definition

[edit]

"On Linux systems, the load-average is not calculated on each clock tick, but driven by a variable value that is based on the Hz frequency setting and tested on each clock tick. (Hz variable is the pulse rate of particular Linux kernel activity. 1Hz is equal to one clock tick; 10ms by default.) Although the Hz value can be configured in some versions of the kernel, it is normally set to 100. The calculation code uses the Hz value to determine the CPU Load calculation frequency. Specifically, the timer.c::calc_load() function will run the algorithm every 5 * Hz, or roughly every five seconds. Following is that function in its entirety:"

The above paragraph in the article is very confusing. I think it is confusing Hz with the Linux kernel interrupt timer.

>> 1Hz is equal to one clock tick; 10ms by default.

If the definition of hertz is "cycles per second" as defined by the Hertz article on Wikipedia, then how can 1 Hz be 10ms? 1 Hz would be 1 cycle per second, which would mean that the clock ticks once per second. I think the 10ms is actually referring to the Linux kernel interrupt timer which is normally set to 100, or roughly every 10 ms. The kernel interrupt will fire 100 times a second.

>> Although the Hz value can be configured in some versions of the kernel, it is normally set to 100.

Huh? I thought it just said that 1 Hz is equal to one clock tick? I guess that what it is referring to is the Linux timer interrupt frequency again.

>> Specifically, the timer.c::calc_load() function will run the algorithm every 5 * Hz, or roughly every five seconds.

That makes sense if the Hz value is 1 (one cycle per second). — Preceding unsigned comment added by 69.84.133.248 (talk) 15:24, 25 April 2013 (UTC)[reply]


Notes

[edit]

There is currently one note regarding an error in a paper. This should probably be signaled to the author of that paper instead of sitting on a wiki page. What's also missing is some reasoning behind the error that was signaled.

Would be nice if the wiki page would recommend a correct reference for a complete description of the concept.~~ — Preceding unsigned comment added by 86.124.79.201 (talk) 07:26, 14 March 2015 (UTC)[reply]

There is no error.

194.166.103.157 (talk) 21:41, 17 March 2015 (UTC)[reply]

definition is ridiculous

[edit]

The term "system load" (the title of this article) cannot be defined using the term itself.

In UNIX computing, the system load is a measure of the amount of computational work that a computer system performs. The load averagerepresents the average system load over a period of time. It conventionally appears in the form of three numbers which represent the system load during the last one-, five-, and fifteen-minute periods.

This is an unacceptable definition.

I am capable of providing a substantial rewrite to this article and creating one that is much more useful in explaining how this works on Unix systems (and probably the article should be retitled as 'system load' is a dramatically different term on Unix (&Unix-like) systems than it is on mainframe and Windows systems (as well as realtime systems like embedded). I note also that this article has "needed references" since 2010.

As somebody who has been an editor of this fine encyclopedia since 2004, I am hesitant to even bother making these changes as I am aware the piranhas will attack and say "well you cannot just go and edit this thing without sources." I have been a Unix engineer for twenty years and I am a source. You are unlikely to find sources to corroborate things in this article as Unix is a moving target. As kernels in Linux have progressed from 2.2 to 2.4 to 2.6 and the 3 kernels, system load has changed dramatically and how we measure it has changed.

I therefore also feel that the tag demanding this article be better referenced is preposterous. The tag should be removed, and people who have their dander up about sources on this article should perhaps be asked to undander themselves. ... jane avriette:talk 19:15, 28 October 2015 (UTC)[reply]

I am retired editor but i have made a pledge to provide requests for comments when asked. Unfortunately the point about references can not be by passed however the definition of references of taggers is generally they do not understand each topic and think that we need 1000 references, but the Wikipedia policy on reference does not say we need lot but for anything written it must be backed up by a reliable third party source. You are quite entitle to rewrite it using your knowledge but you would need to provide a reference that backs up what your saying in this day and age finding reference is not as hard as it was 10 years ago granted this topic will have less references than say a well published subject but there is references. My suggestion is to make a sub test page of the article and then write up your changes how you think the page should be and then request for comment on that as long as it complies with principle standards of Wikipedia then it can not be objected you will find there is some editors who will quote obscure policies to block you but these policies are all derived from the core policies which i can quote later if there is issues. Good luck on the rewrite as it does need a rewrite as with a lot of other articles also if you think the subject matter ie the title is not write then suggest a new one and we can have it moved if supportedAndrewcrawford (talk - contrib) 17:24, 29 October 2015 (UTC)[reply]
these "obscure policies" are why i stopped editing. ... jane avriette:talk 20:18, 18 November 2015 (UTC)[reply]