Talk:Call stack/Archive 1
This is an archive of past discussions about Call stack. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 |
Moving to 'call stack'
I plan on moving this article to give it the title 'call stack' soon. This is the name by which the stack is usually known, as is made clear by, for instance, this data on Google hits for various exact phrases "X stack":
Phrase | Hits | Comments |
---|---|---|
"function stack" | 7,620 | |
"call stack" | 170,000 | |
"execution stack" | 28,000 | |
"control stack" | 16,000 | |
"call return stack" | 551 | |
"return stack" | 23,200 | (includes "call/return stack") |
"return address stack" | 5,280 | |
"address stack" | 8,700 | (includes return address stack) |
"procedure stack" | 954 | |
"data stack" | 26,600 | |
"parameter stack" | 5,260 | |
"frame stack" | 7,020 | |
"runtime stack" | 11,800 | (usually like "runtime stack overflow") |
This should be done with a move, not a copy-paste, to preserve the edit history. (that may require help via WP:RM, or maybe not). There'll be a few fixups to do too. -R. S. Shaw 22:11, 16 February 2006 (UTC)
- Sounds fine. Anything that makes it clear that the stacks primary purpose is tracking function returns, and not necessarily params/locals, is fine by me. --Mgreenbe 00:48, 17 February 2006 (UTC)
History
It would be nice to see some material about the history of these concepts. For instance, who came up with the concept (and the name) "activation record". I presume it must have appeared in the first Algol 60 implementation? Tsferreira 17:59, 13 March 2006 (UTC)
Stack unwinding
The sentence mentioning "stack unwinding" may accurately summarise the stub article, but it not right. Stack unwinding is usually performed in the context of an exception, where control is transferred up the stack. The tricky part is making sure that you destruct and/or deallocate the appropriate objects. For example, see [1] or [2], just the first two hits for a google search on "stack unwinding". There are other contexts, e.g. stack unwinding for C code. So I think that this sentence needs to be removed now, and replaced soon with a proper treatment of "stack unwinding", which I'm sorry to say probably belongs in its own article. --Mike Van Emmerik 11:24, 21 January 2006 (UTC)
- Agreed, and the section on winding; The Forth programming language allows explicit winding of the call stack (called there the "return stack"). is completely wrong. Forth doesn't work that way; there's no frame to wind or unwind. Alex 21:49, 28 March 2006 (UTC)
Return processing
In the recently added section "Return processing", the last sentence says essentially that there is typically no cleanup required by the caller. This is true for RISC processors. For CISC processors using PUSH and POP instructions, it is also true for the callee-pop calling convention, but not for the caller-pop convention. Or of course it is trivially true if there happen to be no arguments pushed, or the arguments are passed in registers (but when there are enough parameters, some arguments are passed on the stack anyway).
In other words, there is a large set of programs (CISC architecture, nonzero arguments, and caller-pop convention as used by C) for which this statement is misleading. I can't immediately think of a change that is compact but not misleading. --Mike Van Emmerik 00:42, 9 April 2006 (UTC)
- I've revised it to cover this variation. -R. S. Shaw 21:17, 10 April 2006 (UTC)
Processor Support
This article should include a discussion of built-in processor support for stack pointers and/or call frames, e.g. which chip architecture introduced this concept, how it works, etc. —The preceding unsigned comment was added by Tzadikv (talk • contribs) 14:57, 10 May 2007 (UTC).
Hardware stack
Shouldn't this be merged with the Hardware stack discussion in Stack? —Preceding unsigned comment added by 99.235.205.120 (talk) 21:21, 21 February 2008 (UTC)
Stuff to merge in
If anyone wants to merge, here is a (low quality) article I wrote for our company wiki, which explains how a function stack is managed in detail. It's not yet encyclopedic quality.
The stack of a process is a data structure that manages several things:
- Storage for lexical variable containers
- Control flow management, especially the BSR type calls
- Parameter passing
- Function return value passing
Stack Structure
The stack structure is very simple
- Preallocated memory region
- Base pointer
- Stack pointer
The base pointer always stays the same.
It's a LIFO structure - when more data is added to the stack, the stack pointer increases to mark it's end.
When data is removed, the stack pointer decreases.
All data is stored in preallocated space.
Scopes in the stack
In order to understand the way the stack is used for running high level langauges, two fundamental scoping principals must be understood
The Lexical Scope
Lexical scoping has to do with the place of definition - what blocks are nested in what other blocks.
It is namespace oriented.
The Dynamic Scope
Dynamic scoping has to do with the call chain - which function called which other function.
An example
int fib (int n) { if (n < 2) { return n; }
return fib(n-1) + fib(n-2); }
void foo (int n) { int f = fib(n); printf("fib(%d) = %d", n, f); }
The lexical scopes in this example are illustrated by the fact that each function has it's own lexical variable called <n>.
The dynamic scope is illustrated by the fact that
foo
callsfib
, andfib
calls itself.The stack for these calls is constructed as follows:
- foo is called with a number:
- Space on the stack is allocated for foo's return value
- A pointer to the instruction after the call to foo is placed on the stack
- The single parameter to foo is placed on the stack
- The instruction pointer is set to the address of foo
- foo's lexical scope is "entered":
- space for the integer f is allocated on the stack
- fib is called with a number:
- space on the stack is allocated for fib's return value
- the pointer to the instruction after the call to fib, in this case the assignment to f is placed on the stack
- the single parameter to fib, n, is placed on the tack
- the instruction pointer is set to the address of fib
- fib checks if the number is smaller than 2
- If it is
- The space allocated for fib's return value is set to n
- fib's storage (parameters, lexical variables) are deleted from the stack
- the instruction pointer is set to the pointer to the place after the call, as found on the stack
- If it isn't
- two callls to fib are made, like illustrated, and the sum of their return values is returned
- fib returns a value
- the instruction after the call to fib in foo is cleans up the return value, and eventually assigns the value to the space in the variable f (also on the stack)
- the a call to printf is made, much like the call to fib
- foo's lexical scope is "left"
- the storage space for 'f' is deleted from the stack
By analyzing this data structure, programs like pstack can look at the pointers to the callers, and the instruction pointer register, and determine the functions the places in the code belong to.
From then on, the function pointers are looked up in the symbol table, and resolved to names. The type information can then be used to unpack the allocated spaces on the stack. If the full core of the program is available, these parameters can be recursively traversed, pretty printed, displayed, and so forth.
Analyzing structures such as the stack is the job of the debugger.
alloca
Alloca works by dynamically allocating additional arbitrary space on the stack for any purpose. This memory does not need to be deleted, because when the function is left, the stack space will be resused for calls to other functions eventually.
This is also why it's wrong to give any other function data alllocated using alloca.
Block structures
Lexical variables are allocated in this manner:
void foo() { int x; /* allocates 1*sizeOf(int) on the stack */
int moose[10]; /* allocates 1*sizeOf(int*) + 10*sizeOf(int) on the stack, */ /* and places the pointer to the 10 ints inside the */ /* one int* */
for (x = 0; x < 10; x++) { int y; /* 1*sizeOf(int) allocated on the stack every time the */ /* loop body is executed */
} /* before the new block the stack end pointer is decremented, and y's space is reused */ } /* When the function returns, the stack is decremented by 1*sizeOf(int) + */ /* 1*sizeOf(int*) + 10*sizeOf(int) here, to account for the lexical */ /* variables (x and moose[10]) in the function's body. */
/* For every call to foo(), x and moose[10] will be allocated and */ /* deallocated once. y will be allocated and deallocated 10 times */
Structural details
The actual structure of the stack varies greatly from platform to platform.
It should be noted that to support things like 'alloca' more annotation on the structure needs to be kept, and the top of the stack frame must be known at all times in order to account for the dynamicity.
These details are beyond the concept illustrated here, and only assembley programmers and compiler/debugger authors should really care about this anyway.
Stack overflow
The stack has a limited amount of preallocated space.
Scenarios when too much space is needed include
- Deeply recursive call chains
- Call chains to functions with large amounts of lexical variable data
- Heavy use of
alloca
When the stack space is exhausted, the program can no longer make calls to functions.
Furthermore, in some situations stack space is needed just to throw an error, so these situations get very ugly very fast.
Tail Call Optimization
Tail call optimization works by omitting the stack frame of the caller when it is known that the caller will never be needed again. This is morally equivelent to using 'goto' into a function call. This can be applied by certain compilers in scenarios such as this:
void moose () { int x = function_with_no_param(); }
int function_with_no_param () { int default_value = ...; return function_with_param(default_value); }
int function_with_param (int default_value) { ... return x; }
When function_with_no_param makes the call to function_with_param, the stack frame for function_with_no_param can be thrown away, since there is nothing else for it to do. This can significantly save on stack space. For example, if you were to calculate the length of a linked list as such:
int length (list) { return length_recursive(list, 0); }
int length_recursive (list, offset); if (is_end(list)) { return offset; } return length_recursive(list->next, offset+1); }
and your compiler did tail call optimizations, then this example will use a constant amount of stack space no matter how long the list, somewhat like if you were to do a loop:
int length (list) { int l; for (l = 0; !is_end(list); list = list->next) { l++; } return l; }
nuffin 12:32, September 12, 2005 (UTC)
- I strongly disagree that this should be merged in. They are 100% code/compiler optimizations (all are done actually by inlining [branch collapse] in modern compilers). By the time the CPU is involved, the stack frame doesn't exist generally because a subroutine wasn't actually called (it doesn't actually even exist). Anytime a compiler or smart human can recognize that a branch is actually a constant expression of some sort, such as in all of these examples, then this is possible. It actually has nothing to do with the call stack at all beyond the fact that you're reducing the number of stack frames. In fact, it's other (arguably much more important) impact is reducing the branch load on the CPU. It only belongs in an article about code/compiler optimization techniques (which is exactly what Tail call optimization is about). Merging it here is akin to merging the entire Meat pie article into the Pastry article because of a purely tangential relationship. -- 67.182.14.249 (talk) 01:23, 29 October 2013 (UTC)
Merging with other wikipedia articles
I disagree with the merging proposal, just put references
- I also think it's not good to merge this with Subroutine. Subroutine is a basic programming concept whose main audience would be relatively unfamiliar with programming. The stack is an advanced topic about internals related to storage allocation, debugging, compiling, exception handling and other things beyond subroutines. -R. S. Shaw 20:22, 28 July 2005 (UTC)
- I disagree with the proposed merge also. Merging Stack unwinding here makes much more sense than merging the two stack articles with Subroutine. --Mike Van Emmerik 21:42, 7 August 2005 (UTC)
- I'm in favor of directing stack unwinding to either here, exception handling, or continuation. Stack frame should surely go here. The article badly, badly needs pictures and good explanation. --Mgreenbe 23:13, 3 January 2006 (UTC)
- Merged with stack unwinding. Stack frame might need some work. Strongly disagree that Subroutine should be merged. —EatMyShortz 08:58, 21 January 2006 (UTC)
- IMO Call Stack is a pretty good article as is (haven't read the whole thing) and should remain largely as is. There should also be a Stack Frame article that goes into greater detail about what's in the stack frame, like a few of the more popular architecture's implementations (Call Stack has little detail which I think is good, but the detail _should_ be somewhere). I agree with the first point on Subroutine. It's a concept all on it's own. I'm not sure whether Stack unwinding should be part of Call Stack or not, need some more thought on that, but I lean towards one article there. —Von Fugal 15:43, 5 April 2008 (UTC)
- I agree that it should not be merged. Call stack (saving return addresses) is logically distinct, though often combined, with stacking of arguments or local variables. It happens that on many processors it is convenient to do all on one stack, but the explanations should be separate. Gah4 (talk) 20:39, 10 June 2008 (UTC)
Merger proposal
Most, if not all, of the text in the article Stack-based memory allocation appears to be covered in this article. I propose merging it into this article; are there any thoughts on this? Hertzsprung 15:04, 14 August 2007 (UTC)
I don't think it is a good idea to merge them, since they are different concepts, the Call stack is a method that uses Stack-based memory allocation, but they are not the same thing. JorgeFierro (talk) 23:26, 11 September 2008 (UTC)
- Agreed. A call stack is stack-based memory allocation, but stack-based memory allocation is not a call stack. I'd consider merging stack-based memory allocation with dynamic memory allocation and sticking the Frankenstein result into memory allocation (a redirect page at the time of this writing); individual articles on both concepts seems unnecessary when both have only a little bit of content. However, it just isn't a matter of import. (Seems like a huuuuuge amount of Wikipedia regards trivialities like this...) --Jtgibson (talk) 00:26, 5 October 2008 (UTC)
NO A GOOD IDEA —Preceding unsigned comment added by 128.243.253.112 (talk) 23:59, 10 November 2008 (UTC)
Evaluation Stack
I'm no true expert, but the call stack (also known as the environmental stack) is a different struture as the evaluation (arithmetic) stack.
It is true, that in some implementation they are mixed together, but this is a conceptual flaw of the implementation.
But as I said, I am no expert. As for my information source: http://www.sbql.pl/Topics/Environment%20stack.html
- The functions of tracking calls, evaluatating arithmetic expressions, allocating temporary storage to a subroutine, and passing parameters are all separate functions. Each can be addressed by using a stack structure. The stacks used for these functions can all be separate, or can be brought together in any combination. The decision in an implementation to combine any or all of these stacks is an engineering decision, which will have better or worse efficiencies under different conditions. There is nothing conceptually wrong with combining any of these functions into the same stack. The functions are conceptually separate, but implementations can be more effective by combining some or all of them. -R. S. Shaw 18:42, 1 July 2006 (UTC)
- Exactly! I support your well formulated analysis and description. 83.255.35.198 (talk) 13:27, 24 March 2011 (UTC)
Perfomance analysis: call instruction vs. entire function call
"The reason is if a subroutine call instruction appears on the call stack for a certain fraction of execution time, its possible removal would save that much time." I don't think sampling the call stack is intended to measure the amount of time spent in the call instruction, but rather to measure the amount of time spent in the subroutine, which (except in the case of absolutely trivial functions) is much greater. This is in line with what is said on the Deep sampling page about eliminating "extraneous function calls". John lindgren (talk) 13:24, 16 June 2011 (UTC)
Error in "Structure" section ?
Original text "For example, if a subroutine named DrawPoint is currently running, having just been called by a subroutine DrawLine having been called by a DrawSquare subroutine, the top part of the call stack might be laid out like this (where the stack is growing towards the top):" and image below it do not match. Upper stack frame on image is for DrawLine, but text specifies that it should be for DrawPoint. Am I right ?
Cybevnm (talk) 21:47, 28 October 2011 (UTC)
- Yes, you are right. The text should not have described a "DrawPoint" as running; DrawLine is the code that is in the middle of execution when DrawLine's frame is at the top. When the illustration first went in, that is what the text said, but somebody changed it since then to have the DrawPoint reference. I have fixed it. -R. S. Shaw (talk) 19:41, 29 October 2011 (UTC)
"Activation record" redirect
Computers can have activation records and recursive function calls without using stacks. Computers built on the S/360 and AS/400 architectures use a linked list of activation records either in static memory or dynamically allocated from a heap-like area (using GETMAIN on the S/360 and Call Return Elements on the AS/400) instead of a stack. The caller's activation record holds the registers and parameters, and the callee's activation record holds its local variables. Architectures using this method typically use Branch and Link instructions for calls and might also have queue or linked list instructions to manipulate this data structure instead of the Push and Pop found on an architecture with a call stack. 69.54.60.34 (talk) 04:40, 11 November 2011 (UTC)
The introduction
For some reason, the second sentence just cracks me up. I have no idea why. -— Isarra (talk) 19:40, 11 January 2012 (UTC)
Incorrect statement
Article currently reads "The mixing of code (return addresses) and data (parameters, return values) in a call stack is a security risk, possibly exploitable through buffer overflows (in which article the risk and exploitation are explained)."
This is clearly an attempted reference to the exploit of buffer overflows by overwriting return addresses, allowing the exploit to snag control of the process. However, return addresses are not code, so to say "code (return addresses)" is incorrect. Return addresses are data about code, but this is (clearly) not the same thing. I don't think there's a way to fix this without significantly expanding the Security section, or referencing a buffer overflow article. It also might be worth noting that many current architectures don't allow execution of code on the stack (ref W^X, for example), making this even more obviously wrong.
- Not true; executable stacks are a security risk exploitable by buffer overflow attacks. Consider the situation where the buffer overflow not only overwrites the return address, but pushes arbitrary code onto the stack. If the attacker can somehow figure out the current stack pointer, they can overwrite the return to return into their arbitrary code, which is arguably more dangerous than just returning into other code in the program. mkehrt (talk) 01:58, 31 October 2008 (UTC)
- Good point about W^X. I added a mention of W^X to the article, and made sure the article no longer confuses "code" with "return addresses".
- I also mentioned the attack mkehrt described, even though W^X makes it irrelevant.
- Fixed. --DavidCary (talk) 18:51, 15 October 2014 (UTC)
Return address alternatives
This article claims that "Using a stack to save the return address has important advantages over alternatives."
What are those alternatives?
Should we describe those alternatives in this article? If not, what is a more appropriate Wikipedia article(s) for describing those alternatives? --DavidCary (talk) 16:51, 15 October 2014 (UTC)
- Alternatives that have been used include keeping the return address in a register. This is probably still being used on a few small embedded processors (which may have hardware support for say 8 registers for stacking return addresses, allowing a maximum nesting depth of eight). Another is saving the return address right next to the code of the called or calling routine. This was used in System/360 and earlier machines, as can be seen at the Hello World example in IBM Basic assembly language and successors#Examples. Doing that is neither reentrant nor allows recursion, and uses a writable area for code. The machine could have used a stack but IBM had an allergy to stacks (NIH; stacks were a Burroughs thing).
- If this stuff belongs in WP, I'd say it would better be in "Calling convention" than here. --R. S. Shaw (talk) 23:10, 15 October 2014 (UTC)
- Good point, R. S. Shaw. So I changed that sentence so this article now says "Using a stack to save the return address has important advantages over alternative calling conventions.".
- Alas, the Wikipedia "calling convention" article currently only lists calling conventions that *do* use a stack to save the return address, so it currently *appears* that there isn't really any alternative, but that's more of a problem with the Wikipedia "calling convention" article. --DavidCary (talk) 15:00, 17 December 2014 (UTC)
- Well, the Calling convention article is fine in that regard, here's a short quote from its lead section:
- ... return value is delivered from the callee back to the caller (on the stack, in a register, or within the heap)
- Rest of the article also describes how processor registers can be used for that purpose, for example in different architectures. — Dsimic (talk | contribs) 19:23, 21 December 2014 (UTC)
- Well, the Calling convention article is fine in that regard, here's a short quote from its lead section:
- Dear Dsimic, the "return address" that a caller somehow gives to a subroutine is different from any "return value" that a subroutine gives to a caller, right?
- Yes, the "calling convention" article does a great job covering the many ways a "return value is delivered from the [subroutine] back to the caller".
- However, in the current "calling convention" article, I only see one way a "return address" is delivered from the caller to the subroutine, even though this "call stack" article vaguely alludes to "alternatives" and R. S. Shaw points out one specific example of one alternative.
- Perhaps talk:calling convention#Standard Entry and Exit sequences. would be a better place to continue this discussion? --DavidCary (talk) 19:58, 21 December 2014 (UTC)
- It's about passing the return address implicitly to the subroutine on caller's behalf, not by the caller itself... Just think of the way high-level code is written: arguments are passed and return values are used, but the called subroutine comes back on its own. Thus, when a subroutine does its job, it knows where to return (that's where the return address kicks in), and it may return its result back to where it was called from; then, it's up to the caller to pop the return value off the stack, use the register value, etc.
- Calling convention article discusses the ways return addresses are passed around, please have a look at Calling convention § MIPS, Calling convention § SPARC and Calling convention § ARM sections, for example. Went ahead and clarified that a bit in its lead section. — Dsimic (talk | contribs) 20:20, 21 December 2014 (UTC)
Wrong rendering
http://en.m.wikipedia.org/wiki/File:Call_stack_layout.svg has a strange piece of text on the left. 185.19.20.132 (talk) 09:53, 29 April 2015 (UTC)
- Thanks, User:Offnfopt. 185.19.20.132 (talk) 16:46, 2 May 2015 (UTC)
Awkward / Unclear section in Description
In the Description section, the following passage,
...Since there is only one in this important context, it can be referred to as the stack (implicitly, "of the task"); however, in the Forth programming language the data stack or parameter stack is accessed more explicitly than the call stack and is commonly referred to as the stack (see below)...
is awkward and unclear to me. I don't know what 'the Fourth programming language' is. Also, what is 'this important context'? The word 'however' probably should not be used, because it implies that a different word will be used to describe this new kind of stack. This makes for confusing wording.
172.76.108.5 (talk)Bradley — Preceding undated comment added 19:06, 3 August 2015 (UTC)
- Before anything else, it's about the Forth programming language. Does that help? — Dsimic (talk | contribs) 06:25, 2 September 2015 (UTC)
Citations
Section 5 on Inspection says "Taking regular-time samples of the call stack can be useful in profiling the performance of programs." This section is marked 'dubious' and that does not sound right to me. Below are some references to this technique:
- These code examples of Java performance problems have been assembled, in part, to demonstrate the usefulness of this technique.
- Case Study of Performance Tuning / Mike Dunlavey /
- Building Better Applications / A Theory of Efficient Software Development / Mike Dunlavey / ISBN 978-0442017408
- Performance tuning with instruction-level cost derived from call-stack sampling / Mike Dunlavey / ACM SIGPLAN Notices 42, 8 (August, 2007), pp. 4-8
Erikostermueller (talk) 02:23, 3 October 2016 (UTC)
- Long story short, I think it's safe to remove that 'dubious' tag, and I think the references you gave above should be added to the section.
- That 'dubious' apparently was added by John lindgren (talk · contribs) in 2011, I think, see diff. However, at this time the text was this:
- Taking random-time samples of the call stack can be very useful in optimizing performance of programs. The reason is if a subroutine call instruction appears on the call stack for a certain fraction of execution time, its possible removal would save that much time.
- John lindgren used the tag to mark the second sentence of this text, and he added a corresponding comment to the talk page, now archived. See Talk:Call stack/Archive 1#Perfomance analysis: call instruction vs. entire function call. The text (and that second sentence in particular) was rewritten in 2013 by an anonymous user, commenting "Fixed the bogus performance analysis logic". See diff. After that it was cleaned up and refactored to the version that is now in the article.
- I am going to remove that tag. Perhaps you could find the time to work in the references you found?
- Tea2min (talk) 09:02, 3 October 2016 (UTC)
- Looks good to me, thanks. John lindgren (talk) 06:23, 17 February 2017 (UTC)
- I was looking in Stack-based Memory Allocation and it seemed like it did not have nearly enough information. The Call Stack page has all of the information that would be listed in the "Structure" section of the Stack-based memory allocation. 2600:6C5A:517F:E6F0:84D1:B06E:7261:FB6 (talk) 03:49, 17 January 2018 (UTC)Arsh