In asm for CPUs, an optimized tailcall is just a jump to the function entry point. I.e. running the whole function as a loop body in the case of tail-recursion. (Without pushing a return address, so when you reach the base-case it's just one return back to the ultimate parent.) I'm going to take a wild guess that Erlang / BEAM bytecode is remotely similar, even though I know nothing about it specifically. When execution reaches the top of a function, it doesn't know whether it got there by recursion or a call from another function and would thus have to allocate more space if it needed any. If you want to reuse already-allocated stack space, you'd have to further optimize the tail-recursion into an actual loop inside the function body, not recursion at all anymore. Or to put it another way, to tailcall anything, you need the callstack in the same state it was in on function entry. Jumping instead of calling loses the opportunity to do any cleanup after the called function returns, because it returns to your caller, not to you. But can't we just put the stack-cleanup in the recursion base-case that does actually return instead of tailcalling? Yes, but that only works if the "tailcall" is to a point in this function after allocating new space is already done, not the entry point that external callers will call. Those 2 changes are exactly the same as turning tail-recursion into a loop.
(Disclaimer: This is a guess) Tail-recursion calls does not mean that it cannot perform any other call previously or use the stack in the meantime. In that case, the allocated stack for those calls must be deallocated before performing the tail-recursion. The call_last deallocates surplus stack before behaving like call_only . You can see an example if you erlc -S the following code:
{function, fun1, 1, 2}.
{label,1}.
{line,[{location,"test.erl",4}]}.
{func_info,{atom,test},{atom,fun1},1}.
{label,2}.
{test,is_nonempty_list,{f,3},[{x,0}]}.
{get_list,{x,0},{x,1},{x,2}}.
{test,is_eq_exact,{f,1},[{x,1},{integer,1}]}.
{move,{x,2},{x,0}}.
{call_only,1,{f,2}}. % No stack allocated, no need to deallocate it
{label,3}.
{test,is_nil,{f,1},[{x,0}]}.
{move,{atom,ok},{x,0}}.
return.
{function, funN, 0, 5}.
{label,4}.
{line,[{location,"test.erl",10}]}.
{func_info,{atom,test},{atom,funN},0}.
{label,5}.
{allocate_zero,1,0}. % Allocate 1 slot in the stack
{call,0,{f,7}}. % Leaves the result in {x,0} (the 0 register)
{move,{x,0},{y,0}}.% Moves the previous result from {x,0} to the stack because next function needs {x,0} free
{call,0,{f,7}}. % Leaves the result in {x,0} (the 0 register)
{test_heap,4,1}.
{put_list,{x,0},nil,{x,0}}. % Create a list with only the last value, [B]
{put_list,{y,0},{x,0},{x,0}}. % Prepend A (from the stack) to the previous list, creating [A, B] ([A | [B]]) in {x,0}
{call_last,1,{f,2},1}. % Tail recursion call deallocating the stack
{function, list, 0, 7}.
{label,6}.
{line,[{location,"test.erl",15}]}.
{func_info,{atom,test},{atom,list},0}.
{label,7}.
{move,{literal,[1,2,3,4]},{x,0}}.
return.
EDIT:
To actually answer your questions: The thread's memory is used for both the stack and the heap, which use the same memory block in opposite sides, growning towards each other (the thread's GC triggers when they meet). "Allocating" in this case means increasing the space used for the stack, and if that space is not going to be used anymore, it must be deallocated (returned to the memory block) in order to be able to use it again later (either as heap or as stack).
2条答案
按热度按时间gxwragnw1#
In asm for CPUs, an optimized tailcall is just a jump to the function entry point. I.e. running the whole function as a loop body in the case of tail-recursion. (Without pushing a return address, so when you reach the base-case it's just one return back to the ultimate parent.)
I'm going to take a wild guess that Erlang / BEAM bytecode is remotely similar, even though I know nothing about it specifically.
When execution reaches the top of a function, it doesn't know whether it got there by recursion or a call from another function and would thus have to allocate more space if it needed any.
If you want to reuse already-allocated stack space, you'd have to further optimize the tail-recursion into an actual loop inside the function body, not recursion at all anymore.
Or to put it another way, to tailcall anything, you need the callstack in the same state it was in on function entry. Jumping instead of calling loses the opportunity to do any cleanup after the called function returns, because it returns to your caller, not to you.
But can't we just put the stack-cleanup in the recursion base-case that does actually return instead of tailcalling? Yes, but that only works if the "tailcall" is to a point in this function after allocating new space is already done, not the entry point that external callers will call. Those 2 changes are exactly the same as turning tail-recursion into a loop.
b91juud32#
(Disclaimer: This is a guess)
Tail-recursion calls does not mean that it cannot perform any other call previously or use the stack in the meantime. In that case, the allocated stack for those calls must be deallocated before performing the tail-recursion. The
call_last
deallocates surplus stack before behaving likecall_only
.You can see an example if you
erlc -S
the following code:I've annotated the relevant parts:
EDIT:
To actually answer your questions:
The thread's memory is used for both the stack and the heap, which use the same memory block in opposite sides, growning towards each other (the thread's GC triggers when they meet).
"Allocating" in this case means increasing the space used for the stack, and if that space is not going to be used anymore, it must be deallocated (returned to the memory block) in order to be able to use it again later (either as heap or as stack).