c - Stack assignment to a thread -
i've been trying piece how stack memory handed out threads. haven't been able piece whole thing together. tried go code, i'm more confused, i'm asking help.
i asked question little while ago. assume particular program (therefore, threads within same process). if write printf
s each beginning of stack pointer, , how allocated them, stuff table @ end of message, first column time_t usec
, second doesn't matter, third tid of thread, fourth guard size, begin of stack, end of stack (sorted beginning of stack), last 1 allocated stack (8 megs default) , last column difference between end of first allocated stack, , beginning of next stack.
this means (i think), if 0, stacks contiguous, if positive, since stack grows down in memory, means there "free space" of many mbs between tid , next (in memory). if negative, means memory being reused. may mean that stack space has been freed before thread created.
my problem is: algorithm assigns stack space threads (at higher level code) , why contiguous stacks, , not, , values 7.94140625 , 0.0625 in last column?
this linux 2.6, c , pthreads.
this may question have iterate on right, , apologize, i'm telling know right now. feel free ask clarifications.
thanks this. table follows.
52815 14 14786 4096 92549120 100941824 8392704 0 52481 14 14784 4096 100941824 109334528 8392704 0 51700 14 14777 4096 109334528 117727232 8392704 0 70747 14 14806 4096 117727232 126119936 8392704 8.00390625 75813 14 14824 4096 117727232 126119936 8392704 0 51464 14 14776 4096 126119936 134512640 8392704 8.00390625 76679 14 14833 4096 126119936 134512640 8392704 -4.51953125 53799 14 14791 4096 139251712 147644416 8392704 -4.90234375 52708 14 14785 4096 152784896 161177600 8392704 0 50912 14 14773 4096 161177600 169570304 8392704 0 51617 14 14775 4096 169570304 177963008 8392704 0 70028 14 14793 4096 177963008 186355712 8392704 0 51048 14 14774 4096 186355712 194748416 8392704 0 50596 14 14771 4096 194748416 203141120 8392704 8.00390625
first, stracing simple test program launches single thread, can see syscalls used create new thread. here's simple test program:
#include <pthread.h> #include <stdio.h> void *test(void *x) { } int main() { pthread_t thr; printf("start\n"); pthread_create(&thr, null, test, null); pthread_join(thr, null); printf("end\n"); return 0; }
and relevant portion of strace output:
write(1, "start\n", 6start ) = 6 mmap2(null, 8392704, prot_read|prot_write, map_private|map_anonymous|map_stack, -1, 0) = 0xf6e32000 brk(0) = 0x8915000 brk(0x8936000) = 0x8936000 mprotect(0xf6e32000, 4096, prot_none) = 0 clone(child_stack=0xf7632494, flags=clone_vm|clone_fs|clone_files|clone_sighand|clone_thread|clone_sysvsem|clone_settls|clone_parent_settid|clone_child_cleartid, parent_tidptr=0xf7632bd8, {entry_number:12, base_addr:0xf7632b70, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0xf7632bd8) = 9181 futex(0xf7632bd8, futex_wait, 9181, null) = -1 eagain (resource temporarily unavailable) write(1, "end\n", 4end ) = 4 exit_group(0) = ?
we can see obtains stack mmap prot_read|prot_write
protection , map_private|map_anonymous|map_stack
flags. protects first (ie, lowest) page of stack, detect stack overflows. rest of calls aren't relevant discussion @ hand.
so, then, how mmap
allocate stack, then? well, let's start @ mmap_pgoff
in linux kernel; entry point modern mmap2
syscall. delegates do_mmap_pgoff
after taking locks. calls get_unmapped_area
find appropriate range of unmapped pages.
unfortunately, calls function pointer defined in vma - 32-bit , 64-bit processes can have different ideas of addresses can mapped. in case of x86, defined in arch_pick_mmap_layout
, switches based on whether it's using 32-bit or 64-bit architecture process.
so let's @ implementation of arch_get_unmapped_area
then. first gets reasonable defaults search find_start_end
, tests see if address hint passed in valid (for thread stacks, no hint passed). starts scanning through virtual memory map, starting cached address, until finds hole. saves end of hole use in next search, returns location of hole. if reaches end of address space, starts again start, giving 1 more chance find open area.
so can see, normally, assign stacks in increasing manner (for x86; x86-64 uses arch_get_unmapped_area_topdown
, assign them decreasing). however, keeps cache of start search, might leave gaps depending on when areas freed. in particular, when mmaped area freed, might update free-address-search-cache, might see out of order allocations there well.
that said, implementation detail. do not rely on of in program. take addresses mmap
hands out , happy :)
Comments
Post a Comment