c - Stack assignment to a thread -


i've been trying piece how stack memory handed out threads. haven't been able piece whole thing together. tried go code, i'm more confused, i'm asking help.

i asked question little while ago. assume particular program (therefore, threads within same process). if write printfs each beginning of stack pointer, , how allocated them, stuff table @ end of message, first column time_t usec, second doesn't matter, third tid of thread, fourth guard size, begin of stack, end of stack (sorted beginning of stack), last 1 allocated stack (8 megs default) , last column difference between end of first allocated stack, , beginning of next stack.

this means (i think), if 0, stacks contiguous, if positive, since stack grows down in memory, means there "free space" of many mbs between tid , next (in memory). if negative, means memory being reused. may mean that stack space has been freed before thread created.

my problem is: algorithm assigns stack space threads (at higher level code) , why contiguous stacks, , not, , values 7.94140625 , 0.0625 in last column?

this linux 2.6, c , pthreads.

this may question have iterate on right, , apologize, i'm telling know right now. feel free ask clarifications.

thanks this. table follows.

52815   14  14786   4096    92549120    100941824   8392704 0 52481   14  14784   4096    100941824   109334528   8392704 0 51700   14  14777   4096    109334528   117727232   8392704 0 70747   14  14806   4096    117727232   126119936   8392704 8.00390625 75813   14  14824   4096    117727232   126119936   8392704 0 51464   14  14776   4096    126119936   134512640   8392704 8.00390625 76679   14  14833   4096    126119936   134512640   8392704 -4.51953125 53799   14  14791   4096    139251712   147644416   8392704 -4.90234375 52708   14  14785   4096    152784896   161177600   8392704 0 50912   14  14773   4096    161177600   169570304   8392704 0 51617   14  14775   4096    169570304   177963008   8392704 0 70028   14  14793   4096    177963008   186355712   8392704 0 51048   14  14774   4096    186355712   194748416   8392704 0 50596   14  14771   4096    194748416   203141120   8392704 8.00390625 

first, stracing simple test program launches single thread, can see syscalls used create new thread. here's simple test program:

#include <pthread.h> #include <stdio.h>  void *test(void *x) { }  int main() {         pthread_t thr;         printf("start\n");         pthread_create(&thr, null, test, null);         pthread_join(thr, null);         printf("end\n");         return 0; } 

and relevant portion of strace output:

write(1, "start\n", 6start )                  = 6 mmap2(null, 8392704, prot_read|prot_write, map_private|map_anonymous|map_stack, -1, 0) = 0xf6e32000 brk(0)                                  = 0x8915000 brk(0x8936000)                          = 0x8936000 mprotect(0xf6e32000, 4096, prot_none)   = 0 clone(child_stack=0xf7632494, flags=clone_vm|clone_fs|clone_files|clone_sighand|clone_thread|clone_sysvsem|clone_settls|clone_parent_settid|clone_child_cleartid, parent_tidptr=0xf7632bd8, {entry_number:12, base_addr:0xf7632b70, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0xf7632bd8) = 9181 futex(0xf7632bd8, futex_wait, 9181, null) = -1 eagain (resource temporarily unavailable) write(1, "end\n", 4end )                    = 4 exit_group(0)                           = ? 

we can see obtains stack mmap prot_read|prot_write protection , map_private|map_anonymous|map_stack flags. protects first (ie, lowest) page of stack, detect stack overflows. rest of calls aren't relevant discussion @ hand.

so, then, how mmap allocate stack, then? well, let's start @ mmap_pgoff in linux kernel; entry point modern mmap2 syscall. delegates do_mmap_pgoff after taking locks. calls get_unmapped_area find appropriate range of unmapped pages.

unfortunately, calls function pointer defined in vma - 32-bit , 64-bit processes can have different ideas of addresses can mapped. in case of x86, defined in arch_pick_mmap_layout, switches based on whether it's using 32-bit or 64-bit architecture process.

so let's @ implementation of arch_get_unmapped_area then. first gets reasonable defaults search find_start_end, tests see if address hint passed in valid (for thread stacks, no hint passed). starts scanning through virtual memory map, starting cached address, until finds hole. saves end of hole use in next search, returns location of hole. if reaches end of address space, starts again start, giving 1 more chance find open area.

so can see, normally, assign stacks in increasing manner (for x86; x86-64 uses arch_get_unmapped_area_topdown , assign them decreasing). however, keeps cache of start search, might leave gaps depending on when areas freed. in particular, when mmaped area freed, might update free-address-search-cache, might see out of order allocations there well.

that said, implementation detail. do not rely on of in program. take addresses mmap hands out , happy :)


Comments

Popular posts from this blog

linux - Using a Cron Job to check if my mod_wsgi / apache server is running and restart -

actionscript 3 - TweenLite does not work with object -

jQuery Ajax Render Fragments OR Whole Page -