> the new strategy pthread_create now uses is to only put TLS on the
> application-provided stack if TLS is smaller than 1/8 of the stack
> size or 2k, whichever is smaller. this ensures that the application
> always has "close enough" to what it requested, and the threshold is
> chosen heuristically to make sure "sane" amounts of TLS still end up
> in the application-provided stack.
>
> if TLS does not fit the above criteria, pthread_create uses mmap to
> obtain space for TLS, but still uses the application-provided stack
> for actual call frame stack. this is to avoid wasting memory, and for
> the sake of supporting ugly hacks like garbage collection based on
> assumptions that the implementation will use the provided stack range.
Here is a one-liner for people to investigate large p_memsz(PT_TLS)
for i in /usr/bin/*(.) /usr/lib/x86_64-linux-gnu/*; do readelf -Wl $i NN | awk '$1=="TLS" && strtonum($6) > 64 {printf "%d\t", strtonum($6); exit(1)}' || echo $i; done
(Adjust paths and `> 64` as needed.)
It seems that chrome has smaller PT_TLS now. On my machine there are DSOs with large p_memsz(PT_TLS):
libtsan.so.2.0.0 has a 785760 byte p_memsz(PT_TLS) due to a large vector clock (TSAN runtime v3 has a much smaller one).
liblsan.so.0.0.0 (56240)
libglog.so.0.6.0 (30457)
libmetis.so.5.1.0 (28920)
...
PTHREAD_ STACK_MIN/ RLIMIT_ STACK/minstack computation is now in sysdeps/ nptl/pthread_ early_init. h . I think there is not much change from 2012.
I agree that the current behavior is not ideal. musl has a heuristic to address the problem (https:/ /git.musl- libc.org/ cgit/musl/ commit/ ?id=d5142642b8e 6c45449158efdb8 f8e87af4dafde8):
> the new strategy pthread_create now uses is to only put TLS on the provided stack if TLS is smaller than 1/8 of the stack provided stack. provided stack
> application-
> size or 2k, whichever is smaller. this ensures that the application
> always has "close enough" to what it requested, and the threshold is
> chosen heuristically to make sure "sane" amounts of TLS still end up
> in the application-
>
> if TLS does not fit the above criteria, pthread_create uses mmap to
> obtain space for TLS, but still uses the application-
> for actual call frame stack. this is to avoid wasting memory, and for
> the sake of supporting ugly hacks like garbage collection based on
> assumptions that the implementation will use the provided stack range.
Here is a one-liner for people to investigate large p_memsz(PT_TLS)
for i in /usr/bin/*(.) /usr/lib/ x86_64- linux-gnu/ *; do readelf -Wl $i NN | awk '$1=="TLS" && strtonum($6) > 64 {printf "%d\t", strtonum($6); exit(1)}' || echo $i; done
(Adjust paths and `> 64` as needed.)
It seems that chrome has smaller PT_TLS now. On my machine there are DSOs with large p_memsz(PT_TLS):
libtsan.so.2.0.0 has a 785760 byte p_memsz(PT_TLS) due to a large vector clock (TSAN runtime v3 has a much smaller one).
liblsan.so.0.0.0 (56240)
libglog.so.0.6.0 (30457)
libmetis.so.5.1.0 (28920)
...