How many threads does the Linux kernel spawn typically? Last time I attached a ...

david_chisnall /

npub13gc…k50d

2024-05-28 07:29:41

in reply to nevent1q…82sz

How many threads does the Linux kernel spawn typically? Last time I attached a debugger straight after boot it was 900 (I think ZFS was the largest single consumer). The ones for NFS have larger stacks because NFS has some deep calls.

With that many, adding 4 KiB adds almost 4 MiB of wired memory. On a typical desktop, that’s noise and no one cares. For embedded systems (consumer routers and so on) it’s much more of a problem and that’s where the pushback came from last time someone suggested bumping the default.

For the threads associated with userpace threads, it’s different. A userspace thread will have at least a page of userspace stack, a kernel thread structure, and typically a page table page for the stack and its guard page (unless they’re very densely packed), so the memory overhead of a new thread is quite small. If you have a modern x86 machine with AVX-512, you have around 3 KiB just for the CPU state on context switch (kernel threads don’t have FPU state unless they opt in, which most don’t).

Java VMs implement N:M threading, so don’t typically create a lot of kernel threads for a lot of Java threads. The same is true of Go.

The NT kernel was designed at a time when a workstation might have only 4 KiB and so wires just enough disk driver to be able to pull pages back in. All of the metadata required find a page is stored in the invalid PTE. This means that page-table pages can also be paged out, with each step on the page-table walk faulting and bringing back more of the page table until the real page is loaded. Linux and FreeBSD both store extra metadata for paged memory, which is why it’s fairly easy to support things like CHERI and MTE, whereas on Windows it requires significant reachitecture of the virtual memory subsystem. The NT VM subsystem is slightly larger, in lines of code, than a minimal build of the Linux kernel. I completely understand why the NT choices made sense in the early ‘90s but I would not encourage anyone to copy them. Needing a few MiBs more wired memory in exchange for a drastically simpler and more flexible virtual memory model is absolutely the right trade this century.

Author Public Key

npub13gcwraghdcw9xzkg3tky24feu0lzkhtlt57wpdn58y4m4tyr9qdsxdk50d

Show more details

Published at

2024-05-28 07:29:41

Kind type

1 Short Text Note

Event JSON

{ "id": "6d88cd9d953ad60b56519b8095d97ede84105cdd881a143024008552d5e04dbb", "pubkey": "8a30e1f5176e1c530ac88aec455539e3fe2b5d7f5d3ce0b674392bbaac83281b", "created_at": 1716881381, "kind": 1, "tags": [ [ "p", "058a6d106c5e6719008ce4db3f64c846caf49925227a39533d12a846fbab21ee" ], [ "e", "7ad94d5461bb6771f851a347e2d7253514001d311df14133b43097c9ec78ac9f", "", "root" ], [ "proxy", "https://infosec.exchange/@david_chisnall/112517538239728116", "web" ], [ "proxy", "https://infosec.exchange/users/david_chisnall/statuses/112517538239728116", "activitypub" ], [ "L", "pink.momostr" ], [ "l", "pink.momostr.activitypub:https://infosec.exchange/users/david_chisnall/statuses/112517538239728116", "pink.momostr" ] ], "content": "How many threads does the Linux kernel spawn typically? Last time I attached a debugger straight after boot it was 900 (I think ZFS was the largest single consumer). The ones for NFS have larger stacks because NFS has some deep calls.\n\nWith that many, adding 4 KiB adds almost 4 MiB of wired memory. On a typical desktop, that’s noise and no one cares. For embedded systems (consumer routers and so on) it’s much more of a problem and that’s where the pushback came from last time someone suggested bumping the default.\n\nFor the threads associated with userpace threads, it’s different. A userspace thread will have at least a page of userspace stack, a kernel thread structure, and typically a page table page for the stack and its guard page (unless they’re very densely packed), so the memory overhead of a new thread is quite small. If you have a modern x86 machine with AVX-512, you have around 3 KiB just for the CPU state on context switch (kernel threads don’t have FPU state unless they opt in, which most don’t).\n\nJava VMs implement N:M threading, so don’t typically create a lot of kernel threads for a lot of Java threads. The same is true of Go. \n\nThe NT kernel was designed at a time when a workstation might have only 4 KiB and so wires just enough disk driver to be able to pull pages back in. All of the metadata required find a page is stored in the invalid PTE. This means that page-table pages can also be paged out, with each step on the page-table walk faulting and bringing back more of the page table until the real page is loaded. Linux and FreeBSD both store extra metadata for paged memory, which is why it’s fairly easy to support things like CHERI and MTE, whereas on Windows it requires significant reachitecture of the virtual memory subsystem. The NT VM subsystem is slightly larger, in lines of code, than a minimal build of the Linux kernel. I completely understand why the NT choices made sense in the early ‘90s but I would not encourage anyone to copy them. Needing a few MiBs more wired memory in exchange for a drastically simpler and more flexible virtual memory model is absolutely the right trade this century.", "sig": "2fbc55ab70fe2f52a9d207725b37114fca4aeea6d163cde89747f398682519720f7cc0cabba82d74eecf347b54314521b06bc123283f1dc19a5d0391f409bcb1" }