Why Does My eBPF Program Work on One Kernel but Fail on Another?

musicale 3 hours ago

Contempt for stable kernel data structures and APIs (and forget about any sort of kernel ABI) might make things easier for certain kernel developers, but it offloads a constant maintenance burden onto many other people, such as eBPF, driver, and kernel extension developers.

This sort of asymmetry is why system modules, and platforms in general, should absorb pain in order to benefit their many clients, rather than doing the opposite.

Could be worse though - some platforms (cough, iOS) are happy to break user apps every year and offload a constant maintenance burden onto many thousands of app developers, when a more stable ABI would save developers (and users) billions of dollars in aggregate.

beng-nl 3 hours ago

In Linux’s defense, the userland abi is stable, which is no small feat in terms of absorbing pain in order to benefit their many users..
Not sure why the trade-off consideration led to a different outcome for in-kernel api’s, but given the work done to ensure the stability of the userland abi, I’m sure there is thought behind it..
- musicale 3 hours ago
  
  > userland abi is stable
  The system call interface per se is relatively stable. Then there's all that stuff that has been dumped into /proc...
- DSMan195276 2 hours ago
  
  Well from a certain POV it's like stabilizing APIs internal to your application - nobody else should be calling them so "stabilizing" them just creates unnecessary maintenance work. Obviously in practice certain things like eBPF or externally-maintained drivers can break this model, but then they don't really want people doing those things vs. merging code into the kernel.

jstrong 10 minutes ago

wow that sounds like a PITA to deal with

Upvoter33 5 hours ago

There's some research on this topic, e.g., https://depsurf.github.io

linuxftw 5 hours ago

This is all covered in the eBPF documentation. CORE was introduced over 6 years ago.

mackman 4 hours ago

CORE only works on kernels that support BTF. This post introduces one workaround which is to generate BTF data for kernels without it. That's still only half the problem though. You also need to write your eBPF program so every kernel verifier passes it, even though every kernel's eBPF verifier has different bugs, capabilities, and complexity limits. I maintain a large eBPF program that supports 4.14 through 6.14. We implemented our own version of CORE before CORE really existed. In reality, it's a lot more work than "compile once run everywhere."
- roblabla 3 hours ago
  
  Yeah same, we maintain some eBPF probes spanning 4.11 to latest kernel, and holy hell, it's really bad. The worst offender being some old RedHat kernels with half-baked backports of the eBPF features containing a bunch of weird bugs or features that aren't perfectly in line with what's used in mainline...
  Here's a fun bug we recently had: we had to ban substractions in our program (replacing them with an __asm__ macro) because of a bug in linux kernel 5.7.0 to 5.10.10, which had the (indirect) impact of not properly tracking the valid min/max values in the verifier[0]. The worst part is, it didn't cause the verifier to reject our program outright - instead, it used that information to optimize out some branches it thought were never reachable, making for some really wonky to debug situation where the program was running an impossible control-flow[1], resulting in it returning garbage to user-space.
  All this to say, CORE is really only half the problem. Supporting every kernel in existance is still a huge effort. Still worth it compared to the alternative of writing a linux kernel driver though!
  [0]: https://github.com/torvalds/linux/commit/bc895e8b2a64e502fbb...
  [1]: https://github.com/torvalds/linux/blob/bc895e8b2a64e502fbba7...
- linuxftw 4 hours ago
  
  Kernels without BTF data are ancient at this point. BTF was added in 4.18, that was in 2018. 2018! If you're running a kernel older than that, you don't need BPF, you need a whole new operating system.
  Yes, each kernel version might have different features between then and now. You have to pick a minimum supported version and write against that.
  
  roblabla 3 hours ago
  
  Many, many distributions didn't embed the BTF information until fairly recently. OpenSUSE did it in 15.4, released in 2023. At $WORK, we have many customers running on distros that didn't have embedded BTF - such as RHEL7 (yes, they pay for extended maintenance).
  I really wish customers would update to a newer distro, but I also understand why they don't. So it's up to me to adapt.
  > You have to pick a minimum supported version and write against that.
  What we end up doing is progressively enabling features based on what's available in the kernel. Every eBPF we write is compiled multiple times with a couple of different flags to enable/disable certain features. It works decently well, and allows using the most capable datastructure/helpers based on the kernel version.

jeffrallen 7 hours ago

Feels like yet another example of "essential complexity driven by too much churn in infrastructural code".

I wonder why no one needs to write this article about dtrace probes? Is it because they are less used? Less capable? More stable? Better engineered?

Probably all of the above, alas.

heinrichhartman 6 hours ago

From my experience most DTrace users rely on DTrace "providers" [1] and Static Trace Points [2] rather than directly probing kernel structs. Also these days the Solaris kernel is not moving all that much.
[1] https://www.illumos.org/books/dtrace/chp-syscall.html#chp-sy... [2] https://www.illumos.org/books/dtrace/chp-sdt.html#chp-sdt
- toast0 an hour ago
  
  DTrace isn't limited to Solaris. Per Wikipedia, it's in FreeBSD, NetBSD, Mac OS (but you can't use it with SIP), and Windows. And lots of userland stuff too.