From my personal experience, buffered and unbuffered writers are different enough that I think it’s a bit of a mistake to make them indistinguishable to the type system. An unbuffered writer sends the data out of process immediately. A buffered writer usually doesn’t, so sleeping after a write (or just doing something else and not writing more for a while) will delay the write indefinitely. An unbuffered write does not do this.
This means that plenty of algorithms are correct with unbuffered writers and are incorrect with buffered writers. I’ve been bitten by this and diagnosed bugs caused by this multiple times.
Meanwhile an unbuffered writer has abysmal performance if you write a byte at a time.
I’d rather see an interface (trait, abstract class, whatever the language calls it) for a generic writer, with appropriate warnings that you probably don’t want to use it unless you take specific action to address its shortcomings, and subtypes for buffered and unbuffered writers.
And there could be a conditional buffering wrapper that temporarily adds buffering to a generic writer and is zero-cost if applied to an already buffered writer. A language with enforced borrowing semantics like Rust could make this very hard to misuse. But even Python could do it decently well, e.g.:
w: MaybeBufferedByteWriter
with io.LocalBuffer(w) as bufwriter:
do stuff with bufwriter
> This means that plenty of algorithms are correct with unbuffered writers and are incorrect with buffered writers. I’ve been bitten by this and diagnosed bugs caused by this multiple times.
But write() on POSIX is also a buffered API. Until your program calls fsync / fdatasync, linux isn't required to actually flush anything to the underlying storage medium. And even then, many consumer storage devices will lie and return from fsync immediately before data has actually been flushed.
All the OSes that I know of will eagerly write data instead of waiting for fsync, but there's no guarantee the data will be persisted by the time your write() call returns. It usually isn't. If you're relying on write() to durably flush data to disk, you've probably got correctness / data corruption bugs lurking in your code that will show up if power goes out at the wrong time.
I’m not talking about data loss if the host crashes. I’m talking about a much broader sense of correctness.
Imagine an RPC server that uses persistent connections. If you reply using a buffered writer and forget to flush, then your tail latency blows up, possibly to infinity. It’s very easy to imagine situations involving multiple threads or processes that simply don’t work if buffers aren’t flushed on time.
Imagine a program that is intended to crash on error but writes a log message first. If it buffers and doesn’t flush, then the log message will almost always get lost, whereas if the writer is unbuffered or is flushed, then the message will usually not get lost.
I wouldn't call that "buffered", since `write` is guaranteed to appear immediately from the view of other processes that can see the same mount. It's only the disk that needs to be informed to really pick up (could we say "read"?) the changes.
Great point. It's like the earlier days where remote procedure calls were intended to happen "transparently" but the fact that networking is involved in some procedure calls and not others makes them very different in key ways that should not be hidden.
I absolutely agree, and would like to add I feel like the ergonomics of the new interface are just very awkward and almost leaky.
Buffered and unbuffered IO should just be entirely separately things, and separate interfaces. Then as you mention the standard library can provide an adapter in at least one direction, maybe both.
It's an interesting choice, but every writer now needs to handle:
1) vectored i/o (array of arrays, lots of fun for cache lines)
2) buffering
3) a splat optimization for compression? (skipped over in this post, but mentioned in an earlier one)
I'm skeptical here, but I guess we will see if adding this overhead on all I/O is a win. Devirtualization helps _sometimes_ but when you've got larger systems it's entirely possible you've got sync and async I/O in the same optimization space and lose out on optimization opportunities.
In practice, I/O stacks tend to consist of a lot of composition, and in many cases, leak a lot of abstractions. Buffering is one part, corking/backpressure is another (neither of which is handled here, but I might be mistaken). In some cases, you've got meaningful framing on streams that needs to be maintained (or decorated with metadata).
If it works out, I suppose this will be a new I/O paradigm. In fairness, nobody has _really_ solved I/O yet, so maybe a brave new swing is what we need.
I agree on the last point of the lack of composition here.
While it's true that writers need to be aware of buffering to make use of fancy syscalls, implementing that should be an option, but not a requirement.
Naively this would mean implementing one of two APIs in an interface, which ruins the direct peformance. So I see why the choice was made, but I still hope for something better.
It's probably not possible with zig's current capabilities, but I would ideally like to see a solution that:
- Allows implementations to know at comptime what the interface actually implements and optimize for that (is buffering supported? Can you get access to the buffer inplace for zero copy?).
- For the generic version (which is in the vtable), choose one of the methods and wrap it (at comptime).
There's so many directions to take Zig into (more types? more metaprogramming? closer to metal?) so it's always interesting to see new developments!
> While it's true that writers need to be aware of buffering to make use of fancy syscalls, implementing that should be an option, but not a requirement.
Buffering is implemented and handled in the vtable struct itself, the writers (implentations of the interface) themselves don't actually have to know or care about it other than passing through the user-provided buffer when initializing the vtable.
If you don't want buffering, you can pass a zero-length buffer upon creation, and it'll get optimized out. This optimization doesn't require devirtualization because the buffering happens before any virtual function calls.
I wonder if making this change will improve design of buffering across IO implementers because buffering needs consideration upfront, rather than treatment as some feature bolted on the side?
It’s a good sacrifice if the redesign, whilst being more complicated, is avoiding an oversimplified abstraction which end up restricting optimisation opportunities.
The zig std library often builds vtables for structs in an effort to minimize runtime cost for the typical non-virtual cases. I feel it leads to creating a lot of boilerplate to effectively have virtual functions. Worse, you have to study the zig code in a case by case basis to determine how to even use this ad-hoc virtual function scheme. Surely zig can introduce virtual function support in a more ergonomic way than this, as it's so widely used in real life code and extensively in zig's own std library.
From my personal experience, buffered and unbuffered writers are different enough that I think it’s a bit of a mistake to make them indistinguishable to the type system. An unbuffered writer sends the data out of process immediately. A buffered writer usually doesn’t, so sleeping after a write (or just doing something else and not writing more for a while) will delay the write indefinitely. An unbuffered write does not do this.
This means that plenty of algorithms are correct with unbuffered writers and are incorrect with buffered writers. I’ve been bitten by this and diagnosed bugs caused by this multiple times.
Meanwhile an unbuffered writer has abysmal performance if you write a byte at a time.
I’d rather see an interface (trait, abstract class, whatever the language calls it) for a generic writer, with appropriate warnings that you probably don’t want to use it unless you take specific action to address its shortcomings, and subtypes for buffered and unbuffered writers.
And there could be a conditional buffering wrapper that temporarily adds buffering to a generic writer and is zero-cost if applied to an already buffered writer. A language with enforced borrowing semantics like Rust could make this very hard to misuse. But even Python could do it decently well, e.g.:
> This means that plenty of algorithms are correct with unbuffered writers and are incorrect with buffered writers. I’ve been bitten by this and diagnosed bugs caused by this multiple times.
But write() on POSIX is also a buffered API. Until your program calls fsync / fdatasync, linux isn't required to actually flush anything to the underlying storage medium. And even then, many consumer storage devices will lie and return from fsync immediately before data has actually been flushed.
All the OSes that I know of will eagerly write data instead of waiting for fsync, but there's no guarantee the data will be persisted by the time your write() call returns. It usually isn't. If you're relying on write() to durably flush data to disk, you've probably got correctness / data corruption bugs lurking in your code that will show up if power goes out at the wrong time.
I’m not talking about data loss if the host crashes. I’m talking about a much broader sense of correctness.
Imagine an RPC server that uses persistent connections. If you reply using a buffered writer and forget to flush, then your tail latency blows up, possibly to infinity. It’s very easy to imagine situations involving multiple threads or processes that simply don’t work if buffers aren’t flushed on time.
Imagine a program that is intended to crash on error but writes a log message first. If it buffers and doesn’t flush, then the log message will almost always get lost, whereas if the writer is unbuffered or is flushed, then the message will usually not get lost.
I wouldn't call that "buffered", since `write` is guaranteed to appear immediately from the view of other processes that can see the same mount. It's only the disk that needs to be informed to really pick up (could we say "read"?) the changes.
This has bit me a few times when a Linux system crashes so there’s no final call to fsync implicit or otherwise
Great point. It's like the earlier days where remote procedure calls were intended to happen "transparently" but the fact that networking is involved in some procedure calls and not others makes them very different in key ways that should not be hidden.
I absolutely agree, and would like to add I feel like the ergonomics of the new interface are just very awkward and almost leaky.
Buffered and unbuffered IO should just be entirely separately things, and separate interfaces. Then as you mention the standard library can provide an adapter in at least one direction, maybe both.
This seems like a blunder to me.
It's an interesting choice, but every writer now needs to handle:
1) vectored i/o (array of arrays, lots of fun for cache lines)
2) buffering
3) a splat optimization for compression? (skipped over in this post, but mentioned in an earlier one)
I'm skeptical here, but I guess we will see if adding this overhead on all I/O is a win. Devirtualization helps _sometimes_ but when you've got larger systems it's entirely possible you've got sync and async I/O in the same optimization space and lose out on optimization opportunities.
In practice, I/O stacks tend to consist of a lot of composition, and in many cases, leak a lot of abstractions. Buffering is one part, corking/backpressure is another (neither of which is handled here, but I might be mistaken). In some cases, you've got meaningful framing on streams that needs to be maintained (or decorated with metadata).
If it works out, I suppose this will be a new I/O paradigm. In fairness, nobody has _really_ solved I/O yet, so maybe a brave new swing is what we need.
I agree on the last point of the lack of composition here.
While it's true that writers need to be aware of buffering to make use of fancy syscalls, implementing that should be an option, but not a requirement.
Naively this would mean implementing one of two APIs in an interface, which ruins the direct peformance. So I see why the choice was made, but I still hope for something better.
It's probably not possible with zig's current capabilities, but I would ideally like to see a solution that:
- Allows implementations to know at comptime what the interface actually implements and optimize for that (is buffering supported? Can you get access to the buffer inplace for zero copy?).
- For the generic version (which is in the vtable), choose one of the methods and wrap it (at comptime).
There's so many directions to take Zig into (more types? more metaprogramming? closer to metal?) so it's always interesting to see new developments!
> While it's true that writers need to be aware of buffering to make use of fancy syscalls, implementing that should be an option, but not a requirement.
Buffering is implemented and handled in the vtable struct itself, the writers (implentations of the interface) themselves don't actually have to know or care about it other than passing through the user-provided buffer when initializing the vtable.
If you don't want buffering, you can pass a zero-length buffer upon creation, and it'll get optimized out. This optimization doesn't require devirtualization because the buffering happens before any virtual function calls.
I wonder if making this change will improve design of buffering across IO implementers because buffering needs consideration upfront, rather than treatment as some feature bolted on the side?
It’s a good sacrifice if the redesign, whilst being more complicated, is avoiding an oversimplified abstraction which end up restricting optimisation opportunities.
I'm not usually in the "defending C++" camp, but when I see this:
...I can't help but think of this: Writing code to build a vtable and having it implicitly run at compile time is pretty neat, though!The zig std library often builds vtables for structs in an effort to minimize runtime cost for the typical non-virtual cases. I feel it leads to creating a lot of boilerplate to effectively have virtual functions. Worse, you have to study the zig code in a case by case basis to determine how to even use this ad-hoc virtual function scheme. Surely zig can introduce virtual function support in a more ergonomic way than this, as it's so widely used in real life code and extensively in zig's own std library.