I would concur, the author also mentions that file IO isn't async; this makes me conclude that the author hasn't grasped how much the Linux kernel has moved on since in the mid 2000s.
I suspect that the author has an incomplete view of the current kernel / userland interface as well as the inner workings of how a kernel actually "does what it does".
The Linux kernel isn't "primarily" sync, the kernel itself has been almost entirely async except for some small parts, like the bottom half of interrupt handlers which can't be pre-empted. Even in the early days of Linux the scheduler was pre-emptive and used "wait channels" create a synchronous interface for the asynchronous kernel. In the early days of Linux, if a process was in kernel mode then it couldn't be pre-empted until it returned to user mode, but this is no longer the case, and with PREEMPT_RT this is even the case for RT processes /kernel tasks.
Now, the POSIX API is largely synchronous, but this is mainly because of the history of UNIX (and Multics before it).
That particular paragraph can be phrased better so I'll adjust that. What I meant to say is that you can't handle it asynchronously like you can with sockets, i.e. polling it for readiness. This is because for file IO, reads and writes are always reported as being available, making epoll/kqueue/etc effectively useless.
With io_uring, you don't poll a file descriptor for readiness and then attempt a non-blocking operation and hope it works. You submit a request and later get a response.
To be fair to the author, io_uring has not made its way very far into userspace libraries yet. Even if the language you use has async primitives, and it’s on paper a great fit for io_uring, there’s a slim to zero chance the underlying libraries you’re using are actually using it yet. (Unless you’ve gone out of your way to target io_uring and write your own code for it.)
Yeah, it's a huge challenge. A big part of the overhead of read(2) is that historically each read required a buffer allocated to it before you start the slow request, and the way languages implemented non-blocking I/O followed that API. io_uring avoids the problem with buffer pools, but that completely changes the I/O API that most languages have settled on.
io_uring can poll filesystem descriptors for readiness; epoll, select, etc come from the networking world, and aio, etc come from the disk world, and as a result both have their various blinkers / limitations.
And I guess I could have phrased it better when I mentioned newer Linux developments, as io_uring does a far more comprehensive job of handling not just filesystem descriptors and network descriptors, but also other types of character devices.
I suspect that the author has an incomplete view of the current kernel / userland interface as well as the inner workings of how a kernel actually "does what it does".