If you’re working on Linux systems and need to handle a lot of I/O operations efficiently, you’ve probably heard about io_uring. I’ve been using it in my zig
projects for a while now
io_uring first appeared in Linux 5.1, and it’s quickly become the go-to solution for modern Linux applications that need to handle I/O operations. I’ve switched from traditional I/O methods to io_uring in my projects, and the performance improvements have been pretty impressive. Here’s why I love using it:
-
Performance: We’re talking about handling millions of I/O operations per second with microsecond-level latency. Perfect for those high-throughput applications where every microsecond counts.
-
Reduced System Calls: Remember how we used to worry about system call overhead? io_uring pretty much eliminates that by using shared memory rings between user space and kernel space. It’s like having a direct line to the kernel!
-
Unified API: Whether you’re doing file I/O, network operations, or something else, io_uring handles it all through a unified interface. No more juggling different APIs for different types of I/O.
-
Asynchronous Design: Want kernel-side polling? Fixed buffers? Zero-copy operations? io_uring has got you covered. It’s like having a Swiss Army knife for I/O operations.
-
Scaling I/O: Working with multiple cores? High concurrency? io_uring is designed for massive concurrency and parallelism.
For my latest zig
library, I decided to use io_uring using through liburing (the C library wrapper) because, well, why reinvent the wheel when you can leverage these awesome features? Below, I’ve documented the API with practical Zig code examples that I’ve used in my projects. I’ve included everything from basic setup to advanced features, along with real-world examples that I’ve found useful.
Liburing API
- Setup and Teardown
- Submission Queue Management
- Completion Queue Management
- I/O Operations
- File Operations
- Network Operations
- Buffer Management
- Event Management
Setup and Teardown
io_uring_queue_init
int io_uring_queue_init(unsigned entries, struct io_uring *ring, unsigned flags);
Initializes an io_uring instance with the specified number of entries.
Parameters:
entries
: Number of entries in the submission and completion queuesring
: Pointer to the io_uring structure to initializeflags
: Initialization flags (IORING_SETUP_IOPOLL, IORING_SETUP_SQPOLL, etc.)
Returns:
- 0 on success
- Negative error code on failure
Example:
var ring: c.io_uring = undefined;
const entries: u32 = 8;
const ret = c.io_uring_queue_init(entries, &ring, 0);
if (ret < 0) {
// Handle error
}
defer c.io_uring_queue_exit(&ring);
io_uring_queue_exit
void io_uring_queue_exit(struct io_uring *ring);
Cleans up an io_uring instance.
Parameters:
ring
: Pointer to the io_uring structure to clean up
Submission Queue Management
io_uring_get_sqe
struct io_uring_sqe *io_uring_get_sqe(struct io_uring *ring);
Gets the next available submission queue entry.
Parameters:
ring
: Pointer to the io_uring structure
Returns:
- Pointer to an SQE on success
- NULL if no SQE is available
Example:
const sqe = c.io_uring_get_sqe(&ring);
if (sqe == null) {
// Handle error
}
io_uring_submit
int io_uring_submit(struct io_uring *ring);
Submits the prepared SQEs to the kernel.
Parameters:
ring
: Pointer to the io_uring structure
Returns:
- Number of SQEs submitted on success
- Negative error code on failure
Example:
const ret = c.io_uring_submit(&ring);
if (ret < 0) {
// Handle error
}
Completion Queue Management
io_uring_wait_cqe
int io_uring_wait_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr);
Waits for a completion queue entry to be available.
Parameters:
ring
: Pointer to the io_uring structurecqe_ptr
: Pointer to store the CQE pointer
Returns:
- 0 on success
- Negative error code on failure
Example:
var cqe_ptr: ?*c.io_uring_cqe = null;
const ret = c.io_uring_wait_cqe(&ring, &cqe_ptr);
if (ret < 0 or cqe_ptr == null) {
// Handle error
}
const cqe = cqe_ptr.?;
I/O Operations
io_uring_prep_readv
void io_uring_prep_readv(struct io_uring_sqe *sqe, int fd, const struct iovec *iovecs, unsigned nr_vecs, off_t offset);
Prepares a readv operation.
Parameters:
sqe
: Submission queue entry to preparefd
: File descriptor to read fromiovecs
: Array of iovec structuresnr_vecs
: Number of iovec structuresoffset
: File offset to read from
Example:
var buffer: [1024]u8 = undefined;
var iov: c.iovec = .{
.iov_base = &buffer[0],
.iov_len = buffer.len,
};
c.io_uring_prep_readv(sqe, fd, &iov, 1, 0);
io_uring_prep_writev
void io_uring_prep_writev(struct io_uring_sqe *sqe, int fd, const struct iovec *iovecs, unsigned nr_vecs, off_t offset);
Prepares a writev operation.
Parameters:
sqe
: Submission queue entry to preparefd
: File descriptor to write toiovecs
: Array of iovec structuresnr_vecs
: Number of iovec structuresoffset
: File offset to write to
File Operations
io_uring_prep_openat
void io_uring_prep_openat(struct io_uring_sqe *sqe, int dfd, const char *path, int flags, mode_t mode);
Prepares an openat operation.
Parameters:
sqe
: Submission queue entry to preparedfd
: Directory file descriptorpath
: Path to openflags
: Open flagsmode
: File mode
Example:
c.io_uring_prep_openat(sqe, cwd.fd, "test.txt", c.O_RDONLY, 0);
io_uring_prep_close
void io_uring_prep_close(struct io_uring_sqe *sqe, int fd);
Prepares a close operation.
Parameters:
sqe
: Submission queue entry to preparefd
: File descriptor to close
Network Operations
io_uring_prep_accept
void io_uring_prep_accept(struct io_uring_sqe *sqe, int fd, struct sockaddr *addr, socklen_t *addrlen, int flags);
Prepares an accept operation.
Parameters:
sqe
: Submission queue entry to preparefd
: Socket file descriptoraddr
: Pointer to store the client addressaddrlen
: Pointer to store the address lengthflags
: Accept flags
io_uring_prep_connect
void io_uring_prep_connect(struct io_uring_sqe *sqe, int fd, const struct sockaddr *addr, socklen_t addrlen);
Prepares a connect operation.
Parameters:
sqe
: Submission queue entry to preparefd
: Socket file descriptoraddr
: Server addressaddrlen
: Address length
Buffer Management
io_uring_register_buffers
int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs, unsigned nr_iovecs);
Registers buffers for fixed I/O operations.
Parameters:
ring
: Pointer to the io_uring structureiovecs
: Array of iovec structuresnr_iovecs
: Number of iovec structures
Returns:
- 0 on success
- Negative error code on failure
Event Management
io_uring_prep_poll_add
void io_uring_prep_poll_add(struct io_uring_sqe *sqe, int fd, unsigned poll_mask);
Prepares a poll operation.
Parameters:
sqe
: Submission queue entry to preparefd
: File descriptor to pollpoll_mask
: Poll events to monitor
Example:
c.io_uring_prep_poll_add(sqe, fd, c.POLLIN);
io_uring_prep_timeout
void io_uring_prep_timeout(struct io_uring_sqe *sqe, struct __kernel_timespec *ts, unsigned count, unsigned flags);
Prepares a timeout operation.
Parameters:
sqe
: Submission queue entry to preparets
: Timeout durationcount
: Number of completions to wait forflags
: Timeout flags
Example:
var ts: c.__kernel_timespec = .{
.tv_sec = 1,
.tv_nsec = 0,
};
c.io_uring_prep_timeout(sqe, &ts, 1, 0);
Advanced Features
Fixed Buffers
Fixed buffers allow you to register a set of buffers with the kernel, which can then be referenced by index in your I/O operations. This reduces the overhead of passing buffer addresses with each request.
int io_uring_register_buffers(struct io_uring *ring, const struct iovec *iovecs, unsigned nr_iovecs);
int io_uring_unregister_buffers(struct io_uring *ring);
Example:
// Register buffers
var buffers: [2]c.iovec = .{
.{
.iov_base = &buffer1[0],
.iov_len = buffer1.len,
},
.{
.iov_base = &buffer2[0],
.iov_len = buffer2.len,
},
};
_ = c.io_uring_register_buffers(&ring, &buffers, 2);
// Use fixed buffers in read operation
c.io_uring_prep_read_fixed(sqe, fd, &buffer1[0], buffer1.len, 0, 0); // 0 is the buffer index
Kernel-Side Polling
Kernel-side polling allows the kernel to poll for completions without requiring user-space intervention, reducing latency.
// Enable kernel-side polling during initialization
const ret = c.io_uring_queue_init(entries, &ring, c.IORING_SETUP_IOPOLL);
Zero-Copy Operations
Zero-copy operations allow data to be transferred between file descriptors without copying through user space.
// Prepare a splice operation (zero-copy between file descriptors)
c.io_uring_prep_splice(sqe, fd_in, off_in, fd_out, off_out, len, flags);
Performance Considerations
Ring Size
The size of the submission and completion queues affects performance:
- Larger rings allow more operations to be queued
- Smaller rings use less memory
- Typical sizes range from 32 to 4096 entries
Batch Processing
For optimal performance, batch multiple operations:
// Prepare multiple operations
for (0..10) |i| {
const sqe = c.io_uring_get_sqe(&ring) orelse break;
c.io_uring_prep_read(sqe, fd, &buffers[i], buffer_size, offset + i * buffer_size);
}
// Submit all at once
_ = c.io_uring_submit(&ring);
Buffer Management
- Use fixed buffers for frequently accessed data
- Align buffers to page boundaries for optimal performance
- Consider using huge pages for large buffers
Error Handling
Common Error Codes
-EAGAIN
: Resource temporarily unavailable-EBADF
: Invalid file descriptor-EFAULT
: Bad address-EINVAL
: Invalid argument-ENOMEM
: Out of memory
Error Handling Example
const ret = c.io_uring_submit(&ring);
if (ret < 0) {
switch (-ret) {
c.EAGAIN => std.debug.print("Resource temporarily unavailable\n", .{}),
c.EBADF => std.debug.print("Invalid file descriptor\n", .{}),
c.EFAULT => std.debug.print("Bad address\n", .{}),
c.EINVAL => std.debug.print("Invalid argument\n", .{}),
c.ENOMEM => std.debug.print("Out of memory\n", .{}),
else => std.debug.print("Unknown error: {}\n", .{ret}),
}
return error.IoUringError;
}
Best Practices
- Initialize Once: Create and initialize the io_uring instance once and reuse it
- Batch Operations: Group related operations and submit them together
- Use Fixed Buffers: For frequently accessed data, use fixed buffers
- Handle Errors: Always check return values and handle errors appropriately
- Clean Up: Always call
io_uring_queue_exit
when done - Monitor Performance: Use tools like
perf
to monitor io_uring performance
Reference: This guide is based on the liburing library by Jens Axboe. The liburing library provides a C API for Linux’s io_uring interface, making it easier to use io_uring in applications.
Conclusion
io_uring provides a powerful and efficient way to handle I/O operations in Linux applications. By leveraging its features like scatter/gather I/O, fixed buffers, and kernel-side polling, you can achieve significant performance improvements over traditional I/O methods.
The liburing API makes it easy to use io_uring in your applications, whether you’re writing in C, Zig, or other languages that can interface with C libraries. With the examples and best practices provided in this guide, you should be well-equipped to start using io_uring in your projects.
Happy coding!