Part 3: Putting the F in FFI
10 minutes read •
If there’s one thing the VA-API documentation doesn’t convey too well, it’s how absolutely massive the API surface is. The main decode loop example shows it to be simple and straightforward, “configure, upload, run”, however a lot of complexity is hiding in the configure and upload phase.
Wrapping all of it is going to be a large project, so some divide-and-conquer will be required. It would be nice to immediately jump into decoding AV1, but as of writing this I am unfortunately unable to jump, so we’ll have to take this one step at a time.
The previous paragraph should give you an idea of how much I’ve procrastinated while writing this post.
Building the staircase to Greatness
As we’ve discussed earlier, VA-API and its modules can be split into two large groups. The core functions (further separable into codecs), and the platform-specific glue libraries. It is also capable of much more than video coding as we will discover, but for now, the first priority will be simple media decoding.
- As we would like to be able to test our work as we go, we’ll have to begin by implementing at least one platform-specific environment.
- Once that is done, we can continue by working through the easiest codec on the list until we have a functional decoder for it.
- To verify, we will also need some way to get at the decoded frame data. It will likely be easiest to display the image in a window.
- Once we’re at this point, we can decide whether to extend platform or codec support, or branch out towards other capabilities of the library, like encoding or image processing.
All my computers run Wayland now, so the first platform to implement should either be X11 through XWayland or straight Wayland-native processing. Configuring VAAPI to call directly into the Direct Rendering Manager - skipping the desktop environment - would also be possible, though I would really prefer being able to draw to a window with as little friction as possible, so it wouldn’t be a good first platform.
The platform libraries don’t contain much exposed functionality, so let’s scan through the capabilities of both.
X marks the way(land)
Starting with the more broadly available X11 platform, the header file is at va/va_x11.h
/*
* Returns a suitable VADisplay for VA API
*/
VADisplay ;
/*
* Output rendering
* Following is the rendering interface for X windows,
* to get the decode output surface to a X drawable
* It basically performs a de-interlacing (if needed),
* color space conversion and scaling to the destination
* rectangle
*/
VAStatus ;
Nice and simple. A way to convert an X11 display handle into a VA-API-branded display handle, and a monster function to draw a surface to something drawable with a load of additional clipping and mapping capabilities.
For Wayland, the story is a little different, though much better documented:
/**
* \defgroup api_wayland Wayland rendering API
*
* @{
*
* Theory of operations:
* - Create a VA display for an active Wayland display ;
* - Perform normal VA-API operations, e.g. decode to a VA surface ;
* - Get wl_buffer associated to the VA surface ;
* - Attach wl_buffer to wl_surface ;
*/
/**
* \brief Returns a VA display wrapping the specified Wayland display.
*
* This functions returns a (possibly cached) VA display from the
* specified Wayland @display.
*
* @param[in] display the native Wayland display
* @return the VA display
*/
VADisplay
;
/**
* \brief Returns the Wayland buffer associated with a VA surface.
*
* This function returns a wl_buffer handle that can be used as an
* argument to wl_surface_attach(). This buffer references the
* underlying VA @surface. As such, the VA @surface and Wayland
* @out_buffer have the same size and color format. Should specific
* color conversion be needed, then VA/VPP API can fulfill this
* purpose.
*
* The @flags describe the desired picture structure. This is useful
* to expose a de-interlaced buffer. If the VA driver does not support
* any of the supplied flags, then #VA_STATUS_ERROR_FLAG_NOT_SUPPORTED
* is returned. The following flags are allowed: \c VA_FRAME_PICTURE,
* \c VA_TOP_FIELD, \c VA_BOTTOM_FIELD.
*
* @param[in] dpy the VA display
* @param[in] surface the VA surface
* @param[in] flags the deinterlacing flags
* @param[out] out_buffer a wl_buffer wrapping the VA @surface
* @return VA_STATUS_SUCCESS if successful
*/
VAStatus
;
/**
* \brief Returns the Wayland buffer associated with a VA image.
*
* This function returns a wl_buffer handle that can be used as an
* argument to wl_surface_attach(). This buffer references the
* underlying VA @image. As such, the VA @image and Wayland
* @out_buffer have the same size and color format. Should specific
* color conversion be needed, then VA/VPP API can fulfill this
* purpose.
*
* The @flags describe the desired picture structure. See
* vaGetSurfaceBufferWl() description for more details.
*
* @param[in] dpy the VA display
* @param[in] image the VA image
* @param[in] flags the deinterlacing flags
* @param[out] out_buffer a wl_buffer wrapping the VA @image
* @return VA_STATUS_SUCCESS if successful
*/
VAStatus
;
This platform doesn’t offer as many clipping and scaling capabilities at presentation time, but I imagine that is mostly because the Wayland protocol is much leaner and leverages GPU programming APIs instead of cramming everything and the print spooler into itself. Though having to attach these buffers to the window manually will mean needing to call into Wayland functions directly, which might add to the complexity.
Another potential issue would be needing to match pixel formats manually, since both buffer conversion functions mention “Should specific color conversion be needed, then VA/VPP can fulfill this purpose.” So for simplicity’s sake let’s stick to X and XWayland for initial development.
Initial Commit
I foresee this project growing quite large, with accompanying tools and multiple crates, so let’s create a workspace for them.
Welcome to the world, Vaudeville. Let’s define the workspace by creating a Cargo.toml file in the root directory:
[]
= "3"
[]
= "0.0.1"
= ["Karcsesz <git@karcsesz.hu>"]
= "A VA-API library for Rust"
= "MIT OR Apache-2.0"
= "https://code.thishorsie.rocks/Karcsesz/vaudeville"
= "README.md"
= ["vaapi", "video", "api"]
= ["api-bindings", "encoding", "hardware-support", "multimedia"]
= "2024"
Resol-what?
Cargo currently has three slightly incompatible behaviours for resolving what versions of dependencies to use for crates, and for unifying optional features between two separate imports of the same library; normally decided by the Rust edition that a specific crate uses. However, for workspaces, there is no edition specified, and Cargo defaults to the backwards-compatible behaviour of usingresolver = "1". This might prank us in the long run, being used to how resolver version 3 (introduced in the 2024 edition) works, so it is recommended to explicitly set the resolver version in workspaces. Read more
Note that I’ve specified a lot of package-like values, but instead of putting them in the [package] section, I’ve put them in the [workspace.package] one. This will let the packages in the workspace share these values, pulling from a common source instead of having to duplicate version numbers and licence data for each of them.
If we now add the main crate:
Then when we open the freshly created manifest:
[]
= "vaudeville"
= true
= true
= true
= true
= true
= true
= true
= true
= true
We’ll see that the fields are automatically set to inherit the workspace’s values.
Let’s bang out a temporary README to go along with it.
FFI Separation
I’d also like to separate the FFI bindings. Just going to call it vaudeville-ffi instead of the more idiomatic vaapi-sys because I will likely taylor it to work with Vaudeville first and foremost, as opposed to the more generic “just bindgen the lib” others would expect from something called vaapi-sys. Also, I don’t feel confident taking such an important name in the global namespace.
This latter one shouldn’t just inherit every package field from the workspace, so we’ll need to edit the Cargo.toml for it.
[]
= "vaudeville-ffi"
= "VA-API FFI bindings for use with Vaudeville"
= ["vaapi", "video", "api", "ffi"]
= ["external-ffi-bindings", "encoding", "hardware-support", "multimedia"]
= true
= true
= true
= true
= true
= true
[]
And now we can start by pasting the macro from the previous part in its rightful place at /vaudeville-ffi/src/macros/dyload.rs.
Rust has some special syntax for achieving a “pub macro_rules!”, so we have to add #[macro_export].
//! Macro for loading functions from a dynamic library
X11 integration
With that done and dusted, we can finally test the macro by binding the X11 platform functions inside /vaudeville-ffi/src/x11.rs:
use ;
use cratedyload;
dyload!
We can use the special typedefs from std::ffi to match most function parameters, but there are a few that we have to port over from C ourselves.
Display is going to be a special handle only used for this function, so I’ll just define it right in the same file. The C version is imported from X11/Xlib.h:
/*
* Display datatype maintaining display specific data.
* The contents of this structure are implementation dependent.
* A Display should be treated as opaque by application code.
*/
typedef struct _XDisplay Display;
Should be treated as opaque by application code, fair enough. A pub type to *mut c_void should do for now. Drawable is similar, a simple typedef to XID which is a typedef to unsigned long. The rest of the types are going to be used all over the codebase, which warrants better separation.
VADisplay is another simple case, typedef to void*. Now we could also just define this as a *mut c_void too, but that would let users accidentally pass all sorts of funky types also resolving to just *mut c_void without any resistance from the compiler. But if we wrap the pointer in a #[repr(transparent)] struct, it will pass through FFI barriers just the same, while telling the compiler that it is its own distinct type which should be respected as such.
I’ll also toss in a NonNull wrapper, because we don’t want to accidentally mistake vaGetDisplay returning NULL as a valid VADisplay.
use c_void;
;
This will require a NullableVADisplay to be defined too, to be returned by vaGetDisplay. Here we can make use of a guaranteed optimisation, where Option<NonNull> pointers are guaranteed to have NULL represent None. So we just need to write:
//...
pub type NullableVADisplay = ;
Then we can specify that vaGetDisplay can return NULL:
;
VASurfaceID and its friends are typedefs to VAGenericID, which is a typedef to unsigned int. Going to utilise the same newtype trick to separate the two, but with some From implementations to allow opt-in conversion between the base variant and the specialised IDs.
use c_uint;
;
;
;
//...
;
//...
;
//...
;
//...
;
//...
VARectangle is our first compound type:
/** \brief Structure to describe rectangle. */
typedef struct _VARectangle VARectangle;
In Rust, you can mark a struct as #[repr(C)], and it will use the padding and ordering rules defined in the C standard.
And finally we have VAStatus which is…
typedef int VAStatus; /** Return status type from functions */
/** Values for the return status */
/**
* \brief An invalid/unsupported value was supplied.
*
* This is a catch-all error code for invalid or unsupported values.
* e.g. value exceeding the valid range, invalid type in the context
* of generic attribute values.
*/
/** \brief An unsupported filter was supplied. */
/** \brief An invalid filter chain was supplied. */
/** \brief Indicate HW busy (e.g. run multiple encoding simultaneously). */
/** \brief An unsupported memory type was supplied. */
/** \brief Indicate allocated buffer size is not enough for input or output. */
/** \brief Indicate an operation isn't completed because time-out interval elapsed. */
…something I’m just going to use an enum for right now.
Here be UB!
Here we have to think about a very interesting bit of undefined behavior that Rust kept around. What iflibva adds a new error variant on their side, and we don’t track it on the Rust side? Rust defines that “an enum must have a valid discriminant […]”, as in, it is considered UB to have an enum which is set to a value not defined in its variant list. This allows the compiler to eliminate catch-all match arms it deems unneeded, which would break the following code:
let some_value: VAStatus = get_a_status;
match some_value
Since the compiler can verify that we have already matched all variants it knows about, it optimises the fallback straight out, meaning we suddenly have a match arm that can potentially not match anything. If we return values from the arms, suddenly we can forget to initialise memory. Rust’s safety guarantees collapse in on themselves like a house of cards.
Unfortunately even something like #[non_exhaustive] isn’t enough for us, since it is only a lint, and doesn’t change the behaviour of the code generation. So we will eventually have to switch to a more robust system likely involving macros and two separate types. One safe for Rust code, and one that can traverse the FFI boundary. Read more
Workspaced dependencies
A quick compile check shows us that we’re missing a dependency. Since we’re in a workspace, I’m going to make use of its common dependency management capabilities. To do that, first we have to add a [workspace.dependencies] section to the workspace’s TOML file (remember how [workspace.package] worked?) and define the version of libloading that we would like to import.
# ...
= "2024"
[]
= "0.8.6"
Followed by defining the fact that libloading is required in our FFI crate in its manifest.
# ...
= true
[]
= { = true}
Note how instead of adding a version, we’ve instead written that it should pull the dependency from the workspace. Aaaand…
)
)
Now our code compiles!
Warnings everywhere!
I hate seeing code warnings, so let’s clean them up next. Thankfully they’re all style warnings about how our functions and function types should conform to the Rust standard, which we can’t really do much about without breaking the library loading logic. So we’re just going to edit the macro to tell the compiler to ignore these lints.
Add a quick #[allow(non_camel_case_types)] to the loop creating the type $func lines, and some #[allow(non_snake_case)] attributes to the helper method definitions and the structure declaration…
)
And we’re clean!
More Macro: Attribute propagation
It would be very useful if we could declare attributes and docstrings on the functions defined through the dyload! macro and have them propagate to the generated function definitions. Thankfully the /// docstring syntax is just sugar for another added attribute, so we can match both quite easily.
//...
Then it’s just a matter of expanding the matches in the right places:
//...
$*
$struct_name {
$(
$func: $func,
)+
}
//...
$(
#[]
$*
pub unsafe fn $func $?
)+
//...
After adding some documentation to the implemented functions, I think it’s time to wrap up for today. In the next part, we’re going to start implementing some core VA-API objects, and figure out a way to manage their lifetimes.