Libcamera Aims to Make Embedded Cameras Easier

2402

The V4L2 (Video for Linux 2) API has long offered an open source alternative to proprietary camera/computer interfaces, but it’s beginning to show its age. At the Embedded Linux Conference Europe in October, the V4L2 project unveiled a successor called libcamera. V4L2 co-creator and prolific Linux kernel contributor Laurent Pinchart outlined the early-stage libcamera project in a presentation called “Why Embedded Cameras are Difficult, and How to Make Them Easy.”

V4l and V4L2 were developed when camera-enabled embedded systems were far simpler. “Maybe you had a camera sensor connected to a SoC, with maybe a scaler, and everything was exposed via the API,” said Pinchart, who runs an embedded Linux firms called Ideas on Board and is currently working for Renesas. “But when hardware became more complex, we disposed of the traditional model. Instead of exposing a camera as a single device with a single API, we let userspace dive into the device and expose the technology to offer more fine-grained control.”

These improvements were extensively documented, enabling experienced developers implement more use cases than before. Yet, the spec placed much of the burden of controlling the complex API on developers, with few resources available to ease the learning curve. In other words, “V4L2 became more complex for userspace,” explained Pinchart.

The project planned to add a layer called libv4l to address this. The libv4l userspace library was designed to mimic the V4L2 kernel API and expose it to apps “so it could be completely transparent in tracking the code to libc,” said Pinchart. “The plan was to have device specific plugins provided by the vendor and it would all be part of the libv4l file, but it never happened. Even if it had, it would not have been enough.”

Libcamera, which Pinchart describes as “not only a camera library but a full camera stack in user space,” aims to ease embedded camera application development, improving both on V4L2 and libv4l. The core piece is a libcamera framework, written in C++, that exposes kernel driver APIs to userspace. On top of the framework are optional language bindings for languages such as C.

The next layer up is a libcamera application layer that translates to existing camera APIs, including V4L2, Gstreamer, and the Android Camera Framework, which Pinchart said would not contain the usual vendor specific Android HAL code. As for V4L2, “we will attempt to maintain compatibility as a best effort, but we won’t implement every feature,” said Pinchart. There will also be a native libcamera app format, as well as plans to support Chrome OS.

Libcamera keeps the kernel level hidden from the upper layers. The framework is built around the concept of a camera device, “which is what you would expect from a camera as an end user,” said Pinchart. “We will want to implement each camera’s capabilities, and we’ll also have a concept of profiles, which is a higher view of features. For example, you could choose a video or point-and-shoot profile.”

Libcamera will support multiple video streams from a single camera. “In videoconferencing, for example, you might want a different resolution and stream than what you encode over the network,” said Pinchart. “You may want to display the live stream on the screen and, at the same time, capture stills or record video, perhaps at different resolutions.”

Per-frame controls and a 3A API

One major new feature is per-frame controls. “Cameras provide controls for things like video stabilization, flash, or exposure time which may change under different lighting conditions,” said Pinchart. “V4L2 supports most of these controls but with one big limitation. Because you’re capturing a video stream with one frame after another, if you want to increase exposure time you never know precisely at what frame that will take effect. If you want to take a still image capture with flash, you don’t want to activate a flash and receive an image that is either before or after the flash.”

With libcamera’s per-frame controls, you can be more precise. “If you want to ensure you always have the right brightness and exposure time, you need to control those features in a way that is tied to the video stream,” explained Pinchart. “With per-frame controls you can modify all the frames that are being captured in a way that is synchronized with the stream.”

Libcamera also offers a novel approach to a given camera’s 3A controls, such as auto exposure, autofocus, and auto white balance. To provide a 3A control loop, “you can have a simple implementation with 100 lines of code that will give you barely usable results or an implementation based on two or three years of development by device vendors where they really try to optimize the image quality,” said Pinchart. Because most SoC vendors refuse to release the 3A algorithms that run in their ISPs with an open source license, “we want to create a framework and ecosystem in which open source re-implementations of proprietary 3A algorithms will be possible,” said Pinchart.

Libcamera will provide a 3A API that will translate between standard camera code and a vendor specific component. “The camera needs to communicate with kernel drivers, which is a security risk if the image processing code is closed source,” said Pinchart. “You’re running untrusted 3A vendor code, and even if they’re not doing something behind your back, it can be hacked. So we want to be able to isolate the closed source component and make it operate within a sandbox. The API can be marshaled and unmarshaled over IPC. We can limit the system calls that are available and prevent the sandboxed component from directly accessing the kernel driver. Sandboxing will ensure that all the controls will have to go through our API.”

The 3A API combined with libcamera’s sandboxing approach, may encourage more SoC vendors to further expose their ISPs just as some are have begun to open up their GPUs. “We want the vendors to publish open source camera drivers that expose and document every control on the device,” he said. “When you are interacting with a camera, a large part of that code is device agnostic. Vendors implement a completely closed source camera HAL and supply their own buffer management and memory location and other tasks that don’t add any value. It’s a waste of resources. We want as much code as possible that can be reused and shared with vendors.”

Pinchart went on to describe libcamera’s cam device manager, which will support hot plugging and unplugging of cameras. He also explained libcamera’s pipeline handler, which controls memory buffering and communications between MIPI-CSI or other camera receiver interfaces and the camera’s ISP.

“Our pipeline handler takes care of the details so the application doesn’t have to,” said Pinchart. “It handles scheduling, configuration, signal routing, the number of streams, and locating and passing buffers.” The pipeline handler is flexible enough to support an ISP with an integrated CSI receiver (and without a buffer pool) or other complicated ISPs that can have a direct pipeline to memory.

Watch Pinchart’s entire ELC talk below: