OPW Intern Develops QR Code for Linux Kernel Oops Messages

464

teodora-baluta-OPWLinux kernel debugging may soon be a bit easier for kernel developers and users in the field thanks to the work of Outreach Program for Women internTeodora Băluţă.

Băluţă spent her three-month internship this year designing a program to capture and send Linux kernel crash and error messages, called oops, as QR codes. Though her internship ended in March, she’s also continuing to develop an Android app that will help process the kernel panic trace by taking a photo of the QR code, decoding it, and reporting it to a central database, such as bugzilla.kernel.org or kerneloops.org.

“People have been very enthusiastic about it. Everyone is coming up with ideas about the RFC I had sent,” said Băluţă, a senior computer science major at Automatic Control and Computer Science, Politehnica University of Bucharest, Romania. “A kernel developer, Levente Kurusa, also did some pull requests on github.com, where I keep the project, and improved the QR encoding. I was very happy with the response I got; I never expected to get such help and positive reactions.”

A new kernel debugging tool

Kernel developers use oops to find and fix programming errors that cause problems in the Linux kernel. Compressing them into QR codes solves several problems with this process, said PJ Waskiewicz, the Linux kernel developer who mentored Băluţă’s internship with Intel, funded by The Linux Foundation.

Oops messages can often be very large and tend to scroll off screen, especially on heavily-loaded systems and systems with large numbers of CPU’s, he said. “Or, in some instances, the panic is lost on hard lockups, since the framebuffer could be taken by the running window manager, and the panic is scribbled to the virtual terminal elsewhere.”

This makes it difficult for kernel developers and users in the field to quickly and accurately copy a kernel panic to aid in debugging. So bug reports sometimes have incomplete or missing panic traces.

“Getting good kernel panics and panic traces is very important for effective debugging,” Waskiewicz said.

For example, he said, “if I’m working on my laptop while traveling, and hit some obscure bug that I only see once every 3-4 months, I want to make sure to get a clean capture of the panic. I don’t have a serial console hooked up, I may be running some graphics workload that hangs the display, so the panic may have dumped the backtrace, but there’s no way to extract it.”

A QR code can encapsulate an entire oops and will still always fit on screen. Using KMS (kernel mode setting), it can also write the QR code to a portion of the framebuffer that will always be visible, so you never lose the panic, Waskiewicz said. And in a QR code format, you don’t need to be a kernel developer to find, read and report an oops – just take a picture of the crash and send it to your favorite bug reporting software so a kernel developer can dig into it.

How to catch an oops

To capture oops as QR codes, Băluţă first had to research existing QR code libraries that were compatible with the kernel and could also handle a large amount of text. She also studied algorithms capable of compressing all of that data and ended up using the zlib inside the kernel code. Finally, she had to insert hooks into the various oops paths in the kernel to catch the outgoing messages that comprise the full oops message.

“So now we have a compressed form of the QR code which I would write to framebuffer,” Băluţă said. “It’s fairly simple actually.”

Learning how to write to framebuffer was the most challenging part of the project, she said, because she wasn’t very familiar with the kernel code itself. Waskiewicz became her guide in navigating the kernel by pointing out, for example, the various existing compression implementations in lib/.

“He answered my questions over hangouts and email, was very patient with me and explained kernel internals when I had problems,” Băluţă said. “He was very open to discuss the ideas I had and I liked the fact that I had this freedom of trying things out.”

After graduation at the end of May, Băluţă plans to pursue her master’s degree in computer science and continue on the path toward becoming a Linux kernel developer.

“OPW was an eye-opening experience because I got to work and talk with really cool and different people: Marina (Zhurakhinskaya) from the Gnome Foundation who coordinated everything, Sarah (Sharp at Intel) was very thorough and helped all candidates throughout the application process, Greg KH, PJ, Paul McKenney (and everyone on the mailing list!) who reviewed our patches and corrected our silly mistakes,” Băluţă said. “I gained the confidence to do the things I want to do.”