Shrinking the Linux Kernel and File System for IoT

By

-

April 13, 2017

At last year’s Embedded Linux Conference Europe, Sony’s Tim Bird warned that the stalled progress in reducing Linux kernel size meant that Linux was ceding the huge market in IoT edge nodes to real-time operating systems (RTOSes). At this February’s ELC North America event, another figure who has long been at the center of the ELC scene — Free Electrons’ Michael Opdenacker — summed up the latest kernel shrinkage schemes as well as future possibilities. Due perhaps to Tim Bird’s exhortations, ELC 2017 had several presentations on reducing footprint, including Rob Landley’s Tutorial: Building the Simplest Possible Linux System.

Like Bird, Opdenacker bemoaned the lack of progress, but said there are plenty of ways for embedded Linux developers to reduce footprint. These range from using newer technologies such as musl, toybox, and Clang to revisiting other approaches that developers sometimes overlook.

In his talk, Opdenacker explained that the traditional motivator for shrinking the kernel was to speed boot time or copy a Linux image from low-capacity storage. In today’s IoT world, this has been joined with meeting the requirement for very small endpoints with limited resources. These aren’t the only reasons, however. “Some want to run Linux as a bootloader so they don’t have to re-create bootloader drivers, and some want to run to the whole system in internal RAM or cache,” said Opdenacker. “A small kernel can also reduce the attack surface to improve security.”

Stalled efforts such as the Linux Kernel Tinification project have largely done their job, said Opdenacker. Although size has edged up slightly over the years, you can still call upon a variety of techniques to run your kernel in as little as 4MB of RAM.

“You’d think that since the Tinification project is not that active that the kernel would grow exponentially, but it’s still under control, so maybe we could reverse the trend,” said Opdenacker. “With more aggressive work, 2-3MB may be achievable. Still, there has not been much new in this area since ELC Europe 2015.”

Although Josh Triplett’s Tinification patches, which remove functionality via configuration settings, have themselves been removed from the linux-next tree, they are still available for experimentation. The main reason: Kernel developers are hesitant to rip out too much plumbing due to the potential for bugs.

“Removing functionality may no longer be the way to go, as the complexity of kernel configuration parameters is already difficult to manage,” said Opdenacker. “Kernel developers don’t like to remove features. In the future, we may see new approaches that automatically detect and remove unused features like system calls, command-line options, /proc contents, and kernel command-line parameters. You would trace your system and see what you use at runtime and then remove the code you don’t need.”

Shrinking the kernel

Meanwhile, there are still plenty of ways to reduce footprint. One of the easiest is to shrink kernel size during compile. First, use a recent compiler, said Opdenacker. For example, gcc 6.2 gives you almost a half percentage point reduction over gcc 4.7 with ARM versatile Linux 4.10. That may not be much, but, “every byte can count,” he added.

Then there are compiler optimizations. With gcc, for example, you can use the -Os option to reduce size. Since gcc 4.7, users have also been able to run optional Link Time Optimizations that can reduce unused code when applied at the end of the compile “when linking all the object files together to optimize things like inlining across various objects,” said Opdenacker. In one test, running gcc 6.2 with LTO reduced the size of the stripped variable by 2.6 percent (x86_64) to 2.8 percent (32-bit ARM).

A few years ago, there was keen interest in an LLVM Linux project that used the Clang front end to the LLVM compiler to compile the Linux kernel for performance and size optimizations. “It is possibly better than what you can get with gcc LTO today, but the project has been stalled since 2015,” said Opdenacker. In response, an audience member suggested the project was still alive.

Using the Clang front end for the LLVM compiler brings even more footprint savings than gcc LTO. Opdenacker ran some tests using a program called OggEnc that consists of a single C program. He then compared Clang 3.8.1 with gcc 6.2 on x86_64, and saw a 5 percent reduction “out of the box without doing anything.” Gcc, however, can offer greater reductions when compiling very small programs, he added.

Opdenacker also mentioned some patches proposed by Andi Kleen in 2012 built around gcc LTO. They promised performance improvements and a reduction of as much as 6 percent of unused code on ARM systems. “Unfortunately, the patches caused some new problems so it wasn’t accepted,” he added. “The kernel developers were afraid of creating new bugs that were hard to track down. But maybe it’s worth trying again.”

Another compiler technique available to ARM users is to compile with the thumb (-mthumb), which offers a mix of 16- and 32-bit instructions instead of the all 32-bit ARM (-marm) instruction set. Some compilers, such as Ubuntu’s, will compile to thumb by default, said Opdenacker. Using OggEnc, the thumb compile was 6.8 percent smaller than the ARM compile. He conceded, however, that this was not a definitive test, as his compiler also compiled parts of the program using the ARM set.

Since Linux 3.18, developers have been able to reduce kernel size by using the “make tinyconfig” command, which combines “make allnoconfig” with a few adding settings that reduce size. “It uses gcc optimize for size, so the code may be slower but it’s smaller,” said Opdenacker. “You turn on kernel XZIP compression, and you save about 6 to 10KB.”

The kernel now has several tinification options you can choose from, like adding obj-y in kernel Makefiles. “In other words, you can compile the kernel without needing ptrace on all the time, which on ARM takes up 14KB.” You can find several tinification opportunities in the Linux kernel by looking for obj-y in kernel Makefiles, corresponding to code that is always included in the kernel binary. “For example, you may be able to compile the kernel without ptrace support, which on ARM takes up 14KB,” said Opdenacker.

It’s a good idea to “study your compile logs and see if everything is really needed,” said Opdenacker. “You can decide how useful it is and how difficult it is to remove. You can also look for size regressions using the bloat-o-meter command, which compares with vmlinux to see what has increased in size between versions.”

User space reductions

To reduce user space footprint on simpler programs, Opdenacker suggests that instead of busybox, developers should try the toybox set of Linux command line utilities, which is now baked into Android. “Toybox has the same applications and mostly the same features as busybox, but uses only 84KB instead of 100KB,” he added. “If you just want a shell with just a few command line utilities, toybox could save you a few thousands of bytes, though it’s less configurable.”

Another technique is to switch C/POSIX standard library implementations. The newer musl libc uses less space than uclibc and glibc. Opdenacker described one test on the hello.c program under gcc 6.3 and busybox in which musl used only 7.3KB vs. 67KB for uclibc-ng 1.0.22 vs. 49KB using glibc with with gcc 6.2.

For reducing file system size, Opdenacker recommends booting on initramfs for small file systems. “It lets you boot earlier because you don’t have to initialize file-system and storage drivers.” For bigger RAM sizes, he suggests using compression file systems such as SquashFS, JFFS2, or ZRAM.

“There’s still significant room for improvement in user and kernel space reduction,” concluded Opdenacker. However, when he asked if the community should resurrect the Kernel Tinification project, he was met with a somewhat tepid response.

“These days you can’t even buy an 8MB RAM card,” said one attendee. “It’s an interesting exercise, but I don’t know that there’s a whole lot of payback.” Another developer noted that one problem with the Tinification project was that it “removed things that weren’t needed in really small memory configurations, but the minute you go to the cloud all that stuff is required.”

If Linux has indeed run into some practical limits to reducing footprint, there will be new opportunities for simpler RTOSes, including several open source platforms like Zephyr and FreeRTOS, to operate small footprint endpoints on microcontrollers. Yet that does not mean Linux is only useful for IoT gateways. With the growth of AI- and multimedia-related IoT nodes, Linux may be the only game in town. Meanwhile, it’s good to know there are some new tricks available to create the minimalist embedded masterpiece of your dreams.

Watch the full video below:

https://www.youtube.com/watch?v=ynNLlzOElOU?list=PLbzoR-pLrL6pSlkQDW7RpnNLuxPq6WVUR

Connect with the Linux community at Open Source Summit North America on September 11-13. Linux.com readers can register now with the discount code, LINUXRD5, for 5% off the all-access attendee registration price. Register now to save over $300!

More Unknown Linux Commands

By

Carla Schroder

-

April 13, 2017

A roundup of the fun and little-known utilities termsaver, pv, and calendar. termsaver is an ASCII screensaver for the console, and pv measures data throughput and simulates typing. Debian’s calendar comes with a batch of different calendars, and instructions for making your own.

Terminal Screensaver

Why should graphical desktops have all the fun with fancy screensavers? Install termsaver to enjoy fancy ASCII screensavers like matrix, clock, starwars, and a couple of not-safe-for-work screens. More on the NSFW screens in a moment.

termsaver is included in Debian/Ubuntu, and if you’re using a boring distro that doesn’t package fun things (like CentOS), you can download it from termsaver.brunobraga.net and follow the simple installation instructions.

Run termsaver -h to see a list of screens:

 randtxt        displays word in random places on screen
 starwars       runs the asciimation Star Wars movie
 urlfetcher     displays url contents with typing animation
 quotes4all     displays recent quotes from quotes4all.net
 rssfeed        displays rss feed information
 matrix         displays a matrix movie alike screensaver
 clock          displays a digital clock on screen
 rfc            randomly displays RFC contents
 jokes4all      displays recent jokes from jokes4all.net (NSFW)
 asciiartfarts  displays ascii images from asciiartfarts.com (NSFW)
 programmer     displays source code in typing animation
 sysmon         displays a graphical system monitor

Then run your chosen screen with termsaver [screen name], e.g. termsaver matrix, and stop it with Ctrl+c. Get information on individual screens by running termsaver [screen name] -h. Figure 1 is from the starwars screen, which runs our old favorite Asciimation Wars.

The not-safe-for-work screens pull in online feeds. They’re not my cup of tea, but the good news is termsaver is a gaggle of Python scripts, so they’re easy to hack to connect to any RSS feed you desire.

pv

The pv command is one of those funny little utilities that lends itself to creative uses. Its intended use is monitoring data copying progress, like when you run rsync or create a tar archive. When you run pv without options the defaults are:

-p progress.
-t timer, total elapsed time.
-e, ETA, time to completion. This is often inaccurate as pv cannot always know the size of the data you are moving.
-r, rate counter, or throughput.
-b, byte counter.

This is what an rsync transfer looks like:

$ rsync -av /home/carla/ /media/carla/backup/ | pv 
sending incremental file list
[...]
103GiB 0:02:48 [ 615MiB/s] [  <=>

Create a tar archive like this example:

$ tar -czf - /file/path| (pv  > backup.tgz)
 885MiB 0:00:30 [28.6MiB/s] [  <=>

pv monitors processes. To see maximum activity monitor a Web browser process. It is amazing how much activity that generates:

$ pv -d  3095                                                                                                             
  58:/home/carla/.pki/nssdb/key4.db:    0 B 0:00:33 
  [   0 B/s] [<=>                                                                           ] 
  78:/home/carla/.config/chromium/Default/Visited Links:  
  256KiB 0:00:33 [   0 B/s] [<=>                                                      ] 
  ] 
  85:/home/carla/.con...romium/Default/data_reduction_proxy_leveldb/LOG:  
  298 B 0:00:33 [   0 B/s] [<=>                                       ]

Somewhere on the Internet I stumbled across a most entertaining way to use pv to echo back what I type:

$ echo "typing random stuff to pipe through pv" | pv -qL 8
typing random stuff to pipe through pv

The normal echo command prints the whole line at once. Piping it through pv makes it appear as though it is being re-typed. I have no idea if this has any practical value, but I like it. The -L controls the speed of the playback, in bytes per second.

pv is one of those funny little old commands that has acquired a giant batch of options over the years, including fancy formatting options, multiple output options, and transfer speed modifiers. man pv reveals all.

/usr/bin/calendar

It’s amazing what you can learn by browsing /usr/bin and other commands directories, and reading man pages. /usr/bin/calendar on Debian/Ubuntu is a modification of the BSD calendar, but it omits the moon and sun phases. It retains multiple calendars including calendar.computer, calendar.discordian, calendar.music, and calendar.lotr. On my system the man page lists different calendars than exist in /usr/bin/calendar. This example displays the Lord of the Rings calendar for the next 60 days:

$ calendar -f /usr/share/calendar/calendar.lotr  -A 60
Apr 17  An unexpected party
Apr 23  Crowning of King Ellesar
May 19  Arwen leaves Lorian to wed King Ellesar
Jun 11  Sauron attacks Osgilliath

The calendars are plain text files so you can easily create your own. The easy way is to copy the format of the existing calendar files. man calendar contains detailed instructions for creating your own calendar file.

Once again we come to the end too quickly. Take some time to cruise your own filesystem to dig up interesting commands to play with.

Learn more about Linux through the free “Introduction to Linux” course from The Linux Foundation and edX.

Kubernetes is King in Container Survey

By

InfoWorld

-

April 13, 2017

Kubernetes is in, container registries are a dime a dozen, and maximum container density isn’t the only thing that matters when running containers.

Those are some of the insights gleaned by Sysdig, maker of on-prem and in-cloud monitoring solutions, from customers for how they’re using containers in 2017.

Using a snapshot of Sysdig’s services that encompassed 45,000 running containers, Sysdig’s 2017 Docker Usage Report shows that container adoption is getting diversified by workload, and it covers some of the hot-or-not aspects of the new container stack.

Salaries for Storage, Networking Pros Continue to Rise

By

CIO .com

-

April 13, 2017

While 2016 saw U.S. tech salaries remain essentially flat year-over-year, key skills, especially in the areas of storage and networking, did warrant increases, according to the annual tech salary report from careers site Dice.com.

Their recent survey polled 12,907 employed technology professionals online between October 26, 2016 and January 24, 2017. The survey found that, overall, technology salaries in the U.S. were essentially flat year-over-year (-1 percent) at $92,081 in 2016, a slight dip from $93,328 in 2015. However, there are some notable exceptions across the country and for specific skills areas like storage and networking seeing increases, says Bob Melk, president, Dice.com.

Both the storage and networking sectors, the categories where Dice has found the most salary increases overall, are undergoing major disruption that’s fueling the salary increases, Melk says.

Why You Shouldn’t Use ENV Variables for Secret Data

By

Diogo Monica

-

April 13, 2017

The twelve-factor app manifesto recommends that you pass application configs as ENV variables. However, if your application requires a password, SSH private key, TLS Certificate, or any other kind of sensitive data, you shouldn’t pass it alongside your configs.

When you store your secret keys in an environment variable, you are prone to accidentally exposing them—exactly what we want to avoid. Here are a few reasons why ENV variables are bad for secrets:

QA in Production

By

Martin Fowler.com

-

April 13, 2017

Gathering operational data about a system is common practice, particularly metrics that indicate system load and performance such as CPU and memory usage. This data has been used for years to help teams who support a system learn when an outage is happening or imminent. When things become slow, a code profiler might be enabled in order to determine which part of the system is causing a bottleneck, for example a slow-running database query.

I’ve observed a recent trend that combines the meticulousness of this traditional operational monitoring with a much broader view of the quality of a system. While operational data is an essential part of supporting a system, it is also valuable to gather data that helps provide a picture of whether the system as a whole is behaving as expected. I define “QA in production” as an approach where teams pay closer attention to the behaviour of their production systems in order to improve the overall quality of the function these systems serve.

DNS Record Will Help Prevent Unauthorized SSL Certificates

By

PCWorld

-

April 13, 2017

In a few months, publicly trusted certificate authorities will have to start honoring a special Domain Name System (DNS) record that allows domain owners to specify who is allowed to issue SSL certificates for their domains.

The record allows a domain owner to list the CAs that are allowed to issue SSL/TLS certificates for that domain. The reason for this is to limit cases of unauthorized certificate issuance, which can be accidental or intentional, if a CA is compromised or has a rogue employee.

9 Ways to Harden Your Linux Workstation After Distro Installation

By

Konstantin Ryabitsev

-

April 12, 2017

Learn how to work from anywhere and keep your data, identity, and sanity. DOWNLOAD NOW

So far in this series, we’ve walked through security considerations for your SysAdmin workstation from choosing the right hardware and Linux distribution, to setting up a secure pre-boot environment and distro installation. Now it’s time to cover post-installation hardening.

What you do depends greatly on your distribution of choice, so it is futile to provide detailed instructions in a blog series such as this one. However, here are some essential steps you should take:

Globally disable firewire and thunderbolt modules
Check your firewalls to ensure all incoming ports are filtered
Make sure root mail is forwarded to an account you check
Set up an automatic OS update schedule, or update reminders

In addition, you may also consider some of these nice-to-have steps to further harden your system:

Check to ensure sshd service is disabled by default
Configure the screensaver to auto-lock after a period of inactivity
Set up logwatch
Install and use rkhunter
Install an Intrusion Detection System

As I’ve said before, security is like driving on the highway — anyone going slower than you is an idiot, while anyone driving faster than you is a crazy person. The guidelines in this series are merely a basic set of core safety rules that is neither exhaustive, nor a replacement for experience, vigilance, and common sense. You should adapt these recommendations to suit your environment.

Blacklisting modules

To blacklist a firewire and thunderbolt modules, add the following lines to a file in /etc/modprobe.d/blacklist-dma.conf:

blacklist firewire-core 

blacklist thunderbolt

The modules will be blacklisted upon reboot. It doesn’t hurt doing this even if you don’t have these ports (but it doesn’t do anything either).

Root mail

By default, root mail is just saved on the system and tends to never be read. Make sure you set your /etc/aliases to forward root mail to a mailbox that you actually read, otherwise you may miss important system notifications and reports:

# Person who should get root’s mail 

root:                  bob@example.com

Run newaliases after this edit and test it out to make sure that it actually gets delivered, as some email providers will reject email coming in from nonexistent or non-routable domain names. If that is the case, you will need to play with your mail forwarding configuration until this actually works.

Firewalls, sshd, and listening daemons

The default firewall settings will depend on your distribution, but many of them will allow incoming sshd ports. Unless you have a compelling legitimate reason to allow incoming ssh, you should filter that out and disable the sshd daemon.

systemctl disable sshd.service 

systemctl stop sshd.service

You can always start it temporarily if you need to use it.

In general, your system shouldn’t have any listening ports apart from responding to ping. This will help safeguard you against network-level 0-day exploits.

Automatic updates or notifications

It is recommended to turn on automatic updates, unless you have a very good reason not to do so, such as fear that an automatic update would render your system unusable (it’s happened in the past, so this

fear is not unfounded). At the very least, you should enable automatic notifications of available updates. Most distributions already have this service automatically running for you, so chances are you don’t have to do anything. Consult your distribution documentation to find out more.

You should apply all outstanding errata as soon as possible, even if something isn’t specifically labeled as “security update” or has an associated CVE code. All bugs have the potential of being security bugs and erring on the side of newer, unknown bugs is generally a safer strategy than sticking with old, known ones.

Watching logs

You should have a keen interest in what happens on your system. For this reason, you should install logwatch and configure it to send nightly activity reports of everything that happens on your system. This won’t prevent a dedicated attacker, but is a good safety-net feature to have in place.

Note, that many systemd distros will no longer automatically install a syslog server that logwatch needs (due to systemd relying on its own journal), so you will need to install and enable rsyslog to make sure your /var/log is not empty before logwatch will be of any use.

Rkhunter and IDS

Installing rkhunter and an intrusion detection system (IDS) like aide or tripwire will not be that useful unless you actually understand how they work and take the necessary steps to set them up properly (such as, keeping the databases on external media, running checks from a trusted environment, remembering to refresh the hash databases after performing system updates and configuration changes, etc). If you are not willing to take these steps and adjust how you do things on your own workstation, these tools will introduce hassle without any tangible security benefit.

We do recommend that you install rkhunter and run it nightly. It’s fairly easy to learn and use, and though it will not deter a sophisticated attacker, it may help you catch your own mistakes.

The first part of this series has walked through distro installation, and some pre- and post-installation security guidelines. In the next article, cover some of the best storage options to back up your workstation and then we’ll dive into some more general best practices around web browser security, SSH and private keys, and more.

How to Choose the Best Linux Distro for SysAdmin Workstation Security

4 Security Steps to Take Before You Install Linux

Security Tips for Installing Linux on Your SysAdmin Workstation

LLVM-Powered Pocl Puts Parallel Processing on Multiple Hardware Platforms

By

InfoWorld

-

April 12, 2017

Open source implementation of OpenCL automatically deploys code across numerous platforms, speeding machine learning and other jobs.

LLVM, the open source compiler framework that powers everything from Mozilla’s Rust language to Apple’s Swift, emerges in yet another significant role: an enabler of code deployment systems that target multiple classes of hardware for speeding up jobs like machine learning.

To write code that can run on CPUs, GPUs, ASICs, and FPGAs—hugely useful with machine learning apps—it’s best to use the likes of OpenCL, which allows a program to be written once, then automatically deployed across different types of hardware.

Game of Nodes: Network Operators vs. Cloud Operators

By

The New Stack

-

April 12, 2017

There’s a whirlwind of information on the topic of network commoditization. Seriously. Just search in your favorite search engine for “SDN,” “NFV,” or “telco cloud.” You will find dozens of open source projects, communities, forums, architectural definitions, standards, standard bodies, news articles, press releases, and blog sites dedicated to the aforementioned.

With such a myriad of information, you’d think people out there would have a deep understanding of why these topics are creating so much noise. I make it a point to ask everyone I meet this simple question: “In a few sentences, what does telco cloud, NFV, or SDN mean to you?”