Finding Files with mlocate: Part 2

By

-

November 14, 2017

In the previous article, we discussed some ways to find a specific file out of the thousands that may be present on your filesystems and introduced the locate tool for the job. Here we explain how the important updatedb tool can help.

Well Situated

Incidentally, you might get a little perplexed if trying to look up the manuals for updatedb and the locate command. Even though it’s actually the mlocate command and the binary is /usr/bin/updatedb on my filesystem, you probably want to use varying versions of the following man commands to find what you’re looking for:

# man locate


# man updatedb


# man updatedb.conf

Let’s look at the important updatedb command in a little more detail now. It’s worth mentioning that, after installing the locate utility, you will need to initialize your file-list database before doing anything else. You have to do this as the “root” user in order to reach all the relevant areas of your filesystems or the locate command will complain. Initialize or update your database file, whenever you like, with this command:

# updatedb

Obviously, the first time this command is run it may take a little while to complete, but when I’ve installed the locate command afresh I’ve almost always been pleasantly surprised at how quickly it finishes. After a hop, a skip, and a jump, you can immediately query your file database. However, let’s wait a moment before doing that.

We’re dutifully informed by its manual that the database created as a result of running updatedb resides at the following location: /var/lib/mlocate/mlocate.db.

If you want to change how updatedb is run, then you need to affect it with your config file — a reminder that it should live here: /etc/updatedb.conf. Listing 1 shows the contents of it on my system:

PRUNE_BIND_MOUNTS = "yes"

PRUNEFS = "9p afs anon_inodefs auto autofs bdev binfmt_misc cgroup cifs coda configfs 
cpuset debugfs devpts ecryptfs exofs fuse fusectl gfs gfs2 hugetlbfs inotifyfs iso9660 
jffs2 lustre mqueue ncpfs nfs nfs4 nfsd pipefs proc ramfs rootfs rpc_pipefs securityfs 
selinuxfs sfs sockfs sysfs tmpfs ubifs udf usbfs"

PRUNENAMES = ".git .hg .svn"

PRUNEPATHS = "/afs /media /net /sfs /tmp /udev /var/cache/ccache /var/spool/cups 
/var/spool/squid /var/tmp"

Listing 1: The innards of the file /etc/updatedb.conf which affects how our database is created.

The first thing that my eye is drawn to is the PRUNENAMES section. As you can see by stringing together a list of directory names, delimited with spaces, you can suitably ignore them. One caveat is that only directory names can be skipped, and you can’t use wildcards. As we can see, all of the otherwise-hidden files in a Git repository (the .git directory might be an example of putting this option to good use.

If you need to be more specific then, again using spaces to separate your entries, you can instruct the locate command to ignore certain paths. Imagine for example that you’re generating a whole host of temporary files overnight which are only valid for one day. You’re aware that this is a special directory of sorts which employs a familiar naming convention for its thousands of files. It would take the locate command a relatively long time to process the subtle changes every night adding unnecessary stress to your system. The solution is of course to simply add it to your faithful “ignore” list.

Well Appointed

As seen in Listing 2, the file /etc/mtab offers not just a list of the more familiar filesystems such as /dev/sda1 but also a number of others that you may not immediately remember.

/dev/sda1 /boot ext4 rw,noexec,nosuid,nodev 0 0

proc /proc proc rw 0 0

sysfs /sys sysfs rw 0 0

devpts /dev/pts devpts rw,gid=5,mode=620 0 0

/tmp /var/tmp none rw,noexec,nosuid,nodev,bind 0 0

none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0

Listing 2: A mashed up example of the innards of the file /etc/mtab.

Some of the filesystems shown in Listing 2 contain ephemeral content and indeed content that belongs to pseudo-filesystems, so it is clearly important to ignore their files — if for no other reason than because of the stress added to your system during each overnight update.

In Listing 1, the PRUNEFS option takes care of this and ditches those not suitable (for most cases). There are a few different filesystems to consider as you can see:

PRUNEFS = "9p afs anon_inodefs auto autofs bdev binfmt_misc cgroup cifs coda configfs 
cpuset debugfs devpts ecryptfs exofs fuse fusectl gfs gfs2 hugetlbfs inotifyfs iso9660 jffs2 
lustre mqueue ncpfs nfs nfs4 nfsd pipefs proc ramfs rootfs rpc_pipefs securityfs selinuxfs 
sfs sockfs sysfs tmpfs ubifs udf usbfs"

The updatedb.conf manual succinctly informs us of the following information in relation to the PRUNE_BIND_MOUNTS option:

“If PRUNE_BIND_MOUNTS is 1 or yes, bind mounts are not scanned by updatedb(8). All file systems mounted in the subtree of a bind mount are skipped as well, even if they are not bind mounts. As an exception, bind mounts of a directory on itself are not skipped.”

Assuming that makes sense, before moving onto some locate command examples, you should note one thing. Excluding some versions of the updatedb command, it can also be told to ignore certain “non-directory files.” However, this does not always apply, so don’t blindly copy and paste config between versions if you use such an option.

In Need of Modernization

As mentioned earlier, there are times when finding a specific file needs to be so quick that it’s at your fingertips before you’ve consciously recalled the command. This is the irrefutable beauty of the locate command.

And, if you’ve ever sat in front of a horrendously slow Windows machine watching the hard disk light flash manically as if it were suffering a conniption due to the indexing service running, then I can assure you that the performance that you’ll receive from the updatedb command will be a welcome relief.

You should bear in mind, that unlike with the find command, there’s no need to remember the base paths of where your file might be residing. By that I mean that all of your (hopefully) relevant filesystems are immediately accessed with one simple command and that remembering paths is almost a thing of the past.

In its most simple form, the locate command looks like this:

# locate chrisbinnie.pdf

There’s also no need to escape hidden files that start with a dot or indeed expand a search with an asterisk:

# locate .bash

Listing 3 shows us what has been returned, in an instant, from the many partitions the clever locate command has scanned previously.

/etc/bash_completion.d/yum.bash

/etc/skel/.bash_logout

/etc/skel/.bash_profile

/etc/skel/.bashrc

/home/chrisbinnie/.bash_history

/home/chrisbinnie/.bash_logout

/home/chrisbinnie/.bash_profile

/home/chrisbinnie/.bashrc

/usr/share/doc/git-1.5.1/contrib/completion/git-completion.bash

/usr/share/doc/util-linux-ng-2.16.1/getopt-parse.bash

/usr/share/doc/util-linux-ng-2.16.1/getopt-test.bash

Listing 3: The search results from running the command: “locate .bash”

I’m suspicious that the following usage has altered slightly, from back in the day when the slocate command was more popular or possibly the original locate command, but you can receive different results by adding an asterisk to that query as so:

# locate .bash*

In Listing 4, you can see the difference from Listing 3. Thankfully, the results make more sense now that we can see them together. In this case, the addition of the asterisk is asking the locate command to return files beginning with .bash as opposed to all files containing that string of characters.

/etc/skel/.bash_logout

/etc/skel/.bash_profile

/etc/skel/.bashrc

/home/d609288/.bash_history

/home/d609288/.bash_logout

/home/d609288/.bash_profile

/home/d609288/.bashrc

Listing 4: The search results from running the command: “locate .bash*” with the addition of an asterisk.

Stay tuned for next time when we learn more about the amazing simplicity of using the locate command on a day-to-day basis.

Learn more about essential sysadmin skills: Download the Future Proof Your SysAdmin Career ebook now.

Chris Binnie’s latest book, Linux Server Security: Hack and Defend, shows how hackers launch sophisticated attacks to compromise servers, steal data, and crack complex passwords, so you can learn how to defend against these attacks. In the book, he also talks you through making your servers invisible, performing penetration testing, and mitigating unwelcome attacks. You can find out more about DevSecOps and Linux security via his website (http://www.devsecops.cc).

What Is OpenHPC?

By

OpenSource.com

-

November 14, 2017

High performance computing (HPC)—the aggregation of computers into clusters to increase computing speed and power—relies heavily on the software that connects and manages the various nodes in the cluster. Linux is the dominant HPC operating system, and many HPC sites expand upon the operating system’s capabilities with different scientific applications, libraries, and other tools.

As HPC began developing, that there was considerable duplication and redundancy among the HPC sites compiling HPC software became apparent, and sometimes dependencies between the different software components made installations cumbersome. The OpenHPC project was created in response to these issues. OpenHPC is a community-based effort to solve common tasks in HPC environments by providing documentation and building blocks that can be combined by HPC sites according to their needs.

How Enterprise IT Uses Kubernetes to Tame Container Complexity

By

The Enterprisers Project

-

November 14, 2017

Running a few standalone containers for development purposes won’t rob your IT team of time or patience: A standards-based container runtime by itself will do the job. But once you scale to a production environment and multiple applications spanning many containers, it’s clear that you need a way to coordinate those containers to deliver the individual services. As containers accumulate, complexity grows. Eventually, you need to take a step back and group containers along with the coordinated services they need, such as networking, security, and telemetry.

That’s why technologies like the open source Kubernetes project are such a big part of the container scene.

Kubernetes automates and orchestrates Linux container operations. It eliminates many of the manual processes involved in deploying and scaling containerized applications.

DevOps, Agile, and Continuous Delivery: What IT Leaders Need to Know

By

TechGenix

-

November 14, 2017

Enterprises across the globe have implemented the Agile methodology of software development and reaped its benefits in terms of smaller development times. Agile has also helped streamline processes in multilevel software development teams. And the methodology builds in feedback loops and drives the pace of innovation. Over time, DevOps and continuous delivery have emerged as more wholesome and upgraded approaches of managing the software development life cycle (SDLC) with a view to improve speed to market, reduce errors, and enhance quality.

In this guide, we will talk more about Agile, how it grew, and how it extended into DevOps and, ultimately, continuous development.

11 Top Tools to Assess, Implement, and Maintain GDPR Compliance

By

CSO Online

-

November 14, 2017

The European Union’s General Data Protection Regulation (GDPR) goes into effect in May 2018, which means that any organization doing business in or with the EU has six months from this writing to comply with the strict new privacy law. The GDPR applies to any organization holding or processing personal data of E.U. citizens, and the penalties for noncompliance can be stiff: up to €20 million (about $24 million) or 4 percent of annual global turnover, whichever is greater. Organizations must be able to identify, protect, and manage all personally identifiable information (PII) of EU residents even if those organizations are not based in the EU.

Some vendors are offering tools to help you prepare for and comply with the GDPR. What follows is a representative sample of tools to assess what you need to do for compliance, implement measures to meet requirements, and maintain compliance once you reach it.

How AMD Wants to Provide ‘Supercomputing for All’

By

eWeek

-

November 14, 2017

At the SC17 supercomputing conference in Denver Nov. 13, AMD and some of its ecosystem partners announced the availability of a suite of new, high-performance systems powered by AMD EPYC CPUs (central processing units) and AMD Radeon Instinct GPUs (graphics processing units) to accelerate the use of supercomputing in smaller data centers.

AMD combines this portfolio with new software, including the new ROCm 1.7 open platform with updated development tools and libraries, enabling complete AMD EPYC-based PetaFLOPS systems.

Supports Various Environments

By supporting both heterogeneous supercomputing systems and memory-bound, CPU-driven, high-performance platforms with EPYC, AMD claims it can address the needs of multiple workloads with up to a 3X advantage in performance per dollar for the EPYC 7601 versus Intel’s Xeon Platinum 8180M.

How OpenChain Can Transform the Supply Chain

By

OpenSource.com

-

November 14, 2017

OpenChain is all about increasing open source compliance in the supply chain. This issue, which many people initially dismiss as a legal concern or a low priority, is actually tied to making sure that open source is as useful and frictionless as possible. In a nutshell, because open source is about the use of third-party code, compliance is the nexus where equality of access, safety of use, and reduction of risk can be found. OpenChain accomplishes this by building trust between organizations.

Many companies today understand open source and act as major supporters of open source development; however, addressing open source license compliance in a systematic, industry-wide manner has proven to be a somewhat elusive challenge. The global IT market has not seen a significant reduction in the number of open source compliance issues in areas such as consumer electronics over the past decade.

LTS Linux Kernel 4.14: No Regressions

By

Paul Brown

-

November 13, 2017

Linus Torvalds released version 4.14 of the Linux kernel on Sunday, Nov. 12 — which was a week later than expected. The delay was due to some reverts that would have made the projected Nov. 5 release too early.

One of the unsettling reverts was regarding an AppArmor patch that was causing a regression, a big no-no according to Torvalds, who stated the first rule of Linux kernel development: “we don’t cause regressions.” After some back and forth, Linus reverted the offending commit himself and the problem was temporarily solved.

And now the new kernel is here: Linux 4.14 is the 2017 Long-Term Stable (LTS) release of the kernel and will be supported for about two years. Greg Kroah-Hartman made the announcement in his blog and added that he would be supporting 4.14 with stable kernel patch backports “unless it is a horrid release,” which, despite the delaying issues, doesn’t seem to be the case.

Something else that was pending and has finally been addressed in this kernel is the closing of in-tree kernel firmware. This will help better enforce placing firmware blobs in the linux-firmware.git repository. Before David Woodhouse created the Git repository, proprietary firmware blobs were submitted to an in-tree kernel firmware/ branch, but this branch has been dormant for years. Deleting it gets rid of any ambiguity and lightens the kernel load by some 100,000 lines.

Zstd/Zstandard is also something new that has been integrated into kernel 4.14. Zstd is a compression technology for filesystems that achieves similar compression ratios to zlib, but is much faster. Zstd was originally developed at Facebook and has already been tested extensively in production environments.

Other stuff that’s new in kernel 4.14

A virtual machine shake-up has led to improvements in speed and performance of KVM, Xen, and Microsoft’s Hyper-V. Interestingly enough, in the case of latter, most changes have not come from Redmond, but from Red Hat engineers.
The Raspberry Pi now has HDMI CEC support built into the mainline kernel. CEC, or “Consumer Electronics Control” allows users to control devices over HDMI using a single controller — think using a remote to control your Pi, but that can also be used to control a TV connected to your Pi.
Several fixes to EFI support ensures that reboots are handled correctly and, by enabling the wiping of RAM after a warm reboot, that they are now more secure.

To find out more, check out the writeups at Kernel Newbies and Phoronix.

You can learn more about the Linux kernel development process and read featured developer profiles in the new 2017 Linux Kernel Development Report. Download the free report now.

The CNCF Just Got 36 Companies to Agree to a Kubernetes Certification Standard

By

TechCrunch

-

November 13, 2017

The Cloud Native Computing Foundation (CNCF) announced today that 36 members have agreed to a set of certification standards for Kubernetes, the immensely popular open source container orchestration tool. This should make it easy for users to move from one version to another without worry, while ensuring that containers under Kubernetes management will behave in a predictable way.

The group of 36 is agreeing to a base set of APIs that have to underly any version of Kubernetes a member creates to guarantee portability. Dan Kohn, executive director at CNCF, says that they took a subset of existing Kubernetes project APIs, which are treated as a conformance test that the members who have signed on, are guaranteeing to support. In practice this means that when you spin up a new container, regardless of who creates the version of Kubernetes, it will behave in a consistent way, he said.

Autodesk’s Shift to Open Source and Inner Source

By

The Linux Foundation

-

November 13, 2017

Autodesk is undergoing a company-wide shift to open source and inner source. And that’s on top of the culture change that both development methods require.

Inner source means applying open source development practices and methodologies to internal projects, even if the projects are proprietary. And the culture change required to be successful can be a hard shift from a traditional corporate hierarchy to an open approach. Even though they’re connected, all three changes are distinct heavy lifts.

They began by hiring Guy Martin as Director of Open Source Strategy in the Engineering Practice at Autodesk, which was designed to transform engineering across the company. Naturally, open source would play a huge role in that effort, including spurring the use of inner source. But neither would flourish if the company culture didn’t change. And so the job title swiftly evolved to Director of Open @ADSK at the company.