Know Your Storage: Block, File & Object

By

-

September 11, 2018

Dealing with the tremendous amount of data generated today presents a big challenge for companies who create or consume such data. It’s a challenge for tech companies that are dealing with related storage issues.

“Data is growing exponentially each year, and we find that the majority of data growth is due to increased consumption and industries adopting transformational projects to expand value. Certainly, the Internet of Things (IoT) has contributed greatly to data growth, but the key challenge for software-defined storage is how to address the use cases associated with data growth,” said Michael St. Jean, principal product marketing manager, Red Hat Storage.

Every challenge is an opportunity. “The deluge of data being generated by old and new sources today is certainly presenting us with opportunities to meet our customers escalating needs in the areas of scale, performance, resiliency, and governance,” said Tad Brockway, General Manager for Azure Storage, Media and Edge.

Trinity of modern software-defined storage

There are three different kinds of storage solutions — block, file, and object — each serving a different purpose while working with the others.

Block storage is the oldest form of data storage, where data is stored in fixed-length blocks or chunks of data. Block storage is used in enterprise storage environments and usually is accessed using Fibre Channel or iSCSI interface. “Block storage requires an application to map where the data is stored on the storage device,” according to SUSE’s Larry Morris, Sr. Product Manager, Software Defined Storage.

Block storage is virtualized in storage area network and software defined storage systems, which are abstracted logical devices that reside on a shared hardware infrastructure and are created and presented to the host operating system of a server, virtual server, or hypervisor via protocols like SCSI, SATA, SAS, FCP, FCoE, or iSCSI.

“Block storage splits a single storage volume (like a virtual or cloud storage node, or a good old fashioned hard disk) into individual instances known as blocks,” said St. Jean.

Each block exists independently and can be formatted with its own data transfer protocol and operating system — giving users complete configuration autonomy. Because block storage systems aren’t burdened with the same investigative file-finding duties as the file storage systems, block storage is a faster storage system. Pairing that speed with configuration flexibility makes block storage ideal for raw server storage or rich media databases.

Block storage can be used to host operating systems, applications, databases, entire virtual machines and containers. Traditionally, block storage can only be accessed by individual machine, or machines in a cluster, to which it has been presented.

File-based storage

File-based storage uses a filesystem to map where the data is stored on the storage device. It’s a dominant technology used on direct- and networked-attached storage system, and it takes care of two things: organizing data and representing it to users. “With file storage, data is arranged on the server side in the exact same format as the clients see it. This allows the user to request a file by some unique identifier — like a name, location, or URL — which is communicated to the storage system using specific data transfer protocols,” said St. Jean.

The result is a type of hierarchical file structure that can be navigated from top to bottom. File storage is layered on top of block storage, allowing users to see and access data as files and folders, but restricting access to the blocks that stand up those files and folders.

“File storage is typically represented by shared filesystems like NFS and CIFS/SMB that can be accessed by many servers over an IP network. Access can be controlled at a file, directory, and export level via user and group permissions. File storage can be used to store files needed by multiple users and machines, application binaries, databases, virtual machines, and can be used by containers,” explained Brockway.

Object storage

Object storage is the newest form of data storage, and it provides a repository for unstructured data which separates the content from the indexing and allows the concatenation of multiple files into an object. An object is a piece of data paired with any associated metadata that provides context about the bytes contained within the object (things like how old or big the data is). Those two things together — the data and metadata — make an object.

One advantage of object storage is the unique identifier associated with each piece of data. Accessing the data involves using the unique identifier and does not require the application or user to know where the data is actually stored. Object data is accessed through APIs.

“The data stored in objects is uncompressed and unencrypted, and the objects themselves are arranged in object stores (a central repository filled with many other objects) or containers (a package that contains all of the files an application needs to run). Objects, object stores, and containers are very flat in nature — compared to the hierarchical structure of file storage systems — which allow them to be accessed very quickly at huge scale,” explained St. Jean.

Object stores can scale to many petabytes to accommodate the largest datasets and are a great choice for images, audio, video, logs, backups, and data used by analytics services.

Conclusion

Now you know about the various types of storage and how they are used. Stay tuned to learn more about software-defined storage as we examine the topic in the future.

Join us at Open Source Summit + Embedded Linux Conference Europe in Edinburgh, UK on October 22-24, 2018, for 100+ sessions on Linux, Cloud, Containers, AI, Community, and more.

5 Things You Should Be Monitoring

By

O'Reilly

-

September 11, 2018

Whether you’re a developer building websites or internal applications, or an administrator building the infrastructure to back them, your job doesn’t stop once they’re up and running. Machine failure, releases containing bugs, and growth in usage can all lead to problems that need to be dealt with. To detect them, you need monitoring.

But monitoring can do more than just send you alerts about the things that are going wrong. It can also help you debug those problems and prevent them in the future. So what things should you be monitoring?

1. Latency

Faster web pages lead to happier users. The opposite is also true: increased latency leads to user dissatisfaction and could also be the first warning sign that your system is strained. Launching resource-intensive features often means more user requests being served. As servers die, latency can increase. In fact, latency tends to increase nonlinearly in response to load due to increased contention. Small latency increases today could indicate bigger latency increases in the future; early awareness gives you some time to fix any issues.

Latency is generally measured from two perspectives: your users and your system.

Distributed Tracing Infrastructure with Jaeger on Kubernetes

By

Medium Blog

-

September 11, 2018

Kubernetes has become the de-facto orchestrator for microservices infrastructure and deployment. The ecosystem is extremely rich and one of the fastest growing in the open-source community. A monitoring infrastructure with Prometheus, ElasticSearch, Grafana, Envoy/Consul, Jaeger/Zipkin make up a solid foundation to enable metrics, logging, dashboards, service-discovery and distributed tracing across the stack.

Distributed Tracing

Distributed tracing enables capturing requests and building a view of the entire chain of calls made all the way from user requests to interactions between hundreds of services. It also enables instrumentation of application latency (how long each request took), tracking the lifecycle of network calls (HTTP, RPC, etc) and also identify performance issues by getting visibility on bottlenecks.

The following sections go over enabling distributed tracing with Jaeger for gRPC services in a Kubernetes setup. Jaeger Github Org has dedicated repo for various deployment configurations of Jaeger in Kubernetes. These are excellent examples to build on, and I will try to break down each Jaeger component and its Kubernetes deployment.

Mozilla Identifies 10 Open Source Personas: What You Need to Know

By

OpenSource.com

-

September 11, 2018

Participating in open source communities—or in any open organization, for that matter—means collaborating with others who might not operate the same way you do. Their motivations may differ. Their governance models might seem foreign. Their goals might not immediately speak to you. So if you’re going to work together, you’ll need to develop a clear sense of what makes the project tick—and decide quickly whether working together is best for your team and your business.

Similarly, if you’re instigating an open source project, you should ask yourself, “what kind of community do I want to attract?” Then you can plan for and signal that accordingly.

Earlier this year, Mozilla and Open Tech Strategies released the first version of a tool we hope will help with this. Our recent report, “Open Source Archetypes,”identifies 10 general types (or “archetypes”) of open communities in their strategic contexts. The report offers narratives describing these archetypes, explains what motivate them, and outlines the strategic benefits of working with them. We also cite some examples of each archetype and offer insights into various licensing models, governance models and community standards that comprise them.

Open Source Eases AT&T’s Technical Burden

By

SDx Central

-

September 11, 2018

At this week’s AT&T Spark event in San Francisco, where Amy Wheelus, vice president of Network Cloud at the carrier, explained that AT&T was an “open source-first shop.”

AT&T’s embrace of the open source community was echoed by Wheelus’ colleague Catherine Lefèvre, associate vice president for Network Cloud and infrastructure at AT&T Labs, who said the carrier’s work with that ecosystem is very collaborative.

…As part of a panel discussion tied to the open source topic, Arpit Joshipura, general manager for networking and orchestration at the Linux Foundation, said that while this new operating model is a big change for legacy telecom vendors, he also sees an opportunity in this upheaval for those traditional vendors to oversee actual deployments.

“There is an opportunity in the open source world for those traditional vendors to be systems integrators,” Joshipura said. He explained that those vendors have a long history of knowing exactly what telecom operators and networks need in terms of support and could lead in making “those open source projects more distributed and hardened for telecom.”

ACM’s Code of Ethics Offers Updated Guidelines for Computing Professionals

By

Esther Shein

-

September 10, 2018

The Association of Computing Machinery (ACM) has released an update to its Code of Ethics and Professional Conduct geared at computing professionals. The update was done “to address the significant advances in computing technology and the degree [to which] these technologies are integrated into our daily lives,” explained ACM members Catherine Flick and Michael Kirkpatrick, writing in Reddit.

This marks the first update to the Code, which the ACM maintains “expresses the conscience of the profession,” since 1992. The goal is to ensure it “reflects the experiences, values and aspirations of computing professionals around the world,’’ Flick and Kirkpatrick said.

The Code was written to guide computing professionals’ ethical conduct and includes anyone using computing technology “in an impactful way.” It also serves as a basis for remediation when violations occur. The Code contains principles developed as statements of responsibility in the belief that “the public good is always the primary consideration.”

Ethical Decision Making

In its entirety, the ACM says the Code “is concerned with how fundamental ethical principles apply to a computing professional’s conduct. The Code is not an algorithm for solving ethical problems; rather it serves as a basis for ethical decision-making.”

It is divided into four sections: General Ethical Principles; Professional Responsibilities; Professional Leadership Principles; and Compliance with the Code.

The General Ethical Principles section discusses the role of a computer professional, saying they should contribute to society, with an acknowledgement “that all people are stakeholders in computing.” This section addresses the “obligation” of computing professionals to use their skills for the benefit of society.

“An essential aim of computing professionals is to minimize negative consequences of computing, including threats to health, safety, personal security, and privacy,’’ the code advises. “When the interests of multiple groups conflict, the needs of those less advantaged should be given increased attention and priority.”

Computing professionals should perform high quality work and maintain professional confidence. They should also take into consideration diversity and social responsibility in their efforts and engage in pro bono or volunteer work benefitting the public good, the ACM recommends.

They should also try to avoid harm, in areas including “unjustified physical or mental injury, unjustified destruction or disclosure of information, and unjustified damage to property, reputation, and the environment.” To minimize the possibility of unintentionally or indirectly hurting others, computing professionals are advised to follow “generally accepted best practices unless there is a compelling ethical reason to do otherwise.” They should also carefully consider the consequences of “data aggregation and emergent properties of systems,” the ACM advises.

Computing professionals should also be honest and trustworthy and transparent. They should “provide full disclosure of all pertinent system capabilities, limitations, and potential problems to the appropriate parties. Making deliberately false or misleading claims, fabricating or falsifying data, offering or accepting bribes, and other dishonest conduct are violations of the Code,” the ACM stresses. This also applies to honesty about their qualifications and any limitations in their ability to complete a task. They should also be fair and not discriminate against others, and “credit the creators of ideas, inventions, work and artifacts, and respect copyrights, patents, trade secrets, license agreements, and other methods of protecting authors’ works.”

With a nod to the ability of technology to collect, monitor and disseminate personal information, another call to action under the Ethical Principles section is respecting the privacy, rights and responsibilities associated with collecting and using personal information. Use of personal information should only be done for “legitimate ends and without violating the rights of individuals and groups,” the Code states.

A Position of Trust

Noting that computing professionals “are in a position of trust,” they have “a special responsibility to provide objective, credible evaluations and testimony to employers, employees, clients, users, and the public.” Consequently, the Code says these individuals “should strive to be perceptive, thorough, and objective when evaluating, recommending, and presenting system descriptions and alternatives.”

The Code also stresses that “extraordinary care should be taken to identify and mitigate potential risks in machine learning systems.” Other mandates in the Professional Responsibilities section include maintaining high standards of competence, conduct and ethical practice. Computing professionals should also only perform work in areas in which they are competent. They should also design and implement systems that are “robustly and usably secure,” the Code states.

The Professional Leadership Principles section, as the name suggests, deals with the attributes of a leader. These principles deal with the importance of ensuring computing work is done, again, with the public good in mind, and having procedures and attitudes oriented toward the welfare of society. Doing so, the Code suggests, will “reduce harm to the public and raise awareness of the influence of technology in our lives.”

Leaders should also enhance the quality of work life, articulate, apply and support the Code’s principles and create opportunities for people to grow as professionals. They should use care when changing or discontinuing support for systems/features, and help users understand “that timely replacement of inappropriate or outdated features or entire systems may be needed.”

Lastly, the ACM urges compliance to the Code’s principles and to treat violations “as inconsistent with membership in the ACM.”

10 Ways to Learn More about Open Source Software and Trends

By

The Enterprisers Project

-

September 10, 2018

When Forrester released its 2016 report “Open Source Powers Enterprise Digital Transformation,” some people in the open source community were surprised by the results. They weren’t surprised that 41 percent of enterprise decision makers called open source a high priority and planned to increase use of open source in their organizations. They were concerned that the other 59 percent didn’t seem to understand the role open source would play in the future of the enterprise.

Paul Miller, one of the analysts behind the report, wrote, “The myth that open source software is exclusively written by and for lonely – rather odd – individual geeks remains remarkably prevalent. And yet it’s a myth that is almost entirely wrong. Again and again, we encounter executives who do not grasp how much their organization already depends on open source. More importantly, they do not see the key role that open source technologies and thinking will play in enabling their efforts to transform into a customer-obsessed business that really can win, serve, and retain customers.”

Fast-forward to today, and open source skills are among the most in-demand: 83 percent of hiring managers surveyed for the 2018 Open Source Jobs report said hiring open source talent was a priority this year, up from 76 percent last year.

The Best Linux Apps for Chrome OS

By

Android Police

-

September 10, 2018

Slowly but surely, Google is bringing support for Linux applications to Chrome OS. Even though the feature is primarily aimed at developers, like those who want to get Android Studio running on a Pixelbook, there are plenty of apps that can benefit normal users. We already have a guide about installing Linux apps on Chrome OS, but if you’re not sure what to try, this post may point you in the right direction.

This isn’t a simple compilation of the best Linux apps, because plenty of those exist already. Instead, the goal here is to recommend apps for tasks that cannot be adequately filled by web apps or Android applications. For example, serious photo editing isn’t really possible through the web, and options on the Play Store are limited, but Gimp is perfect for it.

Beginner’s Guide: How To Install Ubuntu Linux 18.04

By

Linux Editorial Staff

-

September 10, 2018

Two surprising things happened this year in my personal tech life. Dell’s XPS 13 laptop became my daily driver, finally pulling me away from Apple’s MacBook Pro. Then I ditched Windows 10 in favor of Ubuntu. Now I’ve gone down the Linux rabbit hole and become fascinated with the wealth of open source (and commercial!) software available, the speed and elegance of system updates and the surprising turn of events when it comes to gaming on Linux. I’ve also seen a rising interest in Linux inside my community, so I wanted to craft a guide to help you install Ubuntu on your PC of choice. … Read more at Forbes.

How to Use the Netplan Network Configuration Tool on Linux

By

Jack Wallen

-

September 7, 2018

For years Linux admins and users have configured their network interfaces in the same way. For instance, if you’re an Ubuntu user, you could either configure the network connection via the desktop GUI or from within the /etc/network/interfaces file. The configuration was incredibly easy and never failed to work. The configuration within that file looked something like this:

auto enp10s0

iface enp10s0 inet static

address 192.168.1.162

netmask 255.255.255.0

gateway 192.168.1.100

dns-nameservers 1.0.0.1,1.1.1.1

Save and close that file. Restart networking with the command:

sudo systemctl restart networking

Or, if you’re not using a non-systemd distribution, you could restart networking the old fashioned way like so:

sudo /etc/init.d/networking restart

Your network will restart and the newly configured interface is good to go.

That’s how it’s been done for years. Until now. With certain distributions (such as Ubuntu Linux 18.04), the configuration and control of networking has changed considerably. Instead of that interfaces file and using the /etc/init.d/networking script, we now turn to Netplan. Netplan is a command line utility for the configuration of networking on certain Linux distributions. Netplan uses YAML description files to configure network interfaces and, from those descriptions, will generate the necessary configuration options for any given renderer tool.

I want to show you how to use Netplan on Linux, to configure a static IP address and a DHCP address. I’ll be demonstrating on Ubuntu Server 18.04. I will give you one word of warning, the .yaml files you create for Netplan must be consistent in spacing, otherwise they’ll fail to work. You don’t have to use a specific spacing for each line, it just has to remain consistent.

The new configuration files

Open a terminal window (or log into your Ubuntu Server via SSH). You will find the new configuration files for Netplan in the /etc/netplan directory. Change into that directory with the command cd /etc/netplan. Once in that directory, you will probably only see a single file:

01-netcfg.yaml

You can create a new file or edit the default. If you opt to edit the default, I suggest making a copy with the command:

sudo cp /etc/netplan/01-netcfg.yaml /etc/netplan/01-netcfg.yaml.bak

With your backup in place, you’re ready to configure.

Network Device Name

Before you configure your static IP address, you’ll need to know the name of device to be configured. To do that, you can issue the command ip a and find out which device is to be used (Figure 1).

Figure 1: Finding our device name with the ip a command.

I’ll be configuring ens5 for a static IP address.

Configuring a Static IP Address

Open the original .yaml file for editing with the command:

sudo nano /etc/netplan/01-netcfg.yaml

The layout of the file looks like this:

network:

Version: 2

Renderer: networkd

ethernets:

DEVICE_NAME:

Dhcp4: yes/no

Addresses: [IP/NETMASK]

Gateway: GATEWAY

Nameservers:

Addresses: [NAMESERVER, NAMESERVER]

Where:

DEVICE_NAME is the actual device name to be configured.
yes/no is an option to enable or disable dhcp4.
IP is the IP address for the device.
NETMASK is the netmask for the IP address.
GATEWAY is the address for your gateway.
NAMESERVER is the comma-separated list of DNS nameservers.

Here’s a sample .yaml file:

network:

    version: 2

    renderer: networkd

    ethernets:

       ens5:

       dhcp4: no

       addresses: [192.168.1.230/24]

       gateway4: 192.168.1.254

       nameservers:

          addresses: [8.8.4.4,8.8.8.8]

Edit the above to fit your networking needs. Save and close that file.

Notice the netmask is no longer configured in the form 255.255.255.0. Instead, the netmask is added to the IP address.

Testing the Configuration

Before we apply the change, let’s test the configuration. To do that, issue the command:

sudo netplan try

The above command will validate the configuration before applying it. If it succeeds, you will see Configuration accepted. In other words, Netplan will attempt to apply the new settings to a running system. Should the new configuration file fail, Netplan will automatically revert to the previous working configuration. Should the new configuration work, it will be applied.

Applying the New Configuration

If you are certain of your configuration file, you can skip the try option and go directly to applying the new options. The command for this is:

sudo netplan apply

At this point, you can issue the command ip a to see that your new address configurations are in place.

Configuring DHCP

Although you probably won’t be configuring your server for DHCP, it’s always good to know how to do this. For example, you might not know what static IP addresses are currently available on your network. You could configure the device for DHCP, get an IP address, and then reconfigure that address as static.

To use DHCP with Netplan, the configuration file would look something like this:

network:

    version: 2

    renderer: networkd

    ethernets:

       ens5:

       Addresses: []

       dhcp4: true

       optional: true

Save and close that file. Test the file with:

sudo netplan try

Netplan should succeed and apply the DHCP configuration. You could then issue the ip a command, get the dynamically assigned address, and then reconfigure a static address. Or, you could leave it set to use DHCP (but seeing as how this is a server, you probably won’t want to do that).

Should you have more than one interface, you could name the second .yaml configuration file 02-netcfg.yaml. Netplan will apply the configuration files in numerical order, so 01 will be applied before 02. Create as many configuration files as needed for your server.

That’s All There Is

Believe it or not, that’s all there is to using Netplan. Although it is a significant change to how we’re accustomed to configuring network addresses, it’s not all that hard to get used to. But this style of configuration is here to stay… so you will need to get used to it.

Learn more about Linux through the free “Introduction to Linux” course from The Linux Foundation and edX.