Open FinTech Forum Offers Tips for Open Source Success

September 27, 2018

2018 marks the year that open source disrupts yet another industry, and this time it’s financial services. The first-ever Open FinTech Forum, happening October 10-11 in New York City, focuses on the intersection of financial services and open source. It promises to provide attendees with guidance on building internal open source programs along with an in-depth look at cutting-edge technologies being deployed in the financial sector, such as AI, blockchain/distributed ledger, and Kubernetes.

Several factors make Open FinTech Forum special, but the in-depth sessions on day 1 especially stand out. The first day offers five technical tutorials, as well as four working discussions covering open source in an enterprise environment, setting up an open source program office, ensuring license compliance, and best practices for contributing to open source projects.

Enterprise open source adoption has its own set of challenges, but it becomes easier if you have a clear plan to follow.

How Github Can Be The Most Powerful Ticketing Tool

John Lafleur

September 27, 2018

U5dGvzyGgge2tjFkIr-DO5LN9ZT3mz5In_ijgqWD
How Github Can Be The Most Powerful Ticketing Tool

Compared to all other ticketing tools, GitHub Issues is the only platform giving entire freedom to define whatever types of labels you want. All other tools have an opinion on label types, such as priority, severity, component, epic, etc. Now if you consider that the number of GitHub public active repositories amounts now to 25 million at the end of 2017, and that, well, most of these public (understand open-source) projects are managed through GitHub, then you could be wondering if any best practices have emerged. Are there any unique best practices that cannot be applied to other tools? Should we all switch to GitHub to manage our projects there?

In the past few years, I’ve contributed to the development of two developer platforms – CodinGame and Tech.io. Together they total more than 1 million developers. I’ve recently co-founded another company, Anaxi with the mission of delivering actionable business intelligence for the whole software engineering organization. So, part of my job is identifying growing trends in software development tools. In other words, I think about this kind of thing quite a lot!

First, let’s analyze the 20 most popular open-source projects, how they are structured, and which labels they use.

Then, we will try to find common patterns, to help us understand when and how those best practices can be applied to your project.

Finally, we will compare with other existing project management tools, so you can decide whether GitHub is worth using as your project management tool.

The Top Projects on GitHub

We analyzed the 30 highest-velocity open-source projects on GitHub listed by the Linux Foundation, and selected what we felt were the most well-organized projects. We then analyzed the labels they used to organize their issues, and especially what we call their label categories.

What do we call a label category? Some labels show the following pattern: area/networking, area/hosting, area/backup. “area” is the label category for which there can be lots of labels.

Some projects use “/” to define categories, like in the example above, and others “:”, as Tensorflow and Angular do.

Here is the list of the projects we selected along with the label categories they use:

bAg94_Xfe0eqBPqMmZuEZ01diX17Px-KaTmxpBGd

By the way, did you notice that most projects are backed by big companiescorporations? Tensorflow and AngularJS by Google, React by Facebook, Moby by Docker, Ansible by RedHat and ElasticSearch by Elastic. In the list of 30 that we analyzed, 9 were backed by foundations and only 6 were not backed by an entity. We analyzed them all and kept the best-in-classes in terms of project organization, ending up with one for each (not on purpose): Kubernetes by CNCF (foundation), and DefinitelyTyped (not backed).

Ok, let’s deep dive and see what is interesting about this now.

The Patterns and Best Practices

We identified 7 different types of label categories that those projects were using. If you think it would make more sense to organize the categories differently, don’t hesitate to leave a response – we’re all ears. But here is our take on it.

Type or Kind:

In other project management tools, you would get bug, feature, task or subtask here. But as you can see, GitHub projects are able to expand far beyond this with experimental, discussion, technical-debt, failing-test, docs and much more. That could be interesting to any company. Right now, we generally put everything that is not a bug or feature as a task, but being able to specify what the ticket is should be valuable, whatever the size of your team.

Status/State or Triage/Resolution or Lifecycle:

We grouped these label categories together, as they all inform about the state of the ticket. But each one has nuances. That’s why Kubernetes uses all three – Status, Triage and Lifecycle. Triage and Resolution are used similarly; they explain how the ticket got to this Status/State. While Lifecycle describes the state of the ticket within its state, and has values as active, frozen, rotten, stale. If you have a large number of tickets to handle, this can become very handy, as you can have a process to resurface tickets or just stow them away to enable more focus on what matters.

Priority or Severity or Frequency or Workaround:

We put all four of them together, as they address the same overall issue. But there are important nuances as well. You could actually think that severity, frequency and workaround are 3 different topics that explain the priority better. And a better prioritization can have a massive effect on your business.

Component or Area or Feature:

These are used in a similar way, unlike most of the others. You won’t see a project using Component and Area or Feature, for instance. However, the interesting part is some projects use subcategories, such as area/platform/... or area/os/…, or just different label categories but for the same use, like browser:.. or cli:.. This enables you to have 2 levels of labels to define more precisely which part the code is really about. And within your team, you could have one person responsible for a subcategory, and another for a full category and all its subcategories. So, better delegation and accountability. This is especially helpful to big teams with big projects.

Difficulty or Size or Experience or Need:

This part is about what is required to solve this ticket. It can be time, experience or other dependencies. Other project management tools, like Pivotal Tracker, enable you to put estimates on tickets. The point here is that you can also add information about what level of experience is required for the ticket, which is pretty important for open-source projects. You will want newcomers to start with easy tasks. Angular also uses the label category “needs” with values such as breaking change, browser fix, docs, feedback, investigation, jquery fix, merging, more info, public api, review, squashing, test, work. It enables you to be more explicit about the work to be done to move the ticket forward. Typically, breaking change is very useful to know, so you can prepare the community or whole team for this change.

Milestone Related:

On GitHub, you can add milestones. But some projects add additional label categories related to milestones, such as milestone/needs-approval, milestone/needs-attention or milestone/removed. This creates opportunities to discuss and decide. Realistically, part of some weekly meetings could be dedicated to milestone/needs-attention. Ansible also uses labels with an indicator of the versions – affects_1.2, affects_2.3, etc – that the ticket can affect. This is pretty important when you have clients and customers using previous versions – forward compatibility issues – which is kind of every software unless it is a cloud-hosted SaaS product.

Pull Request Related:

GitHub is first and foremost used for code versioning, and therefore pull requests. Angular typically has the following label categories:

PR action: cleanup, discuss, merge, merge-assistance, review
PR state: WIP, blocked
PR target: master-only, master & patch, patch-only

Kubernetes has do-not-merge/ with values like blocked-paths, hold and work-in-progress.

This enables clarification of the reason for the state of the ticket. My personal feeling is that this doesn’t apply to most non-open-source software projects, though. Feel free to disagree!

So which ticketing tool is best?

First, let’s take note that any additional information you ask your team to provide is an extra effort for them. And if they don’t feel it is worth it, they will naturally not make the effort, and it will just be useless. So, if you’re considering editing your tools and processes, you’d better consider only what is worthwhile for your particular project.

Let’s take 3 different examples to illustrate the point. In no way do I claim that the points below apply to your particular case; only you know what would or would not work for your team.

If you’re a startup with less than 30 engineers

I would think expanding the type of tickets to documentation and refactoring still offers value. Your deployment processes shouldn’t be that complex, so you shouldn’t need any additional label possibilities around status and pull requests. Similarly, milestones shouldn’t hold many intricacies. Having the ability to detail frequency, severity and whether there is a workaround still holds value, but in a startup mode, you would rather maximize the time spent on the output than on a clean process. Finally, the product shouldn’t be too complex for you to have several layers of components. So overall, the flexibility that GitHub offers is a nice-to-have, but clearly you can have very clean processes without it, by using other tools that offer better user experiences – like Trello, for instance.

If you have several teams working on several products

If you want to have some kind of visibility over all your projects, you’d better be using the same tool in all your teams if you want your team to build consistent reporting on top of it. I would think it is pretty similar to the startup case. It really depends on the team sizes.

If you have large teams working the same products

Typically, this would apply to any complex software, or actually open-source projects. In that case, all the points listed with the label categories apply. And any additional label category could add value. The point is no longer about individual output, but the best communication for the best collective output. That’s actually why a lot of enterprises build their own in-house ticketing tool, with little success. GitHub could be a great solution for them. However, there is a UX and visibility problem with GitHub, but it can be fixed. How? Read on.

But GitHub is missing a lot…?

In terms of project management features, GitHub doesn’t offer the best experience, although it has progressed a lot on this point lately. But we’re still missing quite a lot:

The interface lets you input labels as text; there is no picker. So you have to know the basic structure of the labels.
You cannot make any label required when you create an issue
Labels are great if you want to monitor the tickets assigned with them. You can’t do that on GitHub, unfortunately.
Visibility over your projects on GitHub is pretty absent, to be honest.

That’s why Jira is the most used tool across the world. You can partially customize your workflow (unfortunately not at the extent of GitHub); it offers a good enough UX to manage tickets on a day-to-day basis and a set of reports so you have some kind of visibility.

What if you could integrate GitHub and use your label categories as if they were part of GitHub initially (with pickers)? Then, you could also monitor the issues for each of those labels and get the visibility you were missing. Better yet, what about integration of GitHub and Jira, so if you are using GitHub for some projects and Jira for others, you could have one single interface for your reporting and get the visibility you need.

I hope this article will help you think about your processes and, more precisely, the labels you use.

3 Docker Compose Features for Improving Team Development Workflow

O'Reilly

September 27, 2018

A developer today is bombarded with a plethora of tools that cover every possible problem you might have—but, selecting which tools to use is The New Problem. Even in container-land, we’re swimming in an ocean of tool choices, most of which didn’t exist a few years ago.

I’m here to help. I make a living out of helping companies adopt a faster and more efficient workflow for developing, testing, packaging, and shipping code to servers. Today that means containers, but it’s often not just the tool that’s important; it’s the way you use it and the way you scale it in a team.

For now, let’s focus on Docker Compose. It has become the de facto standard for managing container-based developer environments across any major OS. For years, I’ve consistently heard about teams tossing out a list of tools and scripts this single tool replaces. That’s the reason people adopt Compose. It works everywhere, saves time, and is easy to understand.

Blockchain Development Made Easy: Getting Started with Hyperledger Iroha

Jaxenter

September 27, 2018

Our ‘Blockchain development made easy’ series continues with Hyperledger Iroha, a simple blockchain platform you can use to make trusted, secure, and fast applications. What are the advantages and how can developers get started with it? We talked to Makoto Takemiya, co-founder and co-CEO of Soramitsu about what’s under this project’s hood.

JAXenter: Is Hyperledger Iroha some sort of supplement to Fabric and Sawtooth? How does that work?

Makoto Takemiya: Hyperledger Iroha is an open-source, distributed ledger supported by an open source community of developers. Hyperledger Iroha has its own technical properties and vision that is equally important to the vision and technical characteristics of other blockchain platforms governed by the Hyperledger Project, run by the Linux Foundation. There are numerous use cases and different applications, so all platforms are important for the users to be able to test and select the blockchain platform that performs best in their specific use case.

Iroha contributes to the diversity of the Hyperledger frameworks. Hyperledger Iroha is written in C++ and has a small set of commands and queries focused on enabling financial applications, digital asset management, and digital identity use-cases for enterprises of any size.

The State of Machine Learning in Business Today

Forbes

September 27, 2018

Artificial Intelligence (AI), Machine Learning, and Deep Learning are all topics of considerable interest in news articles and industry discussions these days. However, to the average person or to senior business executives and CEO’s, it becomes increasingly difficult to parse out the technical differences which distinguish these capabilities. …

I met last week with Ben Lorica, Chief Data Scientist at O’Reilly Media, and a co-host of the annual O’Reilly Strata Data and AI Conferences. O’Reilly recently published their latest study, The State of Machine Learning Adoption in the Enterprise. Noting that “machine learning has become more widely adopted by business”, O’Reilly sought to understand the state of industry deployments on machine learning capabilities, finding that 49% of organizations reported they were exploring or “just looking” into deploying machine learning, while a slight majority of 51% claimed to be early adopters (36%) or sophisticated users (15%). Lorica went on to note that firms identified a range of issues that make deployment of machine learning capabilities an ongoing challenge. These issues included a lack of skilled people, and ongoing challenges with lack of access to data in a timely manner.

How to Create SSH Tunneling or Port Forwarding in Linux

Tecmint

September 27, 2018

SSH tunneling (also referred to as SSH port forwarding) is simply routing local network traffic through SSH to remote hosts. This implies that all your connections are secured using encryption. It provides an easy way of setting up a basic VPN (Virtual Private Network), useful for connecting to private networks over unsecure public networks like the Internet.

You may also be used to expose local servers behind NATs and firewalls to the Internet over secure tunnels, as implemented in ngrok.

SSH sessions permit tunneling network connections by default and there are three types of SSH port forwarding: local, remote and dynamic port forwarding.

In this article, we will demonstrate how to quickly and easily setup a SSH tunneling or the different types of port forwarding in Linux.

Building Security into Linux-Based Azure Sphere

Eric Brown

September 26, 2018

It’s still a bit unsettling to see a Microsoft speaker at a Linux Foundation conference. Yet, at the recent Linux Security Summit, Ryan Fairfax, Microsoft’s head of OS development for Azure Sphere, quickly put the audience at ease with his knowledge of tuxified tech. His presentation on “Azure Sphere: Fitting Linux Security in 4 MiB of RAM” fits into the genre of stories in which developers are challenged to strip down their precious code to the spartan essentials for IoT.

As we saw last year in Michael Opdenacker’s presentation about reducing the Linux kernel and filesystem for IoT, Linux can be made to run — just barely — in as little as 4MB of RAM. That was Microsoft’s target for Azure Sphere OS, the open source Linux-based distribution at the heart of the Azure Sphere platform for IoT. Azure Sphere also includes a proprietary crypto/secure boot stack called the Microsoft Pluton Security Subsystem, which runs on an MCU, as well an Azure Sphere Security Service, a turnkey cloud service for secure device-to-device and device-to-cloud communication.

Last week, Seeed launched the first dev kit for Azure Sphere. The Azure Sphere MT3620 Development Kit features MediaTek’s MT3620, a 500MHz Cortex-A7/Cortex-M4F hybrid SoC that runs the lightweight Azure Sphere OS on a single -A7 core. The SoC’s 4MB of RAM is the only RAM on Seeed’s Grove compatible dev board. Other SoC vendors besides MediaTek will offer their own Cortex-A/Cortex-M SoCs for Azure Sphere, says Microsoft.

Major shrinkage

Fitting an entire Linux stack into 4MB was a tall order considering that “most of us hadn’t touched Linux in 10 years,” said Fairfax. Yet, the hard part of creating Azure Sphere OS was not so much the kernel modification, as it was the development of the rest of the stack. This includes the custom Linux Security Module, which coordinates with the Cortex-M4’s proprietary Pluton security code using a mailbox-based protocol.

“We decided early on to go with Linux,” said Fairfax. “Most of our changes to the kernel were small, and the core Linux features ‘just worked’ even with limited resources. That’s a credit to the effort of the community and flexibility of the kernel.”

Fairfax’s team started working on Azure Sphere in secret in 2016 after struggling to convince Microsoft leadership that working with a Linux kernel “was viable,” said Fairfax. The project was unveiled in April 2018, and the first public preview will be released soon.

One of the main goals of Azure Sphere was to bring security to the MCU world where “security is basically nonexistent,” said Fairfax. Microsoft somewhat confusingly refers to the MediaTek MT3620 as an MCU rather than an application processor due to its inclusion of Cortex-M4 MCU cores. In part, this may be a marketing ploy since Microsoft intends to compete directly with the Cortex-M oriented Amazon FreeRTOS.

Sign up to receive updates on Open Source Summit and ELC+OpenIoT Europe:

Azure Sphere OS sits on top of the MCU’s Pluton stack, architecturally speaking, and the base layer is a security stack based on Arm TrustZone. This is followed by the custom Linux kernel, which in turn is topped by a cloud services update connectivity layer. The top level is for POSIX and real-time I/O app containers.

The custom kernel is currently based on mainline Linux 4.9. Patches are merged upstream every month, and there are plans to upgrade to LTS branches yearly.

The first step in reducing the kernel was “to avoid putting text into memory,” said Fairfax. To do this, the OS depends a lot on Execute-In-Place (XiP) technology, which is commonly integrated in MCU flash controllers. “XiP lets you take a flash region and map it into the address space in read only, but also in a mode where you can execute it as code.”

In addition, “we tuned the kernel to make things modular so we could turn things off,” explained Fairfax. “We tuned cache sizes and patched to tweak default sizes.”

The team turned off a lot of the memory tracking options and things like kallsyms. They reluctantly cut sysfs, which saved almost 1MB, but for Fairfax was the coder equivalent of the writer’s challenge to kill your darlings. In the end, much of the kernel space was taken up by the network stack and hardware drivers.”

A lightweight Linux Security Module

Initially, the Azure Sphere OS team tried using SSH server with a fixed root password for security, but they quickly realized that this “was not going to cut it long term,” said Fairfax. To reduce the attack surface, they experimented with different security models, including “baking things into the file system and leveraging set UID and SGID to create predictable environments.”

These approaches caused some IPC problems and were otherwise flawed because “they put all the burden at build time,” said Fairfax. “Any mistake would propagate through the system and leave you vulnerable.”

Fairfax and his team revisited existing Linux technologies that might help make permissions more granular and “create a model where apps can access resources with the principle of least privilege,” said Fairfax. They finally decided on a stripped-down version of the Linux Security Model (LSM), a set of kernel extensions that “would reduce attack surface by taking certain features completely off the table. There’s no shell or user account management, which really isn’t relevant for an IoT device, and there’s no sophisticated job and process management.”

Fairfax also added fields that created an app identity for every task. “Applications and kernel modules can use these new fields for extended access control,” said Fairfax. “Values are immutable — once set, they inherit by default.”

The developers “experimented a lot with file systems,” said Fairfax. They tried the read-only cramfs with XIP patches, as well as writable file systems like ext2, jfffs, and yaffs, but “they all took hundreds of kilobytes to initialize, or about 1/16th of the total system memory available.” In the end, they ported the ultra-lightweight littlefs from Arm’s Mbed OS to Linux as a VFS module.

One problem with securing a Linux IoT device is that “Linux treats the entire GPIO infrastructure as a single resource,” said Fairfax. “In the real world not everything connected to your chip has the same sensitivity. I might have one GPIO pin that toggles an LED saying I’m connected to the network, which is not super sensitive, but another GPIO might open the solenoid on my furnace to start gas flow, which is more worrisome.” To compensate, the team added access control to existing features like GPIO to provide more granular access control.

User and application model

If Azure Sphere’s kernel is not radically different than any other extremely reduced Linux kernel, the user mode differs considerably. “The current Linux model is not designed for resource constrained environments,” said Fairfax. “So we built a custom init called the application manager that loads apps, configures their security environments, and launches them. It’s the only traditional process that runs on our system — everything else is part of an application.”

Azure Sphere applications are self describing and independently updatable. In fact, “they’re actually their own independent file systems,” explained Fairfax. “They run isolated from each other and cannot access any resource from another app.”

There are initially four pre-loaded system applications: network management, updates, command and control via USB, hardware crypto and RNG acceleration. GDBServer is optional, and OEMs can “add one or two apps that contain their own business logic,” said Fairfax.

One Azure Sphere rule is that “everything is OTA updatable and everything is renewable,” said Fairfax. In addition, because “quick OTA is critical” in responding to new threats, the team is aiming for OTA security patch updates within 24 hours of public disclosure, a feat they achieved with the Crack virus. Microsoft will manage all the OS updates, but OEMs control their own app updates.

The Microsoft team tried hard to find a way to run containers, including using LXC, but “we couldn’t get it to fit,” said Fairfax. “Containers are great, but they have some serious RAM overhead.” They also tried using namespaces to create self-contained apps but found that “many peripherals such as GPIO don’t play right with namespaces.”

For now, “we have pivoted off of containers and are focused on isolating apps and making sure that our permission model is sane,” said Fairfax. “We ensure that a buffer overrun in an application only gives you what that application can already do. We build each app as its own file system so they mount or unmount as part of install or uninstall. There’s no copying of files around for installation.

“Each application has metadata in the file system that says: ‘Here’s how to run me and here’s what I need,” continued Fairfax. “By default, all you get is compute and RAM — even network access must be declared as part of the manifest. This helps us reason about the security state and helps developers to do least privilege in apps.”

Future plans call for revisiting namespaces to create “something like a container,” and there’s a plan to “reduce cap_sys_admin or make it more granular,” says Fairfax. He also wants to explore integrating parts of SELinux or AppArmor. More immediately, the team plans to upstream some of the work in memory improvements and file systems, which Fairfax says “are applicable elsewhere even if you’re talking about something like a Raspberry Pi.”

You can find more information about Azure Sphere on Microsoft’s product page, and you can watch the complete presentation below.

Join us at Open Source Summit + Embedded Linux Conference Europe in Edinburgh, UK on October 22-24, 2018, for 100+ sessions on Linux, Cloud, Containers, AI, Community, and more.

Open Source Culture Starts with Programs and Policies

The New Stack

September 26, 2018

More than anything, open source programs are responsible for fostering “open source culture,” according to a survey The New Stack conducted with The Linux Foundation’s TODO Group. By creating an open source culture, companies with open source programs see the benefits we’ve previously reported, including increased speed and agility in the development cycle, better licence compliance and more awareness of which open source projects a company’s products depend on.

But what is open source culture, why is it important and how do we measure it? Based on survey data and reporting from this summer’s Open Source Summit, we believe open source programs support a corporate culture that prioritizes DevOps and microservices architecture, and enables developers to quickly use and participate in internal and external projects. It’s no longer sufficient to measure a company’s open source culture by counting what percentage of their technology stack is open source. Businesses interested in improved developer efficiency should examine their participation in open source projects and support a culture that nurtures code sharing and collaboration on externally maintained projects.

Defining Open Source Culture

Open source culture is more than just reusing free code on GitHub to get products to market faster. It is is an ethos that values sharing. The culture embraces an approach to software development that emphasizes internal and external collaboration, an increasing focus on core competencies instead of core infrastructure, and implementation of DevOps processes commonly associated with microservices and cloud native technologies.

New Video Applications Will Represent Majority of Edge Traffic by 2020, Survey Finds

The Linux Foundation

September 26, 2018

In an effort to identify early edge applications, we recently partnered with IHS Markit to interview edge thought leaders representing major telcos, manufacturers, MSOs, equipment vendors, and chip vendors that hail from open source, startups, and large corporations from all over the globe. The survey revealed that edge application deployments are still young but they will require new innovation and investment requiring open source.

The research investigated not only which applications will run on the edge, but also deployment timing, revenue potential and existing and expected barriers and difficulties of deployment. Presented onsite at ONS Europe by IHS Markit analyst Michael Howard, the results represent an early look at where organizations are headed in their edge application journeys.

Key findings which were presented onstage at ONS Europe by IHS analyst Michael Howard, indicate:

Video and other big-bandwidth applications and connected things that move drive top services, expected revenue.

Site Reliability Engineering (SRE): A Simple Overview

O'Reilly

September 26, 2018

Curious about site reliability engineering (SRE)?

The following overview is for you. It covers some of the basics of SRE: what it is, how it’s used, and what you need to keep in mind before adopting SRE methods.

In the book Site Reliability Engineering, contributor Benjamin Treynor Sloss—the originator of the term “Site Reliability Engineering”—explains how SRE emerged at Google….

The attributes of SRE

…site reliability engineers need a holistic understanding of the systems and the connections between those systems. “SREs must see the system as a whole and treat its interconnections with as much attention and respect as the components themselves,” Schlossnagle says.

In addition to an understanding of systems, site reliability engineers are also responsible for specific tasks and outcomes. These are outlined in the following seven principles of SRE written by the contributors of The Site Reliability Workbook.

1. Operations is a software problem — “The basic tenet of SRE is that doing operations well is a software problem. SRE should therefore use software engineering approaches to solve that problem.”