Your First Machine Learning Project in Python Step-By-Step

By

-

February 1, 2019

Do you want to do machine learning using Python, but you’re having trouble getting started?

In this post, you will complete your first machine learning project using Python.

In this step-by-step tutorial you will:

Download and install Python SciPy and get the most useful package for machine learning in Python.
Load a dataset and understand it’s structure using statistical summaries and data visualization.
Create 6 machine learning models, pick the best and build confidence that the accuracy is reliable.

If you are a machine learning beginner and looking to finally get started using Python, this tutorial was designed for you.

Let’s get started!

How Do You Start Machine Learning in Python?

The best way to learn machine learning is by designing and completing small projects.

Python Can Be Intimidating When Getting Started

Python is a popular and powerful interpreted language. Unlike R, Python is a complete language and platform that you can use for both research and development and developing production systems.

There are also a lot of modules and libraries to choose from, providing multiple ways to do each task. It can feel overwhelming.

The best way to get started using Python for machine learning is to complete a project.

It will force you to install and start the Python interpreter (at the very least).
It will given you a bird’s eye view of how to step through a small project.
It will give you confidence, maybe to go on to your own small projects.

Xen Project Celebrates Unikraft Unikernel Project’s One Year Anniversary

By

Lars Kurth

-

January 31, 2019

It has been one year since the Xen Project introduced Unikraft as an incubator project. In that time, the team has made great strides in simplifying the process of building unikernels through a unified and customizable code base.

Unikraft is an incubation project under the Xen Project, hosted by the Linux Foundation, focused on easing the creation of building unikernels, which compile source code into a lean operating system that only includes the functionality required by the application logic. As containers increasingly become the way cloud applications are built, there is a need to drive even more efficiency into the way these workloads run. The ultra lightweight and small trusted compute base nature of unikernels make them ideal not only for cloud applications, but also for fields where resources may be constrained or safety is critical.

Unikraft tackles one of the fundamental downsides of unikernels: despite their clear potential, building them is often manual, time-consuming work carried out by experts. Worse, the work, or at least chunks of it, often needs to be redone for each target application. Unikraft’s goal is to provide an automated build system where non-experts can easily and quickly generate extremely efficient and secure unikernels without having to touch a single line of code. Further, Unikraft explicitly supports multiple target platforms: not only virtual machines for Xen and KVM, but also OCI-compliant containers and bare metal images for various CPU architectures.

Over the last year the lead team at NEC Laboratories Europe along with external contributors from companies like ARM and universities such as University Politehnica of Bucharest have made great strides in developing and testing Unikraft’s base functionality, including support for a number of CPU architectures, platforms, and operating system primitives. Notable updates include support for ARM64.

The Unikraft community continues to grow. Over the last year, we’ve seen impressive momentum in terms of community support and involvement:

Contributions from outside the project founders (NEC) now make up 25% of all contributions.
Active contributors rose 91%, from 2 contributors to 23.
The initial NEC code contribution was around 86KLOC: since then around 34KLOC of code have been added and/or modified.

An upcoming milestone for the project is the Unikraft v0.3 release, which will ship in February. This release includes:

Xenstore and Xen bus support
ARM32 support for Xen
ARM64 support for QEMU/KVM
X86_64 bare metal support
Networking support, including an API that allows for high-speed I/O frameworks (e.g., DPDK, netmap)
A lightweight network stack (lwip)
Initial VFS support along with an a simple but performant in-RAM filesystem

We are very excited about this coming year, where the focus will be on automating the build process and supporting higher-layer functionality and applications:

External standard libraries: musl, libuv, zlib, openssl, libunwind, libaxtls (TLS), etc.
Language environments: Javascript (v8), Python, Ruby, C++
Frameworks: Node.js, PyTorch, Intel DPDK
Applications: lighttpd, nginx, SQLite, Redis, etc.

Looking forward, in the first half of 2019 Unikraft will be concentrating its efforts towards supporting an increasing number of programming languages and applications and towards actively creating links to other unikernel projects in order to ensure that the project delivers on its promise. Stay tuned for what’s in store. If you want to take Unikraft out for a spin, to contribute or to simply find out more information about Unikraft please head over to the project’s website.

Also, if you are attending FOSDEM, February 2nd and 3rd, please stop by room AW1.121 for the talk “Unikraft: Unikernels Made Easy,” given by Simon Kuenzer. Simon, a senior systems researcher at NEC Labs and the lead maintainer of Unikraft, will be speaking all about Unikraft and giving a comprehensive overview of the project, where it’s been and what’s in store.

Want to learn more about Unikraft and connect with the Xen community at large? Registration for the annual Xen Project Developer and Design Summit is open now! Check out information on sponsorships, speaking opportunities and more here.

This article originally appeared at Xen Project.

A Hitchhiker’s Guide to the Blockchain Universe

By

ACM Queue

-

January 31, 2019

Despite the significant potential of blockchain, it is also difficult to find a consistent description of what it really is. A Google search for “blockchain technical papers” returns nothing but white papers for the first three screens; not a single paper is peer-reviewed.¹⁰ One of the best discussions of the technology itself is from the National Institute of Standards and Technology, but at 50-plus pages, it is a bit much for a quick read.⁹

The purpose of this article is to look at the basics of blockchain: the individual components, how those components fit together, and what changes might be made to solve some of the problems with blockchain technology. This technology is far from monolithic; some of the techniques can be used (at surprising savings of resources and effort) if other parts are cut away.

Because there is no single set of technical specifications, some systems that claim to be blockchain instances will differ from the system described here. Much of this description is taken from the original blockchain paper.⁶ While details may differ, the main ideas stay the same. …

While there are lots of different ways to implement a blockchain, all have three major components. The first of these is the ledger, which is the series of blocks that are the public record of the transactions and the order of those transactions. Second is the consensus protocol, which allows all of the members of the community to agree on the values stored in the ledger. Finally, there is the digital currency, which acts as a reward for those willing to do the work of advancing the ledger. These components work together to provide a system that has the properties of stability, irrefutability, and distribution of trust that are the goals of the system.

Docker and Kubernetes in High Security Environments

By

Medium Blog

-

January 31, 2019

This is brief summary of parts of my master’s thesis and the conclusions to draw from it. This medium-story focuses on containerized application isolation. The thesis also covers segmentation of cluster networks in Kubernetes which is not discussed in this story.

Container orchestration and cloud-native computing has gained lots of traction the recent years. The adoption has increased to such level that even enterprises in finance, banking and the public sector are interested. Compared to other businesses they differ by having extensive requirements on information security and IT security.

One important aspect is how containers could be used in production environments while maintaining system separation between applications. As such enterprises uses private clouds powered by bare-metal virtualization, the separation loss upon migrating to a container orchestrated environment is not negligible. It is in this scope that my thesis is written –with the Swedish Police Authority as the target client.

The specific research question that the thesis explores is the following:

How can Docker and Kubernetes support the separation of applications for the Swedish Police Authority compared with virtual machines powered by the bare-metal hypervisor ESXi?

That question has a lot to unwrap. To break this down, let’s start by looking in to the common denominator — the applications.

Future-Proof Your Career with AI

By

HPE Blog

-

January 31, 2019

AI is the fastest growing field in enterprise tech. Here’s how to get an AI job you will love.

AI job listings have become the fastest growing category on LinkedIn, and Indeed is packed with listings. But most job requisitions seek a computer scientist type with a PhD in neural networks or some other years-long study. The trick is to look past those, and you’ll find that what many companies need can’t be outsourced or given to a freshly minted college grad: an IT pro with enterprise-scale experience who also knows how to deliver on a machine learning project.

Machine learning is where the jobs are

Here’s the secret: There are plenty of AI-related jobs that aren’t advanced science but simply applying new machine learning features from cloud services giants to familiar IT environments. “Most ML jobs aren’t about advancing ML technology and algorithms,” says Ross Mead, founder and CEO of robotics software startup Semio and an industry consultant in AI with a PhD from the University of Southern California. “The money in AI for most companies is using ML for better business intelligence.”

That means using turnkey ML packages to analyze internal data—customer behavior, sales, etc.—to look for patterns that indicate likely business success. Machine learning is different from deep learning, the more esoteric field of AI that it is often confused with.

Remote Code Execution in apt/apt-get

By

Max Justicz

-

January 31, 2019

tl;dr I found a vulnerability in apt that allows a network man-in-the-middle (or a malicious package mirror) to execute arbitrary code as root on a machine installing any package. The bug has been fixed in the latest versions of apt. If you’re worried about being exploited during the update process, you can protect yourself by disabling HTTP redirects while you update. To do that, run:

$ sudo apt update -o Acquire::http::AllowRedirect=false
$ sudo apt upgrade -o Acquire::http::AllowRedirect=false

If your current package mirrors redirect by default (meaning you can’t update apt when using that flag) you’ll need to pick different mirrors or download the package directly. Specific instructions for upgrading on Debian can be found here. Ubuntu’s announcement can be found here.

As a proof of concept, below is a video of me exploiting the following Dockerfile:

FROM debian:latest

RUN apt-get update && apt-get install -y cowsay

SAP: One of Open Source’s Best Kept Secrets

By

The Linux Foundation

-

January 30, 2019

SAP has been working with open source for decades and has now established an open source program office (OSPO) to further formalize the coordination of its open source activities and expand its engagement with the open source communities. “SAP was one of the first industry players to formally define processes for open source consumption and contribution,” says Peter Giese, director of the Open Source Program Office.

Even so, many people do not yet consider SAP to be a company that embraces open source engagement and contributions.

“In the past, we may not have been active enough in sharing our open source activities,” says Giese.

Now, SAP is shining a spotlight on its work in open source. Transparency is an essential part of the new open source mandate, beginning with an explanation of what the company has been up to and where it is headed with open source.

How SAP came to adopt open source

“In 1998, SAP started to port the R/3 system, our market-leading ERP system, to Linux,” says Giese. “That was an important milestone for establishing Linux in the enterprise software market.”

Porting a system to Linux was just a first step, and a successful one. The action spurred an internal discussion and exploration of how and where to adopt Linux going forward.

Container Storage Interface (CSI) for Kubernetes GA

By

Kubernetes Blog

-

January 30, 2019

The Kubernetes implementation of the Container Storage Interface (CSI) has been promoted to GA in the Kubernetes v1.13 release. Support for CSI was introduced as alpha in Kubernetes v1.9 release, and promoted to beta in the Kubernetes v1.10 release.

The GA milestone indicates that Kubernetes users may depend on the feature and its API without fear of backwards incompatible changes in future causing regressions. GA features are protected by the Kubernetes deprecation policy.

Why CSI?

Although prior to CSI Kubernetes provided a powerful volume plugin system, it was challenging to add support for new volume plugins to Kubernetes: volume plugins were “in-tree” meaning their code was part of the core Kubernetes code and shipped with the core Kubernetes binaries—vendors wanting to add support for their storage system to Kubernetes (or even fix a bug in an existing volume plugin) were forced to align with the Kubernetes release process. In addition, third-party storage code caused reliability and security issues in core Kubernetes binaries and the code was often difficult (and in some cases impossible) for Kubernetes maintainers to test and maintain.

CSI was developed as a standard for exposing arbitrary block and file storage storage systems to containerized workloads on Container Orchestration Systems (COs) like Kubernetes. With the adoption of the Container Storage Interface, the Kubernetes volume layer becomes truly extensible. Using CSI, third-party storage providers can write and deploy plugins exposing new storage systems in Kubernetes without ever having to touch the core Kubernetes code. This gives Kubernetes users more options for storage and makes the system more secure and reliable.

Using more to View Text Files at the Linux Command Line

By

OpenSource.com

-

January 30, 2019

There are a number of utilities that enable you to view text files when you’re at the command line. One of them is more.

more is similar to another tool I wrote about called less. The main difference is that more only allows you to move forward in a file.

While that may seem limiting, it has some useful features that are good to know about. Let’s take a quick look at what more can do and how to use it.

The basics

Let’s say you have a text file and want to read it at the command line. Just open the terminal, pop into the directory that contains the file, and type this command:

more <filename>

How Companies Are Building Sustainable AI and ML Initiatives

By

O'Reilly

-

January 30, 2019

In 2017, we published “How Companies Are Putting AI to Work Through Deep Learning,” a report based on a survey we ran aiming to help leaders better understand how organizations are applying AI through deep learning. We found companies were planning to use deep learning over the next 12-18 months. In 2018, we decided to run a follow-up survey to determine whether companies’ machine learning (ML) and AI initiatives are sustainable—the results of which are in our recently published report, “Evolving Data Infrastructure.”

The current generation of AI and ML methods and technologies rely on large amounts of data—specifically, labeled training data. In order to have a longstanding AI and ML practice, companies need to have data infrastructure in place to collect, transform, store, and manage data. On one hand, we wanted to see whether companies were building out key components. On the other hand, we wanted to measure the sophistication of their use of these components. In other words, could we see a roadmap for transitioning from legacy cases (perhaps some business intelligence) toward data science practices, and from there into the tooling required for more substantial AI adoption?

Here are some notable findings from the survey:

Companies are serious about machine learning and AI. Fifty-eight percent of respondents indicated that they were either building or evaluating data science platform solutions. Data science (or machine learning) platforms are essential for companies that are keen on growing their data science teams and machine learning capabilities.