Home Blog Page 668

Using Apache Hadoop to Turn Big Data Into Insights

The Apache Hadoop framework for distributed processing of large data sets is supported and used by a wide-ranging community — including businesses, governments, academia, and technology vendors. According to John Mertic, Director of ODPi and the Open Mainframe Project at The Linux Foundation, Apache Hadoop provides these diverse users with a solid base and allows them to add on different pieces depending on what they want to accomplish.

John Mertic, Director, ODPi and Open Mainframe Project
As a preview to Mertic’s talk at Apache: Big Data Europe in Seville, Spain, we spoke with him about some of the challenges facing the project and its goals for growth and development.

Apache Hadoop has a large and diverse community and user base. What are some of the various ways the project is being used for business and how can the community meet those needs?

If you think of a use case where a business needs to answer a question with data, the chances that they are using Apache Hadoop are fairly high. The platform is evolving to become the go-to strategy for working with data in a business. Hadoop’s ability to turn data into business insights speaks to the flexibility and depth of both Hadoop and the Big Data ecosystem as a whole.

The Big Data community can help to increase the adoption of Apache Hadoop through consistent encouragement of interoperability and standardization across Hadoop offerings. These efforts will not only help to mitigate the risks associated with implementing such differing platforms, but also streamline new development, promote open source architectures, and eliminate functionality confusion.   

What is the most common misconception about Apache Hadoop?

The most common misconception about Apache Hadoop is that it is just a project of The Apache Software Foundation, and one containing only YARN, MapReduce, HDFS. In reality, as it’s brought to market by platform providers like Hortonworks, IBM, Cloudera, or MapR, Hadoop can be equipped with 15-20 additional projects that vary across platform vendors, like Hive, Ambari, HCFS, etc. To use an analogy, Apache Hadoop is like Mr. Potato Head. You start with a solid base and can add different pieces depending on what you are trying to accomplish. What an end user may think of as Apache Hadoop is actually more than what it really is, and thus it may seem quite amorphous.

What are its strengths, and what value does it bring to users?

The Hadoop ecosystem enables a multitude of strategies for dealing with and capitalizing on data in any enterprise environment. The breadth and depth of the evolving platform now enables businesses to consider this growing ecosystem as part of their strategy for data management.

Can you describe some of the current challenges facing the project?

There certainly are compatibility gaps with Apache Hadoop and, while technologists are tackling some of these by creating new innovative projects, I think having a tighter feedback loop of real-life usage from businesses — to help technologies closest to the project understand the challenges and opportunities — will be crucial to increase adoption. Obtaining those use cases directly from user to project can help solidify and mature these projects quickly.

The effects of the broad ecosystem – most commonly occurring through end user confusion and enterprise software expectations – happen when end users turn to Hadoop from the matured world of enterprise data warehouses with the same expectations but don’t see the same stability in this new ecosystem.

What are the project’s goals and strategies for growth?

ODPi’s goals for the Big Data community at-large are to solve end-user challenges more directly, remove the investment risks for legacy companies considering a move to Hadoop through universal standardization, and connect the technology more directly to business outcomes for potential enterprise users.

Attending Apache: Big Data Europe? Join Apache project members and speakers at the ODPi Community Lounge!

Kubernetes: An Overview

Kubernetes is an open source container management platform designed to run enterprise-class, cloud-enabled and web-scalable IT workloads. It is built upon the foundation laid by Google based on 15 years of experience in running containerized applications.

Though their popularity is a mostly recent trend, the concept of containers has existed for over a decade. Mainstream Unix-based operating systems (OS), such as Solaris, FreeBSD and Linux, had built-in support for containers, but it was Docker that truly democratized containers by making them manageable and accessible to both the development and IT operations teams. Docker has demonstrated that containerization can drive the scalability and portability of applications. Developers and IT operations are turning to containers for packaging code and dependencies written in a variety of languages. Containers are also playing a crucial role in DevOps processes. They have become an integral part of build automation and continuous integration and continuous deployment (CI/CD) pipelines.

While core implementations center around the life cycle of individual containers, production applications typically deal with workloads that have dozens of containers running across multiple hosts. The complex architecture dealing with multiple hosts and containers running in production environments demands a new set of management tools. Some of the popular solutions include Docker Datacenter, Kubernetes, and Mesosphere DC/OS.

Read more at The New Stack

Microsoft Open Sources its Azure Container Service Engine and Launches Deeper Kubernetes Integration

The open source Kubernetes container management project is probably the most popular of the various competing container management services available today. The Cloud Native Compute Foundation, which plays host to the open source side of Kubernetes, is hosting its first Kubernetes conference this week and unsurprisingly, we’ll see quite a bit of container-related news in the next few days.

First up is Microsoft, which is not only making the source code of the engine at the core of its Azure Container Service (ACS) available, but also launching a preview of its native integration of Kubernetes for ACS. In addition, Microsoft is also continuing to bet on Mesosphere’s DC/OS and updating that service to the latest release of DC/OS.

Read more at Tech Crunch

 

Move Over Bitcoin, The Blockchain Is Only Just Getting Started

It’s easy to think we’ve reached peak Bitcoin, but the blockchain at the heart of cryptocurrencies contains the seeds of something revolutionary.

The blockchain is a decentralised electronic ledger with duplicate copies on thousands of computers around the world. It cannot be altered retrospectively, allowing asset ownership and transfer to be recorded without external verification.

Investors have now realised the blockchain is bigger than Bitcoin. In the first quarter of 2016, venture-capital investment in blockchain startups overtook that in pure-play Bitcoin companies for the first time, according to industry researcher CoinDesk, which has tallied $1.1 billion (£840m) in deals to date.

Read more at Wired

How to Deploy a Fault Tolerant Cluster with Continuous or High Availability

Some companies cannot allow having their services down. In case of a server outage a cellular operator might experience billing system downtime causing lost connection for all its clients. Admittance of the potential impact of such situations leads to the idea to always have a plan B.

In this article, we’re throwing light on different ways of protection against server failures, as well as architectures used for deployment of VMmanager Cloud, a control panel for building a High Availability cluster.

Read complete article at HowtoForge

First 64-Bit Orange Pi Slips in Under $20

The open spec Orange Pi PC 2 runs Linux or Android on a quad-core -A53 Allwinner H5 SoC, and offers GbE, a 40-pin RPi interface, and three USB host ports.



Shenzhen Xunlong is keeping up its prolific pace in spinning off new Allwinner SoCs into open source SBCs, and now it has released its first 64-bit ARM model, and one of the cheapest quad-core -A53 boards around. The Orange Pi PC 2 runs Linux or Android on a new Allwinner H5 SoC featuring four Cortex-A53 cores and a more powerful Mali-450 GPU. The Orange Pi PC 2, which sells at Aliexpress for $19.98, or $23.33 including shipping to the U.S., updates the quad-core -A7 Allwinner H3 based Orange Pi PC, which came in 14th out of 81 SBCs in our hacker boards reader survey.

Read more at HackerBoards

The DevOpsification of Security

In December 2009, Google was the target of a series of highly coordinated, sophisticated advanced persistent threat (APT) attacks in which state-sponsored hackers from China stole intellectual property and sought to access and potentially modify Google source code the companys crown jewels. Dubbed Operation Aurora, the attack proved to be a referendum at Google on the layered, perimeter-based security model.

Five years later, in 2014, Google published a paper titled  “BeyondCorp: A New Approach to Enterprise Security,” which detailed the companys radical security overhaul, transitioning to a trustless model where all applications live on the public Internet. Google wrote:

Virtually every company today uses firewalls to enforce perimeter security. However, this security model is problematic because, when that perimeter is breached, an attacker has relatively easy access to a companys privileged intranet. As companies adopt mobile and cloud technologies, the perimeter is becoming increasingly difficult to enforce. Google is taking a different approach…We are removing the requirement for a privileged intranet and moving our corporate applications to the Internet.

Yet while much of the world is in the throes of adopting the open, on-demand IT paradigm characterized by agility and elasticity that Google helped define, security has yet to be reimagined in the image of cloud and DevOps, much less Google.

Read more at The New Stack

A Practical Guide to Nmap (Network Security Scanner) in Kali Linux

In the second Kali Linux article, the network tool known as ‘nmap‘ will be discussed. While nmap isn’t a Kali only tool, it is one of the most useful network mapping tools in Kali.

  1. Kali Linux Installation Guide for Beginners – Part 1

Nmap, short for Network Mapper, is maintained by Gordon Lyon (more about Mr. Lyon here: http://insecure.org/fyodor/) and is used by many security professionals all over the world. The utility works in both Linux and Windows and is command line (CLI) driven. However for those a little more timid of the command line, there is a wonderful graphical front end for nmap called zenmap.

Read complete article at Tecmint

How We Built a Metering & Chargeback System to Incentivize Higher Resource Utilization -Michael Benedict & Vinu Charanya, Twitter

This talk by Vinu Charanya and Michael Benedict at LinuxCon North America goes into fascinating detail on the metering and chargeback system Twitter engineers built to solve this problem, using both a technical and social approach.

 

The Linux Foundation Issues 2016 Guide to Open Source Cloud Projects

The Linux Foundation today released its third annual “Guide to the Open Cloud” report on current trends and open source projects in cloud computing.

Guide to the Open Cloud Report
The report aggregates and analyzes industry research to provide insights on how trends in containers, microservices, and more shape cloud computing today. It also defines the open source cloud and cloud native computing and discusses why the open cloud is important to just about every industry.

“From banking and finance to automotive and healthcare, companies are facing the reality that they’re now in the technology business. In this new reality, cloud strategies can make or break an organization’s market success. And successful cloud strategies are built on Linux and open source software,” according to the report.

A list of 75 projects at the end of the report serves as a directory for IT managers and practitioners looking to build, manage, and monitor their cloud resources. These are the projects to know about, try out, and contribute to in order to ensure your business stays competitive in the cloud.

The projects are organized into key categories of cloud infrastructure including IaaS, PaaS, virtualization, containers, cloud operating systems, DevOps, configuration management, logging and monitoring, software-defined networking (SDN), software-defined storage, and networking for containers.

New this year is the addition of a section on container management and automation tools, which is a hot area for development as companies race to fill the growing need to manage highly distributed, cloud-native applications. Traditional DevOps CI/CD tools have also been collected in a separate category, though functionality can overlap.

These additions reflect a movement toward the use of public cloud services and microservices architectures which is changing the nature of open source cloud computing.

“A whole new class of open source cloud computing projects has now begun to leverage the elasticity of the public cloud and enable applications designed and built to run on it,” according to the report.

To learn more about current trends in cloud computing and to see a full list of the most useful, influential, and promising open source cloud projects, download the report now.