2016 was an exciting year in information security. There were mega-breaches, tons of new malware strains, inventive phishing attacks, and laws dealing with digital security and privacy. Each of these instances brought the security community to where we are now: on the cusp of 2017.
Even so, everything that happened in 2016 wasn’t equally significant. Some moments clearly stood out above the rest.
Some years quietly sneak by – 2016 not so much. It’s safe to say there are always forces reshaping the HPC landscape but this year’s bunch seemed like a noisy lot.
Among the noisemakers: TaihuLight, DGX-1/Pascal, Dell EMC & HPE-SGI et al., KNL to market, OPA-IB chest thumping, Fujitsu-ARM, new U.S. President-elect, BREXIT, JR’s Intel Exit, Exascale (whatever that means now), NCSA@30, whither NSCI, Deep Learning mania, HPC identity crisis…You get the picture.
Far from comprehensive and in no particular order – except perhaps starting with China’s remarkable double helping atop the Top500 List – here’s a brief review of ten 2016 trends and a few associated stories covered in HPCwire…
Raspberry Pi Founder Eben Upton proudly announced the availability of the Debian-based Raspbian GNU/Linux distribution with the recently introduced PIXEL desktop environment for PC and Mac.
As you might be aware of, Raspbian is the official Linux-based operating system for Raspberry Pi single-board computers. In the same manner, PIXEL is the new interface of Raspbian, launched in September 2016, based on the LXDE (Lightweight X11 Desktop Environment) project.
“I have been up against tough competition all my life. I wouldn’t know how to get along without it.”
Walt Disney
PostgreSQL vs. MySQL. MongoDB vs. Cassandra. Solr vs. Elasticsearch. ReactJS vs. AngularJS. If you have an open source project that you are passionate about, chances are a competing project exists and is doing similar things, with users as passionate as yours. Despite the “we’re all happily sharing our code” vibe that many individuals in open source love to project, open source business, like any other, is filled with competition. Unlike other business models, however, open source presents unique challenges and opportunities when it comes to competition.
Read more at OpenSource.com
Today, Google search is a well known and the most-used search engine on the World Wide Web (WWW), if you want to gather information from millions of servers on the Internet, then it is the number one and most reliable tool for that purpose plus much more.
Many people around the world mainly use Google search via a graphical web browser interface. However, command line geeks who are always glued to the terminal for their day-to-day system related tasks, face difficulties in accessing Google search from command-line, this is where Googler comes in handy.
Googler is a powerful, feature-rich and Python-based command line tool for accessing Google (Web & News) and Google Site Search within the Linux terminal.
With emerging technology, there can be the thought that old is not good. It could lack the features and performance the business requires. Cloud technology changes so much, do we still need something like Swift that predates OpenStack?
To answer this question, we must understand Swift’s unique architecture. Only with Swift can we harness the power of the BLOB.
A central concept to Swift is the Binary Large OBject (BLOB). Instead of block storage, data is divided into some number of binary streams. Any file, of any format, can be reduced to a series of ones and zeros, sometimes referred to as serialization. Start at the first bit of a file and count ones and zeros until you have a block, a megabyte or even five gigabytes. This becomes an object. The next number of bits becomes an object until there is no more file to divide into objects. These objects can be stored locally or sent to a Swift proxy server. The proxy server will send the object to a series of storage servicers where memcached will accept the object, at memory speeds. Definitely an advantage in the days before inexpensive solid state drives.
These independent objects can be placed anywhere, as long as they can be brought back together in the same order, which is what Swift does on our behalf through services. Swift uses three services to track the blobs, where they are stored, and who owns them:
Object Servers
Container Servers
Account Servers
These services can be deployed on the same system, or individually across several systems. This allows the Swift cluster to scale and meet the changing needs of the storage. The three services are independant of one another and distribute their data among the available nodes. The distribution has led to the use of the term “ring services.” The distribution among the object, container, and account rings is not round-robin, as the name might imply. Instead it uses an algorithm that includes the device partition index and weights to determine which node the object or its replicas should store the object.
The Object Servers are responsible for storing the actual blobs. The object is stored as a file while the metadata is stored in extended attributes (xattrs). As long as the local filesystem supports xattrs you should be able to use it for local storage. Each node could use its own filesystem, no need for the entire cluster to be the same.
The objects are stored relative to a container. The Container Server keeps a database of which objects are in which containers. It also maintains a total number of objects and how much storage each container is using.
The third of the “ring services” tracks container ownership and is maintained by the Account Server.
While the most common deployment of Swift is that each new node runs all three services, it can be easily changed as necessary. Some services may be more active than others, and the node resource demands can be different per ring as well. The flexibility of Swift means we can change our cluster to meet the storage demands for size or speed as necessary. We can deploy more Object Servers without the need to use resources for additional Account Servers.
Swift architecture frees us from the common constraints often found with NAS systems. We can store any data, anywhere we want, on whichever hardware we want. There is no vendor lock. Rackspace developed a forward thinking solution to cloud storage. As an open source tool it has revolutionised enterprise storage.
I discuss Swift in more detail in my recent Linux Foundation webinar on OpenStack: Exploring Object Storage with Ceph and Swift.
Is virtualization still as strategically important as it was now that we are in the age of containers? According toa Red Hat survey of 900 enterprise IT administrators, systems architects, and IT managers across geographic regions and industries, the answer is a resounding yes. Virtualization adoption remains on the rise, and is integrated with many cloud deployments and platforms.
Red Hat’s survey showed that most respondents are using virtualization to drive server consolidation, increase provisioning time, and provide infrastructure for developers to build and deploy applications. According to a Red hat post: “Over the next two years, respondents indicated that they expect to increase both virtualized infrastructure and workloads by 18 percent and 20 percent, respectively. In terms of application mix, the most commonly virtualized workloads among respondents were web applications, including websites (73 percent), web application servers (70 percent) and databases (67 percent).
At the same time, virtualization does face challenges. Nearly 40 percent of respondents to Red Hat’s survey called out budgets and costs as a key challenge, likely related to the cost implications of migrating workloads to and maintaining virtualization environments. That is precisely where free and open source virtualization solutions are making an enormous difference. Open virtualization tools can be part of a broader strategy to provide developers and applications with the best possible infrastructure, integrating with containers, private clouds and public clouds.
The Linux Foundation recentlyannounced the release of its 2016 report“Guide to the Open Cloud: Current Trends and Open Source Projects.” This third annual report provides a comprehensive look at the state of open cloud computing. You candownload the report now, and one of the first things to notice is that it aggregates and analyzes research, illustrating how trends in containers, microservices, and more shape cloud computing. In fact, from IaaS to virtualization to DevOps configuration management, it provides descriptions and links to categorized projects central to today’s open cloud environment.
In this series of posts, we are calling out many of these projects, by category, providing extra insights on how the overall category is evolving. Below, you’ll find a collection of several important virtualization tools and the impact that they are having, along with links to their GitHub repositories, all gathered from the Guide to the Open Cloud:
KVM (Kernel-based Virtual Machine) is a full virtualization solution for Linux on x86 hardware containing virtualization extensions (Intel VT or AMD-V). It consists of a loadable kernel module, kvm.ko, that provides the core virtualization infrastructure and a processor specific module, kvm-intel.ko or kvm-amd.ko. It can run multiple virtual machines running unmodified Linux or Windows images. KVM mailing lists.
Linux Containers (LXC) are lightweight virtual machines enabled by functions within the Linux kernel, including cgroups, namespaces and security modules. Userspace tools coordinate kernel features and manipulate container images to create and manage system or application containers. LXC on GitHub.
LXD is Canonical’s container hypervisor and a new user experience for LXC. Developed in Go, it runs unmodified Linux operating systems and applications with VM-style operations. LXD on GitHub.
Xen Project, a Linux Foundation project, develops virtualization technologies for a number of different commercial and open source applications including server virtualization, Infrastructure as a Service (IaaS), desktop virtualization, security applications, embedded and hardware appliances on x86 and ARM CPU architectures, and supports a wide range of guest operating systems. Xen Project Git repositories.
If you’ve ever gone to an event that required a ticket, chances are you’ve done business with Ticketmaster. The ubiquitous ticket company has been around for 40 years and is the undisputed market leader in its field.
To stay on top, the company is trying to ensure its best product creators can focus on products, not infrastructure. The company has begun to roll out a massive public cloud strategy that uses Kubernetes, an open source platform for the deployment and management of application containers, to keep everything running smoothly, and sent two of its top technologists to deliver a keynote at the 2016 CloudNativeCon in Seattle explaining their methodology.
Continuous Self-Disruption
The company was the first to disrupt the ticket industry when it was founded in 1976 at Arizona State University, and its leaders are perfectly aware that as a ubiquitous market leader, Ticketmaster is ripe to be disrupted itself. So, since 2013, the company has undergone a continuous process of “self-disruption” in an effort to stay ahead of any competition.
“It is great to be the market leader but it is also a terrifying place to be,” said Justin Dean, Ticketmaster’s SVP of Platform and Technical Operations, during the keynote. “Ticketmaster, our ecosystem, has a huge surface area. One little piece of that surface area, could be an entire business for a start up or for a small company. For us, what we have to do as a company, is really optimize for speed and agility.”
This approach has included a shift from a private cloud implementation, with over 22,000 virtual machines across seven global data centers, to the public cloud and AWS. It also means a major commitment to containerization. Dean and his co-presenter, Kraig Amador, both joked that they have every version of every piece of software created over the past 40 years running somewhere inside the company, including an emulated version of VAX software from the 1970s, which runs Ticketmaster’s original groundbreaking system.
Dean said the company has 21 different ticketing systems and more than 250 unique products. As part of their transformation, Ticketmaster has created more than 65 cross-functional software product teams, and they need a system that lets those teams focus on creating new products. This is where Kubernetes comes in.
Let the Makers Make
“Our goal of all of this is let the makers make,” Dean said. “We have an amazing company of makers, creators, visionaries, innovators: people who can focus on delivering products to market, and that is where they should figure out the next big thing, to power our business and make it better. We do not want to burden them with also having to figuring out how to deploy infrastructure to support their software.”
Amador, Ticketmaster’s Senior Director of Core Platform, is leading the effort to fully implement Kubernetes at the company. He said their work is far from over, but early returns have been very promising.
Amador explained that after an extensive internal product audit and team evaluation, Ticketmaster’s DevOps team has built tools to gauge the health of each piece of code and help make sure all these different products are running smoothly and independently.
Independence is of major importance; Amador said. Ticketmaster essentially invites a DDOS attack on its servers every time a popular concert or event has tickets go on sale. So, when a service gets overwhelmed — it happens all the time, he said — Kubernetes is there to get things running again.
“By putting in the Kubernetes and leveraging the pod health checks, Kubernetes can catch that for us and bring it back up for us,” Amador said. “We don’t have to go in there and manually manage it anymore, it just kind of does its own thing.”
Dean said Ticketmaster was very deliberate in its move to public cloud; the company needs to build a system that can not only do $25 billion in commerce every year but also have room to grow seamlessly.
“We have to ensure we have the right strategies,” Dean said. “One of those is really ensuring that we are betting big in the right communities. We definitely feel that the Kubernetes community is the community that we want to be a part of. And we want to encourage others to join us along the journey and add more anchors of big companies into the community so that it can continue to thrive, grow so that some of these problems get solved and we divide and conquer.”
Watch the complete video below:
Do you need training to prepare for the upcoming Kubernetes certification? Pre-enroll today to save 50% on Kubernetes Fundamentals (LFS258), a self-paced, online training course from The Linux Foundation. Learn More >>
A template for working collaboratively with the business in today’s rapidly changing technology environment.
Everywhere I go lately, the cloud seems to be on the agenda as a topic of conversation. Not surprisingly, along with all the focus, attention, and money the cloud is receiving, comes the hype and noise we’ve come to expect in just about every security market these days. Given this, along with how new the cloud is to most of us in the security world, how can security professionals make sense of the situation? I would argue that that depends largely on what type of situation we’re referring to, exactly. And therein lies the twist.
Rather than approach this piece as “20 questions security professionals should ask cloud providers,” I’d like to take a slightly different angle. It’s a perspective I think will be more useful to security professionals grappling with issues and challenges introduced by the cloud on a daily basis. For a variety of reasons, organizations are moving both infrastructure and applications to the cloud at a rapid rate – far more rapidly than anyone would have forecast even two or three years ago.
Hortonworks’ Scott Gnau talks about Apache Spark vs. Hadoop and data in motion.
Hortonworks has built its business on big data and Hadoop, but the Hortonworks Data Platform provides analytics and features support for a range of technologies beyond Hadoop, including MapReduce, Pig, Hive, and Spark. Hortonworks DataFlow, meanwhile, offers streaming analytics and uses technologies like Apache Nifi and Kafka.
InfoWorld Executive Editor Doug Dineley and Editor at Large Paul Krill recently spoke with Hortonworks CTO Scott Gnau about how the company sees the data business shaking out, the Spark vs. Hadoop face-off, and Hortonworks’ release strategy and efforts to build out the DataFlow platform for data in motion.