Home Blog Page 550

NEXmark: A Benchmarking Framework for Processing Data Streams

ApacheCon North America is only a few weeks away — happening May 16-18  in Miami. This year, it’s particularly exciting because ApacheCon will be a little different in how it’s set up to showcase the wide variety of Apache topics, technologies, and communities.

Apache: Big Data is part of the ApacheCon conference this year. Ismaël Mejía and Etienne Chauchot, of Talend, are giving a joint presentation called NEXmark, which is a unified framework to evaluate Big Data and processing systems with Apache Beam. In this interview, they are sharing some highlights on that talk and other thoughts on these topics, too.

LinuxCon: Who should attend your talk? Who will get the most out of it?

Etienne: Our talk is about NEXmark, which comes from a research paper that tried to evaluate the streaming systems for streaming semantics. This paper was adopted by Google into a suite of jobs, pipelines we’re calling them. It was contributed to the community, but it didn’t integrate well with all the Apache stuff, so we took the job and we improved on it and we’re going to present this story.

Ismaël:  And for the audience question, we will just define the concepts that are specific to Beam, so basic big data knowledge is required.

LinuxCon: Is it only focused on Apache Beam or is it on Big Data in general?

Etienne: In the Big Data world there are two big families: batch and streaming. We will treat both cases because Beam is a unified model for both. Then there are many Apache products involved also.

Apache Beam is enough traction to execute the pipeline or jobs. But we also need different Apache products, or different runners we call them, so we can run Beam code on Apache Flink, Apache Spark, or Apache Apex. But we also integrate with the data stores using Apache, like Cassandra.

Ismaël:  The main goal of this benchmark suite is to reproduce cases of advanced semantics of Beam that cover all the streaming of the space also.

LinuxCon: So you are both involved in Apache Beam? How long have you been involved in that?

Etienne: Since December, myself.

Ismaël:   I’ve been since June of the last year. I’m already a commenter, that’s the good news, as of two weeks ago.

LinuxCon: What are the main highlights? You talk about the runner, is there anything specific or new technology or new logic that you are unveiling as part of your talk?

Etienne: The big thing is that there is a new unified solution to evaluate Big Data using both streaming and batch and that’s quite new. Attendees will also learn the concepts of Beam and the API.

Linux.Com: So what’s your overall aim?

Etienne: There is one aim, that people will know that they can take this and use it to evaluate their own inference to two. For example, you might want to use big data framework from Apache and Spark, maybe version one or version two. You decide you want to evaluate the differences. So, you can take this suite and play this out. And then you will have some criteria extracted to decide. And the second thing that could be of interest is to use the advanced semantics of Beam. Things like timers, and other new stuff. So that would be of interest.

LinuxCon: Is this the first time you’re presenting?

Etienne: I went to Apache: Big Data in Vancouver last year and Seville also. It was a really nice atmosphere. But this is the first time I’m going to present something, so it’s going to be cool.

Ismaël:   This will be the second time I have attended ApacheCon. I’ve already been to the one in Seville, Europe. I’ve noticed that it’s a family atmosphere. That’s why I feel very confident in this kind of environment, and it’s very interesting for me. I mean in addition to the very interesting technical talks. But this is my first time speaking at ApacheCon.

LinuxCon: When is your talk? What date and time is it?

Etienne: It will be on Wednesday, May 17 at 2:30 pm.

Learn first-hand from the largest collection of global Apache communities at ApacheCon 2017 May 16-18 in Miami, Florida. ApacheCon features 120+ sessions including five sub-conferences: Apache: IoT, Apache Traffic Server Control Summit, CloudStack Collaboration Conference, FlexJS Summit and TomcatCon. Secure your spot now! Linux.com readers get $30 off their pass to ApacheCon. Select “attendee” and enter code LINUXRD5. Register now >>  

How Amazon and Red Hat Plan to Bridge Data Centers

“There’s a lot of innovation on AWS. This makes OpenShift more attractive to more developers, but it’s also a storefront for Amazon features and products,”Red Hat CEO Jim Whitehurst told Fortune during an interview at the Red Hat Summit tech conference in Boston. Whitehurst said he started discussing this plan with AWS chief executive Andy Jassy in January.

Red Hat is not alone in trying to woo corporate users with better ties to AWS. Last fall, VMware (VMW, -0.22%) and Amazon (AMZN, -0.69%) said they were working on a way to deploy VMware workloads on AWS, for example.

Read more at Fortune

The Case for Containerizing Middleware

It’s one thing to accept the existence of middleware in a situation where applications are being moved from a “legacy,” client/server, n-tier scheme into a fully distributed systems environment. For a great many applications whose authors have long ago moved on to well-paying jobs, containerizing the middleware upon which they depend may be the only way for them to co-exist with modern applications in a hybrid data center.

It’s why it’s a big deal that Red Hat is extending its JBoss Fuse middleware service for OpenShift. It’s also why Cloud Foundry’s move last December to make its Open Service Broker API an open standard can be viewed as a necessary event for container platforms.

Read more at The New Stack

Scaling Agile and DevOps in the Enterprise

In a recent Continuous Discussions (#c9d9) video podcast, expert panelists discussed scaling Agile and DevOps in the enterprise.

Our expert panel included: Gary Gruver, co-author of “Leading the Transformation, A Practical Approach to Large-Scale Agile Development,” and “Starting and Scaling DevOps in the Enterprise”; Mirco Hering, a passionate Agile and DevOps change agent; Rob Hirschfeld, CEO at RackN; Steve Mayner, Agile coach, mentor and thought leader; Todd Miller, delivery director for Celerity’s Enterprise Technology Solutions; and, our very own Anders Wallgren and Sam Fell.

During the episode, the panelists discussed lessons learned with regards to leadership, teams and the pipeline and patterns that can be applied for scaling Agile and DevOps in the Enterprise.

The full post can be found on the Electric Cloud blog

Learn How to Fix a Django Bug from Beginning to End

For those who are starting to code and want to make open source software, sometimes starting is hard. The idea of contributing with that fancy and wonderful library that you love can sound a little bit scary. Lucky for us, many of those libraries have room for whoever is willing to start. They also give us the support that we need. Pretty sweet, right?

Do you know that famous Python framework, Django? There’s one section on their bug track website called Easy pickings. It was made for anyone willing both to get started in open source and to contribute with an amazing library.

Read more at OpenSource.com

Cloud Computing Continues to Influence HPC

Traditionally, HPC applications have been run on special-purpose hardware, managed by staff with specialized skills. Additionally, most HPC software stacks are rigid and distinct from other more widely adopted environments, and require a special skillset by the researchers that want to run the applications, often needing to become programmers themselves. The adoption of cloud technologies increases the productivity of your research organization by making its activities more efficient and portable. Cloud platforms such as OpenStack provide a way to collapse multiple silos into a single private cloud while making those resources more accessible through self-service portales and APIs. Using OpenStack, multiple workloads can be distributed among the resources in a granular fashion that increases overall utilization and reduces cost.

Read more at insideHPC

Serverless Security Implications—From Infra to OWASP

By its very nature, Serverless (FaaS) addresses some of today’s biggest security concerns. By eliminating infrastructure management, it pushes its security concerns to the platform provider. Unfortunately, attackers won’t simply give up, and will instead adapt to this new world. More specifically, FaaS will move attackers focus from the servers to the application concerns OWASP highlights—and defenders should adapt priorities accordingly.

This post touches on which security concerns Serverless helps, and which ones it doesn’t. Each of these bullets is probably worth of a full post of its own (which I may write later on!), but in this post I’ll keep remediation and risk management details light, in favor of covering the bigger picture.

Read more at Snyk

4 Ways to Take Control of your Wi-Fi Connections on Linux

Easy connection to the Internet over Wi-Fi is no longer a privilege denied Linux users. With a recent distribution on a fairly recent laptop, connecting your Linux laptop to an available Wi-Fi network is often as easy as it is with your phone.

But just getting something to work is only the first step. With a little extra effort, you can optimize your Wi-Fi connections on Linux for the best speed and improved privacy. 

Read more at PCWorld

OSEN Podcast: Tim Mackey, Black Duck

I spoke with Tim Mackey, Technology Evangelist from Black Duck. Tim spent a few years at Citrix working on Xen Server and Cloudstack, where he, like me and many others, started thinking about how to get code from project to product. Tim and I talked about open source risk management, the current state of IT and open source, Xen vs. KVM flashbacks, and more.

Read more at OSEN

Catch Up With The Linux Foundation at OpenStack Summit in Boston

The Linux Foundation will be at OpenStack Summit in Boston — one of the largest open cloud infrastructure events in the world — with many conference sessions, intensive training courses, giveaways, and a chance to win a free OpenStack training course or a Raspberry Pi 3 Starter Kit.

Stop by The Linux Foundation training booth for fun giveaways, including webcam covers and stickers, as well as two free ebooks: Open Source in the Enterprise and SysAdmin’s Essential Guide to Linux Workstation Security.

You can also enter the raffle for a chance to win either a free LFS252 OpenStack Administration Fundamentals course OR a Raspberry Pi 3 Starter Kit. The winners will be announced Thursday, May 11 at 10:45 a.m. Eastern time at The Linux Foundation Training booth (#C19).

The Linux Foundation is also looking forward to an array of conference events — including intensive OpenStack training, many project-focused presentations, and the Women of OpenStack Lunch.

Event Highlights

Be sure to stop by these conference booths to chat and learn more: OPNFV (Booth C15), FD.io (C16), OpenDaylight (C17), Cloud Foundry (C18), and Cloud Native Computing Foundation (C20).