Industry leaders and experts across financial services, technology and open source will come together for thought-provoking insights and conversations about how to best leverage open source software to solve industry challenges. SAN FRANCISCO, June 29, 2022 — The Linux Foundation, the nonprofit organization enabling mass innovation through open source, and co-host the Fintech Open Source…
Interact with multiple hosts simultaneously, on a per-playbook basis with Ansible’s serial keyword.
Read More at Enable Sysadmin
New features bringing unmatched query performance to open data lakehouses
Today, the Delta Lake project announced the Delta Lake 2.0 release candidate, which includes a collection of new features with vast performance and usability improvements. The final release of Delta Lake 2.0 will be made available later this year.
Delta Lake has been a Linux Foundation project since October 2019 and is the open storage layer that brings reliability and performance to data lakes via the “lakehouse architectures”, the best of both data warehouses and data lakes under one roof. In the past three years, lakehouses have become an appealing solution to data engineers, analysts, and data scientists who want to have the flexibility to run different workloads on the same data with minimal complexity and no duplication – from data analysis to the development of machine learning models. Delta Lake is the most widely-used lakehouse format in the word and currently sees over 7M downloads per month (and continues to grow).
Delta Lake 2.0 will bring some major improvements to query performance for Delta Lake users, such as support for change data feed, Z-order clustering, idempotent writes to Delta tables, column dropping, and many more (get more details in the Delta Lake 2.0 RC release notes). This enables any organization to build highly performant lakehouses for a wide range of data and AI use cases.
The announcement of Delta Lake 2.0 came on stage during Data + AI Summit 2022 keynote as Michael Armbrust, distinguished engineer at Databricks and a co-founder of the Delta Lake project, showed how the new features will dramatically improve performance and manageability compared to previous versions and other storage formats. Databricks had initially open sourced Delta Lake and has, with the Delta Lake community, been continuously contributing new features to the project. The latest set of features included in v2.0 have been first made available to Databricks customers, ensuring they are “battle-tested” for production workloads before being contributed to the project.
Databricks is not the only organization actively contributing to Delta Lake – developers from over 70 different organizations have been collaborating and contributing new features and capabilities.
“The Delta Lake project is seeing phenomenal activity and growth trends indicating the developer community wants to be a part of the project. Contributor strength has increased by 60% during the last year and the growth in total commits is up 95% and the average line of code per commit is up 900%. We are seeing this upward velocity from contributing organizations like Uber Technologies, Walmart, and CloudBees, Inc., among others,”
— Executive Director of the Linux Foundation, Jim Zemlin.
The Delta Lake community is inviting you to explore Delta Lake and join the community. Here are a few useful links to get you started:
The post Delta Lake project announces the availability of 2.0 Release Candidate appeared first on Linux Foundation.
At last week’’s Open Source Summit North America, Robin Ginn, Executive Director of the OpenJS Foundation, relayed a principle her mentor taught: “1+1=3”. No, this isn’t ‘new math,’ it is demonstrating the principle that, working together, we are more impactful than working apart. Or, as my wife and I say all of the time, teamwork makes the dream work.
This principle is really at the core of open source technology. Turns out it is also how I look at the Open Programmable Infrastructure project.
Stepping back a bit, as “the new guy” around here, I am still constantly running across projects where I want to dig in more and understand what it does, how it does it, and why it is important. I had that very thought last week as we launched another new project, the Open Programmable Infrastructure Project. As I was reading up on it, they talked a lot about data processing units (DPUs) and infrastructure processing units (IPUs), and I thought, I need to know what these are and why they matter. In the timeless words of The Bobs, “What exactly is it you do here?”
What are DPUs/IPUs?
First – and this is important – they are basically the same thing, they just have different names. Here is my oversimplified explanation of what they do.
In most personal computers, you have a separate graphic processing unit(s) that helps the central processing unit(s) (CPU) handle the tasks related to processing and displaying the graphics. They offload that work from the CPU, allowing it to spend more time on the tasks it does best. So, working together, they can achieve more than each can separately.
Servers powering the cloud also have CPUs, but they have other tasks that can consume tremendous computing power, say data encryption or network packet management. Offloading these tasks to separate processors enhances the performance of the whole system, as each processor focuses on what it does best.
In order words, 1+1=3.
DPUs/IPUs are highly customizable
While separate processing units have been around for some time, like your PC’s GPU, their functionally was primarily dedicated to a particular task. Instead, DPUs/IPUs combine multiple offload capabilities that are highly customizable through software. That means a hardware manufacturer can ship these units out and each organization uses software to configure the units according to their specific needs. And, they can do this on the fly.
Core to the cloud and its continued advancement and growth is the ability to quickly and easily create and dispose of the “hardware” you need. It wasn’t too long ago that if you wanted a server, you spent thousands of dollars on one and built all kinds of infrastructure around it and hoped it was what you needed for the time. Now, pretty much anyone can quickly setup a virtual server in a matter of minutes for virtually no initial cost.
DPUs/IPUs bring this same type of flexibility to your own datacenter because they can be configured to be “specialized” with software rather than having to literally design and build a different server every time you need a different capability.
What is Open Programmable Infrastructure (OPI)?
OPI is focused on utilizing open software and standards, as well as frameworks and toolkits, to allow for the rapid adoption and use of DPUs/IPUs. The OPI Project is both hardware and software companies coming together to establish and nurture an ecosystem to support these solutions. It “seeks to help define the architecture and frameworks for the DPU and IPU software stacks that can be applied to any vendor’s hardware offerings. The OPI Project also aims to foster a rich open source application ecosystem, leveraging existing open source projects, such as DPDK, SPDK, OvS, P4, etc., as appropriate.”
In other words, competitors are coming together to agree on a common, open ecosystem they can build together and innovate, separately, on top of. The are living out 1+1=3.
I, for one, can’t wait to see the innovation.
A special thanks to Yan Fisher of Red Hat for helping me understand open programmable infrastructure concepts. He and his colleague, Kris Murphy, have a more technical blog post on Red Hat’s blog. Check it out.
Click here to add your own text
Learn how to install software with RHEL’s package manager using the dnf command or the GNOME Software app.
Read More at Enable Sysadmin
In a new white paper, the Cardea Project at Linux Foundation Public Health demonstrates a complete, decentralized, open source system for sharing medical data in a privacy-preserving way with machine readable governance for establishing trust.
The Cardea Project began as a response to the global Covid-19 pandemic and the need for countries and airlines to admit travelers. As Covid shut down air travel and presented an existential threat to countries whose economies depended on tourism, SITA Aero, the largest provider of IT technology to the air transport sector, saw decentralized identity technology as the ideal solution to manage a proof of Covid test status for travel.
With a verifiable credential, a traveler could hold their health data and not only prove they had a specific test at a specific time, they could use it—or a derivative credential—to prove their test status to enter hotels and hospitality spaces without having to divulge any personal information. Entities that needed to verify a traveler’s test status could, in turn, avoid the complexity of direct integrations with healthcare providers and the challenge of complying with onerous health data privacy law.
Developed by Indicio with SITA and the government of Aruba, the technology was successfully trialed in 2021 and the code specifically developed for the project was donated to Linux Foundation Public Health (LFPH) as a way for any public health authority to implement an open source, privacy-preserving way to manage Covid test and vaccination data. The Cardea codebase continues to develop at LFPH as Indicio, SITA, and the Cardea Community Group extend its features and applications beyond Covid-related data.
On May 22, 2022 at the 15th KuppingerCole European Identity and Cloud Conference in Berlin, SITA won the Verifiable Credentials and Decentralized Identity Award for its implementation of decentralized identity in Aruba.
The new white paper from the Cardea Project provides an in-depth examination of the background to Cardea, the transformational power of decentralized identity technology, how it works, the implementation in Aruba, and how it can be deployed to authenticate and share multiple kinds of health data in privacy-preserving ways. As the white paper notes:
“…Cardea is more than a solution for managing COVID-19 testing; it is a way to manage any health-related process where critical and personal information needs to be shared and verified in a way that enables privacy and enhances security. It is able to meet the requirements of the 21st Century Cures Act and Europe’s General Data Protection Regulation, and in doing so enable use cases that range from simple proof of identity to interoperating ecosystems encompassing multiple cloud services, organizations, and sectors, where data needs to be, and can be, shared in immediately actionable ways.
Open source, interoperable decentralized identity technology is the only viable way to manage both the challenges of the present—where entire health systems can be held at ransom through identity-based breaches—and the opportunities presented by a digital future where digital twins, smart hospitals, and spatial web applications will reshape how healthcare is managed and delivered.”
This article was originally published on the Linux Foundation Public Health project’s blog.
The post Sharing Health Data while Preserving Privacy: The Cardea Project appeared first on Linux Foundation.
So, I am old enough to remember when the U.S. Congress temporarily intervened in a patent dispute over the technology that powered BlackBerries. A U.S. Federal judge ordered the BlackBerry service to shutdown until the matter was resolved, and Congress determined that BlackBerry service was too integral to commerce to be allowed to be turned off. Eventually, RIM settled the patent dispute and the BlackBerry rode off into technology oblivion.
I am not here to argue the merits of this nearly 20-year-old case (in fact, I coincidentally had friends on both legal teams), but it was when I was introduced to the idea of companies that purchase patents with the goal of using this purchased right to extract money from other companies.
Patents are an important legal protection to foster innovation, but, like all systems, it isn’t perfect.
At this week’s Open Source Summit North America, we heard from Kevin Jakel with Unified Patents. Kevin is a patent attorney who saw the damage being done to innovation by patent trolls – more kindly known as non-practicing entities (NPEs).
Kevin points out that patents are intellectual property designed to protect inventions, granting a time-bound legal monopoly, but they are only a sword, not a shield. You can use it to stop people, but it doesn’t give you a right to do anything. He emphasizes, “You are vulnerable even if you invented something. Someone can come at you with other patents.”
Kevin has watched a whole industry develop where patents are purchased by other entities, who then go after successful individuals or companies who they claim are infringing on the patents they now legally own (but is not something they invented). In fact, 88% of all high-tech patent litigation is from an NPE.
NPEs are rational actors using the legal system to their advantage, and they are driven by the fact that almost all of the time the defendant decides to settle to avoid the costs of defending the litigation. This perpetuates the problem by both reducing the risk to the NPEs and also giving them funds to purchase additional patents for future campaigns.
In regards to open source software, the problem is on the rise and is only going to get worse without strategic, consistent action to combat it.
Kevin started Unified Patents with the goal of solving this problem without incentivizing further NPE activity. He wants to increase the risk for NPEs so that they are incentivized to not pursue non-existent claims. Because NPEs are rational actors, they are going to weigh risks vs. rewards before making any decisions.
How does Unified Patents do this? They use a three-step process:
Detect – Patent Troll Campaigns
Disrupt – Patent Troll Assertions
Deter – Further Patent Troll Investment
Unified Patents works on behalf of 11 technology areas (they call them Zones). They added an Open Source Zone in 2019 with the help of the Linux Foundation, Open Invention Network, and Microsoft. They look for demands being filed in court, and then they selectively pick patent trolls out of the group and challenge them, attempting to disrupt the process. They take the patent back to the U.S. Patent and Trademark Office and see if the patent should have ever existed in the first place. Typically, patent trolls look for broad patents so they can sue lots of companies, making their investment more profitable and less risky. This means it is so broad that it probably should never have been awarded in the first place.
The result – they end up killing a lot of patents that should have never been issued but are being exploited by patent trolls, stifling innovation. The goal is to slow them down and eventually bring them to a stop as quickly as they can. Then, the next time they go to look for a patent, they look somewhere else.
And it is working. The image below shows some of the open source projects that Unified Patents has actively protected since 2019.
The Linux Foundation participates in Unified Patents’ Open Source Zone to help protect the individuals and organizations innovating every day. We encourage you to join the fight and create a true deterrence for patent trolls. It is the only way to extinguish this threat.
Learn more at unifiedpatents.com/join.
And if you are a die-hard fan of the BlackBerry’s iconic keyboard, my apologies for dredging up the painful memory of your loss.