The Linux Foundation would like to reiterate its statements and analysis of the application of US Export Control regulations to public, open collaboration projects (e.g. open source software, open standards, open hardware, and open data) and the importance of open collaboration in the successful, global development of the world’s most important technologies. At this time, we have no information to believe recent Executive Orders regarding WeChat and TikTok will impact our analysis for open source collaboration. Our members and other participants in our project communities, which span many countries, are clear that they desire to continue collaborating with their peers around the world.
As a reminder, we would like to point anyone with questions to our prior blog post on US export regulations, which also links to our more detailed analysis of the topic. Both are available in English and Simplified Chinese for the convenience of our audiences.
The Linux Foundation has partnered with edX to update the Open Source Jobs Report, which was last produced in 2018. The report examines the latest trends in open source careers, which skills are in demand, what motivates open source job seekers, and how employers can attract and retain top talent. In the age of COVID-19, this data will be especially insightful both for companies looking to hire more open source talent, as well as individuals looking to advance or change careers.
The report is anchored by two surveys, one of which explores what hiring managers are looking for in employees, and one focused on what motivates open source professionals. Ten respondents to each survey will be randomly selected to receive a US$100 gift card to a leading online retailer as a thank you for participating!
All those working with open source technology, or hiring folks who do, are encouraged to share your thoughts and experiences. The surveys take around 10 minutes to complete, and all data is collected anonymously. Links to the surveys are at the top and bottom of this post.
Linux Foundation Executive Director Jim Zemlin recently spoke at Cloud Native + Open Source Virtual Summit China 2020. We’d now like to republish his opening comments and a guide on how to get involved with the TARS project, the open source microservices framework.
The pandemic has thrown our global society into a health and economic crisis. It seems like there are conflicts every day from all over the world. Today, I want to remind you that open source is one of the great movements where collaboration, working together, and getting along is the essence of what we do.
Open source is not a zero-sum game, but it has had an incredible impact on us in a net positive way. I like to remind everyone that open source is public goods that will be freely available to everyone worldwide, no matter what wind of political or economic change brings us. The LF is dedicated to all of that.
Today, we are working hard to help folks during hard times, expanding our mentorship programs with over a quarter of million dollars of new donations to allow people to come in and train themselves on new skills during this tough time. We had a wonderful set of virtual events with thousands of people from hundreds of companies from countries worldwide working together.
We want to bring the power of open source to help during these times and have several new initiatives that we are working on. Most notably, our recently launched LFPH initiative, which has started with seven members: Cisco, doc.ai, Geometer, IBM, NearForm, Tencent, and VMware, and it’s hosting exposure notification projects such as Covid-Shield and Covid-Green, which are currently being deployed in Canada, Ireland, and several U.S. states to help find and reduce the spread of COVID-19.
We are also working on a considerable number of new initiatives, which I will talk about. Still, I like to remind you of what we are here to talk about, which is cloud computing, and how much cloud computing has impacted all of us. Microservices are an essential part of that. In China we are seeing the TARS project; the microservices framework is really taking off.
Two years ago, TARS joined the Linux Foundation, and ever since its community has been growing and new projects and contributors have been coming in. The TARS project provides a mature, high-performance microservices framework that supports multiple programming languages. We will talk more about the TARS Foundation in a little bit, but the microservices ecosystem has been growing and quickly turning applications and ideas in scale.
In addition to TARS, we have been seeing amazing work going on in the open source community. It begins with things such as the Software Package Data Exchange specification (SPDX), which was recently contributed as an international specification to the ISO/IEC JTC 1 for approval. This will help us track the usages of open source software across a complex global supply chain and reaffirm our commitment to the global movement.
We also see growth and projects with recent releases, such as our networking project, the Open network automation platform Frankfurt release, which is being used to automate the networks and edge computing service for telecommunication providers, cloud providers, and enterprises.
We‘ve seen new projects join our organization. One good example is MLflow — this project was contributed to our organization from Data Brick. This project has had an impressive community with over 200 contributors, which has been downloaded more than 2 million times. MLflow is part of the LF AI initiative. It will be a neutral home and open governance model to broaden the adaptation and contribution of things like MLflow. We have also seen new projects come to our organization, such as the FinOps Foundation, the consortium of financial companies. We are working together to grow the use of open source throughout our global financial system.
It’s impressive to see all the different projects that have been coming. And today, I’d like to introduce the TARS Foundation formally. TARS has been an amazing project, and in just the last few years, I’ve noticed that developers here in China are for the first time incubating and sharing new open-source projects in China and the rest of the world.
And the rest of the world is watching the progress of open source projects and seeing fantastic work. We are so proud of the work that is coming out of TARS.
You know, Just like the Linux Foundation is about more than Linux, the TARS Foundation is more than just TARS. It’s a microservices ecosystem.
Unfortunately, because of the COVID-19 pandemic, we had to cancel the Linux Foundation Member Summit this Spring, and we were unable to announce the TARS Foundation at that time.
But today, the Linux Foundation is proud to announce again that the TARS project has become the TARS Foundation, an open-source microservice foundation within the overall framework of the Linux Foundation, and its outcome has been rapid growth for both the TARS project and projects associated with TARS. TARS has really taken off, and it’s just amazing to see the amount of development.
We hope the TARS foundation will create a neutral home for additional projects for solving significant problems surrounding microservices, including but not limited to:
Agile development, DevOps best practices, and the comprehensive governance that we have will enable multi-languages, high performance, scalable solutions.
It is my pleasure to present what the TARS Foundation has achieved in the open source community.
There are many companies whose contributions are instrumental in establishing TARS’ microservices ecosystem. The TARS Foundation is proof of that. Currently, the TARS Foundation has Arm and Tencent as premier members and five general members: AfterShip, Ampere, API7, Kong, and Zenlayer.
In terms of TARS applications, it serves more than 100 companies from different industries, including Edge, E-sport, Fintech, Streaming, E-commerce, Entertainment, Telecommunication, Education, and more.
Furthermore, the TARS Foundation is striving to expand its microservices ecosystem, and it’s incorporating more functions such as Testing, Gateway, and Edge, to name a few. So far, the TARS Foundation has more than 30 projects.
Developers around the world are starting to realize that the TARS project is amazing and contribute as such. There are 12,000 developers actively using TARS. Also, 150 developers contribute code to TARS projects, from companies like Arm, Tencent, Google, Microsoft, Vmware, Webank, TAL, China Literature, iFlytek, Longtu Game, and many more.
An overview of the TARS framework and how you can contribute to the open source microservices community
What is TARS?
TARS is a new generation distributed microservice application framework that was created in 2008. It provides developers and enterprises with a complete set of solutions to build, release, deploy, and maintain stable and reliable applications that run at scale.
“a neutral home for open source microservices projects that empower any industry to quickly turn ideas into applications at scale”.
The TARS Foundation’s goal is to address the most common problems related to microservices application, including solving multi-programming language interoperability issues, mitigating transfer issues, maintaining data storage consistency, and ensuring high performance while supporting a growing number of requests.
Many companies have successfully used TARS framework from diverse industries such as fintech, esports, edge computing, online streaming, e-commerce, and education, to name a few.
Here is a complete timeline of the TARS Foundation’s development:
Initially developed by Tencent, the world’s largest online gaming company, the TARS project has created an open source microservices platform for modern enterprises to realize innovative ideas quickly with the user-friendly technology in the TARS framework.
In March 2020, the TARS project transitioned into the TARS Foundation under the Linux Foundation umbrella, aiming to support microservices development through DevOps best practices, comprehensive service governance, high-performance data transfer, storage scalability with massive data requests, and built-in cross-language interoperability. TARS has a mission to support the rapid growth of contributions and membership for a community focused on building a robust microservices platform.
The TARS Foundation provides a great platform for developers who are interested in contributing to an open source project. The organization extends different opportunities for developers to contribute to open source projects and the possibility to take on leadership roles and create major contributions in the broader open source community.
There are Contributor, Committer, Maintainer, and Ambassador roles in their open source ecosystem, each having different requirements and responsibilities.
How to become a Contributor
To get involved with TARS open source projects, you can first become a Contributor by participating in software construction and having at least one pull request merged into the source code.
There are several ways for software developers to engage with the TARS community and become contributors:
Help other users and answer questions.
Submit meaningful issues.
Use TARS projects in production to increase testing scenarios.
Improve technical documentations.
Publish articles on applications and case studies related to TARS projects.
Make changes to the code and test it on your local machine.
Commit those changes.
Push the committed code to GitHub.
Open a new pull request to submit your changes for review.
Your changes will be merged into the master branch if accepted.
Now you did it! You’ve become a TARS Contributor, and you will receive a Contributor t-shirt!
How to become a Committer
A Committer is a contributor who has made distinct contributions to the TARS repositories and has accomplished at least one essential construction project or has repaired critical bugs. He or she can also take on some leadership opportunities.
The Committer is expected to:
Display excellent ability to make technical decisions.
Have successfully submitted and merged five pull requests.
Have contributed to the improvement of project code quality and performance.
Have implemented significant features or fixed major bugs.
After meeting the above requirements, you can submit a Committer request:
STEP 1: Provide your proof of the above criteria under Repo ISSUE.
STEP 2: Submit your pull request after you receive a response with instructions
STEP 3: Once your application is accepted, you will become a TARS Committer!
As a Committer, you are able to:
Control the code quality as a whole.
Respond to the pull requests submitted by the community.
Mentor contributors to promote collaborations in the open source community.
Attend regular meetings for committers.
Know about project updates and trends in advance.
How to become a Maintainer
Maintainers are responsible for devising the subprojects in the TARS community. They will take the lead to make decisions associated with project development while holding power to merge branches. They should demonstrate excellent judgment and a sense of responsibility for subprojects’ well-being, as they need to define or approve design strategies suitable for developing subprojects.
The Maintainer is expected to:
Have a firm grasp of TARS technology.
Be proactive in organizing technical seminars and put forward construction projects.
Be able to handle more complicated problems in coding.
Get unanimously approved by the technical support committee (TSC).
As a Maintainer, you have the right to:
Devise and decide the top-level technical design of subprojects.
Define the technical direction and priority of sub-projects.
Participate in version releases and ensure code quality.
Guide Contributors and Committers to promote collaborations in the open source community.
How to become an Ambassador
Passionate about open source technology and community, Ambassadors promote and support extensive use of TARS technology to a wider audience of software developers. Ambassadors’ expertise and involvement in TARS projects will also acquire greater recognition in the community.
The Ambassador can:
Become a general member of the TARS Foundation.
Participate in TARS Foundation’s projects as a contributor, lecturer, or blogger.
Engage with developers by presenting at community events or sharing technology articles on online media platforms.
Ultimately, the TARS Foundation encourages a contributor to becoming a member of the governing board and the Technical Support Committee (TSC). At this level, you will focus on the organization’s strategic directions and decision-making as a whole.
Contributing to open source projects has many benefits. It strengthens your development skills, and your code is reviewed by other developers who can give a new perspective. You are also making new connections and even lifelong friendships with like-minded developers in the process of contributing. This is the open source model that has built many tech innovations that all of us enjoy today. Its sustainability depends on a free exchange of ideas and technology in our global community. Open source value and innovations are embedded in developers like you who can attempt development challenges and share insights with the broader community.
Software Package Data Exchange (SPDX) is an open standard for communicating software bill of materials (SBOM) information that supports accurate identification of software components, explicit mapping of relationships between components, and the association of security and licensing information with each component. The SPDX format has recently been submitted by the Linux Foundation and the Joint Development Foundation to the JTC1 committee of the ISO for international standards approval.
A group of eight healthcare industry organizations, composed of five medical device manufacturers and three healthcare delivery organizations (hospital systems), recently participated in the first-ever proof of concept (POC) of the SPDX standard for healthcare use.
This blog post is a summary of the results of this initial trial.
Why do we care about SBOMs and the medical device industry?
A Software Bill of Materials (SBOM) is a nested inventory or a list of ingredients that make up the software components used in creating a device or system. This is especially critical in the medical device industry and within healthcare delivery organizations to adequately understand the operational and cyber risks of those software components from their originating supply chain.
Some cyber risks come from using components with known vulnerabilities. Known vulnerabilities are a widespread problem in the software industry, such as known vulnerabilities in the Top 10 Web Application Security Risks from the Open Web Application Security Project (OWASP). Known vulnerabilities are especially concerning in medical devices since the exploitation of those vulnerabilities could lead to loss of life or maiming. One-time reviews don’t help, since these vulnerabilities are typically found after the component has been developed and incorporated. Instead, what is needed is visibility into the components of a medical device, similar to how food ingredients are made visible.
A measured path towards using SBOMs in the medical device industry
In June 2018, the National Telecommunications and Information Administration (NTIA) engaged stakeholders across multiple industries to discuss software transparency and to participate in a limited proof of concept (POC) to determine if SBOMs can be successfully produced by medical device manufacturers and consumed by healthcare delivery organizations. That initial POC was successfully concluded in the early fall of 2019.
Despite the limited scope, the NTIA POC results demonstrated that industry-agnostic standard formats can be leveraged by the healthcare vertical and that industry-specific formats are unnecessary.
Next, the participants in the NTIA POC explored whether a standardized SBOM format could be used for sharing information between medical device manufacturers and healthcare delivery organizations. For this next phase, the NTIA stakeholders engaged the Linux Foundation’s SPDX community to work with the NTIA Healthcare working group. The goal was to demonstrate through a proof of concept whether the open source SPDX SBOM format would be suitable for healthcare and medical device industry uses. The first phase of that trial was conducted in early 2020.
This NTIA framing document defines specific baseline data elements or fields that should be used to identify software components in any SBOM format, which can be mapped into corresponding field elements in SPDX:
(7.1) Relationship: CONTAINS
The 2020 POC conducted by NTIA working group had a stated objective to determine if SBOMs generated by Medical Device Manufacturers (MDMs) using SPDX could be ingested into SIEM (Security, Information and Event Management) solutions operated by the participating Healthcare Delivery Organizations (HDOs).
The MDMs included in this POC included Abbott, Medtronic, Philips, Siemens, and Thermo Fisher. The HDOs included Cedars-Sinai, Christiana Care, Mayo Clinic, Cleveland Clinic, Johns Hopkins, New York-Presbyterian, Partners/Mass General, and Sutter Health.
Execution and implementation of the SPDX SBOMs
The participating HDOs provided an inventory of the deployed medical devices in use within their organizations.
A best-effort approach was used to determine software identity as the names that software packages are known by are “ambiguous” and could be misinterpreted.
An example SPDX was created along with a guidance document for the MDMs to follow for use with the medical devices identified by the HDO inventory exercise.
The MDMs produced 17 distinct SPDX-based SBOMs manually and with generator tooling.
The SBOMs were delivered via secure transfer using enterprise Box accounts, simulating delivery via secure customer portals offered by each MDM.
Consumption of the SBOMs in the SPDX POC
As a result of the 2020 POC, all participating HDOs successfully ingested the SPDX SBOM into their respective SIEM solutions, immediately making the data searchable to identify security vulnerabilities across a fleet of products. This information can also be converted into a human-readable, tabular format for other data analysis systems.
Multiple HDOs are already collaborating with vendor partners to explore direct ingestion into medical device asset/risk management solutions as part of their device procurement. One of the HDOs is working with one of their vendor partners to explore direct ingestion into a healthcare Vendor Risk Management (VRM) solution, and another has developed a ”How-To Guide,” focusing on how to correctly parse out the Packages fields using regular expressions (regex).
As a positive indicator of SPDX’s suitability when used with asset management systems, two HDOs have begun configuring their respective internal tracking systems to track software dependencies and subcomponents. Additionally, multiple HDOs are collaborating with vendor partners to manage devices into medical device asset/risk management solutions through the device’s life by allowing for periodic updates and an audit trail.
Ongoing considerations for SPDX-based SBOMs for medical devices in healthcare organizations
Risk management, vulnerability management, and legal considerations are ongoing at the participating HDOs related to the use of SPDX-based SBOMs.
All of the responding HDOs are exploring vulnerability identification upon procurement (i.e., SIEM through initial ingestion of the SBOM) and on an on-going basis (i.e., SIEM, CMDB/CMMS, VRM). The participating HDOs intend to explore mitigation plan / compensating control exercises that will be performed to identify vulnerable components, measure exploitability, implement risk reduction techniques, and document this data alongside the SBOM.
The SPDX community intends to learn from these exercises and improve future versions of SPDX specification to include requested information determined to be needed to manage risk effectively.
Vulnerability management at HDOs
An HDO is already working with its Biomed team to manually perform vulnerability management processes on information extracted from SBOM data.
Another is working with their Vulnerability Management team to evaluate correlated SBOM data to credentialed/non-credentialed scans of the same device, which may prove useful in an information audit use case. A second HDO is currently working with their Vulnerability Management team on leveraging the SBOM data to supplement regular scanning results.
Participating HDOs have been developing SBOM product security language to add cybersecurity safeguards to the contract documentation.
The original POC was able to validate the conclusions of the NTIA Working Group that proprietary SBOM formats specific to healthcare industry verticals are not needed. This 2020 POC showed that the SPDX standard could be used as an open format for SBOMs for use by healthcare industry providers. Additionally, the ability to import the SPDX format into SIEM solutions will help HDOs adequately understand the operational and cyber risks of medical device software components from their originating supply chain.
There is work ahead to improve automation of SPDX-based SBOMs, including the automated identification of software components and determining which component vulnerabilities are exploitable in a given system. Participating HDOs intend to perform compensating control exercises to identify and implement risk reduction techniques building on this information. HDOs are also evaluating how SPDX can support other improvements to vulnerability management. In summary, this POC showed that SPDX could be an essential part of addressing today’s operational and cyber risks.
Kate Stewart is a Senior Director of Strategic Programs, responsible for the Open Compliance program at the Linux Foundation encompassing SPDX, OpenChain, Automating Compliance Tooling related projects. In this interview, we talk about the latest release and the role it’s playing in the open source software supply chain.
Here is a transcript of our interview.
Swapnil Bhartiya: Hi, this is Swapnil Bhartiya, and today we have with us, once again, Kate Stewart, Senior Director of Strategic Programs at Linux Foundation. So let’s start with SPDX. Tell us, what’s new going on in there in this specification?
Kate Stewart: Well, the SPDX specification just a month ago released auto 2.2 and what we’ve been doing with that is adding in a lot more features that people have been wanting for their use cases, more relationships, and then we’ve been working with the Japanese automotive-made people who’ve been wanting to have a light version. So there’s lots of really new technology sitting in the SPDX 2.2 spec. And I think we’re at a stage right now where it’s good enough that there’s enough people using it, we want to probably take it to ISO. So we’ve been re-formatting the document and we’ll be starting to submit it into ISO so it can become an international specification. And that’s happening.
Swapnil Bhartiya: Can you talk a bit about if there is anything additional that was added to the 2.2 specification. Also, I would like to talk about some of the use cases since you mentioned the automaker. But before that, I just want to talk about anything new in the specification itself.
Kate Stewart: So in the 2.2 specifications, we’ve got a lot more relationships. People wanted to be able to handle some of the use cases that have come up from containers now. And so they wanted to be able to start to be able to express that and specify it. We’ve also been working with the NTIA. Basically they have a software bill of materials or SBoM working groups, and SPDX is one of the formats that’s been adopted. And their framing group has wanted to see certain features so that we can specify known unknowns. So that’s been added into the specification as well.
And then there are, how you can actually capture notices since that’s something that people want to use. The license has called for it and we didn’t have a clean way of doing it and so some of our tool vendors basically asked for this. Not the vendors, I guess there are partners, there are open source projects that wanted to be able to capture this stuff. And so we needed to give them a way to help.
We’re very much focused right now on making sure that SPDX can be useful in tools and that we can get the automation happening in the whole ecosystem. You know, be it when you build a binary to ship to someone or to test, you want to have your SBoM. When you’ve downloaded something from the internet, you want to have your SBoM. When you ship it out to your customer, you want to be able to be very explicit and clear about what’s there because you need to have that level of detail so that you can track any vulnerabilities.
Because right now about, I guess, 19… I think there was a stat from earlier in the year from one of the surveys. And I can dig it up for you if you’d like, but I think 99% of all the code that was scanned by Synopsys last year had open source in it. And of which it was 70% of that whole build materials was open source. Open source is everywhere. And what we need to do is, be able to work with it and be able to adhere to the licenses, and transparency on the licenses is important as is being able to actually know what you have, so you can remediate any vulnerabilities.
Swapnil Bhartiya: You mentioned a couple of things there. One was, you mentioned tooling. So I’m kind of curious, what sort of tooling that is already there? Whether it’s open source or open source be it basically commercialization that worked with the SPDX documents.
Kate Stewart: Actually, I’ve got a document that basically lists all of these tools that we’ve been able to find and more are popping up as the day goes by. We’ve got common tools. Like, some of the Linux Foundation projects are certainly working with it. Like FOSSology, for instance, is able to both consume and generate SPDX. So if you’ve got an SPDX document and you want to pull it in and cross check it against your sources to make sure it’s matching and no one’s tampered with it, the FOSSology tool can let you do that pretty easily and codes out there that can generate FOSSology.
Free Software Foundation Europe has a Lindt tool in their REUSE project that will basically generate an SPDX document if you’re using the IDs. I guess there’s actually a whole bunch more. So like I say, I’ve got a document with a list of about 30 to 40, and obviously the SPDX tools are there. We’ve got a free online, a validator. So if someone gives you an SPDX document, you can paste it into this validator, and it’ll tell you if it’s a valid SPDX document or not. And we’re looking to it.
I’m finding also some tools that are emerging, one of which is decodering, which we’ll be bringing into the Act umbrella soon, which is looking at transforming between SPDX and SWID tags, which is another format that’s commonly in use. And so we have tooling emerging and making sure that what we’ve got with SPDX is usable for tool developers and that we’ve got libraries right now for SPDX to help them in Java, Python and Go. So hopefully we’ll see more tools come in and they’ll be generating SPDX documents and people will be able to share this stuff and make it automatic, which is what we need.
Another good tool, I can’t forget this one, is Tern. And actually Tern, and so what Tern does is, it’s another tool that basically will sit there and it will decompose a container and it will let you know the bill of materials inside that container. So you can do there. And another one that’s emerging that we’ll hopefully see more soon is something called OSS Review Toolkit that goes into your bill flow. And so it goes in when you work with it in your system. And then as you’re doing bills, you’re generating your SBoMs and you’re having accurate information recorded as you go.
As I said, all of this sort of thing should be in the background, it should not be a manual time-intensive effort. When we started this project 10 years ago, it was, and we wanted to get it automated. And I think we’re finally getting to the stage where it’s going to be… There’s enough tooling out there and there’s enough of an ecosystem building that we’ll get this automation to happen.
This is why getting it to ISO and getting the specification to ISO means it’ll make it easier for people in procurement to specify that they want to see the input as an SPDX document to compliment the product that they’re being given so that they can ingest it, manage it and so forth. But by it being able to say it’s an ISO standard, it makes the things a lot easier in the procurement departments.
OpenChain recognized that we needed to do this and so they went through and… OpenChain is actually the first specification we’re taking through to ISO. But for SPDX, we’re taking it through as well, because once they say you need to follow the process, you also need some for a format. And so it’s very logical to make it easy for people to work with this information.
Swapnil Bhartiya: And as you’ve worked with different players, different of the ecosystem, what are some of the pressing needs? Like improve automation is one of those. What are some of the other pressing needs that you think that the community has to work on?
Kate Stewart: So some of the other pressing needs that we need to be working on is more playbooks, more instructions, showing people how they can do things. You know, we figured it out, okay, here’s how we can model it, here’s how you can represent all these cases. This is all sort of known in certain people’s heads, but we have not done a good job of expressing to people so that it’s approachable for them and they can do it.
One of the things that’s kind of exciting right now is the NTIA is having this working group on these software bill of materials. It’s coming from the security side, but there’s various proof of concepts that are going on with it. One of which is a healthcare proof of concept. And so there’s a group of about five to six device manufacturers, medical device manufacturers that are generating SBoMs in SPDX and then there are handing them into hospitals to go and be able to make sure they can ingest them in.
And this level of bringing people up to this level where they feel like they can do these things, it’s been really eye-opening to me. You know, how much we need to improve our handholding and improve the infrastructure to make it approachable. And this obviously motivates more people to be getting involved. From the vendors and commercial side, as well as the open source, but it wouldn’t have happened, I think, to a large extent for SPDX without this open source and without the projects that have adopted it already.
Swapnil Bhartiya: Now, just from the educational awareness point of view, like if there’s an open source project, how can they easily create SBoM documents that uses the SPDX specification with their releases and keep it synced?
Kate Stewart: That’s exactly what we’d love to see. We’d love to see the upstream projects basically generate SPDX documents as they’re going forward. So the first step is to use the SPDX license identifiers to make sure you understand what the licensing should be in each file, and ideally you can document with eTags. But then there’s three or four tools out there that actually scan them and will generate an SPDX document for you.
If you’re working at the command line, the REUSE Lindt tool that I was mentioning from Free Software Foundation Europe will work very fast and quickly with what you’ve got. And it’ll also help you make sure you’ve got all your files tagged properly.
If you haven’t done all the tagging exercising and you wonder [inaudible 00:09:40] what you got, a scan code works at the command line, and it’ll give you that information as well. And then if you want to start working in a larger system and you want to store results and looking things over time, and have some state behind it all so like there’ll different versions of things over time, FOSSology will remember from one version to another and will help you create these [inaudible 00:10:01] off of bill materials.
Swapnil Bhartiya: Can you talk about some of the new use cases that you’re seeing now, which maybe you did not expect earlier and which also shows how the whole community is actually growing?
Kate Stewart: Oh yeah. Well, when we started the project 10 years ago, we didn’t understand containers. They weren’t even not on the raw mindset of people. And there’s a lot of information sitting in containers. We’ve had some really good talks over the last couple of years that illustrate the problems. There was a report that was put out from the Linux Foundation by Armijn Hemel, that goes into the details of what’s going on in containers and some of the concerns.
So being able to get on top of automating, what’s going on with concern inside a container and what you’re shipping and knowing you’re not shipping more than you need to, figuring out how we can improve these sorts of things is certainly an area that was not initially thought about.
We’ve also seen a tremendous interest in what’s going on in IOT space. And so that you need to really understand what’s going on in your devices when they’re being deployed in the field and to know whether or not, effectively is vulnerability going to break it, or can you recover? Things like that. The last 10 years we’ve seen tremendous spectrum of things we just didn’t anticipate. And the nice thing about SPDX is, you’ve got a use case that we’re not able to represent. If we can’t tell you how to do it, just open an issue, and we’ll start trying to figure it out and start to figure if we need to add fields in for you or things like that.
Swapnil Bhartiya: Kate, thank you so much for taking your time out and talking to me today about this project.
SODA Foundation is an open source project under Linux Foundation that aims to establish an open, unified, and autonomous data management framework for data mobility from the edge, to core, to cloud. We talked to Steven Tan, SODA Foundation Chair, to learn more about the project.
Here is a transcript of the interview:
Swapnil Bhartiya: Hi, this is Swapnil Bhartiya, and today we have with us Steven Tan, chair of the SODA foundation. First of all, welcome to the show.
Steven Tan: Thank you.
Swapnil Bhartiya: Tell us a bit about what is SODA?
Steven Tan: The foundation is actually a collaboration among vendors and users to focus on data management for, how do you call, autonomous data mesh management. And the point of this whole thing is how do we serve the users? Because a lot of our users are getting a lot of data challenges, and that’s what this foundation is for. To get users and vendors together to help to address these data challenges.
Swapnil Bhartiya: What kind of data are we talking about?
Steven Tan: The data that we’re talking about is referring to anything like data protection, data governance, data replication, data copy management and stuff like that. And also data integration, how to connect the different data silos and stuff.
Swapnil Bhartiya: Right. But are we talking about enterprise data or are we talking consumer data? Like there is a lot of data with Facebook, Google, and Gmail, and then there are a lot of enterprise data, which companies … Sorry, as an enterprise, I might put something on this cloud, I can put it on this cloud. So can you please clarify what data are we talking about?
Steven Tan: Actually, the data that we’re talking about is … It depends on the users. There’re all kinds of data. Like for example, I mean, in the keynote that I gave two days ago, the example I gave was from Toyota. So Toyota use case is actually car data. So car data refers to things like the car sensor data, videos, map data and stuff. And then we have users like China Unicom. I mean, they have enterprise companies going to the cloud and so on. So they’ve all kinds of enterprise data over there. And then we also have other users like Yahoo Japan, and they have like a website. So the data that you’re talking about is web data, consumer data and stuff like that. So it’s across the board.
Swapnil Bhartiya: Oh, so it’s not as specific to an industry or any space or sector, okay. But why do you need it? What is the problem that you see in the market and in the current sphere that you’re like, hey, we should create something like that?
Steven Tan: So the problem that came, I mean the reason why all these companies came together is that they are building data centers that are from small to big. But a lot of the challenges that you have is like, it’s hard for a single project to address. It’s not like a business where we have a specific problem and then we need this to be solved and so on, it’s not like that. A lot of it is like, how do you connect the different pieces together in the data center together?
So there’s nothing like, no organization like that that can help them solve this kind of problem. Like how do you have, in order to address the data of … Or how do you address things like taking care of data protection and data privacy at the same time? And at the same time, you want to make sure that this data can be governed properly. So there isn’t any single organization that can help to take care of this kind of stuff, so we’re helping these users understand their problems and then come together and then we plan projects and roadmaps based on their problems and try to address them through these projects in the SODA foundation.
Swapnil Bhartiya: And you gave an example of data from the cars and all these things. Does that also mean that open source has helped solving a lot of problems by breaking down a lot of silos so that there’s a lot of interaction between different silos, which were like earlier separated and isolated? Today, as you mentioned, we are living in a data driven world. No matter what we do all the way from the Ring, to what we are doing right now, talking to each other, to the product that we’ll create in the end. But most of this data is living in their own silos. There may be a lot of value in that data, which cannot be extracted because one, it is locked into the silos. The second problem is that these days, data is kind of becoming the next oil. These companies are trying to capture all the data, irrespective of the fact of what value do they see in that data today? And by leveraging machine learning and deep learning, they can in the future … So how do you look at that, and how is SODA foundation going to break those silos, without compromising on our privacy, yet allow companies … Because the fact is, as much as I prefer my privacy, I also want Google Maps to tell me the fastest route where I want to go.
Steven Tan: Right. So I think there are certain, I mean, there are different levels of privacy that we’re going to take care of. And in terms of like, first of all, there are all kinds of … I mean, in terms of the different countries or different States or different provinces like in different countries, there are different kinds of regulations and so on. So first of all, like the data silos you talk about. Yes, that’s one of the key problems that we’re trying to solve. How to connect all the different data silos so as to reduce fragmentation, and then try to minimize the so called dark data that you’re talking about, and then extract all the values over there. So that’s one of the things that we try to get here. I mean, we try to connect all the different pieces, like in the different … The data may be sitting in the edge in the data center or different data centers and in the cloud. We try to connect all these pieces together.
I mean, that’s one of the first things that we tried to do. And then we tried to have data policies. I think this is a critical piece of things that a lot of the solutions out there don’t address. You have data policies, but it may be the data policies just for a single vendor solution. But once the data gets out, that solution then is out of control. So what we’re trying to do here is say, how do you have data policies across different solutions, so no matter where the data is it’s governed the same way, consistently? That’s the key. So then you can talk about how can you really protect the data in terms of privacy or govern the data or control the data? And in terms of the, I mentioned about the regions, right? So you know where the data is, and you know what kind of regulations that need to be taken care of and you apply it right there. That’s how it should work.
Swapnil Bhartiya: When we look at the kind of a scenario you talked about, I see it as two-fold. One is there is a technology problem and the second is people problem. So is SODA foundation going to deal with both, or are you going to just deal with the technology aspect of it?
Steven Tan: The technology part that we talk about, we try to define in terms of the API and so on to all the data policies and so on, and try to get as many companies to support this as possible. And then the next thing that we try to do is actually try to work with standards organizations to try to make this into a standard. I mean, that’s what we’re trying to do here.
And then government aspects, there are certain organizations that we are talking to. Like there’s the CESI, it’s China Electronic Standards organizations that we’re talking to that’s trying to work things into their … Actually, I’m not sure about China, because it’s, I mean, we don’t know about their sphere of influence within the CSI and so on. And then for the industry standards, there’s [inaudible 00:09:05] and so on, we’re trying to work with them and trying to get it to work.
Swapnil Bhartiya: Can we talk about the ecosystem that you’re trying to build around SODA foundation? One would be the participants who are actually contributing either the code or the vision, and then the users community who would actually be benefiting from it?
Steven Tan: So the ecosystem that we are trying to build, that’s the core part, which is actually the framework. So the framework, I mean, this part will be more of the data vendors or the storage vendors that will be involved in trying to build this ecosystem. And then the outer part, what I call the outer part of the ecosystem will be things like the platforms. Things like Kubernetes, VMware, all these different vendors, and then networking kind of stuff that you need to take care of like the big data analytics and stuff.
And then for the users, actually, if you can see from the SODA end-user advisory committee, I mean, that’s where most of our users are participating in the communication. So most of these users, I mean, they are from different regions and different countries and different industries. So we try to serve, I mean, whichever participant is interested in, they can participate in this thing. But the main thing is that because they may be from different industries, but actually most of the issues that they have is still the same thing. So there are some commonalities among all these users.
Swapnil Bhartiya: We are in the middle of 2020, because of COVID-19 everything has slowed down, things have changed. What does your roadmap, what does your plan look like? The structure, the governance and the plan for ’21 or end of the year?
Steven Tan: We are very, how do you call it? Very community-driven or focused kind of organization. We hold a lot of meetups and events and so on where we get together the users and the vendors and so on and the community in general. So with this COVID-19 thing, a lot of the plans has been upset. I mean, it’s in chaos right now. So most of the things are like what everybody is doing, moving online. So we are having some webinars and stuff, even as of right now when we are talking, we are having a mini summit going on with the Open Source Summit North America right now.
So for the rest of this year, most of our events will be online. We’re going to have some webinars and some meetups, you can find it out from our website. And the other plans that we have is that we are going to have, we just released the SODA federal release, which is the 1.0 release. And through the end of this year, we’re going to have two more releases, the G release and the H release at the end of this year. G release is going to be in September, and H is in the end of the year. And we’re trying to engage our users with things like the POC testing for the federal. Because each release that we have, we try to get them to do the testing, and then so that’s the way of them trying to provide feedback to us. Whether that works for them or how can we improve to make the code work for what they need.
Swapnil Bhartiya: Awesome. So thank you so much for taking your time out and explaining more about SODA foundation, and I look forward to talking to you again because I can see that you have a very exciting pipeline ahead. So thank you.
Steven Tan: Thank you, thank you very much.