PingCAP had high hopes that its TiKV project would develop into a building block for the next generation of distributed systems by providing reliable high quality and practical storage foundation. To accomplish that, it decided to contribute TiKV to the Cloud Native Computing Foundation (CNCF) to make it vendor-neutral and widely used across organizations. It seems headed in that direction, especially now that the project recently graduated, further demonstrating its maturity and sustainability. On behalf of the Linux Foundation, Swapnil Bhartiya, founder and host of TFiR, sat down with two members of the TiKV project, Siddon Tang and Calvin Weng, to learn more about the project’s evolution.
Here is a transcript of the discussion:
Swapnil Bhartiya: What is TiKV project and what problem are you trying to solve? Siddon Tang: TiKV is an open source, distributed transactional key value database. TiKV is inspired by Google Spanner and HBase, but the design is simpler and more practical. Why did we develop TiKV at PingCAP? We want to build a distributed database with SQL compatibility. We built the SQL and then we wanted to build a distributed key-value storage layer that supported our database. At first, we tried to use HBase, but its performance was not what we expected so we decided to build our own distributed key-value database. That’s how TiKV started.
Calvin Weng: It was originally created to complement TiDB, but we soon realized that the TiKV project could be decoupled from TiDB and serve as a unified distributed storage layer that supported distributed transactions, horizontal scalability, and cloud-native architecture.
We also realized that with the amount of data we generate, there could be a demand for such a solution in the cloud-native communities. So, we contributed it to the CNCF to develop it as a building block for the next generation of distributed systems by providing a reliable high quality and practical storage foundation.
Swapnil Bhartiya: How is CNCF helping the TiKV project and the community? Calvin Weng: Thanks for the question. I am a liaison between the CNCF and the TiKV project. The CNCF has been immensely helpful in shaping TiKV into what it is today in terms of both the project and the community. There are a few things that I would like to elaborate on and the first is neutrality. CNCF provides a neutral home to projects like ours, so that developers from different organizations are willing to collaborate, contribute and eventually become the leaders in the project. This is very important for the broader community to perceive TiKV as a vendor-neutral and universal project that belongs to the community instead of a single company like PingCAP. People will feel comfortable adopting it or developing their own apps on TiKV.
Another important aspect is exposure, which includes publicity and marketing support that we get from CNCF so that we are known by the broader community. More people and more companies could get involved, which also means more adoption.
Last but not least is diversity in the maintainer and the contributor structure. This is a very important criterion for CNCF graduation.
Swapnil Bhartiya: Since you mentioned graduation, can you talk about what it means for a project like TiKV to become a graduated project? How does it affect the project and what does it mean for its users? Calvin Weng: TiKV has a lot of adoptions. There are more than 1,000 deployments in production. It is battle-tested. Moving from incubation to graduation is a very solid and convincing validation of the technology, its open governance, its vision, maturity and sustainability.
From a user’s perspective, graduation means the credibility and reliability of the project. It means that the TiKV project is a mature enough project for cloud-native architecture. It also means that the TiKV community is an active and healthy community. It boosts the confidence of users.
Swapnil Bhartiya: One last question before we wrap this up: can you talk about the roadmap of the TiKV project? Siddon Tang: Our focus is on making it faster, easier-to-use, and cost-effective. We just released the 4.0 version and in the next major release of 5.0, we want it to be more cloud-friendly and be able to smoothly run on AWS S3, AWS EBS Cloud Disk or any other cloud storage. We are also working on making TiKV handle different workloads. We are also working on adding support for other database engines so it can support different workloads. The long-term goal is to introduce AI so it can use different engines to certify different workloads.
“Good words are worth much, and cost little.”George Herbert
Although effective communication is an essential life skill, it is the most critical element in any business . Lack of accurate communication is the common cause of any organization’s issues, causing conflicts, reducing client relationships, team effectiveness, and profitability . According to the Project Management Institute (PMI), ineffective communication is the main contributor to project failure one-third of the time. It has a negative impact on project success more than half of the time .
In open source projects where there is a diverse and world spread community, effective communication is the key to projects’ success. Using the right technology is crucial for that. So, which tools do open source communities use for communication?
Open Source community communication by example
The Ubuntu community uses mailing lists for development and team coordination. The mailing lists are split into announcements and news, support, development, testing and quality assurance, and general (such as translation, marketing, and documentation) . Despite the mailing lists, IRC (Internet Relay Chat) channels are used for informal daily chats and short-term coordination tasks . If someone wants to know what is going on on Ubuntu, but doesn’t want to subscribe to the high traffic mailing list, the web forumcan be used to get support and discuss the future of Ubuntu. Finally, Ask Ubuntu can be used to ask technical questions.
Mailing lists are the main communication channels in the Linux Kernel. For newcomers that would like to learn more about the Linux kernel development, there is the kernelnewbies resource and #kernelnewbies IRC channel on OFTC. This online resource provides information on basic kernel development questions. Additionally, the kernelnewbies IRC channel is a vehicle for contributors to ask questions in real-time and get help from experts in the kernel community. The Linux Kernel Mailing List (LKML) is where most development discussions and announcements are made. Kernel developers send patches to the mailing lists as outlined in the Submitting patches: the essential guide to getting your code into the kernel. The archives from each mailing list can be found at https://lore.kernel.org/lists.html.
Shuah Khan, a Linux Fellow, mentioned in an interview  that before contributing to the Linux Kernel, it’s important to subscribe to the kernel-related mailing lists “to understand the dynamics.” Khan said, “The process works like this: you walk into a room. People are gathering in small groups and are talking to each other. You have to break into one of these conversations. That is the process of watching the mailing lists, watching the interaction, and learning from that before you start sending out a patch.”
OpenStack has many communication channels such as IRC channels for both public meetings and projects as well as mailing lists. The mailing lists are used to asynchronously communicate and share information, team communication, and cross-project communication. Additionally, mailing lists in OpenStack are used to communicate with non-developer community members of OpenStack .
IRC channels are one of the most important communication methods in Gnome. They are a google place to know what the community is talking about and also ask for help. There are many channels on Discourse, including discussions about Gnome’s sub-projects, community-related topics, internationalization, etc. Similar to other communities, mailing lists can be used for discussing specific topics. Finally, PlanetGnome and GnomeNews can be used to follow the latest news of the project.
So, where does communication occur in open source projects?
As observed in our previous discussion, mailing lists seem to be the most used communication method. Previous work has also found that “mailing lists are the bread and butter of project communications”  and that “the developer mailing list is the primary communication channel for an OSS project” . However, as we have previously mentioned, mailing lists are not the only communication channel used in OSS. Other channels (such as IRC channels and forums) also play an important role.
Guzzy et. al  mention that when more than one communication repository exists, the policy of most OSS is to transfer all official decisions and useful discussions to the mailing lists, so that they can later be retrieved. Thus, traceability and transparency of information is an important matter here.
The benefit of using mailing lists is that it is an asynchronous form of communication, and it is an easy resource to share information with the entire community. Additionally, mailing lists allow people that are in different timezones to engage, as well as people that have different levels of English proficiency, may better manage it in text messages .
However, mailing lists might also have their disadvantages. Previous work  found that developers have problems maintaining awareness of each other’s work when discussing on the mailing lists. Additionally, recovering traceability links among different communication repositories might help researchers and community members to have a more complete picture of the development process.
What are the common DOs and DON’Ts when using OSS mailing lists?
Given that mailing lists are one of the common ways to communicate in open source projects, it is worth knowing how to communicate in mailing lists. Although each project has its own set of rules, certain conventions should be followed.
Prefix the subject with topic tags in square brackets. This makes email threads easier for readers to categorize and decide what they should read quickly. For example, OpenStack has documentation  establishing how to prefix the subject, i.e., community members should use [docs] to address any kind of documentation discussions that are cross-projects and so on.
Sometimes it’s appropriate to change the subject rather than start a new thread.
Exceptions: Linux Kernel mailing lists use “bottom post” protocol (writing the message below the original text) rather than “top post” (writing the message above the original text of an email, which is what most mail clients are set to do by default.)
Plain text: Send your email as plain text only! Please, don’t send HTML emails.
Line wrapping: Lines should be wrapped at 72 characters or fewer.
Always useinline replies, i.e., break the original message by replying to each specific part of the message.
When replying to long discussions,trim your message and leave only the relevant parts to the reply.
Avoid cross-posting, i.e., posting the same message to many mailing lists at the same time.
Exceptions: The Linux Kernel maintains mailing lists for each subsystem, and patches are often sent to multiple mailing lists for review and discussion. However, avoid “top posting” on a Linux Kernel mailing list.
Avoid sending the wrong topic to the wrong mailing list. Make sure that your topic is the topic of the mailing list.
Setting up your email client
The Linux Kernel has great documentation on setting different email clients according to the rules mentioned above.
How to minimize the harm caused by conflicts?
Even if the code of conduct is applied, conflicts might exist. Many actions can be taken in case of dispute, and here are some examples:
Gather information about the situation
If someone has violated the code of conduct, you should carefully analyze the situation according to the experience working with that person . It is essential to read the past comments and interactions with that person to have an unbiased perspective about what happened. Stephanie Zvan  has mentioned that the best way to avoid a conflict is not to get pulled into an argument. It is important to focus on what you need to do instead of getting sidetracked into dealing with others’ behaviors.
Take appropriate actions
Two ways to respond to the code of conduct violation is that the moderator of the community (i) in a thoughtful way explain in public how the person’s behavior affected the community, or (ii) privately reach out to the person and explain how that behavior was negative .
“A code of conduct that isn’t (or can’t be) enforced is worse than no code of conduct at all: it sends the message that the values in the code of conduct aren’t actually important or respected in your community.”Ada Initiative
Open source projects are, in large part, successful due to the collaborative nature of projects. Thus, start conversations that lead to collaboration. That means, give feedback, support each other’s communication, and share your ideas.
There is no additional cost to being transparent and authentic with your community. In that way, it is easy to keep your team informed, empowered, and focused on one specific goal or task.
About the author:
Isabella Ferreira is an Advocate at TARS Foundation, a cloud-native open-source microservice foundation under the Linux Foundation.
The Linux Foundation consolidated its projects around AI, ML & Data by bringing them under the umbrella of the LF AI & Data Foundation. Swapnil Bhartiya, founder and host at TFiR.io, sat down with Ibrahim Haddad, Executive Director of LF AI & Data to discuss this consolidation.
Transcript of the discussion:
Swapnil Bhartiya: A lot of consolidation is happening within the Linux Foundation around AI/ML projects. Can you talk about what AI/ML & data projects are there under the Linux Foundation umbrella right now?
Ibrahim Haddad: So, if you think of Linux Foundation, it is kind of a foundation of foundations. There are multiple umbrella foundations. There’s the CNCF (Cloud Native Computing Foundation), there’s LF Edge, there’s the Hyperledger project, automotive, et cetera. And LF AI & Data is one of these umbrella foundations. We share the same goal, which is to accelerate the development of open-source projects and innovation. However, we each do it in our specific domains.
We’re focused on AI, machine learning, deep learning, and the data aspects of AI. The LF AI & Data Foundation was initially kicked off as LF Deep Learning in March of 2018. We grew a bit, and we started to host projects in other subdomains within the AI umbrella. And then we rebranded again to LF AI & Data to reflect the additional growth in our portfolio.
As of today, we host 22 projects across multiple domains of machine learning, deep learning, data models, and trusted AI. We have, I believe, 36 numbered companies that are involved in our foundation.
Swapnil Bhartiya: Within the Linux Foundation, there are a lot of projects that at times overlap, and then there are gaps as well. So, within the AI/ML space, where do you still see gaps that need to be bridged and overlaps that need consolidation?
Ibrahim Haddad: When a project is contributed to the foundation, we see under which umbrella it fits, however it’s the decision of the project where they want to go, we only offer guidance. If projects do overlap under the same umbrella, it’s their call to make. In terms of consolidation, we’re actually in the process of doing this at least in the AI space. We recently announced the formation of LF AI & Data, which consolidates two projects – LF AI Foundation and ODPi.
Swapnil Bhartiya: Can you also talk about what are the new goals or new areas that the Foundation is focusing on after this consolidation and merger?
Ibrahim Haddad: The first one is increasing the collaboration between the projects that are on the data side and the traditional open-source AI projects that we host. We host about seven projects that focus on the data and 15 projects in the general AI domain. One of the activities we launched, which we are going to accelerate in 2021, is creating integration across different projects so that companies see a tighter integration within projects inside the foundation.
The second area is trusted AI to build trust and a responsible AI system, which is really a hot topic across industry verticals including governments, NGOs and companies. They all are putting emphasis on building fair systems, systems that don’t create bias, systems that are transparent, systems that are robust. Building trust with the consumer of these systems is a very critical thing. So trusted and responsible AI would be a key area in addition to the integration and growing the data/AI collaborations.
KubeCon, November 17, 2020 – The Linux Foundation, the nonprofit organization enabling mass innovation through open source, today announced it will host the Servo web engine. Servo is an open source, high-performance browser engine designed for both application and embedded use and is written in the Rust programming language, bringing lightning-fast performance and memory safety to browser internals. Industry support for this move is coming from Futurewei, Let’s Encrypt, Mozilla, Samsung, and Three.js, among others.
“The Linux Foundation’s track record for hosting and supporting the world’s most ubiquitous open source technologies makes it the natural home for growing the Servo community and increasing its platform support,” said Alan Jeffrey, Technical Chair of the Servo project. “There’s a lot of development work and opportunities for our Servo Technical Steering Committee to consider, and we know this cross-industry open source collaboration model will enable us to accelerate the highest priorities for web developers.”
The Linux Foundation is home to many of the world’s most important open source projects, and also home to many of the top open source experts. Our instructor-led training courses are taught by hands-on practitioners who have used, built, and contributed to these projects for years. Instructor-led courses work differently to eLearning courses in that they take place at a specific time and are led by a teacher in real-time. The courses typically involve 3-4 full, consecutive days of instructional and lab time, meaning you can complete the training quickly and in a highly structured format. Having a live instructor also means you have the opportunity to ask questions and interact in real-time.
To increase access to this training, through November 24 all instructor-led training courses are discounted by 30-50%!
The Linux Foundation, the nonprofit organization enabling mass innovation through open source, and Cloud Native Computing Foundation® (CNCF®), which builds sustainable ecosystems for cloud-native software, today announced the Certified Kubernetes Security Specialist (CKS), previously announced to be in development in July, is now generally available.
CKS is a two-hour, performance-based certification exam that provides assurance that a certificant has the skills, knowledge, and competence on a broad range of best practices for securing container-based applications and Kubernetes platforms during build, deployment, and runtime. The exam is taken remotely with a live proctor monitoring via webcam and screen sharing. Candidates for CKS must hold a current Certified Kubernetes Administrator (CKA) certification to demonstrate they possess sufficient Kubernetes expertise before sitting for the CKS. The certification remains valid for two years from the date it is awarded.
“The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery. There is a bare minimum of centralized management of these services, which may be written in different programming languages and use different data storage technologies.“
James Lewis and Martin Fowler (2014) 
It is expected that in 2020, the global cloud microservices market will grow at a rate of 22.5%, with the US market projected to maintain a growth rate of 27.4% . The tendency is that developers will move away from locally hosted applications and shift into the cloud. Consequently, this will help businesses minimize downtime, optimize resources, and reduce infrastructure costs. Experts also predict that by 2022, 90% of all applications will be developed using microservices architecture . This article will help you to learn what microservices are and how companies have been using it nowadays.
What are microservices?
Microservices have been widely used around the world. But what are microservices? Microservice is an architectural pattern in which the application is based on many small interconnected services. They are based on the single responsibility principle, which according to Robert C. Martin is “gathering things that change for the same reason, and separate those things that change for different reasons” . The microservices architecture is also extended to the loosely coupled services that can be developed, deployed, and maintained independently .
Moving away from monolithic architectures
Microservices are often compared to traditional monolithic software architecture. In a monolithic architecture, a software is designed to be self-contained, i.e., the program’s components are interconnected and interdependent rather than loosely coupled. In a tightly-coupled architecture (monolithic), each component and its associated components must be present in order for the code to be executed or compiled . Additionally, if any component needs to be updated, the whole application needs to be rewritten.
That’s not the case for applications using the microservices architecture. Since each module is independent, it can be changed without affecting other parts of the program. Consequently, reducing the risk that a change made to one component will create unanticipated changes in other components.
Companies might run in trouble if they cannot scale a monolithic architecture if their architecture is difficult to upgrade or the maintenance is too complex and costly . Breaking down a complex task into smaller components that work independently from each other is the solution to this problem.
How developers around the world build their microservices
Microservices are well known for improving scalability and performance. However, are those the main reasons that developers around the world build their microservices? The State of Microservices 2020 research project  has found out how developers worldwide build their microservices and what they think about it. The report was created with the help of 660 microservice experts from Europe, North America, Central and South America, the Middle East, South-East Asia, Australia, and New Zealand. The table below presents the average rating on questions related to the maturity of microservices .
Average rating (out of 5)
Setting up a new project
Maintenance and debugging
Efficiency of work
Solving scalability issues
Solving performance issues
As observed on the table, most experts are happy with microservices for solving scalability issues. On the contrary, maintenance and debugging seem to be a challenge for them.
Although there are plenty of options to choose to deploy microservices, most experts use Amazon Web Services (49%), followed by their own server. Additionally, 62% prefer AWS Lambda as a serverless solution.
Most microservices used by the experts use HTTP for communication, followed by events and gRPC. Additionally, most experts use RabbitMQ for message-brokers, followed by Kafka and Redis.
Also, most people work with microservices continuous integration (CI). In the report, 87% of the respondents use CI solutions such as GitLab CI, Jenkins, or GitHub Actions.
The most popular debugging solution among 86% of the respondents was logging, in which 27% of the respondents ONLY use logs.
Finally, most people think that microservice architecture will become either a standard for more complex systems or backend development.
Successful use cases of Microservices
Many companies have changed from a monolithic architecture to microservices.
In 2001, development delays, coding challenges, and service interdependencies didn’t allow Amazon to address its growing user base’s scalability requirements. With the need to refactor their monolithic architecture from scratch, Amazon broke its monolithic applications into small, independent, and service-specific applications .
In 2001, Amazon decided to change to microservices, which was years before the term came into fashion. This change led Amazon to develop several solutions to support microservices architectures, such as Amazon AWS. With the rapid growth and adaptation to microservices, Amazon became the most valuable company globally, valued by market cap at $1.433 trillion on July 1st, 2020 .
Netflix started its movie-streaming service in 2007, and by 2008 it was suffering scaling challenges. They experienced a major database corruption, and for three days, they could not ship DVDs to their members . This was the starting point when they realized the need to move away from single points of failure (e.g., relational databases) towards a more scalable and reliable distributed system in the cloud. In 2009, Netflix started to refactor its monolithic architecture into microservices. They began by migrating its non-customer-facing, movie-coding platform to run on the cloud as an independent microservices . Changing to microservices allowed Netflix to overcome its scaling challenges and service outages. Despite that, it allowed them to reduce costs by having cloud costs per streaming instead of costs with a data center . Today, Netflix streams approximately 250 million hours of content daily to over 139 million subscribers in 190 countries .
After launching Uber, they struggled to develop and launch new features, fix bugs, and rapidly integrate new changes. Thus, they decided to change to microservices, and they broke the application structure into cloud-based microservices. In other words, Uber created one microservice for each function, such as passenger management and trip management. Moving to microservices brought Uber many benefits, such as having a clear idea of each service ownership. This boosted speed and quality, facilitating fast scaling by allowing teams to focus only on the services they needed to scale, updating virtual services without disrupting other services, and achieving more reliable fault tolerance .
It’s all about scalability!
A good example of how to provide scalability is by looking at China. With its vast number of inhabitants, China had to adapt by creating and testing new solutions to solve new challenges at scale. Statistics show that China serves roughly 900 million Internet users nowadays . During the 2019 Single’s Day (the equivalent of Black Friday in China), the peak transaction of Alibaba’s various shopping platforms was 544,00 transactions per second. The total amount of data processed on Alibaba Cloud was around 970 petabytes . So, what’s the implication of these numbers of users in technology?
Many technologies have emerged from the need to address scalability.For example, Tars was created in 2008 by Tencent and contributed to the Linux Foundation in 2018. It’s being used at scale and enhanced for ten years . Tars is open source, and many organizations are significantly contributing and extending the framework’s features and value . Tars supports multiple programming languages, including C++, Golang, Java, Node.js, PHP, and Python; and it can quickly build systems and automatically generate code, allowing the developer to focus on the business logic to improve operational efficiency effectively. Tars has been widely used in Tencent’s QQ, WeChat social network, financial services, edge computing, automotive, video, online games, maps, application market, security, and many other core businesses. In March of 2020, the Tars project transitioned into the TARS Foundation, an open source microservice foundation to support the rapid growth of contributions and membership for a community focused on building an open microservices platform .