Home Blog Page 681

Google’s Open Source Report Card Highlights Game-Changing Contributions

Ask people about Google’s relationship to open source, and many of them will point to Android and Chrome OS — both very successful operating systems and both based on Linux. Android, in particular, remains one of the biggest home runs in open source history. But, as Josh Simmons from Google’s Open Source Programs Office will tell you, Google also contributes a slew of useful open source tools and programs to the community each year. Now, Google has issued its very first “Open Source Report Card,” as announced by Simmons on the Google Open Source Blog.

“We’re sharing our first Open Source Report Card, highlighting our most popular projects, sharing a few statistics and detailing some of the projects we’ve released in 2016. We’ve open sourced over 20 million lines of code to date and you can find a listing of some of our best known project releases on our website,” said Simmons.

Open source projects emerge from all over Google, many of them produced through the company’s famous “80/20 directive” for employees, where they are advised to spend 80 percent of their time on Google-centric projects and 20 percent on their own creative projects. Google reports that its GitHub footprint includes more than 84 organizations and 3,499 repositories, 773 of which were created this year.

Simmons has also rounded up Google’s most popular open source projects, as follows:

  • Android — A software stack for mobile devices that includes an operating system, middleware and key applications.

  • Chromium – A project encompassing Chromium, the software behind Google Chrome, and Chromium OS, the software behind Google Chrome OSdc devices.

  • Angular — A web application framework for JavaScript and Dart focused on developer productivity, speed and testability.

  • TensorFlow — A library for numerical computation using data flow graphics with support for scalable machine learning across platforms from data centers to embedded devices.

  • Go — A statically typed and compiled programming language that is expressive, concise, clean and efficient.

  • Kubernetes — A system for automating deployment, operations and scaling of containerized applications now at the Cloud Native Computing Foundation.

  • Polymer — A lightweight library built on top of Web Components APIs for building encapsulated re-usable elements in web applications.

  • Protobuf — An extensible, language-neutral and platform-neutral mechanism for serializing structured data.

  • Guava — A set of Java core libraries that includes new collection types (such as multimap and multiset), immutable collections, a graph library, functional types, an in-memory cache, and APIs/utilities for concurrency, I/O, hashing, primitives, reflection, string processing and much more.

  • Yeoman — A robust and opinionated set of scaffolding tools including libraries and a workflow that can help developers quickly build beautiful and compelling web applications.

Among recent open source contributions from Google, Kubernetes and TensorFlow are having particularly notable impact.

Google’s Open Source Report Card also delves into the most popular languages that Googlers use. These are summarized in order, with open source strongly represented:

  • JavaScript

  • Java

  • C/C++

  • Go

  • Python

  • TypeScript

  • Dart

  • PHP

  • Objective-C

  • C#

The Open Source Report Card is not the only way to put metrics on Google’s open source activities. GitHub, in partnership with Google, has produced a new open dataset on Google BigQuery, a low-cost analytics data warehouse service in the cloud, so that anyone can get data-driven insights based on more than 2.8 million open source GitHub repositories.

“Many things can be gleaned using the open source GitHub dataset on BigQuery,” Simmons notes, “like usage of tabs versus spaces and the most popular Go packages. What about how many times Googlers have committed to open source projects on GitHub? We can search for Google.com email addresses to get a baseline number of Googler commits.”

“With this we learn that Googlers have made 142,527 commits to open source projects on GitHub since the start of the year. This dataset goes back to 2011 and we can tweak this query to find out that Googlers have made 719,012 commits since then. Again, this is just a baseline number as it doesn’t count commits made with other email addresses.”

In recent months, Google also has open sourced other useful tools, many of them tested and hardened in-house. They include machine learning applications, 3D visualization tools, and more. In case you missed any of these, the Open Source Report Card highlights some of the most interesting examples. You can find expanded discussions of these projects at the bottom of the page here.

Get started in open source with The Linux Foundation’s Introduction to Linux, Open Source Development, and GIT course.

Ranking the Web With Radical Transparency

Ranking every URL on the web in a transparent and reproducible way is a core concept of the Common Search project, says Sylvain Zimmer, who will be speaking at the upcoming Apache: Big Data Europe conference in Seville, Spain.

The web has become a critical resource for humanity, and search engines are its arbiters, Zimmer says. However, the only search engines currently available are for-profit entities, so the Common Search project is creating a nonprofit engine that is open, transparent, and independent.

We spoke with Zimmer, who founded Jamendo, dotConferences, and Common Search, to learn more about why nonprofit search engines are important, why Apache Spark is such a great match for the job, and some of the challenges the project faces.

Sylvain Zimmer, Founder of Common Search

Linux.com: Could you provide some background on the Common Search project? Why is a nonprofit search engine needed?

Sylvain Zimmer: Search engines are the arbiters of the web: they decide which websites and what information we get when we search for something online. As many studies have shown, it is easy to misinform or manipulate an audience with tailored search results.

We think it is critical for the Internet and ultimately for our society to have a healthy diversity in its sources of information. That means having both commercial and non-commercial search engines available, so that we can compare their results and watch out for biases.

Linux.com: The website mentions “radical transparency” as a core value of the project. Can you explain what that means and why it’s important?

Sylvain: Indeed, the cornerstone of Common Search is the transparency and reproducibility of our results.

Being transparent means that you can actually understand why our top search result came first, and why the second had a lower ranking. This is why people will be able to trust us and be sure we aren’t manipulating results. However for this to work, it needs to apply not only to the results themselves but to the whole organization. This is what we mean by “radical transparency.” Being a nonprofit doesn’t automatically clear us of any ulterior motives, we need to go much further.

As a community, we will be able to work on the ranking algorithm collaboratively and in the open, because the code is open source and the data is publicly available. We think that this means the trust in the fairness of the results will actually grow with the size of the community.

As Eric S. Raymond said, “given enough eyeballs, all bugs are shallow.” We think this also applies for search engine results!

Linux.com: Why did you choose Apache Spark?  What are some features of Spark that make it well suited for this project?

Sylvain: When we choose languages and frameworks for building Common Search, we consider two factors: technology and community. We want to use technologies that are close to the state-of-the-art and well suited for the unique challenges of a search engine, but we also want to position ourselves within vibrant communities that can be a source of talented contributors.

Spark is quite unique, because it fits both needs almost perfectly: it was built specifically for fast, large-scale data processing, and it is one of the most active Apache projects. It also supports Python, which is our main back-end language, so choosing it made a lot of sense.

Linux.com: What are some challenges that remain to be solved?

Sylvain: We definitely need to raise awareness about the need for a nonprofit search engine. It seems quite obvious when explained clearly, but many people are still unaware of the dangers of having only for-profit search engines on the market.

One of the biggest challenges though is just to make people believe we can actually build a useful service as a nonprofit. I think few people believed Wikipedia would survive vandalism and grow to become one of the top destinations on the web, but it actually happened!

What we need the most right now is for many new contributors to join the project. We made sure the project is very welcoming for newcomers with lots of documentation, simple tutorials, and easy issues to get started on. This is really our main focus because each new contributor gets us closer to a better, fairer web 🙂
 

Hear from leading open source technologists from Cloudera, Hortonworks, Uber, Red Hat, and more atApache: Big Data and ApacheCon Europe on November 14-18 in Seville, Spain. Register Now >>

Docker: Making the Internet Programmable

Docker, and containers in general, are hot technologies that have been getting quite a bit of attention over the past few years. Even Solomon Hykes, Founder, CTO, and Chief Product Officer at Docker started his keynote with the assumption that people attending LinuxCon Europe know that Docker does containers, so instead of focusing on what Docker does, Hykes used his time to talk about Docker’s purpose saying, “It really boils down to one small sentence. We’re trying to make the Internet programmable.”

Hykes described this idea of making the Internet programmable with three key points. First, they are focused on building “tools of mass innovation” designed to allow people to create and innovate on a very large scale. Second, applications and cloud services are allowing the idea of the Internet as a programmable platform to be realized, and they want to make this accessible to more people. Third, they are accomplishing all of this by building the Docker stack with open standards, open infrastructure, and a development platform with commercial products on top of the stack.

Docker is still a relatively small company at only 250 people, and as a small company with a big goal, Hykes credits open source with allowing them to achieve so much with few people. He even goes as far as saying that “Docker would not be possible without open source full stop. That’s because there’s not enough of us to solve all the technical problems we need to solve to make the Internet programmable.” Hykes says that they process 1200 patches per week, which feels a bit like drinking from a fire hose. While he says that this volume isn’t quite at the Linux scale, he says, “it’s closer to Linux than 99% of projects out there,” so they have been borrowing heavily from the various processes that have allowed Linux development to scale.

Although Hykes demonstrated the beta releases of Docker for Mac and Docker for AWS, the big news from this keynote was the introduction of InfraKit, which he described as “a tool kit to create and manage infrastructure that’s scalable, self-healing, declarative, and it embeds years and years of experience operating real systems at really large scale.” InfraKit originated from the March acquisition of Conductant, the team behind the Aurora Project. But, he cautions the audience, “Don’t check it out quite yet because it’s not open source right now. We actually thought since we’re at an open source conference, we could open source something live on stage.” Luckily, the demo gods were smiling on him, and Hykes was able to successfully open the repository for InfraKit live on stage before the end of his keynote.

Watch the complete keynote below to see the Docker for Mac / Docker for AWS demos and to learn more about Docker and open source. 

https://www.youtube.com/watch?v=p_2NDz0K0uc?list=PLbzoR-pLrL6ovByiWK-8ALCkZoCQAK-i_

Watch 25+ keynotes and technical sessions from open source leaders at LinuxCon + ContainerCon Europe. Sign up now to access all videos!

 

Technology Changes Us, Changes Society, and Changes Governments

We shape our technology. Our technology shapes us. It’s not a one-way trip, but a continual feedback loop. Dr. Ainissa Ramirez, in her inspirational LinuxCon North America keynote, claims that the greatest cultural shifts come from new technologies.

Dr. Ramirez says, “Technology is a faster mover than legislation. If you want to change the culture, if you want to steer how people behave, legislation can certainly do that. But if you really want to do that quickly, just do that with code.”

“If you don’t think so, think of all the people who four weeks ago had a different life before Pokemon GO. They’re completely addicted. That all happened with lines and lines of code. That’s the impact that technology has, and since technology has this impact, it’s very important for us to look at the stance that we have between technology and humans.”

Unpredictable Outcomes

The effects of new technologies are unpredictable. We don’t know where they will take us. Putting face masks on football helmets led to the large problem of concussions and brain damage. The telegraph was the pebble in the water that created multiple ripples of change. It was the precursor of the Internet. It shrank the planet. It changed language. Sending a telegraph was so expensive that “You couldn’t expound on the weather and all these other things that you would usually do in a letter. You had to have a very terse, sparse style.”

“Now, if you look at books written before the telegraph, A Tale of Two Cities, The Scarlet Letter, you’ll see that these poets, these writers will write with exquisite details in very long, elaborate sentences. If you look at books written after the telegraph, they have a terse, sparse style. In fact, a quintessential example is Hemingway. Hemingway was directly impacted by the telegraph because he worked in a newsroom where they had telegraphs, and this is where he developed that style, and he went on to inspire all of us.”

Dr. Ramirez describes a number of technologies that have made giant changes in us, changed societies, and changed entire governments: cell phones, the Google Brain, the printing press, Twitter, Facebook…”Now, you may be saying, “Look, I’m just a code diva or a code dude. I don’t do that. Technology doesn’t do that…we need you more than ever. You already have the right mindset for how to look at technology. You share it. You’re open. You’re inclusive. You know that many, many minds make things better than one mind. Since we already know that technology is going to be the thing that’s writing democracy, you serve as democracy’s heroes, but I have to remind you that with this great power comes a great responsibility.”

Don’t miss the presentations by leading open source technologists from Cloudera, Hortonworks, Uber, Red Hat, and more at Apache: Big Data and ApacheCon Europe on November 14-18 in Seville, Spain. Register Now >>

Exclusive: Blockchain Platform Developed by Banks to be Open-Source

A blockchain platform developed by a group that includes more than 70 of the world’s biggest financial institutions is making its code publicly available, in what could become the industry standard for the nascent technology.

The Corda platform has been developed by a consortium brought together by New-York-based financial technology company R3. It represents the biggest shared effort among banks, insurers, fund managers and other players to work on using blockchain technology in the financial markets.

Corda’s code will be contributed on Nov. 30 to the Hyperledger project – a cross-industry project led by the non-profit Linux Foundation to advance blockchain technology by coming up with common standards.

Read more at Reuters

 

 

Secure Your Containers with this One Weird Trick

Did you know there is an option to drop Linux capabilities in Docker? Using the docker run --cap-drop option, you can lock down root in a container so that it has limited access within the container. Sadly, almost no one ever tightens the security on a container or anywhere else.

The Day After is Too Late

There’s an unfortunate tendency in IT to think about security too late. People only buy a security system the day after they have been broken into.

Dropping capabilities can be low hanging fruit when it comes to improving container security.

Read more at Red Hat blog

Humanity’s War on Latency: Semaphore to Silicon Photonics and Beyond

For most of humanity’s existence, communication has been incredibly slow. For millennia the only way of transmitting information between two humans was via speech or crude drawings. About 5,000 years ago written language and papyrus increased the transmission distance and bandwidth of human-human communication, but the latency, delivered by hand, was still pretty bad.

Somewhere around 300BC, though—at least according to recorded history—things started to get interesting. Ancient Greece, as described by the historian Polybius, used a technology called hydraulic telegraph to communicate between Sicily and Carthage—a distance of about 300 miles—in the First Punic War.

Read more at Ars Technica

How to Incorporate Open Source into Computer Science Classes

This year at the Grace Hopper Conference I’m moderating a panel on why, and how, to incorporate open source into computer science classes. The panelists are four computer science instructors—all women—who have already used open source projects in their classrooms.

I’ve asked these four talented instructors to tell you a little about themselves, what teaching open source has meant for them and their students, and what you’ll hear at the Grace Hopper Celebration of Women in Computing, which is the world’s largest gathering of women technologists. This year the event is at the George R. Brown Convention Center in Houston, Texas from October 19-21.

Read more at OpenSource.com

How to Benchmark Your Linux System

If you go out looking for PC benchmark results, there’s a very strong chance the tests won’t perfectly translate to performance under Linux, since they were likely run in Windows. This is particularly true if certain hardware has limited support in the Linux kernel. However, there are still plenty of tests you can run in Linux, and the vast majority of them are free.

Testing in Linux

Linux users can find an easy-to-use test for their systems in the Gnome Disks utility, which comes with both the Gnome 3 and Ubuntu’s Unity desktops. Though the utility is most often used to administer disk partitions and software RAID, it features a built-in benchmark. It’s pretty basic, but will suffice for a general overview. Simply search for Disks in Ubuntu’s dash (or Gnomes Activities panel) to find the utility.

Read more at PCWorld

Iotop – Monitor Linux Disk I/O Activity and Usage Per-Process Basis

Iotop is an open source and free utility similar to top command, that provides an easy way to monitor Linux Disk I/O usage details and prints a table of existing I/O utilization by process or threads on the systems.

Iotop tool is based on Python programming and requires Kernel accounting function to monitor and display processes. It is very useful tool for system administrator to trace the specific process that may causing a high disk I/O read/writes.

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

Read more at Tecmint