Ubuntu One users can very easily move their data to ownCloud and stay within the free software world instead of having to go to a proprietary provider.
Read more at Muktware
Ubuntu One users can very easily move their data to ownCloud and stay within the free software world instead of having to go to a proprietary provider.
Read more at Muktware
It’s been a busy week. A week ago I flew out to Napa,CA for two days of discussions with various kernel people (ok, and some postgresql people too) about all things VM and FS/IO related. I learned a lot. These short focussed conferences have way more value to me these days personally than the conferences of years ago with a bunch of tracks, and day after day of presentations.
I gave two sessions relating to testing, there are some good write-ups on lwn. It was more of a extended QA than a presentation, so I got a lot of useful feedback (and especially afterwards in the hallway sessions). A couple people asked if trinity was doing certain things yet, which led to some code walkthroughs, and a lot of brainstorming about potential solutions.
By the end of the week I was overflowing with ideas for new things it could be doing, and have started on some of the code for this already. One feature I’d had in mind for a while (children doing root operations) but hadn’t gotten around to writing could be done in a much simpler way, which opens the doors to a bunch more interesting things. I might end up rewriting the current ioctl fuzzing (which isn’t finding a huge amount of bugs right now anyway) once this stuff has landed, because I think it could be doing much more ‘targeted’ things.
It was good to meet up with a bunch of people that I’ve interacted with for a while online and discuss some things. Was surprised to learn Sasha Levin is actually local to me, yet we both had to fly 3000 miles to meet.
Two sessions at LSF/MM were especially interesting outside of my usual work.
The postgresql session where they laid out their pain points with the kernel IO was enlightening, as they started off with a quick overview of postgresql’s process model, and how things interact. The session felt like it went off in a bunch of random directions at once, but the end goal (getting a test case kernel devs can run without needing a full postgresql setup) seemed to be reached the following day.
The second session I found interesting was the “Facebook linux problems” session. As mentioned in the lwn write-up, one of the issues was this race in the pipe code. “This is *very* hard to trigger in practice, since the race window is very small”. Facebook were hitting it 500 times a day. Gave me thoughts on a whole bunch of “testing at scale” problems. A lot of the testing I do right now is tiny in comparison. I do stress tests & fuzz runs on a handful of machines, and most of it is all done by hand. Doing this kind of thing on a bigger scale makes it a little impractical to do in a non-automated way. But given I’ve been buried alive in bugs with just this small number, it has left me wondering “would I find a load more bugs with more machines, or would it just mean the mean time between reproducing issues gets shorter”. (Given the reproducibility problems I’ve had with fuzz testing sometimes, the latter wouldn’t necessarily be a bad thing). More good thoughts on this topic can be found in a post google made a few years ago.
Coincidentally, I’m almost through reading How google tests software, which is a decent book, but with not a huge amount of “this is useful, I can apply this” type knowledge. It’s very focussed on the testing of various web-apps, with no real mention of testing of Android, Chrome etc. (The biggest insights in the book aren’t actually testing related, but more the descriptions of googles internal re-hiring processes when people move between teams).
Collaboration summit followed from Wednesday onwards. One highlight for me were learning that the tracing code has something coming in 3.15/3.16 that I’ve been hoping for for a while. At last year’s kernel summit, Andi Kleen suggested it might be interesting if trinity had some interaction with ftrace to get traces of “what the hell just happened”. The tracing changes landing over the next few months will allow that to be a bit more useful. Right now, we can only do that on a global system-wide basis, but with that moving to be per-process, things can get a lot more useful.
Another interesting talk was the llvmlinux session. I haven’t checked in on this project in a while, so was surprised to learn how far along they are. Apparently all the necessary llvm changes to build the kernel are either merged, or very close to merging. The kernel changes still have a ways to go, but this too has improved a lot since I last looked. Some good discussion afterwards about the crossover between things like clang’s static analysis warnings and the stuff I’m doing with Coverity.
Speaking of, I left early on Friday to head back to San Francisco to meet up with Coverity. Lots of good discussion about potential workflow improvements, false positive/heuristic improvements etc. A good first meeting if only to put faces to names I’ve been dealing with for the last year. I bugged them about a feature request I’ve had for a while (that a few people the days preceding had also nagged me about); the ability to have per-subsystem notification emails instead of the one global email. If they can hook this up, it’ll save me a lot of time having to manually craft mails to maintainers when new issues are detected.
busy busy week, with so many new ideas I felt like my head was full by the time I got on the plane to get back.
Taking it easy for a day or two, before trying to make progress on some of the things I made notes on last week.
This article was re-published with permission from Dave Jones’ blog.
My overall impression is that it was a good week, except that by Thursday the combination of 14 hour days and jet lag were catching up with me in a big way. However, from the point of view of the PostgreSQL project, I think it was very positive. On Monday, Andres and I had an hour-and-a-half slot; we used about an hour and fifteen minutes of that time. Our big complaint was with the Linux kernel’s fsync behavior, but we talked about some other issues as well, including double buffering, transparent huge pages, and zone reclaim mode.
Since I’ve already written a whole blog post (linked above) about the fsync issues, I won’t dwell further on that here, except to say that our explanation prompted some good discussion and I think that the developers in the room understood the problem we were complaining about and felt that it was a real problem which deserved to be addressed. The discussion of double-buffering was somewhat less satisfying; I don’t think it’s very clear what the best way forward is there. One possible solution is to have a way for PostgreSQL to evict pages from its cache back into the kernel cache without marking them dirty, but this is quite understandably scary from the kernel’s point of view, and I’m not very sure that the performance would be good anyway.
On the topic of transparent huge pages (THP), we made the point, already well-known to many PostgreSQL users, that they destroy performance on some PostgreSQL workloads. When we see users with excessive system time usage, we simply recommend that they shut THP off. This problem was familiar to many; apparently, the code for transparent huge pages is quite complex and is known to cause problems for some other workloads as well. It has been improved in more recent kernels, so the problem cases may now be fewer, but it is far from clear that all of the bugs have been swatted. One interesting idea that was floated was to add some sort of mutex so that only one process will attempt THP compaction at a time, to prevent the time spent compacting from ballooning out of control on machines with many processors. If you are about to do THP compaction but the mutex is held by another process, don’t wait for the mutex, but just skip compaction.
On the topic of zone reclaim mode, nearly everyone seemed to agree that the current kernel behavior of setting vm.zone_reclaim_mode to 1 on some systems hurts more people than it helps. No one objected to the idea of changing the kernel so that 0 is always the default. A setting of 1 can improve things for certain workloads where the whole working set fits within a single memory node, but most people (and certainly all the database people in the room) seemed to feel that was a relatively uncommon scenario.
Before and after our Monday session, I got a chance to hear about some other kernel efforts that were underway. There was discussion of whether 32-bit systems needed to be able to handle disk drives with more than 2^32 4K pages (i.e. >16TB), with the conclusion being that it might make sense to support accessing files on such filesystems with O_DIRECT, but reworking the kernel page cache to support it more fully was probably not sensible. Among other problems, a 32-bit system won’t have enough memory to fsck such a volume. Persistent memory, which does not lose state when the system loses power, was also discussed. I learned about shingled magnetic recording (SMR), a technique created to work around the fact that drive write heads can’t be made much smaller and still write readable data. Such drives will have a write head larger than the read head, and each track will partially overwrite the previous track. The drive is therefore divided into zones, each of which can be written in append-only fashion, or the whole zone can be erased. This presents new challenges for filesystem developers (and will doubtless work terribly for database workloads!). Dave Jones talked about a tool called Trinity, which makes random Linux system calls in an attempt to crash the kernel. He’s been very successful at crashing the kernel this way; many bugs have been found and fixed.
On Tuesday, there was nothing specific to PostgreSQL, but there were discussions of transparent huge pages, the memcg (Linux memory controller group) interface, NUMA-related issues, and more discussion of topics from Monday. On Wednesday, we went from 80 or so people for LSF/MM to maybe 400 for the Linux Collaboration summit – those numbers might be off; I’m just guessing, but there were certainly a lot more people there. Most of the day was taken up with keynotes, including corporate attempts to promote open source, an interesting-sounding project called AllJoyn, a talk on container virtualization, and more. I was interested by the fact that a significant number of attendees were not technical; for example, there was an entire legal track, about trademarks, licensing, and so on.
On Thursday, Andres and I had another opportunity to talk about PostgreSQL. This was cast as a broader discussion that would include not only PostgreSQL developers but also MySQL, MariaDB, and MongoDB developers, as well as the LSF/MM kernel developers. It seemed to me that, for the most part, we’re all struggling with the same set of issues, although in slightly different ways. The MongoDB developer explained that MongoDB uses mmap() with the MAP_PRIVATE flag to create their private cache, equivalent to shared_buffers; to minimize double buffering, they occasionally unmap and remap entire files. They use a second set of memory mappings, this one with MAP_SHARED, to copy changes back to disk, mirroring our checkpoint process. They weren’t quite sure whether the Linux kernel was to blame for the performance problems they were seeing while performing that operation, but their description of the problem matched what we’ve seen with PostgreSQL quite closely. Several developers from forks of MySQL were also present, and reported similar problems in their environment also. The databases vary somewhat in how they interact with the kernel: MySQL uses direct I/O, we use buffered I/O via read() and write(), and MongoDB uses mmap(). Despite that, there was some unity of concerns.
Aside from the database-related stuff, the most interesting session I attended on Thursday was one by Steve Rostedt, regarding a kernel facility called ftrace of which I had not previously been aware. I’m not sure how useful this will be for debugging PostgreSQL problems just because of the volume of output it creates; I think that for many purposes perf will remain my tool of choice. Nevertheless, ftrace has some very interesting capabilities that I think could be used to uncover details that can’t be extracted via perf. I’m intrigued by the possibility of using it to get better latency measurements than what I’ve been able to get out of perf, and there may be other applications as well once I think about it more.
There wasn’t much left to talk about on Friday; quite a few people left Thursday night, or Friday morning, I think. But I spent some time listening to Roland McGrath talk about ongoing work on glibc, and there were some more talks about NUMA issues that I found interesting.
There’s more to say, but this blog post is too long already, so I’d better stop writing before everyone stops reading. In closing, I’d like to thank Mel Gorman very much for inviting us and for his very positive attitude about addressing some of the problems that matter to PostgreSQL and other databases, and I’d also like to thank Dave Chinner, James Bottomley, Jan Kara, Rik van Riel, and everyone else whose names I am unfortunately forgetting for their interest in this topic. Thanks!
Linux and open source software have demonstrated that collaborative development is a successful model for rapid innovation in the tech sector. Now that model is being applied in other industries from health care, to city government, to education.
In Raleigh, North Carolina the open source way is transforming city government through increased access to government data and lower barriers to citizen participation. It’s no coincidence that the city is also home to Red Hat, the first billion-dollar company built on Linux.
Jason Hibbets, a project manager in corporate marketing at Red Hat and community manager for Opensource.com, has been an integral part of the city’s open transformation. In his keynote at ApacheCon in Denver next week, Hibbets will define the five principles of an open source city and share his experiences with civic hacking in Raleigh, and a government-focused unconference called CityCamp.
“What’s really inspiring to me is that open source started as a software development model and has evolved as a way to positively change our society,” Hibbets says in the Q&A, below. “Open source is truly changing the world, and it’s doing so by being a great way of doing things even beyond software.”
Here he discusses the similarities between open source software and open city government; how he got involved in Raleigh’s open government movement; and some lessons learned along the way.
Linux.com: How do you apply open source philosophy and governance principles outside of technology?
Jason Hibbets: I use many of the principles from open source and what has been defined as the open source way, such as transparency, rapid prototyping, and collaboration, in my work with both software and non-software communities. To help others understand how I apply the open source philosophy to everything I do, I share examples from my own experiences and strive to maintain a default to open attitude.
My day job as community manager and project manager for Opensource.com gives me insight into businesses, groups, and individuals who are applying the open source philosophy beyond technology every day. Their work and their projects include everything from tinkering with open hardware projects and using 3D printers for making prosthetics, to innovating how kids learn with programs like Scratch from MIT and extracting data from PDFs with efforts like DocHive. There are a lot of parallels between the stories that we share on Opensource.com and the work I do outside of that in my local community. The great thing is they build on one another.
In open source software projects, code is the great equalizer – or as Facebook’s Mark Zuckerberg so succinctly wrote in a letter to investors: “Code wins arguments.” Is there an equivalent to code in an open source city?
Hibbets: In some ways, cities do operate like a software project, and code is called just that: codes (or laws, rules, and regulations). There are people who write city code, and who propose new code, just as in a software project. Ideally, the principles of a meritocracy – that the best ideas win, or in this case, that the best code wins – are supported in the open government movement. Determining what the best idea is, however, involves more time and risk.
In a software project, new code can be tested and the results are immediate, so you can quickly evaluate which code is superior. When you change city code, however, it may take a long time before you can see the effects, and it’s not as easy to go back and undo harm from bad code. Changes in traffic patterns, building codes, or local environmental regulations can have a huge impact on the citizenry, but it could be months or even years before the full effects are known. Many of the barriers that passionate citizens and government IT shops face are related to changing entrenched cultures that develop to manage or minimize that risk. Transparency and collaboration may be easily achievable for an open source city, but rapid prototyping is more difficult; and that slows down decision-making. That means people can’t just bring their best ideas to the table; they must also become agents of cultural and community change. For us, that usually means not only demonstrating the benefits of open source and open data, but also demonstrating that these changes carry little risk.
Operating in an open source product and development environment, you come to expect quick decisions and rapid prototyping. Tips for those who want to bring their open source know-how to government: be prepared for a slower pace, understand how your government works including policy making and public process, start forming relationships and partnerships with influencers and talented people who can help the best ideas win.
What other parallels exist between an open source project and open government?
Hibbets: The first three that come to mind are transparency, collaboration, and participation. These principles are essential to both open source and open government. Many of us are familiar with how this works in open source projects, but not so much in government.
Transparency is important, because as citizens we should know how our tax dollars are being spent, how our representatives are voting, and what transpires in public meetings. Collaboration is critical to a better government. Our communities, our governments, need ideas from all citizens. On an open source project, collaboration is what makes everything tick because cultivating ideas from many participants is what generates the best results.
In a democracy it’s not us versus them, so participation is essential. And, participation is not limited to voting. The government is “ours,” of the people, and how it operates and what it does should be a participatory process.
We’re starting to see government agencies bridge the gap; from holding in-person meetings to online collaboration and participation to gather feedback from more citizens. As we think about the bigger picture though, we need to be keen on open standards. Governments (and citizens) should not be limited to proprietary formats. Open standards create a level playing field and allow for interoperability. Deciding on and implementing open standards should be at the top of the list for government IT departments.
What inspired you to take what you’ve learned about open source communities and collaboration and apply it to your own city government in Raleigh?
Hibbets: Open source and my city is a blend of my passions. When I first started discovering the open government movement, I realized I could use my knowledge about open source and open source communities and apply that to my local government. For a number of years I used my open source know-how to promote transparency and participation in my neighborhood watch program and other neighborhood organizations.
My involvement and work with open source of my local government went to a new level when I got involved with an event called CityCamp. It’s an international unconference series that brings open source and technology to local municipalities. Then, I had the privilege to co-chair a group of volunteers, of passionate citizens, to create CityCamp Raleigh. We’ve evolved this over the years to become CityCamp North Carolina with more of a state-wide focus. Now in its fourth year, we’re looking to inspire change in our state government by encouraging them to use more open source solutions, create open data.
For more on my journey through this process, from neighborhood watch to CityCamp and more, plus a ton of great tips for how to be more involved in open government, check out my book: The foundation for an open source city. When I reflect back, CityCamp was the catalyst and spark not only for me, but for Raleigh.
What valuable lessons have you learned about open source projects and about city government in doing so?
Hibbets: The first lesson I’ve learned is that having a core group of like-minded people working on the project with you is critical. Just like open source projects, you need a nucleus of dedicated leaders and contributors to be successful. This is no different in the open government world. And to be successful in open government, this core group needs to establish key relationships and partnerships with elected officials, city staff, and other stakeholders.
The second lesson I’ll share is to embrace incremental progress. Earlier, I talked about transparency, collaboration, and participation. On the open government front, open source advocates may need to make compromises in order to move the needle and make some progress. For example, some tools that governments are implementing may not use open source software, but may get more citizens involved or engaged.
There are several platforms for engaging citizens online that foster participation, allow people to collaborate, and provide transparency around what ideas are winning and what’s being voted on. But many, if not all, are not built on open source software or licensing. Do we scrap the platform because it’s not under an open source license? Or, do we choose to make incremental progress so that open source is an alternative in the future?
In many local governments, officials and leaders are looking for turnkey solutions to reduce the burden on the IT department. There is an opportunity there for start-ups to embrace the open source way and find a business model that works with the public sector.
What can the open source software community learn from open government projects? How can it contribute?
Hibbets: There are many similarities between these two communities, but one advantage that open government projects have is the ability to access and draw from a highly diverse community with a wide range of skills, talents, and resources. While there are a number of larger open source projects that do this very well, many smaller open source projects struggle with getting more project managers, designers, writers, marketers, and people with other valuable skill sets to participate in their project. I think the reason for this is that government projects have such a strong and direct relation to what the general public needs, on a more basic and daily level. Think transparent access to information and ways for citizens to report non-emergency issues. Whereas open source projects tend to be more focused on technology solutions.
As far as contributing to open government projects, it’s just like any other open source community: identify your passion, find a community that fits that passion, and figure out a way to bring your skills to the community. One community that I’m involved in is Code for America. They have a variety of different projects and programs for people to get involved in. Because a lot of my skills are around community organizing and storytelling, as opposed to writing code, I help organize my local Code for America Brigade in Raleigh and share our progress through blogging and social media.
What else do you plan to cover in your ApacheCon keynote? Who should attend and why?
Hibbets: Attendees to my keynote should be anyone who wants to take their open source knowledge and apply it outside of technology. I will share stories from my experience with citizens and the local government of Raleigh, so attendees will gain insight into how they can replicate these ideas and projects in their city or town. The ideas I will share will also be applicable to other disciplines, like healthcare and education.
What’s really inspiring to me is that open source started as a software development model and has evolved as a way to positively change our society. Open source is truly changing the world, and it’s doing so by being a great way of doing things even beyond software.
The clear winner inside the Galaxy S5 is Qualcomm, securing not only the top processor spot with the Snapdragon 801, but also a handful of other key sockets.
Canonical has announced they will be ending their Ubuntu One cloud storage service. The Ubuntu One music store is also being shutdown…
Mesa is an open-source implementation of the OpenGL specification – a system for rendering interactive 3D graphics.
The post Valve publishes the patched branch of Mesa for SteamOS appeared first on Muktware.
The Kernfs code that’s a split of the sysfs logic to make it more useful to other kernel subsystems wishing to have a virtual file-system, will get better in Linux 3.15…
While the voglperf code has been public for some time within Git, the first initial release of Voglperf was tagged on Tuesday evening by a Valve developer…
While the Linux 3.15 kernel is introducing a large number of new features, it’s also doing away with some old drivers and older x86 platforms…