Linux lacks testing methodologies

14

Author: Jay Lyman

How can potential buyers judge the differences in performance among applications running on various Linux distributions? Linux kernel stability and reliability testing is quite sophisticated, thanks mainly to efforts such as the Linux Test Project, but measuring application performance on Linux is more difficult. The Open Source Development Labs is calling for application vendors to put their products to the test for scalability, security and clustering. In keeping with the open source approach, the lab is also calling on vendors to share their testing and results.

OSDL lab manager and open source test-giver Tim Witham is on a mission to push Linux performance testing to higher-level, real-world applications, to produce reliable, retestable, comparable data that will let users compare the operating systems or open source applications in a transparent fashion.

Witham said everybody seems to have a different idea of what performance metrics means. For developers, it may be how fast they can compile a turnaround; for database administrators, it is a question of speed and support for threads in I/O and consistency in response time. One of OSDL’s first significant findings, according to Witham, was the identification and fixing of a few I/O scheduling issues that did not show up on “normal” I/O tests.

And while performance metrics may not mean anything beyond someone’s own private workload, the industry must have more in the way of standard tests for higher, enterprise-level applications, according to Witham and others.

When workloads approach the higher-level, Linux and open source testing seems to trail behind proprietary software and traditional testing in terms of the availability of standard application testing and data. The use of artificial loads gets tricky, according to Witham, because real-world applications are too complex. Simpler applications, on the other hand, are not an effective means of measurement unless big changes are made to them. “The time to set up [a test] becomes dominant on that,” Witham said.

Witham said the biggest challenges of developing tests and results at OSDL as been the amount of data that needs to be reviewed for the tests.

Application testing and tools company Parasoft’s Executive Vice President Arthur Hicken added that it is often hard to compare Linux to Windows even on the same exact hardware because the operating system setups vary so greatly.

“Plus, many vendors don’t put the same effort into different ports of their software,” Hicken said. “For a long time, Oracle had bad Linux support (i.e. slow) because they weren’t concerned about Linux.”

That support may have improved, with Oracle among others providing better software, support, and test data, but the information seems to be coming in the form of pieces and parts from various vendors and application developers. What appears to be lacking is readily accessible assessment testing that can make or break corporate IT decisions.

PostgreSQL Advocacy Volunteer Josh Berkus agreed that “We could really use tests that allow us to demonstrate equivalency, or even superiority, to proprietary products. An example of this would be TPC benchmarks for applications built on OSS databases. Or a truly independent desktop applications test comparing, for example, Microsoft Word, Mac Word, OpenOffice.org, AbiWord, and WordPerfect.”

Berkus referred to ongoing work with OSDL, adding that the lab has been a big help in the push to make PostgreSQL as good or better than all of the leading proprietary databases for high-end business applications. “We’d love to be able to compare some of the other databases – both OSS and proprietary – to ours with hard numbers.”

The higher level

Witham indicated that Linux and open source testing may not be measuring up at the higher level of overall system-and-software configurations testing.

“I think we’ve got a lot of good unit tests,” Witham said. “But there needs to be more of this higher-level system together, including the whole application stack: Apache and Jboss and PostgreSQL — that’s way more code than is in the kernel.”

Witham said OSDL and its members have tried to produce the necessary complex environments to bring open source testing up to par with its Database Test Suite.

“The application people want this,” Witham said. “That means there has to be a different way. We stepped forward with the database testing.”

Witham pointed to other areas in need of better metrics: Web performance testing in a more complex environment; Java performance testing to be shared with the open source community; and real-world types of tests for vertical applications such as accounts receivable.

Not open source compliant

Another common challenge to measuring the performance of Linux distributions and other open source software centers on its dedication to transparency, according to Witham. Developers that write open source code want to see the source for testing tools too. But the main industry-standard tests are proprietary and are owned by consortiums or companies that attach restrictions to their use — mainly because they have marketing value. “A lot of common metrics products require an audit,” Witham said,
“which does not comply with the open source approach. They are very useful comparing systems before and after change, but what it means for us is we can’t use them for open source,” Witham said. “You can’t load on Oracle or DB2 and run the numbers and share the information.”

Witham also complained that while higher-level testing can be done on a proprietary operating system with a “ton of people,” it is more difficult for the open source side of the world to come up
with good measurements.

“Our whole goal is to automate and do automatic testing with the kernel drops,” Witham said. “We’d like to bring in higher levels of software with a kernel drop and run an Apache performance test against it. So we’re trying to automate.”

Berkus agreed that the need for more paid labor for testing may put the open source community at a disadvantage. “The problem is that testing, fixing an application, retesting, tuning, and retesting again is an extremely tedious process, and nobody wants to participate in it for long. Test development is liable to become a Mozilla-length project in itself,” Berkus said, adding that the boredom of testing makes it an on-again, off-again proposition that is sidetracked by “work on more exciting new features.”

“Another problem, which may be insoluble, is that there is no authority which both the OSS and proprietary worlds respect,” Berkus added. “Most of the institutes are notorious for showing test results favorable to whomever paid for the test, and the mainstream business press pretty much ignores any studies issued by non-profit, OSS-friendly centers.”

Calling all application developers

Witham said there is little motivation to create the kind of comprehensive system-software tests that are needed to make Linux performance metrics as useful and widespread as their proprietary counterparts. “The application-level folks believe the current tests aren’t anywhere near where they need to be,” Witham said. “[But] the kernel people, it doesn’t interest them. It’s not their itch.”

Witham pointed out that those writing and developing the kernel are not the best people to write and develop tests for it anyway, so there is a need for others to fill the void.

“That’s where [application developers] could step forward for Linux to make it better,” Witham said. “This is a way you can participate. It’s really a place where end users and application developers can contribute to the overall process.”

Witham said it is really the same application people who are calling for better Linux and open source performance metrics who must put themselves to the test. “I don’t think as many application people have stepped forward to work on and contribute to testing [as should],” Witham said.

Witham argued that more application developers must look at conducting systems-level tests and they are the logical ones to do it since they are the ones who know it. “How do you simulate user apps with a majority of code not in the test pack?” Witham asked. “The only people who can do it are those writing the apps.”

PostgreSQL’s Berkus also said organizations that want to help OSS projects with testing should be willing to contribute to that help through official, online publication of test results, “when the OSS project is ready for them, of course.”

As OSDL does, testing agencies must also be willing to work with members of the open source project to make sure the test is fair, according to Berkus, who referred to the manual setup and tuning beyond installation script that is required to test OSS applications.

Witham, who indicated OSDL is planning an addition in the JBOSS test layer to test the Java environment, said it would increase efficiency to build in automatic testing during kernel and application development. “The earlier in the process you can test, the smaller the amount you throw away,” Witham said.

Witham was critical of “the old way” of a vendor submitting an application to be told by another vendor what’s right or wrong. “That’s OK, but you have no visibility or ability to change,” he said. “You do in open source and Linux. What we need is early testing and data.”