Open source databases gaining ground, analysts say

35

Author: Jay Lyman

Open source databases have a bright outlook in low-end, Unix-friendly vertical markets, including telecommunications and retail. They have grown from niche use to widespread utilization, and while challenges remain, database deployment is emerging as more of a force for the growth and use of open source and Linux software, according to analysts and vendors.

Wayne Kernochan’s report for former employer Aberdeen Group, sponsored by Berkeley DB vendor Sleepycat Software, indicates that with a market of $100 million worldwide out of a total database market worth $10.5 billion in 2003, open source databases will not increase market share as strongly as they did in the 1990s, when year-to-year growth was steady at 35 percent.

Open source databases have risen from being simply alternatives to actually attracting enterprise-level use thanks to increased control, flexibility and other advantages. Despite remaining issues including perceived lack of support, the open source database is also widening its prospects via deployment on Windows, where there are increasing efforts from open source database sellers.

“I think the database aspect is more important to Linux than is generally said,” Kernochan said. “It’s not make or break, but having an open source database in there and seeing the advantages may well cause developers and organizations to look more favorably on Linux.”

The analyst said Linux in embedded devices such as consumer electronics and cash machines provides a larger role for open source databases, too.

Kernochan, who also referred to Sun Microsystems’ use of Berkeley DB as the embedded data store in Sun Java Enterprise and Desktop systems, said open source database users appreciated the ease of installation, transparency, speed, and lack of necessary support. When support was needed, one respondent to Kernochan’s survey said their open source database support was “a joy to work with.”

“The first place Linux will really make an impact on enterprise is hosting database systems,” said Bill Claybrook, vice president of Linux strategy for the Harvard Research Group. “I think you’ll see more and more of that happening. Oracle and IBM are pushing it, so we’ll probably see more and more.” Claybrook said it appears that open source databases such as PostgreSQL and MySQL have found their niches to a good degree already.

One of the main advantages of open source databases is the ability of users to get patches from the Web faster than they could with a proprietary database vendor, according to Claybrook, who pointed to the scheduling of such tasks and the control afforded by open source as key advantages.

Most people Claybrook had interviewed for his own research, he said, “had worked with a proprietary database, went to open source, and just liked it so much better.”

Aggressiveness questioned

Nevertheless, some factors are holding back open source databases. In addition to a perceived and sometimes real lack of scalability, open source databases do not have the same kind of hook onto other applications for business, according to Claybrook. For instance, “Oracle has a tie-in with the applications to run business on Oracle, too,” he said. This tight coupling of applications to a proprietary database is among the reasons why switching databases is so painful.

Open source database vendors continue to add functionality and features, but not aggressively, Claybrook said. Kernochan agreed on relatively slow progress by both MySQL and PostgreSQL. “MySQL AB indicates that it is contemplating improvements in areas such as stored procedures, but the next version of MySQL may arrive sometime in 2006, and the company is not committing to any particular improvements,” Kernochan said. “PostgreSQL users indicate that improvements in some areas are contemplated, but the next version of PostgreSQL may come as late as two years from now, and the community that drives PostgreSQL development is not committing to any particular improvements.”

The database vendors take issue with these statements. MySQL Senior Product Manager Alex Roedling said MySQL is actually chosen not only for its ease of use and added control, but also for its scalability and reliability. MySQL handles mission-critical database applications for enterprise customers including Sabre’s Travelocity, Yahoo! Finance and other Web properties, Cox Communications, and others.

Roedling also said MySQL is continuously enhancing and regularly releasing new features and products, including MySQL Cluster for high availability database applications, which is being previewed at this week’s MySQL Users Conference in Orlando. Roedling pointed out that MySQL subqueries are currently available for download in MySQL 4.1 alpha and stored procedures can be had in the MySQL 5.0 development release. Views and Triggers are set to come out with MySQL 5.1, which will be available in alpha before the end of the year, according to Roedling.

For its part, PostgreSQL advocacy volunteer Josh Berkus complained that the Aberdeen report contained factual errors and did not accurately reflect the open source database’s feature development.

“For example, the report claims that PostgreSQL has no support for Java procedures, something we’ve had for two years,” Berkus said. “While a few of the errors favor us (for example, the report mentions Tablespaces, which are still in beta), most are not so flattering.”

Berkus, who said Kernochan had not talked to anyone with any standing in the PostgreSQL community, conceded PostgreSQL may have difficulty promoting its progress on features, but said the open source database is indeed enterprise-ready.

“Overall, we’re happy to be taken seriously as an ‘enterprise’ database, but not so thrilled with the report,” Berkus said.

Kernochan found that database users overwhelmingly said they are content with their database — whether it is open source or proprietary — and are unlikely to seek a change anytime soon.
PostgreSQL’s Berkus agreed that users are reluctant to switch databases absent a truly compelling reason, mainly because porting applications and infrastructure to a new database requires retraining, refactoring code, and significant downtime in many cases.

Nevertheless, open source database users are swayed by lower TCO, easier administration, and added control, according to MySQL’s Roedling, who referred to a Forrester report on the subject.

To Windows and beyond

While open source databases such as MySQL, PostgreSQL, and Berkeley DB are thriving as replacements for databases that used to run on SCO’s low-end Unix platform, they are also penetrating the Windows market in significant numbers, according to Kernochan.

“Open source databases have made great strides,” Kernochan said. “If they can keep this going, it makes them attractive in the wider-area Windows environment.”

MySQL is deployed about 45 percent of the time on Linux, 40 percent on Windows and then on numerous other platforms, including AIX, Solaris, Mac OS X, Netware, HP-UX and others, Roedling reported. PostgreSQL is most commonly deployed with Linux, but will soon be offering more for the Windows world, Berkus said.

“We don’t support native Windows yet,” he said. “When we release our Windows version this year, we’ll see, although the Cygwin port is very popular,” Berkus added, referring to PostgreSQL’s own survey that indicates more than 21 percent of PostgreSQL users are pairing it with Windows or Cygwin environments.

Aberdeen study sponsor and Berkeley DB maker Sleepycat Software said it did not have data on deployments by operating system, but indicated it gets approximately 1,500 downloads per day with about 85 percent of it representing the Unix/Linux version and the bulk of the rest representing the Windows version, Vice President of Marketing Rex Wang said.

Berkeley DB offers many of the same key features as MySQL and PostreSQL, but in a much different package as a library that runs directly in the application.

“It is also accessed via simple programmatic interfaces,” Kernochan wrote. “It does not support SQL, ODBC, or JDBC.”

Sleepycat said that the use of any open source software increasingly means the use of its Berkeley DB as well. “This is because Berkeley DB is so pervasive with other open source software,” Wang said. “Berkeley DB is used by Apache Web server (68 percent of all sites), all commercial versions of Linux, all versions of BSD Unix, sendmail (75 percent of Internet email traffic), Mozilla, OpenLDAP, Perl, Python, Movable Type, and OpenOffice.org.”

“We are the most widely used open source database, with over 200 million copies of Berkeley DB in deployment. So Berkeley DB is almost always already in use at a customer site, even if they don’t realize it,” Wang added.