Home Linux Community Community Blogs Business (or Enterprise)

Community Blogs

Cloud Operating System - what is it really?

A recent article published on, “Are Cloud Operating Systems the Next Big Thing”, suggests that a Cloud Operating System should simplify the Application stack. The idea being that the language runtime is executed directly on the hypervisor without an Operating System Kernel.

Other approaches for cloud operating systems are focussed on optimising Operating System distributions for the cloud with automation in mind. The concepts of IaaS (Infrastructure as a Service), PaaS (Platform as a Service) and SaaS (Software as a Service) remain in the realm of conventional computing paradigms. 

None of these approaches address the core benefits of the cloud. The cloud is a pool of resources, not just another “single” computer. When we think of a computer, it has a processor, persistent storage and memory. A conventional operating system exposes compute resources based on these physical limitations of a single computer. 

There are numerous strategies to create the illusion of a larger compute platform, such as load balancing to a cluster of compute nodes. Load balancing is most commonly performed at a network level with applications or operating systems having limited exposure of the overall compute platform. This means an application cannot determine the available compute resources and scale the cloud accordingly.

To fully embrace the cloud concept a platform is required that can automatically scale application components with additional cloud compute resources. Amazon and Google both have solutions that provide some of these capabilities, however internal Enterprise solutions are somewhat limited. Many organisations embrace the benefits of a hosted cloud within the mega data centres around the world. Many companies have a requirement to host applications internally.

As network speeds increase the feasibility of a real “Cloud Operating System” becomes a reality. This is where an application can start a thread that executes not on a separate processor core, but executes somewhere within the cloud. 

A complete paradigm shift is required to comprehend the possibilities of an Operating System providing distributed parallel processing. Virtualisation takes this new cloud paradigm to a different level where the abstraction of the hardware using a virtualisation layer and a platform operating system presents compute resources to a Cloud Operating System.

The same way as a conventional operating system determines which CPU core is the most appropriate to execute a specific process or thread, a cloud operating system should identify which instance of the cloud execution component is most appropriate to execute a task. 

A cloud operating system with multiple execute instances on numerous hosts can schedule tasks based on the available resources of an execute instance. By abstracting task scheduling to a higher layer the underlying operating system is still required to optimise performance  using techniques such as Symmetric Multiprocessing (SMP), processor affinity and thread priorities.

The application developer has for many years been abstracted from the hardware with development environments such as C#, Java and even PHP. Operating systems have not adapted to the Cloud concept of providing compute resources beyond a single computer. 

The most comparable implementation is the route taken by Application Servers with solutions such as JAVA EJB where lookups can occur to find providers.  Automatic scalability is however limited with these solutions.

Hardware vendors are moving ahead by creating cloud optimised platforms. The concept is that many smaller platforms create optimal compute capacity. HP seem to be leading this sector with their Moonshot solution. The question however remains: How do you make many look like one?  

Enterprises have existing data centres where very little of the overall compute capacity is actually leveraged on an ongoing basis. When one system is busy, numerous others are idle. A cloud compute environment that can automatically scale across a collection of servers providing actual cost savings. Compute capacity would be additive using existing infrastructure for workload based on available resources. According to the IDC report on world wide server shipments the server market is in excess of $12B per quarter. The major vendors are looking for ways to differentiate their solutions and provide optimal value to customers.

Combining hardware, virtualisation and a Cloud Operating System organisations will benefit from a reduction in the cost to provide adequate compute capacity to serve business needs.

Gideon Serfontein is a co-founder of the Bongi Cloud Operating System research project. Additional information at


The Tyranny of the Clouds

Or “How I learned to start worrying and never trust the cloud.”

The Clouderati have been derping for some time now about how we’re all going towards the public cloud and “private cloud” will soon become a distant, painful memory, much like electric generators filled the gap before power grids became the norm. They seem far too glib about that prospect, and frankly, they should know better. When the Clouderati see the inevitability of the public cloud, their minds lead to unicorns and rainbows that are sure to follow. When I think of the inevitability of the public cloud, my mind strays to “The Empire Strikes Back” and who’s going to end up as Han Solo. When the Clouderati extol the virtues of public cloud providers, they prove to be very useful idiots advancing service providers’ aims, sort of the Lando Calrissians of the cloud wars. I, on the other hand, see an empire striking back at end users and developers, taking away our hard-fought gains made from the proliferation of free/open source software. That “the empire” is doing this *with* free/open source software just makes it all the more painful an irony to bear.

I wrote previously that It Was Never About Innovation, and that article was set up to lead to this one, which is all about the cloud. I can still recall talking to Nicholas Carr about his new book at the time, “The Big Switch“, all about how we were heading towards a future of utility computing, and what that would portend. Nicholas saw the same trends the Clouderati did, except a few years earlier, and came away with a much different impression. Where the Clouderati are bowled over by Technology! and Innovation!, Nicholas saw a harbinger of potential harm and warned of a potential economic calamity as a result. While I also see a potential calamity, it has less to do with economic stagnation and more to do with the loss of both freedom and equality.

The virtuous cycle I mentioned in the previous article does not exist when it comes to abstracting software over a network, into the cloud, and away from the end user and developer. In the world of cloud computing, there is no level playing field – at least, not at the moment. Customers are at the mercy of service providers and operators, and there are no “four freedoms” to fall back on.

When several of us co-founded the Open Cloud Initiative (OCI), it was with the intent, as Simon Phipps so eloquently put it, of projecting the four freedoms onto the cloud. There have been attempts to mandate additional terms in licensing that would force service providers to participate in a level playing field. See, for example, the great debates over “closing the web services loophole” as we called it then, during the process to create the successor to the GNU General Public License version 2. Unfortunately, while we didn’t yet realize it, we didn’t have the same leverage as we had when software was something that you installed and maintained on a local machine.

The Way to the Open Cloud

Many “open cloud” efforts have come and gone over the years, none of them leading to anything of substance or gaining traction where it matters. Bradley Kuhn helped drive the creation of the Affero GPL version 3, which set out to define what software distribution and conveyance mean in a web-driven world, but the rest of the world has been slow to adopt because, again, service providers have no economic incentive to do so. Where we find ourselves today is a world without a level playing field, which will, in my opinion, stifle creativity and, yes, innovation. It is this desire for “innovation” that drives the service providers to behave as they do, although as you might surmise, I do not think that word means what they think it means. As in many things, service providers want to be the arbiters of said innovation without letting those dreaded freeloaders have much of a say. Worse yet, they create services that push freeloaders into becoming part of the product – not a participant in the process that drives product direction. (I know, I know: yes, users can get together and complain or file bugs, but they cannot mandate anything over the providers)

Most surprising is that the closed cloud is aided and abetted by well-intentioned, but ultimately harmful actors. If you listen to the Clouderati, public cloud providers are the wonderful innovators in the space, along with heaping helpings of concern trolling over OpenStack’s future prospects. And when customers lose because a cloud company shuts its doors, the clouderati can’t be bothered to bring themselves to care: c’est la vie and let them eat cake. The problem is that too many of the clouderati think that Innovation! is a means to its own ends without thinking of ground rules or a “bill of rights” for the cloud. Innovation! and Technology! must rule all, and therefore the most innovative take all, and anything else is counter-productive or hindering the “free market”. This is what happens when the libertarian-minded carry prejudiced notions of what enabled open source success without understanding what made it possible: the establishment and codification of rights and freedoms. None of the Clouderati are evil, freedom-stealing, or greedy, per se, but their actions serve to enable those who are. Because they think solely in terms of Innovation! and Technology!, they set the stage for some companies to dominate the cloud space without any regard for establishing a level playing field.

Let us enumerate the essential items for open innovation:

  1. Set of ground rules by which everyone must abide, eg. the four freedoms
  2. Level playing field where every participant is a stakeholder in a collaborative effort
  3. Economic incentives for participation

These will be vigorously opposed by those who argue that establishing such a list is too restrictive for innovation to happen, because… free market! The irony is that establishing such rules enabled Open Source communities to become the engine that runs the world’s economy. Let us take each and discuss its role in creating the open cloud.

Ground Rules

We have already established the irony that the four freedoms led to the creation of software that was used as the infrastructure for creating proprietary cloud services. What if the four freedoms where tweaked for cloud services. As a reminder, here are the four freedoms:


  • The freedom to run the program, for any purpose (freedom 0).
  • The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1).
  • The freedom to redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to distribute copies of your modified versions to others (freedom 3).


If we rewrote this to apply to cloud services, how much would need to change? I made an attempt at this, and it turns out that only a couple of words need to change:


  • The freedom to run the program or service, for any purpose (freedom 0).
  • The freedom to study how the service works, and change it so it does your computing as you wish (freedom 1).
  • The freedom to implement and redistribute copies so you can help your neighbor (freedom 2).
  • The freedom to implement your modified versions for others (freedom 3).


Freedom 0 adds “or service” to denote that we’re not just talking about a single program, but a set of programs that act in concert to deliver a service.

Freedom 1 allows end users and developers to peak under the hood.

Freedom 2 adds “implement and” to remind us that the software alone is not much use – the data forms a crucial part of any service.

Freedom 3 also changes “distribute copies of” to “implement” because of the fundamental role that data plays in any service. Distributing copies of software in this case doesn’t help anyone without also adding the capability of implementing the modified service, data and all.

Establishing these rules will be met, of course, with howls of rancor from the established players in the market, as it should be.

Level Playing Field

With the establishment of the service-oriented freedoms, above, we have the foundation for a level playing field with actors from all sides having a stake in each other’s success. Each of the enumerated freedoms serves to establish a managed ecosystem, rather than a winner-take-all pillage and plunder system. This will be countered by the argument that if we hinder the development of innovative companies won’t we a.) hinder economic growth in general and b.) socialism!

In the first case, there is a very real threat from a winner-take-all system. In its formative stages, when everyone has the economic incentive to innovate (there’s that word again!), everyone wins. Companies create and disrupt each other, and everyone else wins by utilizing the creations of those companies. But there’s a well known consequence of this activity: each actor will try to build in the ability to retain customers at all costs. We have seen this happen in many markets, such as the creation of proprietary, undocumented data formats in the office productivity market. And we have seen it in the cloud, with the creation of proprietary APIs that lock in customers to a particular service offering. This, too, chokes off economic development and, eventually, innovation. At first, this lock in happens via the creation of new products and services which usually offer new features that enable customers to be more productive and agile. Over time, however, once the lock-in is established, customers find that their long-term margins are not in their favor, and moving to another platform proves too costly and time-consuming. If all vendors are equal, this may not be so bad, because vendors have an incentive to lure customers away from their existing providers, and the market becomes populated by vendors competing for customers, acting in their interest. Allow one vendor to establish a larger share than others, and this model breaks down. In a monopoly situation, the incumbent vendor has many levers to lock in their customers, making the transition cost too high to switch to another provider. In cloud computing, this winner-take-all effect is magnified by the massive economies of scale enjoyed by the incumbent providers. Thus, the customer is unable to be as innovative as they could be due to their vendor’s lock-in schemes. If you believe in unfettered Innovation! at all costs, then you must also understand the very real economic consequences of vendor lock-in. By creating a level playing field through the establishment of ground rules that ensure freedom, a sustainable and innovative market is at least feasible. Without that, an unfettered winner-take-all approach will invariably result in the loss of freedom and, consequently, agility and innovation.

Economic Incentives

This is the hard one. We have already established that open source ecosystems work because all actors have an incentive to participate, but we have not established whether the same incentives apply here. In the open source software world, developers participate because they had to, because the price of software is always dropping, and customers enjoy open source software too much to give it up for anything else. One thing that may be in our favor is the distinct lack of profits in the cloud computing space, although that changes once you include services built on cloud computing architectures.

If we focus on infrastructure as a service (IaaS) and platform as a service (PaaS), the primary gateways to creating cloud-based services, then the margins and profits are quite low. This market is, by its nature, open to competition because the race is on to lure as many developers and customers as possible to the respective platform offerings. However, the danger becomes if one particular service provider is able to offer proprietary services that give it leverage over the others, establishing the lock-in levers needed to pound the competition into oblivion.

In contrast to basic infrastructure, the profit margins of proprietary products built on top of cloud infrastructure has been growing for some time, which incentivizes the IaaS and PaaS vendors to keep stacking proprietary services on top of their basic infrastructure. This results in a situation where increasing numbers of people and businesses have happily donated their most important business processes and workflows to these service providers. If any of them are to grow unhappy with the service, they cannot easily switch, because no competitor would have access to the same data or implementation of that service. In this case, not only is there a high cost associated with moving to another service, there is the distinct loss of utility (and revenue) that the customer would experience. There is a cost that comes from entrusting so much of your business to single points of failure with no known mechanism for migrating to a competitor.

In this model, there is no incentive for service providers to voluntarily open up their data or services to other service providers. There is, however, an incentive for competing service providers to be more open with their products. One possible solution could be to create an Open Cloud certification that would allow services that abide by the four freedoms in the cloud to differentiate themselves from the rest of the pack. If enough service providers signed on, it would lead to a network effect adding pressure to those providers who don’t abide by the four freedoms. This is similar to the model established by the Free Software Foundation and, although the GNU people would be loathe to admit it, the Open Source Initiative. The OCI’s goal was to ultimately create this, but we have not yet been able to follow through on those efforts.


We have a pretty good idea why open source succeeded, but we don’t know if the open cloud will follow the same path. At the moment, end users and developers have little leverage in this game. One possibility would be if end users chose, at massive scale, to use services that adhered to open cloud principles, but we are a long way away from this reality. Ultimately, in order for the open cloud to succeed, there must be economic incentives for all parties involved. Perhaps pricing demands will drive some of the lower rung service providers to adopt more open policies. Perhaps end users will flock to those service providers, starting a new virtuous cycle. We don’t yet know. What we do know is that attempts to create Innovation! will undoubtedly lead to a stacked deck and a lack of leverage for those who rely on these services.

If we are to resolve this problem, it can’t be about innovation for innovation’s sake – it must be, once again, about freedom.

This article originally appeared on the Gluster Community blog.


On the use of low thread high speed “gaming computers” to solve engineer simulations.

On the use of low thread high speed “gaming computers” to solve engineer simulations.

   Many of us in the linux community work only with software that is FOSS, which stands for Free open-source software. This is software that is not only open source but is available without licensing fees. There are many outstanding products out on the market today that are considered FOSS from the Firefox browser to most distributions of linux to the OpenOffice suite of text editing programs. However, there are times when FOSS is not an option, a good example is in my line of work supporting engineering software especially CAD tools and simulators. This software is not only costly but it is very restrictive. Each aspect of the software is charged. For example many of the simulators can run multithreaded. With one piece of software running on up to 16 threads for a single simulation.  More threads require more tokens, and we pay per token available. This puts us in a situation that we wish to maximize the amount we accomplish with as little threads as we can.

   If for example an engineer needs a simulation to finish to prove a design concept and it will take 6 hours to simulate at 1 thread he will want another token in order to use more threads. Using one token may buy him or her a reduction of 3 hours in simulation time, but the cost is that the tokens used for his or her simulation cannot be used by another engineer. The simple solution would be to keep on buying more and more tokens until every engineer has enough to run on the maximum number of threads at all times. If there are 5 engineers who run simulations that can run on 16 threads each for the cost of 5 tokens then we will need 25 tokens. Of course the simple solution rarely works. The cost of 25 is so high that it can easily bankrupt a medium sized company.

   Another solution would be to use less tokens but implement advance queuing software. This has the advantage that engineers can submit tasks and the servers running the simulations will run at all time (we hope) using the tokens we do have to the utmost. This strategy works well when deadlines are far away, but as they get close the fight for slots can grow.

  Since the limiting resource here is the number of threads we tried a different approach. As we are paying per thread we run, we should try to run threads as fast as possible (increasing our performance) rather then our throughput.  To further justify our reasoning we looked at creating benchmarks for our tools and comparing the amount of time it took to run a simulation compared to the number of threads we employed for it.

  The conclusion was:  Independent of software and the type of simulation we ran the performance increased exponentially to 4.5 threads and then leveled off. A shocking result given that the tools we used came out in different years and were produced by different venders.

   Given this information we concluded that if we ran 4 threads 25% faster on machine A (by overclocking) we could achieve better results then on machine B despite the same architecture.  This meant that for the near trivial price (compared to a server’s cost or additional tokens) of a modified desktop computer we could outperform a server with the maximum number of tokens we could purchase.

Our new system specifications:

Newegg #


Item name




Intel core i7 1155 socket




Asus motherboard




Cooler master power supply




G.Skill DDR3 ram




All in one liquid cpu cooler




Cooler master PC case




Ethernet server adapater






Amazon order




   The total cost was approximately 1200 per unit after rebates. Assembly took about 3 hours. Overclocking was achieved at 4.7Ghz stable with the maximum recorded temperature at 70 C. The operating system is centos with the full desktop installed. The NICs have two connections link aggregated to our servers.

  To test overclocking we wrote a simple infinite loop floating point operation in perl and launched 8 instances of it while monitoring the results using a FOSS program called i7z. The hard drive only exists to provide a boot drive all other functions are performed via ssh and NFS exports. The units sit headless in our server room. It has been estimated that we have reduced overall simulation time across the company by 50% with only two units.

  The analogy we give is one of transportation. We have servers which function like buses. They can move a great deal of people at a time, which is great but buses are slow. Now we constructed high speed sports cars, these cars can only move a few people at a time but can move them much faster.

Isiah Schwartz

Teledyne Lecroy



Meet HD Camera Cape for BBB- $ 49.99 low cost camera cape

  • Adding a camera cape to the latest BeagleBone Black, RadiumBoards has increased its “Cape”-ability to provide a portable camera solution. Thousands of designers, makers, hobbyists and engineers are adopting BeagleBone Black and are becoming huge fans because of its unique functionality as a pocket-sized expandable Linux computer that can connect to the Internet. To augment them in their work we have designed this HD Camera Cape with following features and benefits as:

    • 720p HD video at 30fps
    • Superior low-light performance
    • Ultra-low-power
    • Progressive scan with Electronic Rolling Shutter
    • No software effort required for OEM
    • Proven, off-the-shelf driver available for quick and easy software integration
    • Minimize development time for system designers
    • Easily Customized
    • Simple Design with Low Cost
  • Priced for just $49.99, this game-changing cape plug for the latest credit card sized BBB can help developers differentiate their product and get to market faster.
  • To learn more about the new and exciting cape checkout

    Dick MacInnis, Creator of DreamStudio, Launches Celeum Embedded Linux Devices

    Celeum offers four unique embedded devices based on Linux:

    • 1.The CeleumPC, which dual boots Android and DreamStudio
    • 2.The CeleumTV, which runs Android with a custom XBMC setup
    • 3.The Celeum Cloud Server, which runs Ubuntu Server with ownCloud for personal cloud storage, and
    • 4.The Celeum Domain Server, a drop in replacement for Windows Domain Controllers, powered by Ubuntu Server and a custom fork of Zentyal Small Business Server.

    The Celeum TV is currently available only in the Saskatoon, Canada area, while the other three devices are currently in crowdfunding phase, and can be preordered by making a donation to the Celeum Indiegogo campaign


    Leveraging Open Source and Avoiding Risks in Small Tech Companies

    Today’s software development is geared more towards building upon previous work and less about reinventing content from scratch. Resourceful software development organizations and developers use a combination of previously created code, commercial software and open source software (OSS), and their own creative content to produce the desired software product or functionality. Outsourced code can also be used, which in itself can contain any of the above combination of software.

    There are many good reasons for using off-the-shelf and especially open source software, the greatest being its ability to speed up development and drive down costs without sacrificing quality. Almost all software groups knowingly, and in many cases unknowingly, use open source software to their advantage. Code reuse is possibly the biggest accelerator of innovation, as long as OSS is adopted and managed in a controlled fashion.

    In today’s world of open-sourced, out-sourced, easily-searched and easily-copied software it is difficult for companies to know what is in their code. Anytime a product containing software changes hands there is a need to understand its composition, its pedigree, its ownership, and any open source licenses or obligations that restrict the rules around its use by new owners.

    Given developers’ focus on the technical aspects of their work and emphasis on innovation, obligations associated with use of third party components can be easily compromised. Ideally companies track open source and third party code throughout the development lifecycle. If that is not the case then, at the very least, they should know what is in their code before engaging in a transaction that includes a software component.

    Examples of transactions involving software are: a launch of a product into the market, mergers & acquisitions (M&A) of companies with software development operations, and technology transfer between organizations whether they are commercial, academic or public. Any company that produces software as part of a software supply chain must be aware of what is in their code base.


    Impact of Code Uncertainties

    Any uncertainty around software ownership or license compliance can deter downstream users, reduce ability to create partnerships, and create litigation risk to the company and their customers. For smaller companies, intellectual property (IP) uncertainties can also delay or otherwise threaten closures in funding deals, affect product and company value, and negatively impact M&A activities.

    IP uncertainties can affect the competitiveness of small technology companies due to indemnity demands from their clients. Therefore technology companies need to understand the obligations associated with the software that they are acquiring. Any uncertainties around third party content in code can also stretch sales cycles. Lack of internal resources allocated to identification, tracking and maintaining open source and other third party code in a project impacts smaller companies even more.

    Along with licensing issues and IP uncertainties, organizations that use open source also need to be aware of security vulnerabilities. A number of public databases, such as the US National Vulnerability Database (NVD) or Carnegie Mellon University's Computer Emergency Response Team (CERT) database, list known vulnerabilities associated with a large number of software packages. Without an accurate knowledge of what exists in the code base it is not possible to consult these databases. Aspects such as known deficiencies, vulnerabilities, known security risks, and code pedigree all assume the existence of software Bill of Materials (BOM). In a number of jurisdictions, another important aspect to consider before a software transaction takes place is whether the code includes encryption content or other content subject to export control – this is important to companies that do business internationally.  


    The benefits of OSS usage can be realized and the risks can be managed at the same time. Ideally, a company using OSS should have a process in place to ensure that OSS is properly adopted and managed throughout the development cycle. Having such a process in place allows organizations to detect any licensing or IP uncertainties at the earliest possible stage during development which reduces the time, effort, and cost associated correcting the problem later down the road.

    If a managed OSS adoption process spanning all stages of a development life cycle is not in place, there are other options available to smaller companies. Organizations are encouraged to audit their code base, or software in specific projects, regularly. Some may decide to examine third party contents and the associated obligations just before a product is launched, or in anticipation of an M&A.


    Internal Audits

    The key here is having an accurate view of all third-party, including OSS, content within the company. One option is to carry out an internal audit of the company code base for the presence of outside content and its licensing and other obligations. Unfortunately manually auditing a typical project of 1000-5000 files is a resource and time consuming process. Automated tools can speed up the discovery stage considerably. For organizations that do not have the time, resources or expertise to carry out an assessment on their own, an external audit would be the fastest, most accurate and cost effective option.


    External Audits

    External audit groups ideally deploy experts on open source and software licensing that use automated tools, resulting in accurate assessment and fast turnaround. A large audit project requires significant interactions between the audit agency and the company personnel, typically representatives in the R&D group, resident legal or licensing office, and product managers. A large audit project requires an understanding of the company’s outsourcing and open source adoption history, knowledge of the code portfolio in order to break it down into meaningful smaller sub projects, test runs, and consistent interactions between the audit team and the company representatives.

    Smaller audit projects however can be streamlined and a number of overhead activities can be eliminated, resulting in a time and cost efficient solution without compromising details or accuracy. An example would be streamlined machine-assisted software assessment service. The automated scanning operation, through use of automated open source management tools, can provide a first-level report in hours. Expert review and verification of the machine-generated reports and final consolidation of the results into an executive report can take another few days depending on the size of the project.

    The executive report delivered by an external audit agency is a high level view of all third party content, including OSS, and attributes associated with them. The audit report describes the software code audit environment, the process used, and the major findings, drawing attention to specific software packages, or even software files and their associated copyright and licenses. The audit report will highlight third party code snippets that were “cut & pasted” into proprietary files and how that could affect the distribution or the commercial model. This is important for certain licenses such as those in the GPL (GNU Public License) family of OSS licenses, depending on how the public domain code or code snippet is utilized.

    The report significantly reduces the discovery and analysis effort required from the company being audited, allowing them to focus on making relevant decisions based on the knowledge of their code base.


    Third party code, including open source and commercially available software packages, can accelerate development, reduce time to market and decrease development costs. These advantages can be obtained without compromising quality, security or IP ownership. Especially for small companies, any uncertainty around code content and the obligations associated with third party code can impact the ability of an organization to attract customers. Ambiguity around third party code within a product stretches sales cycles, and reduces the value of products and impacts company valuations. For small organizations, an external audit of the code base can quickly, accurately and economically establish the composition the software and its associated obligations.


    Corks? Or Screw Tops? Why the Experience Matters

    I've noticed a disturbing trend amongst a few of the high quality wineries in my state. They have abandoned the cork to close their high-end wine bottles and turned to screw caps. This is good news to people who struggle with how to get a cork out of a wine bottle. 

    Read more... Comment (0)

    Chatting with Peter Tait of Lucid Imagination

    Back in January I had the opportunity to test drive LucidWorks Enterprise, a search engine for internal networks. The cross-platform search engine was flexible, stable, easy to install and came backed by a friendly support staff. In short, it was a good experience which demonstrated how useful (and straight forward) running one's own search engine can be.

    Read more... Comment (0)

    HOWTO - Using rsync to move a mountain of data

    In this installment of my blog, I want to document the proper use of rsync for folks who are tasked with moving a large amount of data.  I'll even show you a few things you can do from the command line interface to extend the built-in capability of rsync using a little bash-scripting trickery.

    I use rsync to migrate Oracle databases between servers at least a few times per year.  In a snap, its one of the easiest ways to clone a database from a Production server to a Pre-Production/Development server or even a Virtual Machine.  You don't have to have a fancy Fibre-Channel or iSCSI storage array attached to both servers, in order to do a data LUN clone, thanks to rsync.

    I hope you enjoy this in-depth article.  Please feel free to comment: if you need clarification, find it useful, or something I wrote is just plain wrong.

    Read more... Comment (3)

    Clone a Virtual Machine from the shell (The Script)

    After few comments on my previous blog related on how to manually clone a Virtual Machine from the shell I've decided to write a simple script to do everything automatically. Maybe this could be useful for newbies but basically it reproduces all the information reported on my latest blog.

    There's no rocket science here and I've tried to keep the script simple and hackable for anyone, it required me some time (less than 1h) due to my poor sed knowledge, I've taken it as an exercise to improve my sed capabilities.

    As in open source feel free to improve or modify it as you wish, send me an updated copy so I can publish your best version as well, error checking it's quite simple now. You may input absolute or relative paths but there're few limitations around.


    Basic Usage:

    VMCopy <old name> <new name>
    oldname is the name of the directory with original VMWare files
    is the name of the directory with newly created VMWare files

    simple, isn't it ?


    Here's the script:

    # @name VMCopy - Copy/Clone a VMWARE Virtual machine with a new name
    # @author Andrea Benini (Ben)
    # @since 2011-02
    # @website
    # @email andrea benini (at domain name) gmail [DoT] com
    # @package Use it to get a physical copy of an existing machine, no snapshots or
    # VMWare tools involved in this operation, it's a plain text bash script
    # @require This tool should be portable to many UNIX platforms, it just requires:
    # sed, dirname, basename, md5sum, $RANDOM (shell variable) and few more
    # shell builtins commands
    # @license GPL v2 AND The Beer-ware License
    # See GPL details from
    # "THE BEER-WARE LICENSE" (Revision 43)
    # Andrea Benini wrote this file. As long as you retain this notice you
    # can do whatever you want with this stuff. If you make modification on
    # the file please leave author notes on it, if you improve/alter/modify
    # it please send me an updated copy by email. If we meet some day, and
    # you think this stuff is worth it, you can buy me a beer in return.
    # Andrea Benini
    SOURCEPATH=$(dirname "$1")
    TARGETPATH=$(dirname "$2")
    SOURCEMACHINE=$(basename "$1")
    TARGETMACHINE=$(basename "$2")

    if [[ $# -ne 2 ]]; then
    echo -e "$0 "
    echo -e " Copies a VMWare virtual machine"
    echo " and are names"
    echo " of the machine you'd like to copy and the new destination name"
    echo ""

    exec 2> /dev/null
    echo "VMCopy - VMWare Virtual Machines cloner"
    echo " - Copying source machine '$SOURCEMACHINE' with the new name '$TARGETMACHINE'..."

    echo " - Removing unnecessary files for '$TARGETMACHINE'"

    echo " - Renaming files for '$TARGETMACHINE'"
    mv -f "$OLDNAME" "$NEWNAME"

    echo " - Remapping Hard Disks for the new machine"
    ls "$TARGETPATH/$TARGETMACHINE"/*.vmdk | grep -v -e "-s....vmdk" | while read DISKNAME; do

    echo " - Changing resource files (if any)"

    echo " - Changing $TARGETMACHINE.vmx file"
    # Massive character substitutions
    # Change ethernet mac addresses
    MACADDRESSES=`cat "$TARGETPATH/$TARGETMACHINE/$TARGETMACHINE.vmx"|grep "generatedAddress ="| sed -e "s/.*=."//" -e "s/"//"`
    REGEXP="[0-9 A-Z a-z][0-9 A-Z a-z]"
    NEWMAC=$(echo $RANDOM$RANDOM |md5sum| sed -r 's/(..)/1:/g; s/^(.{17}).*$/1/;')

    echo -e " - Operation Complete, '$TARGETMACHINE' cloned successfully"



    Share your ideas

    If you find errors or you'd like to change some parts let me know, share your ideas to improve the script, I'll always post here the improved version



    Manually clone a VMWare Virtual machine from the shell


    Sometimes you've VMWare appliances and you need to get a physical copy instantly and you don't have VMWare Tools with you or you're doing everything from command line (on a remote console), sometimes you don't even have VMWare (ESX/GSX/VSphere/player) installed or you've just the Player (no cloning from there) but you still need to get a clone of a working machine. I usually create my own appliances with my own utilities, packages and tools installed, I store them as .TAR.GZ and I use them as a base for new machines. Here's what I do to have an exact copy of a machine; it's not a geek trick, it's just a plain basic task, this always works, no matter about the OS inside your VM (Win/Linux/BSD/Plan9/BeOS/...).


    First: of all you need to do is stop your source machine (in my example “Debian 6”) and locate its directory, then copy the whole source Dir to a new path (in my example “new.machine”)

    $ cp -R Debian 6   new.machine
    $ ls -la new.machine/
    total 534520
    drwxr-xr-x 2 ben ben 4096 2011-02-09 09:53 .
    drwxr-xr-x 12 ben ben 4096 2011-02-09 09:53 ..
    -rw------- 1 ben ben 8684 2011-02-09 09:53 Debian 6.nvram
    -rw------- 1 ben ben 211550208 2011-02-09 09:53 Debian 6-s001.vmdk
    -rw------- 1 ben ben 234356736 2011-02-09 09:53 Debian 6-s002.vmdk
    -rw------- 1 ben ben 107347968 2011-02-09 09:53 Debian 6-s003.vmdk
    -rw------- 1 ben ben 2621440 2011-02-09 09:53 Debian 6-s004.vmdk
    -rw------- 1 ben ben 65536 2011-02-09 09:53 Debian 6-s005.vmdk
    -rw------- 1 ben ben 639 2011-02-09 09:53 Debian 6.vmdk
    -rw-r--r-- 1 ben ben 0 2011-02-09 09:53 Debian 6.vmsd
    -rwxr-xr-x 1 ben ben 1652 2011-02-09 09:53 Debian 6.vmx
    -rw-r--r-- 1 ben ben 263 2011-02-09 09:53 Debian 6.vmxf
    -rw-r--r-- 1 ben ben 88558 2011-02-09 09:53 vmware-0.log
    -rw-r--r-- 1 ben ben 49667 2011-02-09 09:53 vmware-1.log
    -rw-r--r-- 1 ben ben 64331 2011-02-09 09:53 vmware-2.log
    -rw-r--r-- 1 ben ben 63492 2011-02-09 09:53 vmware.log

    Now delete unnecessary files like the logs

    $ rm *.log

    Do a massive rename, source/previous virtual machine was named “Debian 6”, you need to replace it with “new.machine” (our new name)

    $ mv "Debian 6.nvram" new.machine.nvram
    $ mv "Debian 6-s001.vmdk" new.machine-s001.vmdk
    $ mv "Debian 6-s002.vmdk" new.machine-s002.vmdk
    $ mv "Debian 6-s003.vmdk" new.machine-s003.vmdk
    $ mv "Debian 6-s004.vmdk" new.machine-s004.vmdk
    $ mv "Debian 6-s005.vmdk" new.machine-s005.vmdk
    $ mv "Debian 6.vmdk" new.machine.vmdk
    $ mv "Debian 6.vmsd" new.machine.vmsd
    $ mv "Debian 6.vmx" new.machine.vmx
    $ mv "Debian 6.vmxf" new.machine.vmxf

    NOTE: .vmxf file is present on newer releases of VMWare appliances, if you don't have it just ignore it

    Now it's time to change information inside your virtual machines, you just need to use your favorite text editor to change few things, keep this files as they're


    NVRam is your bios/nvram, it's a binary file and you don't need to change it, *.vmdk are your disks, you just need to change the information header of the disk (new.machine.vmdk), leave the other VMDK files as they are (new.machine-s*.vmdk); VMSD file is usually empty, don't need to change it.


    Modify your hard disks

    If you've more than one hard disk you've more than one .VMDK master file, you need to apply few mods on it, here's the content of the original file (was “Debian 6.vmdk”, now “new.machine.vmdk”)

    # Disk DescriptorFile

    # Extent description
    RW 4192256 SPARSE "Debian 6-s001.vmdk"
    RW 4192256 SPARSE "Debian 6-s002.vmdk"
    RW 4192256 SPARSE "Debian 6-s003.vmdk"
    RW 4192256 SPARSE "Debian 6-s004.vmdk"
    RW 8192 SPARSE "Debian 6-s005.vmdk"

    # The Disk Data Base
    ddb.virtualHWVersion = "7"
    ddb.longContentID ="86aa7ebbb50ab88b973ea60271ad0a67"
    ddb.uuid = "60 00 C2 9f 9a e3 43 6a-ea 70 c7 fa 35 72 7c 04"
    ddb.geometry.cylinders = "1044"
    ddb.geometry.heads = "255"
    ddb.geometry.sectors = "63"
    ddb.adapterType = "lsilogic"

    Row order and content may vary, VMWare configuration files don't have a fixed order, you may change row order, add comments and some other stuff inside it. Here's what you need to change:

    RW 4192256 SPARSE "new.machine-s001.vmdk"
    RW 4192256 SPARSE "new.machine-s002.vmdk"
    RW 4192256 SPARSE "new.machine-s003.vmdk"
    RW 4192256 SPARSE "new.machine-s004.vmdk"
    RW 8192 SPARSE "new.machine-s005.vmdk"

    So all you need to do is change references to physical hard disk files, nothing more, just change the lines above in your new.machine.vmdk file and nothing else


    Other Descriptors

    It's time to change the VMXF file (extra configs from VMWare), if you don't have it, just skip this step. Your new.machine.vmxf file could be something like that:

    52 62 73 9d 7f 10 1b 58-8e 3c 8e 15 8e ef f4 a3 Debian 6.vmx

    It's an XML file as you may see, content and VMIDs may change a little bit but it doesn't matter. All you need to do here is to replace this string:

    Debian 6.vmx

    with this one


    and nothing more, here's the result:

    52 62 73 9d 7f 10 1b 58-8e 3c 8e 15 8e ef f4 a3 new.machine.vmx

    VMX Main configuration file

    new.machine.vmx is the machine main configuration file, inside it you find hardware description and file references, it may vary a lot according to virtual hardware, player version and hardware machine version, here's a copy of my new.machine.vmx (original copy from Debian 6.vmx)

    .encoding = "UTF-8"
    config.version = "8"
    virtualHW.version = "7"
    scsi0.present = "TRUE"
    scsi0.virtualDev = "lsilogic"
    memsize = "256"
    scsi0:0.present = "TRUE"
    scsi0:0.fileName = "Debian 6.vmdk"
    ethernet0.present = "TRUE"
    ethernet0.connectionType = "bridged"
    ethernet0.wakeOnPcktRcv = "FALSE"
    ethernet0.addressType = "generated"
    pciBridge0.present = "TRUE"
    pciBridge4.present = "TRUE"
    pciBridge4.virtualDev = "pcieRootPort"
    pciBridge4.functions = "8"
    pciBridge5.present = "TRUE"
    pciBridge5.virtualDev = "pcieRootPort"
    pciBridge5.functions = "8"
    pciBridge6.present = "TRUE"
    pciBridge6.virtualDev = "pcieRootPort"
    pciBridge6.functions = "8"
    pciBridge7.present = "TRUE"
    pciBridge7.virtualDev = "pcieRootPort"
    pciBridge7.functions = "8"
    vmci0.present = "TRUE"
    roamingVM.exitBehavior = "go"
    displayName = "Debian 6"
    guestOS = "other26xlinux"
    nvram = "Debian 6.nvram"
    virtualHW.productCompatibility = "hosted"
    gui.exitOnCLIHLT = "FALSE"
    extendedConfigFile = "Debian 6.vmxf"
    ethernet0.generatedAddress = "00:0c:29:b1:8b:e6"
    uuid.location = "56 4d 05 92 24 e8 b0 b3-f7 37 1f d9 51 b1 8b e6"
    uuid.bios = "56 4d 05 92 24 e8 b0 b3-f7 37 1f d9 51 b1 8b e6"
    cleanShutdown = "TRUE"
    replay.supported = "FALSE"
    replay.filename = ""
    scsi0:0.redo = ""
    pciBridge0.pciSlotNumber = "17"
    pciBridge4.pciSlotNumber = "21"
    pciBridge5.pciSlotNumber = "22"
    pciBridge6.pciSlotNumber = "23"
    pciBridge7.pciSlotNumber = "24"
    scsi0.pciSlotNumber = "16"
    ethernet0.pciSlotNumber = "32"
    vmci0.pciSlotNumber = "34"
    vmotion.checkpointFBSize = "16777216"
    ethernet0.generatedAddressOffset = "0" = "1370590182"
    vmi.present = "FALSE"
    ide1:0.present = "FALSE"
    floppy0.present = "FALSE"

    Now let's focus on the changes, it's basically straightforward to understand it but I need to mention:

    change previous VMDK references to newly created hard disk, basically you need to replace “Debian 6.vmdk” with “new.machine.vmdk” everywhere in your file (just one occurrence)

    scsi0:0.fileName = "new.machine.vmdk"

    Now it's time to change the label for your new machine in the Server (ESX,GSX,VSphere) or Player with your favorite name (“My new Machine Name” in my case), here's:

    displayName = "My new Machine Name"

    NVRam file with the new name:

    nvram = "new.machine.nvram"

    Extended configuration file (only if this is present) with the new one:

    extendedConfigFile = "new.machine.vmxf"

    Now change the Ethernet mac address with a new one or your machines cannot be on the same network with the same address (as in real cases), just respect mac address notations and change something random in it

    ethernet0.generatedAddress = "00:0c:29:b1:ab:ab"

    You may change UIDs inside the file but you don't need to bother about them. Save everything and import your newly created/cloned machine inside your favorite player/server.


    Ready, Set, Go!

    Locate your new .VMX file and open it with your Server/Player, you'll see your new machine inside the remote/local repository and you're ready to start it.

    We didn't change the machine UID because it's not necessary, VMWare will do it for us, when you run your machine the first time you'll see a window like this

    Just select “I copied it” button and VMWare will generate the serial UID for the new machine. Now the machine runs an exact copy of your previous one with the same operating system and configurations inside it, please read these hints to solve possible problems:

    • If you're using a static IP address you need to change it in your new machine to avoid conflicts with the previous one (obviously)

    • If you're using a MS Windows OS you need to change the machine name or you'll have a “name duplicated” error when you start the machine, just change the name and make a new reboot

    • If you're serving clients with a basic service (DHCP, DNS, MS Domain Controller) you need to stop it or you'll have few network troubles (as with real servers) due to two services running in the same network (two DHCP servers in the same net are a bad thing...)

    • udev troubles and linux networking, please read below if you're running a perfect Linux machine but without networking capabilities


    No networking ? Please read

    Everything is fine with your new virtual machine but.... you don't have a network card properly configured ? Keep reading.

    If you're using UDEVD ( you may have a problem, it's just a minor trouble as you'll see.

    UDEVD defines plugged network cards in a proper configuration file, network cards are located generally in /etc/udev/rules.d, there's a file called: “z25_persistent-net.rules” | “70-persistent-net.rules” (Debian | Gentoo) or something like that, it's not hard to find it (or let me know and I'll add your information here), generically it's called *persistent-net.rules, let's see it to understand how it works:

    ~$ cat /etc/udev/rules.d/70-persistent-net.rules
    # This file was automatically generated by the /lib64/udev/write_net_rules
    # program run by the persistent-net-generator.rules rules file.
    # You can modify it, as long as you keep each rule on a single line.

    # PCI device 0x10ec:0x8168 (r8169)
    SUBSYSTEM=="net", DRIVERS=="?*", ATTR{address}=="00:21:85:c1:79:37", KERNEL=="eth*", NAME="eth0"


    Here's the “old” network card (the one you cloned), it's called “eth0” and this card is not available any more (you've just changed the mac address), you may:

    • Delete the line reporting the “eth0” device, just delete this line

    • Change the line with the proper mac address (the one you've changed in the VMX file)

    I usually prefer to delete the line so a new one will be created for you on the next reboot


    NOTE: If you've a line with “eth1” device and you don't have two network cards it means UDEVD has created the line for your with your new (according to him) network card and it left the previous one already there and configured, you may remove eth0 line and rename eth1 to eth0 OR delete both lines. UDEVD will recreate what it needs on the next reboot, don't change your network configuration (etc/networking and so on), just leave UDEVD with the proper card and you'll see it running fine from the next boot

    NOTE SAMPLE: If you're following my own example with a Debian 6 installation you don't need to worry about udev, previous versions (etc, lenny, …) are affected by this

    : After few comments reported on this blog I've decided to write a new blog with an automated script, the script does everything reported it by itself, check it out here

    I hope this small guide will assist you in some way if you decide to clone your VMWare machines on your own, file formats are basically the same for a long time and that's what I do for basic sys Admin when I don't have the hypervisor or proper tools with me.

    Share your comments

    Hope it helps


    Andrea (Ben) Benini


    Page 5 of 9

    Upcoming Linux Foundation Courses

    1. LFD312 Developing Applications For Linux
      16 Feb » 20 Feb - Atlanta - GA
    2. LFD331 Developing Linux Device Drivers
      16 Feb » 20 Feb - San Jose - CA
    3. LFS220 Linux System Administration
      16 Feb » 19 Feb - Virtual

    View All Upcoming Courses

    Who we are ?

    The Linux Foundation is a non-profit consortium dedicated to the growth of Linux.

    More About the foundation...

    Frequent Questions

    Join / Linux Training / Board