Home Blog Page 714

Using Tesseract on Ubuntu

In this tutorial, I will show you how to install and use Google’s Open Source OCR engine Tesseract. First off, let’s discuss step by step procedure to install Tesseract on Ubuntu.

1. Installation

1.1 Installing Dependencies

First of all we need to install all the dependencies that are required by Tesserect. Please do not skip any command.

$ sudo apt-get install libpng-dev libjpeg-dev libtiff-dev zlib1g-dev
$ sudo apt-get install gcc g++
$ sudo apt-get install autoconf automake libtool checkinstall

We need image processing toolkit Leptonica to build Tesseract. During the writing of this tutorial, the latest version of Leptonica was 1.73, please check Leptonica’s official website for the latest version. Installing the latest version is highly recommended.

$ cd ~
$ wget http://www.leptonica.org/source/leptonica-1.73.tar.gz
$ tar -zxvf leptonica-1.73.tar.gz
$ cd leptonica-1.73 
$ ./configure
$ make
$ sudo checkinstall
$ sudo ldconfig


1.2 Compiling Tesseract

First make sure that you have git installed on your machine. If not, you can install it by running:

$ sudo apt-get install git

Now clone Tesseract and build it.

$ cd ~ 
$ git clone https://github.com/tesseract-ocr/tesseract.git
$ cd tesseract
$ ./autogen.sh
$ ./configure
$ make
$ sudo make install 
$ sudo ldconfig


1.3 Installing Language Components

Now download English language data for the OCR engine:

$ cd ~
$ git clone https://github.com/tesseract-ocr/tessdata.git 
$ sudo mv ~/tessdata/* /usr/local/share/tessdata/

Tesserect has finally been installed and configured! Now let’s run it.

2. Running It

Select an image with a text, and then run this command in the console (assuming img.png is the input filename):

$ tesseract img.png out

The text read will be saved in out.txt file in the same folder. 

3. Using Python and Tesserect

Python-tesseract is a python wrapper for google’s Tesseract-OCR. First to install pip, follow these instructions.

$ sudo apt-get update
$ sudo apt-get -y install python-pip

Then to install pytesseract,

$ sudo pip install pytesseract

ALTERNATIVELY, if you want to download and install it from its source:

$ git clone git@github.com:madmaze/pytesseract.git 
$ sudo python setup.py install

Now let’s write a simple python program to read a simple captcha:

from pytesseract import image_to_string 
import Image
print image_to_string(Image.open('captcha.png'))

Then,

$ python test.py
be5f

Assuming you named it test.py, the output text will be printed on your screen.


This article was contributed by a student at Holberton School

What is DevOps? Gene Kim Explains

Gene Kim is an author of the popular DevOps Novel, The Phoenix Project, and the upcoming DevOps Handbook, currently scheduled for release in October. He was formerly the founder and CTO of Tripwire, but these days you can find him writing books, organizing the DevOps Enterprise Summit, and working on research and other projects as a co-founder of IT Revolution.

Linux.com: Why are so many organizations embracing DevOps?

Gene Kim: I think the simplest answer is that the business value of adopting DevOps is even higher than we thought! From 2013 through 2016, as part of the Puppet “State Of DevOps Report” initiative, along with Jez Humble, Dr. Nicole Forsgren, Alanna Brown and Nigel Kersten, we’ve surveyed over 25,000 technology professionals with the goal of better understanding the health and habits of organizations at all stages of DevOps adoption.

The first surprise the data from the 2015 and 2016 reports revealed was how much the high-performing organizations using DevOps practices were outperforming their non-high-performing peers in the following areas:

  • Throughput metrics
    • More frequent code and change deployments (200x more frequent)
    • Faster code and change deployment lead time (255x faster)
  • Reliability metrics
    • More successful production deployments (3x lower change failure rate)
    • Faster mean time to restore service (24x faster MTTR)
  • Organizational performance metrics
    • Productivity, market share, and profitability goals (2 times more likely to exceed)
    • Market capitalization growth (50% higher over three years)

In other words, the high performers were both more agile and more reliable. And furthermore, high performers also were twice as likely to exceed profitability, market share, and productivity goals, and for those organizations that provided a stock ticker symbol, we found that the high performers had 50% higher market capitalization growth over three years. They also had higher employee job satisfactions, lower rates of employee burnout.

All that makes for a compelling reason for doing things a better way than we’ve traditionally done for decades!

Linux.com: Why are individuals interested in participating?

Gene Kim is author of The Phoenix Project and co-founder of IT Revolution.
Gene: That is a great question! I think there are two findings that came out of the 2015 and 2016 reports that may explain why so many of us are so passionate about DevOps. We found in 2015, high performers had significantly lower levels of burnout than lower performers. Burnout is an important issue in IT, with serious repercussions for the mental and physical health of practitioners. Research shows that stressful jobs can be as bad for physical health as smoking and obesity. Symptoms of burnout include feeling exhausted, cynical or ineffective; feeling little or no sense of accomplishment in your work; and feelings about your work negatively affecting the rest of your life. In extreme cases, burnout can lead to family issues, severe clinical depression and even suicide.

That’s on the negative size — on the positive side, we found in 2016 that employees in high performers were 2.2x more likely to recommend their organization to a friend as a great place to work. The specific measure is called the “employee Net Promoter Score,” which other studies have shown that this is correlated with better business outcomes.

I think this shows that work is simply more fun and satisfying when you can quickly see the outcomes of your work, whether you’re Dev, Test, Ops, or even Information Security!

Linux.com: What is the overwhelming hurdle?

Gene: This was a question that we wanted to understand, as well. For the last three years, I’ve been hosting the DevOps Enterprise Summit, which is focused on leaders who are driving DevOps transformations in large, complex organizations. Over the years, we’ve had over 150 speakers present their amazing experience reports, each given in a very specific form:

  • what business problem were they trying to solve
  • where did they start and why
  • what did they do
  • what were their outcomes and what did they learn
  • what do they still not know how to do and what are they looking for help with

The last question was there so that we could understand the largest obstacles facing the DevOps Enterprise community. By knowing this, we can create a research agenda of the problems that we need better answers on. Here are some of the top issues that have come up in the last three years:

  • How do we build automated testing for legacy applications?
  • What are modern architectural and technical practices that every technology leader needs to know about? What are best ways to transition existing practices, including metrics, reskilling the workforce, and mitigating risks?
  • What do the organization charts look like for organizations successfully adopting DevOps? What are the respective roles and responsibilities, and how has it changed from more traditional IT organizations?
  • What are effective strategies and methods for leading change in large organizations?
  • What are concrete ways for DevOps to bridge the information security and compliance gap, to show auditors and regulators that effective controls exist to effectively prevent, detect and correct problems?

Over the years, we’ve assembled some of the best thinkers and doers in the community to generate written guidance on these topics, and released them at the DevOps Enterprise Summit. We have made their guidance available as a series of documents.

Linux.com: What advice would you give to people who want to get started in DevOps?

Gene: In my mind, there’s never been a more fun time to be in technology! So many of the things that prevented us from building cool things are so easy now, whether it’s development frameworks, continuous deployment pipeline tools, deployment automation, monitoring, etc. And especially in Operations, I think the best times are ahead of us, not behind us.

I think the fastest way to learn is conferences, because you’re surrounded by kindred spirits and fellow travelers. One of my favorite sayings is, “You’re only as smart as the top five people you hang out with,” and conferences are a great place to find these people. Among my favorites are the Velocity Conference, GOTOcon, and the DevOps Enterprise Summit.

Also, books are great. Some of my favorites include

  • Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation by Jez Humble and David Farley
  • Kanban: Successful Evolutionary Change for Your Technology Business by David J. Andersen
  • The Goal: A Process of Ongoing Improvement by Dr. Eliyahu Goldratt
  • Release It!: Design and Deploy Production-Ready Software by Michael Nygard
  • And of course, the upcoming DevOps Handbook: How to Create World-Class Agility, Reliability, and Security in Technology Organizations by Jez Humble, Patrick Debois John Willis, and me.

Read more interviews with DevOps experts:

Bridget Kromhout is a global core organizer for DevOpsDays, is on the program committee for Velocity, and is a Principal Technologist for Cloud Foundry at Pivotal.

Mark Imbriaco has spent the past 20 years working at some of the most interesting and innovative companies in the industry, including 37Signals, GitHub, and DigitalOcean before moving on to become Co-Founder and CEO at Operable. 

 

IoT and Multi-Cloud Take Center Stage at Upcoming Cloud Foundry Summit

Cloud Foundry Foundation, a collaborative project of the Linux Foundation, is organizing the next Cloud Foundry Summit in Frankfurt (Germany),between September 26 – 28, 2016.  We caught up with Sam Ramji, the CEO of Cloud Foundry Foundation to learn more about the upcoming summit. Here is an edited version of that interview:

The Cloud Foundry European Summit is focused on IoT and multi-cloud – two major IT trends Cloud Foundry is at the heart of. For multi-cloud, it’s simple – Cloud Foundry is the easiest way for Global 2000 companies to be up and running workloads on whichever public or private clouds best fit what that business is trying to do. For IoT, there is a clear spike in growth of this industry across Europe. With companies like Bosch, Siemens, GE, and VW (to name a few) using Cloud Foundry as the platform for IoT, and Germany’s focus on Industry 4.0, it’s sure to be a big topic of conversation for our Frankfurt event. 

Read more at CIO

Getting Blockchain Technology Enterprise-Ready

Editor’s Note: This article is paid for by IBM as a Diamond-level sponsor of LinuxCon North America, held Aug. 22-24, 2016, and was written by Linux.com.

Blockchain technology first burst onto the scene as the underpinning of Bitcoin digital currency. Since then, open source distributed ledger technology has continued to evolve into an unparalleled asset tracker. It brings new efficiencies and much-needed transparency to online transactions in a world where assets move and change hands at Internet speeds.

Blockchain is poised to go mainstream and forever change how business is done. It now has a defined purpose in tracking asset exchanges and histories in nearly every industry from banking, to wholesale and retail, supply chains, manufacturing, health care, legal, the Internet of Things, government, and more.

In this interview Donna Dillenberger, an IBM Fellow and part of the Watson Research Center, discusses where blockchain technology stands today. IBM is also a member of The Linux Foundation’s Hyperledger project – the distributed blockchain technology – and we asked her about that project, too.

Linux.com: IBM contributed to The Linux Foundation’s Hyperledger project. Tell us about that and how that project is being received thus far.

Donna Dillenberger:  Hyperledger is one of the most popular open source projects in history. In the first week of its launch, hundreds of companies donated code. IBM alone donated 44,000 lines of code. It provides a vehicle for anyone to collaborate on features and the results are so much better than what any one vendor could do alone.

Linux.com: IBM has made several blockchain-related announcements recently with heavy focus on the cloud aspect and security. Security is often approached as a last minute paste-on in a number of technologies, but considering that blockchain’s applications concern high-value transactions, and security was its claim to fame, what was overlooked in terms of security or why does it need more security?

Dillenberger:  Blockchain was originally designed for digital currency transactions, specifically Bitcoin. But now blockchain’s uses are expanded to include uses in organizations, like health care and government that have additional and different security requirements and specific compliance needs. To ensure that blockchain is truly enterprise-ready for different verticals and different applications with different rules, more security and compliance issues must be firmly addressed.

For example, blockchain members may require their data to be encrypted and not visible to all public entities.  Hyperledger blockchain provides a configurable option to encrypt data saved on the blockchain.  

And, Blockchain participants may require that all records be signed so anonymous users would not be allowed.  Hyperledger blockchain provides anonymity and when configured, would require that records are digitally signed.  Identity verification is also very important.  For verification, users need to see that a blockchain transaction comes from you and not someone who’s impersonating you.

Blockchain participants may also want to specify which subset of blockchain users can see data on the blockchain.  This is also a configuration option that allows Hyperledger blockchain members to optionally set permission controls to the blockchain data.   

Linux.com: Does blockchain have consumer applications as well? Or, if not yet, what might future consumer applications look like?

Dillenberger:  Enterprises service consumers, so they’re actually one and the same in that aspect. Ensuring that data is secure and accurate, products are authentic and up to par, services are delivered as promised, and that transactions are fast and secure, benefits both enterprises and consumers.

The interesting thing is that we consumers can own our own data again with blockchain. Consumers can put their medical records, car data, and other information in blocks and ensure it is all accurate and never tampered with, which they can then share with enterprises or health care professionals or other entities as they see fit and for a variety of purposes.

Blockchain can also help consumers in other ways. For example, users can verify where the ingredients in their food come from and they can see information about their food down to the farm level. They can see if pesticides or antibiotics were used, whether the food was raised in an environmentally friendly or humane way, and whether employees along the chain were treated fairly and working in good conditions.

The same is true of other goods. For example, consumers could identify which diamonds came from conflict free mines and where they were polished and traded and certified.  We have one IBM customer building a digital ledger for diamonds using blockchain.

Consumers could also see who owned an asset in the past that they are buying now – for example, a painting, house or a used car.

The records on blockchain are protected so you can look at reliable information and make purchasing decisions based on the things you care about.

Linux.com: Since blockchain is so useful for so many different use cases and industries, why isn’t it mainstream already?

Dillenberger:  It’s technically challenging to add the needed security and compliance layers, as well as new sophisticated features, to blockchain to optimize it for enterprise use. Here at IBM it took many months to do the work thus far. So adapting it to many different use cases and applications takes a while.

But organizations in many industries are testing blockchain technology now to see where and how it best fits for them. The interest and response has been immense. Just like we saw with the popularity of The Linux Foundation’s Hyperledger project from its start, you can expect blockchain to become increasingly prevalent soon.

 

This article is sponsored by IBM LinuxONE, an open-source Linux server. Learn more about IBM LinuxONE for Blockchain, and receive a complimentary report from Constellation Research on Blockchain Security. 

 

 

VMworld 2016: VMware Pushes Hybrid Cloud and SDDC with New Cross-Cloud Architecture

VMware is launching a new Cross-Cloud architecture geared toward improving software-defined data center (SDDC) and hybrid cloud deployments, the company announced on Monday at the 2016 VMworld conference in Las Vegas, NV.

According to VMware documents, the new architecture could extend the company’s current hybrid cloud strategy by more effectively allowing customers to run apps in multiple clouds, within a common operating environment. This includes apps running in public clouds like Amazon Web Services (AWS) and Microsoft Azure.

The architecture “enables consistent deployment models, security policies, visibility, and governance for all applications, running on-premises and off, regardless of the underlying cloud, hardware platform or hypervisor,” according to a VMware press release.

Read more at Tech Republic

Carriers Going All-In: How NFV And SDN Will Evolve The Telecom Industry

Service providers have historically relied on dedicated hardware to deliver their cloud-based functions. But software-defined networking (SDN) and network functions virtualization (NFV) are freeing up carriers to use virtualized appliances or less expensive hardware to deliver the same services. As such, most service providers — 100 percent, to be exact — say they have plans to inject NFV into their networks, if they haven’t already, according to a recent report from market research firm IHS Markit.

The opportunities around virtualization are far-reaching for service providers and it’s trickling down to their partners and end customers. Here’s a few of the carriers that are aggressively investing in, and evolving their networks with the help of, SDN and NFV technology.

Read more at CRN

‘MegaMIMO 2.0’ Wireless Routers Work Together to Triple Bandwidth and Double Range

Wireless interference is one of those things that we tend to not think about, because, well, we can’t see it. But routers are all over the place, sometimes several in a room when you’re in an office, conference, or campus — and make no mistake, it’s an epic battle at the frequencies they share.

Some enterprising researchers have found a way to make those routers work together, though. Dina Katabi and her team at MIT”s Computer Science and Artificial Intelligence Laboratory call it MegaMIMO 2.0, and they claim some pretty serious improvements: three times better data transfer speeds and doubled range.

Read more at TechCrunch

On Complexity in Big Data

In February of this year, I published the article, Is my developer team ready for big data?, with a figure representing the subjective complexity of mobile, cloud, big data, and NoSQL technologies…

My editor for that article, Marie Beaugureau, pushed back on the figure—and rightly so. What’s the scale we’re using here? What makes big data and NoSQL more complex than cloud or mobile?

I attribute the complexity of big data to two primary reasons. The first being that you need to know 10 to 30 different technologies, just to create a big data solution. The second reason is that distributed systems are just plain hard.

Read more at O’Reilly

Exascale Computing – What are the Goals and the Baseline?

In this video from the 4th Annual MVAPICH User Group, Thomas Schulthess from CSCS presents: Exascale Computing – What are the Goals and the Baseline?

“Implementation of exascale computing will be different in that application performance is supposed to play a central role in determining the system performance, rather than just considering floating point performance of the high-performance Linpack benchmark. This immediately raises the question as to what the yardstick will be, by which we measure progress towards exascale computing. 

Read more at insideHPC

8 Best and Most Popular Linux Desktop Environments of All Time

One exciting aspect of Linux unlike with Windows and Mac OS X, is its support for numerous number of desktop environments, this has enabled desktop users to choose the appropriate and most suitable desktop environment to best work with, according to their computing needs.

Desktop Environment is an implementation of the desktop metaphor built as a collection of different user and system programs running on top of an operating system, and share a common GUI (Graphical User Interface), also known as a graphical shell.

[[ This is a content summary only. Visit my website for full links, other content, and more! ]]

Read complete article