Linux.com

Home Linux Community Community Blogs

Community Blogs



Setup Apache 2.4 and Php FPM with mod proxy fcgi on Ubuntu 13.10

mod_proxy_fcgi The module mod_proxy_fcgi is a new one and it allows apache to connect to/forward requests to an external fastcgi process manager like php fpm. This allows for a complete separation between the running of php scripts and Apache. Earlier we had to use modules like mod_fcgid and mod_fastcgi which all had some limitations. Mod_fcgid for example did not properly utilise the process management capability of php-cgi whereas mod_fastcgi is a third party module. With the arrival of mod_proxy_fcgi Apache...
Read more... Comment (0)
 

Three Best Network Programming Debugging Tools

3 Best Network Programming Debugging Tools
==========================================

It is always time consuming if we don't use the right network debugging tools when do we socket programming or trying to run a client server program for the first time.

When we do network programming sometimes you want to know why send() from your client or
serverfailing, why I'm not re-start my server program, whether any other process is already using the port that
you are planning to use for your server etc.

There are many Tools available today in Linux. But we will see the most important 3 Tools in this article.

I.netstat
=========

Netstat command displays various network related information such as network connections, routing tables, interface statistics, masquerade connections, multicast memberships etc

1) Show the list of network interfaces

OpenSuse12.3#netstat -i
Kernel Interface table
Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
eth0 1500 0 0 0 0 0 0 0 0 0 BMU
lo 65536 0 45 0 0 0 45 0 0 0 LRU
wlan0 1500 0 25092 0 0 0 22958 0 0 0 BMRU

2) To list all Ports(both listening and non-listening, TCP, UDP, Unix)

OpenSuse12.3#netstat -at
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:smtp *:* LISTEN
tcp 0 0 192.168.1.5:6688 *:* LISTEN
tcp 0 0 192.168.1.5:49875 safebrowsing:www-http ESTABLISHED
tcp 0 1 192.168.1.5:60804 fls.doubleclic:www-http FIN_WAIT1
tcp 0 0 192.168.1.5:43589 safebrowsing.c:www-http ESTABLISHED
tcp 0 0 *:33532 *:* LISTEN
unix 2 [ ACC ] STREAM LISTENING 8645 /var/run/sdp
unix 2 [ ] DGRAM 12241

3) List only TCP Port

OpenSuse12.3#netstat -at
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:smtp *:* LISTEN
tcp 0 0 192.168.1.5:6688 *:* LISTEN
tcp 0 0 localhost:ipp *:* LISTEN
tcp 0 0 *:33532 *:* LISTEN

Similarly for UDP, "netstat -au"

3) List the Sockets which are in Listening state

OpenSuse12.3#netstat -l
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 0 0 localhost:smtp *:* LISTEN
tcp 0 0 192.168.1.5:6688 *:* LISTEN
tcp 0 0 *:52980 *:* LISTEN
tcp 0 0 localhost:ipp *:* LISTEN
tcp 0 0 *:33532 *:* LISTEN
Active UNIX domain sockets (only servers)
Proto RefCnt Flags Type State I-Node Path
unix 2 [ ACC ] STREAM LISTENING 10227 private/scache
unix 2 [ ACC ] STREAM LISTENING 11714 @/tmp/dbus-NmyF9Qx2gH

List only listening TCP Ports using netstat -lt
List only listening UDP Ports using netstat -lu
List only the listening UNIX Ports using netstat -lx

4) Display PID and program names in netstat output using netstat -p

5) Print netstat information continuously
netstat -c

6) Find out on which port a program is running

OpenSuse12.3#netstat -ap | grep servermine
tcp 0 0 192.168.1.5:6688 *:* LISTEN 2135/servermine

II. tcpdump
===========
tcpdump allows us to capture all packets that are received and sent. This helps us to see what all tcp segments are
sent and received (like SYN, FIN, RST etc) and can understand the root cause of our issue.

1) Capture packets from a particular ethernet interface using tcpdump -i
tcpdump capture for a simple tcp client & server example starting from SYN to FIN/ACK with one data packet in between.

OpenSuse12.3#tcpdump -i lo
11:05:27.026304 IP 192.168.1.5.34289 > 192.168.1.5.6688: Flags [S], seq 1990318384, win 43690, options [mss 65495,sackOK,TS val 6116309 ecr 0,nop,wscale 7], length 0
11:05:27.026331 IP 192.168.1.5.6688 > 192.168.1.5.34289: Flags [S.], seq 3856734826, ack 1990318385, win 43690, options [mss 65495,sackOK,TS val 6116309 ecr 6116309,nop,wscale 7], length 0
11:05:27.026357 IP 192.168.1.5.34289 > 192.168.1.5.6688: Flags [.], ack 1, win 342, options [nop,nop,TS val 6116309 ecr 6116309], length 0
11:05:27.026689 IP 192.168.1.5.6688 > 192.168.1.5.34289: Flags [P.], seq 1:27, ack 1, win 342, options [nop,nop,TS val 6116310 ecr 6116309], length 26
11:05:27.026703 IP 192.168.1.5.34289 > 192.168.1.5.6688: Flags [.], ack 27, win 342, options [nop,nop,TS val 6116310 ecr 6116310], length 0
11:05:27.026839 IP 192.168.1.5.34289 > 192.168.1.5.6688: Flags [F.], seq 1, ack 27, win 342, options [nop,nop,TS val 6116310 ecr 6116310], length 0
11:05:27.027445 IP 192.168.1.5.6688 > 192.168.1.5.34289: Flags [.], ack 2, win 342, options [nop,nop,TS val 6116311 ecr 6116310], length 0
11:05:32.026898 IP 192.168.1.5.6688 > 192.168.1.5.34289: Flags [F.], seq 27, ack 2, win 342, options [nop,nop,TS val 6121310 ecr 6116310], length 0
11:05:32.026920 IP 192.168.1.5.34289 > 192.168.1.5.6688: Flags [.], ack 28, win 342, options [nop,nop,TS val 6121310 ecr 6121310], len

2) Capture only N number of packets using tcpdump -c
OpenSuse12.3#tcpdump -c 100 -i lo
capture only 100 packets

3) Capture the packets and write into a file using tcpdump -w
OpenSuse12.3# tcpdump -w myprogamdump.pcap -i lo
tcpdump: listening on lo, link-type EN10MB (Ethernet), capture size 65535 bytes
9 packets captured
18 packets received by filter
0 packets dropped by kernel

4) Reading/viewing the packets from a saved file using tcpdump -r
OpenSuse12.3#tcpdump -tttt -r myprogamdump.pcap
reading from file myprogamdump.pcap, link-type EN10MB (Ethernet)
2013-11-30 11:12:55.019872 IP 192.168.1.5.34290 > 192.168.1.5.6688: Flags [S], seq 2718665633, win 43690, options [mss 65495,sackOK,TS val 6564303 ecr 0,nop,wscale 7], length 0
2013-11-30 11:12:55.019899 IP 192.168.1.5.6688 > 192.168.1.5.34290: Flags [S.], seq 2448605009, ack 2718665634, win 43690, options [mss 65495,sackOK,TS val 6564303 ecr 6564303,nop,wscale 7], length 0
2013-11-30 11:12:55.019929 IP 192.168.1.5.34290 > 192.168.1.5.6688: Flags [.], ack 1, win 342, options [nop,nop,TS val 6564303 ecr 6564303], length 0
2013-11-30 11:12:55.020228 IP 192.168.1.5.6688 > 192.168.1.5.34290: Flags [P.], seq 1:27, ack 1, win 342, options [nop,nop,TS val 6564303 ecr 6564303], length 26
2013-11-30 11:12:55.020243 IP 192.168.1.5.34290 > 192.168.1.5.6688: Flags [.], ack 27, win 342, options [nop,nop,TS val 6564303 ecr 6564303], length 0
2013-11-30 11:12:55.020346 IP 192.168.1.5.34290 > 192.168.1.5.6688: Flags [F.], seq 1, ack 27, win 342, options [nop,nop,TS val 6564303 ecr 6564303], length 0
2013-11-30 11:12:55.020442 IP 192.168.1.5.6688 > 192.168.1.5.34290: Flags [.], ack 2, win 342, options [nop,nop,TS val 6564304 ecr 6564303], length 0
2013-11-30 11:13:00.020477 IP 192.168.1.5.6688 > 192.168.1.5.34290: Flags [F.], seq 27, ack 2, win 342, options [nop,nop,TS val 6569304 ecr 6564303], length 0
2013-11-30 11:13:00.020506 IP 192.168.1.5.34290 > 192.168.1.5.6688: Flags [.], ack 28, win 342, options [nop,nop,TS val 6569304 ecr 6569304], length 0

5) Receive only the packets of a specific protocol type like arp, tcp, udp, ip etc

OpenSuse12.3#tcpdump -i wlan0 ip
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on wlan0, link-type EN10MB (Ethernet), capture size 65535 bytes
11:18:04.193704 IP 132.213.238.6.http > 192.168.1.5.32991: Flags [.], seq 2723848246:2723849686, ack 3820601748, win 6432, options [nop,nop,TS val 786299612 ecr 6873162], length 1440
11:18:04.194241 IP 192.168.1.5.50414 > 192.168.1.1.domain: 36798+ PTR? 5.1.168.192.in-addr.arpa. (42)
11:18:04.196315 IP 132.213.238.6.http > 192.168.1.5.32991: Flags [P.], seq 1440:2880, ack 1, win 6432, options [nop,nop,TS val 786299612 ecr 6873162], length 1440

5) Receive packets flows on a particular port using tcpdump port
tcpdump -i eth0 port 4040

6) Capture packets for particular destination IP and Port
tcpdump -w mypackets.pcap -i eth0 dst 192.168.1.6 and port 22

III. lsof
=========
lsof meaning 'LiSt Open Files' is used to find out which files are open by which process. As we all know Linux/Unix considers everything as a files (pipes, sockets, directories, devices etc).
So by using lsof, you can get the information about any opened files. But here we primarily see options related to
network files.

1) List all network connections

OpenSuse12.3#lsof -i
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 32u IPv6 6955 0t0 TCP *:ipp (LISTEN)
avahi-dae 475 avahi 11u IPv4 9245 0t0 UDP *:mdns
avahi-dae 475 avahi 14u IPv6 9248 0t0 UDP *:46627
master 766 root 12u IPv4 10100 0t0 TCP localhost:smtp (LISTEN)

2) List processes which are listening on a particular port

OpenSuse12.3#lsof -i :6688
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
servermine 3127 prince 3u IPv4 1256979 0t0 TCP 192.168.1.5:6688 (LISTEN)

3) List all TCP or UDP connections

OpenSuse12.3#lsof -i tcp; lsof -i udp
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 32u IPv6 6955 0t0 TCP *:ipp (LISTEN)
master 766 root 12u IPv4 10100 0t0 TCP localhost:smtp (LISTEN)
master 766 root 13u IPv6 10102 0t0 TCP localhost:smtp (LISTEN)
gnome-ses 800 prince 13u IPv6 11789 0t0 TCP *:33532 (LISTEN)
gnome-ses 800 prince 14u IPv4 11790 0t0 TCP *:52980 (LISTEN)
cupsd 1029 root 4u IPv6 6955 0t0 TCP *:ipp (LISTEN)
cupsd 1029 root 10u IPv4 12739 0t0 TCP localhost:ipp (LISTEN)

4) List all IPv4 and IPv6 network files

OpenSuse12.3#lsof -i 4
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 33u IPv4 6956 0t0 UDP *:ipp
avahi-dae 475 avahi 11u IPv4 9245 0t0 UDP *:mdns
avahi-dae 475 avahi 13u IPv4 9247 0t0 UDP *:37715
master 766 root 12u IPv4 10100 0t0 TCP localhost:smtp (LISTEN)
gnome-ses 800 prince 14u IPv4 11790 0t0 TCP *:52980 (LISTEN)
dhclient 926 root 6u IPv4 12038 0t0 UDP *:bootpc

OpenSuse12.3#lsof -i 6
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 32u IPv6 6955 0t0 TCP *:ipp (LISTEN)
avahi-dae 475 avahi 12u IPv6 9246 0t0 UDP *:mdns
avahi-dae 475 avahi 14u IPv6 9248 0t0 UDP *:46627
master 766 root 13u IPv6 10102 0t0 TCP localhost:smtp (LISTEN)
gnome-ses 800 prince 13u IPv6 11789 0t0 TCP *:33532 (LISTEN)
dhclient 926 root 21u IPv6 12022 0t0 UDP *:55332
cupsd 1029 root 4u IPv6 6955 0t0 TCP *:ipp (LISTEN)

5) To list all the running process of open files of TCP Port ranges from 1-1024

OpenSuse12.3#lsof -i TCP:1-1024
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root 32u IPv6 6955 0t0 TCP *:ipp (LISTEN)
master 766 root 12u IPv4 10100 0t0 TCP localhost:smtp (LISTEN)
master 766 root 13u IPv6 10102 0t0 TCP localhost:smtp (LISTEN)
cupsd 1029 root 4u IPv6 6955 0t0 TCP *:ipp (LISTEN)
cupsd 1029 root 10u IPv4 12739 0t0 TCP localhost:ipp (LISTEN)

6) List all network files in use by a specific process
OpenSuse12.3#lsof -i -a -p 234

7) list of all open files belonging to all active processes

OpenSuse12.3#lsof

COMMAND PID TID USER FD TYPE DEVICE SIZE/OFF NODE NAME
systemd 1 root cwd DIR 8,6 4096 2 /
systemd 1 root rtd DIR 8,6 4096 2 /
systemd 1 root mem REG 8,6 126480 131141 /lib64/libselinux.so.1
systemd 1 root mem REG 8,6 163493 131128 /lib64/ld-2.17.so
systemd 1 root 0u CHR 1,3 0t0 2595 /dev/null
systemd 1 root 6r DIR 0,18 0 3637 /sys/fs/cgroup/systemd/system
systemd 1 root 16u unix 0xffff88007c0ec100 0t0 3857 socket

 

How to setup Remi repository on CentOS 5/6 and Fedora 18/19/20

Remi repository The Remi (Les RPM de Remi) repository provides the latest version of various software packages related to php and mysql for redhat based linux distros like centos, fedora and RHEL. It provides php, mysql, pecl packages, pear packages any many other open source/free php applications and libraries and many other php related packages. So its designed to assist in setting up apache+php based web servers with various kinds of open source applications. The default centos/fedora distros do no have...
Read more... Comment (0)
 

Power of Linux Top Command with its Options

Linux top command is one of most powerful in built tool that all system adminstrator use to monitor the system health status everyday. Its also really important to under each parameters in the top command. There are lot of options available, that are really handy to understand system's behaviour. Refer this nice article witten by Raghu Sharma on linux top command and its options.

 

How to install FFmpeg on CentOS, RHEL and Ubuntu

FFmpeg is a cross-platform solution for streaming audio and video as well as recording and conversion.  This article will describe you to how to install FFmpeg on CentOS/RHEL 6/5 and Ubuntu 12.04/12.10 systems with easy steps. Also provides basic uses of ffmpeg.

 

Read complete artile at  How to install FFmpeg on CentOS, RHEL and Ubuntu

 

Android Asynctask Internal - Half Sync Half Async Design Pattern

The way Asynctask has been implemented in Android, is an apt example of Half Sync - Half Async pattern described in Pattern Oriented Software Architecture or POSA2. First of all, let us try to understand what we mean by Half Sync-Half Async design pattern. Half Sync- Half Async pattern is a specific way how we want to structure our threads in a multithreaded application. As the name suggests, in this pattern we divide the solution of managing multiple threads into two different specific zones - one is synchronous and the other is an asynchronous zone. The main thread communicates with a thread asynchronously and this thread is responsible for queuing multiple tasks in a FIFO order. This thread then pushes these tasks on the synchronous layer to different threads taken from a thread pool. These thread pools then execute these tasks synchronously in the background. The whole process can be depicted by the following diagram.



If you are new to the terms Asynchronous and Synchronous in the conjunction of multithreaded application let me throw some lights on it. These can be best understood in a client-server architecture perspective. A synchronous function means it will block the caller of it and will return only after it finishes its task. On the other hand an asynchronous function starts its task in the background but returns immediately to the caller. When it finishes the background task, it notifies the caller about it asynchronously and then the caller takes action.


The two scenarios have been depicted by the following two diagrams.



Now let us dive deep into the Android’s Asynctask.java file to understand it from a designer’s perspective and how it has nicely implemented Half Sync-Half Async design pattern in it.


In the beginning of the class few lines of codes are as follows:


private static final ThreadFactory sThreadFactory = new ThreadFactory() {

       private final AtomicInteger mCount = new AtomicInteger(1);


       public Thread newThread(Runnable r) {

           return new Thread(r, "AsyncTask #" + mCount.getAndIncrement());

       }

   };


   private static final BlockingQueue<Runnable> sPoolWorkQueue =

           new LinkedBlockingQueue<Runnable>(10);


   /**

   * An {@link Executor} that can be used to execute tasks in parallel.

   */

   public static final Executor THREAD_POOL_EXECUTOR

           = new ThreadPoolExecutor(CORE_POOL_SIZE, MAXIMUM_POOL_SIZE, KEEP_ALIVE,

                   TimeUnit.SECONDS, sPoolWorkQueue, sThreadFactory);

The first is a ThreadFactory which is responsible for creating worker threads. The member variable of this class is the number of threads created so far. The moment it creates a worker thread, this number gets increased by 1.


The next is the BlockingQueue. As you know from the Java blockingqueue documentation, it actually provides a thread safe synchronized queue implementing FIFO logic.


The next is a thread pool executor which is responsible for creating a pool of worker threads which can be taken as and when needed to execute different tasks.


If we look at the first few lines we will know that Android has limited the maximum number of threads to be 128 (as evident from private static final int MAXIMUM_POOL_SIZE = 128).


Now the next important class is SerialExecutor which has been defined as follows:


private static class SerialExecutor implements Executor {

       final ArrayDeque<Runnable> mTasks = new ArrayDeque<Runnable>();

       Runnable mActive;


       public synchronized void execute(final Runnable r) {

           mTasks.offer(new Runnable() {

               public void run() {

                   try {

                       r.run();

                   } finally {

                       scheduleNext();

                   }

               }

           });

           if (mActive == null) {

               scheduleNext();

           }

       }


       protected synchronized void scheduleNext() {

           if ((mActive = mTasks.poll()) != null) {

               THREAD_POOL_EXECUTOR.execute(mActive);

           }

       }

   }

The next important two functions in the Asynctask is

public final AsyncTask<Params, Progress, Result> execute(Params... params) {

       return executeOnExecutor(sDefaultExecutor, params);

   }


and


public final AsyncTask<Params, Progress, Result> executeOnExecutor(Executor exec,

           Params... params) {

       if (mStatus != Status.PENDING) {

           switch (mStatus) {

               case RUNNING:

                   throw new IllegalStateException("Cannot execute task:"

                           + " the task is already running.");

               case FINISHED:

                   throw new IllegalStateException("Cannot execute task:"

                           + " the task has already been executed "

                           + "(a task can be executed only once)");

           }

       }


       mStatus = Status.RUNNING;


       onPreExecute();


       mWorker.mParams = params;

       exec.execute(mFuture);


       return this;

   }


AS it becomes clear from the above code we can call the executeOnExecutor from exec function of Asynctask and in that case it takes a default executor. If we dig into the sourcecode of Asynctask, we will find that this default executor is nothing but a serial executor, the code of which has been given above.


Now lets delve into the SerialExecutor class. In this class we have final ArrayDeque<Runnable> mTasks = new ArrayDeque<Runnable>();.


This actually works as a serializer of the different requests at different threads. This is an example of Half Sync Half Async pattern. it actually puts a task in the end of the mTasks arraydeque and then pushes them to different threads from the threadpool. Look at the code


public void run() {

                   try {

                       r.run();

                   } finally {

                       scheduleNext();

                   }

what it actually does it runs in an endless loop waiting for new tasks to arrive and whenever a new task arrives it calls scheculeNext() which actually runs this task in a thread from the threadpool. However, it allows only one task to be executed at any time. Hence with the default executor we cannot have multiple tasks executing parallelly.


Now lets examine how the serial executor does this. Please have a look at the portion of the code of the SerialExecutor which is written as



if (mActive == null) {

            scheduleNext();


           }

So when the execute is first called on the Asynctask, this code is executed on the main thread (as mActive will be initialized to NULL) and hence it will take us to the scheduleNext() function.

The ScheduleNext() function has been written as follows:

 protected synchronized void scheduleNext() {

           if ((mActive = mTasks.poll()) != null) {

               THREAD_POOL_EXECUTOR.execute(mActive);

           }

       }


In this function we initialize the mActive by the Runnable object of the AraayDeque which is already inserted in that Queue. The the task will be executed in a thread taken from the threadpool. While executing this task in a thread from the threadpool, the finally portion of the run method becomes responsible for calling the Schedulenext() function which polls the queue (nonblocking) and whenever there is a new task it executes that in a thread taken from the threadpool.


To understand why the execute function cannot be called more than once on the same Asynctask, please have a look at the below code snippet taken from ExecutoronExecute function of Asynctask.java especially in the below mentioned portion:


 if (mStatus != Status.PENDING) {

           switch (mStatus) {

               case RUNNING:

                   throw new IllegalStateException("Cannot execute task:"

                           + " the task is already running.");

               case FINISHED:

                   throw new IllegalStateException("Cannot execute task:"

                           + " the task has already been executed "

                           + "(a task can be executed only once)");

           }

       }

AS from the above code snippet it becomes clear that if we call execute function twice when a task is in the running status it throws an IllegalStateException saying “Cannot execute task: the task is already running.”.


if we want multiple tasks to be executed parallely, we need to call the execOnExecutor passing Asynctask.THREAD_POOL_EXECUTOR (or maybe an user defined THREAD_POOL as the exec parameter. And if we do that, the sPoolWorkQueue becomes responsible for the Half Sync Half Async pattern for Asynctask.

 

Fedora Codecs: installation

Hi. This little guide will explain how to install the most used codecs on fedora using the command line and the rpmfusion repository.
1.  We need the rpmfusion repository installated. If we haven't them, we'll install the free and non-free repository typing:                                                su -c 'yum localinstall –nogpgcheck http://download1.rpmfusion.org/free/fedora/rpmfusion-free-release-$(rpm -E %fedora).noarch.rpm http://download1.rpmfusion.org/nonfree/fedora/rpmfusion-nonfree-release-$(rpm -E %fedora).noarch.rpm
2. Now we can install the codecs.                                                       For Gtk: su -c "yum install gstreamer-plugins-bad gstreamer-plugins-ugly gstreamer-ffmpeg gstreamer-plugins-crystalhd gstreamer1-plugins-bad-freeworld gstreamer1-plugins-bad-free gstreamer1-plugins-good gstreamer1-libav gstreamer1-plugins-ugly"            For Qt:  su -c "yum install xine-lib-extras xine-lib-extras-freeworld k3b-extras-freeworld”
 

How to setup EPEL repository on CentOS 5/6

What is EPEL repository EPEL (Extra Packages for Enterprise Linux) is a project from the Fedora group that maintains a repository of software packages that are not already present on RHEL/CentOS. The repository is compatible with RHEL and all close derivates like CentOS and Scientific Linux. By using epel we can easily install many packages (around 10,000) with yum command, that are not already present in the centos repositories. EPEL packages are usually based...
Read more... Comment (0)
 

Aptitude : Package Management in Debian Based Operating Systems

Apt, whether it is apt-get or apt-cache, is normally CLI based utility. If you prefer to use graphical environment (GUI), Aptitude is for you. The beauty of Aptitude is that, it can be used both in CLI mode and GUI mode. If it is run without any parameter/argument, it can be operated in GUI mode, CLI mode use of Aptitude is similar to that of apt-get command. There is an alternative to Aptitude if you are more comfortable with GUI mode, it is known as Synaptic. We will limit our discussion up to Aptitude and this article will help you understanding the basic use of Aptitude in GUI mode as well as CLI mode.

 

Read more on YourOwnLinux...

 

Installing Red5 Media Server on CentOS and RHEL

Red5 Media Server is a powerful media streaming server worked on RTMP protocal. Red5 is an open and extensible platform, which can be used in Video Conferencing or Network gaming.

Read this article to How to install Red5 media server on CentOS and RHEL Systems

 

Setting Up a Multi-Node Hadoop Cluster with Beagle Bone Black

Learning map/reduce frameworks like Hadoop is a useful skill to have but not everyone has the resources to implement and test a full system. Thanks to cheap arm based boards, it is now more feasible for developers to set up a full Hadoop cluster. I coupled my existing knowledge of setting up and running single node Hadoop installs with my BeagleBone cluster from my previous post to create my second project. This tutorial goes through the steps I took to set up and run Hadoop on my Ubuntu cluster. It may not be a practical application for everyone to learn but currently distributed map/reduce experience is a good skill to have. All the machines in my cluster are already set up with Java and SSH from my first project so you may need to install them if you don't have them.

Set Up

The first step naturally is to download Hadoop from apache's site on each machine in the cluster and untar it. I used version 1.2.1 but version 2.0 and above is now available. I placed the resulting files in /usr/local and named the directory hadoop. With the files on the system, we can create a new user called hduser to actual run our jobs and a group for it called hadoop:

sudo addgroup hadoop
sudo adduser --ingroup hadoop hduser

With the user created, we will make hduser the owner of the directory containing hadoop:

sudo chown -R hduser:hadoop /usr/local/hadoop

Then we create a temp directory to hold files and make the hduser the owner of it as well:

sudo mkdir -p /hadooptemp
sudo chown hduser:hadoop /hadooptemp

With the directories set up, log in as hduser. We will start by updating the .bashrc file in the home directory. We need to add two export lines at the top to point to our hadoop and java locations. My java installation was openjdk7 but yours may be different:

# Set Hadoop-related environment variables
export HADOOP_HOME=/usr/local/hadoop
# Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-armhf
# Add Hadoop bin/ directory to PATH
export PATH=$PATH:$HADOOP_HOME/bin

Next we can navigate to our hadoop installation directory and locate the conf directory. Once there we need to edit the hadoop-env.sh file and uncomment the java line to point to the location of the java installation again:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-armhf

Next we can update core-site.xml to point to the temp location we created above and specify the root of the file system. Note that the name in the default url is the name of the master node:


<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/hadooptemp</value>
    <description>Root temporary directory.</description>
  </property>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://beaglebone1:54310</value>
    <description>URI pointing to the default file system.</description>
  </property>
</configuration>

Next we can edit mapred-site.xml:


  

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>beaglebone1:54311</value>
<description>The host and port that the MapReduce job tracker runs
at.</description>
</property>
</configuration>

Finally we can edit hdfs-site.xml to list how many replication nodes we want. In this case I chose all 3:


<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
<description>Default block replication.</description>
</property>
</configuration>

Once this has been done on all machines we can edit some configuration files on the master to tell it which nodes are slaves. Start by editing the masters file to make sure it just contains our host name:

beaglebone1

Now edit the slaves file to list all the nodes in our cluster:

beaglebone1
beaglebone2
beaglebone3

Next we need to make sure that our master can communicate to its slaves. Make sure hosts file on your nodes contains the names of all the nodes in your cluster. Now on the master, create a key and copy it to the authorized_keys:

ssh-keygen -t rsa -P ""
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

Then copy the key to other nodes:

ssh-copy-id -i $HOME/.ssh/id_rsa.pub hduser@beaglebone2

Finally we can test the connections from the master to itself and others using ssh.

Starting Hadoop

The first step is to format HDFS on the master. From the bin directory in hadoop run the following:

hadoop namenode -format

Once that is finished Now we can start the NameNode and JobTracker daemons on the master. We simply need to execute these two commands from the bin directory:

start-dfs.sh
start-mapred.sh

To stop them later we can use:

stop-mapred.sh
stop-dfs.sh

Running an Example

With Hadoop running on the master and slaves, we can test out one of the examples. First we need to create some files on our system. Create some text files in a directory of your chosing with a few words in each. We can then copy the files to hdfs. From the main hadoop directory, execute the following:

bin/hadoop dfs -mkdir /user/hduser/test
bin/hadoop dfs -copyFromLocal /tmp/*.txt /user/hduser/test

The first line will call mkdir on HDFS to create a directory /user/hduser/test. The second line will copy files I created in /tmp to the new HDFS directory. Now we can run the wordcount sample against it:

bin/hadoop hadoop-examples-1.2.1.jar wordcount /user/hduser/test /user/hduser/testout

The jar file name will vary based on what version of hadoop you downoaded. Once the job is finished, it will output the results in HDFS to /user/hduser/testout. To view the resulting files we can do this:

bin/hadoop dfs -ls /user/hduser/testout

We can then use the cat command to show the contents of the output:

bin/hadoop dfs -cat /user/hduser/testout/part-r-00000

This file will show us each word found and the number of times it was found. If we want to see proof that the job ran on all nodes, we can view the logs on the slaves from the hadoop/logs directory. For example, on the beaglebone2 node I can do this:

cat hadoop-hduser-datanode-beaglebone2.log

When I examined the file, I could see messages at the end showing the jobname and data received and sent, letting me know that all was well.

Conclusion

If you through all of this and it worked, congratulations on setting up a working cluster. Due to the slow performance of the BeagleBone's SD card, it is not the best device for getting actual work done. However, these steps are applicable to faster arm devices as they come along. In the meantime, the BeagleBone Black is a great platform for practice and learning how to set up distributed systems.

 
Page 17 of 142

Upcoming Linux Foundation Courses

  1. LFS230 Linux Network Management
    06 Oct » 09 Oct - Virtual
    Details
  2. LFD331 Developing Linux Device Drivers
    13 Oct » 17 Oct - Virtual
    Details
  3. LFS430 Linux Enterprise Automation
    13 Oct » 16 Oct - Virtual
    Details

View All Upcoming Courses


Who we are ?

The Linux Foundation is a non-profit consortium dedicated to the growth of Linux.

More About the foundation...

Frequent Questions

Join / Linux Training / Board