August 30, 2007

Get a daily dose of comics

Author: Michael Crider

When I first started learning to read, my primary motivation was to gain the ability to read the comics in my local paper. I had no idea at that time that there were so many comics in the world. Now I know that there are comics all over the Web, but who has time to go to each site each day and read the latest strip? Thanks to the world of open source software, you can gather all your favorite comics on one page automatically, ready for you to read each morning.

Dosage is "an application designed to keep a local 'mirror' of specific Web comics, with a variety of options for naming schemes and updating options. It supports a recursive 'catch-up' method, where it traverses a comic by essentially 'visiting' previous comics and picking out the comics." I have combined Dosage with the CGI Calendar of Events application to create a customized calendar with links to comics for each page.

Setting up: the prerequisites

Dosage is a command-line program; there is no GUI. You will also need a local Web server to use the calendar. On Debian derivatives such as Ubuntu, you can get one by installing the apache2 package: sudo apt-get install apache2.

Download the most recent version of Calendar of Events. You also need the CPAN Perl module CGI::Minimal. Extract the archives of the two packages with the commands tar xzf calendar.tar.gz and tar xzf CGI-Minimal-1.27.tar.gz.

In the root directory for Web documents (typically /var/www), create a directory called calendar. Copy config.txt and template.tmpl from the calendar archive to this new directory. If you know HTML, you can edit template.tmpl to change the look of the calendar if you like. Make config.txt writable by the user who will be running the comic collection script with the command chown username config.txt. config.txt will hold the list of events to be shown on the calendar, or, in our case, links to pages of comics. It comes with a list of holidays and special events that you can remove if you wish. You need to change the line starting template= to point to the directory where you placed template.tmpl -- i.e. template=/var/www/calendar/template.tmpl.

From the CGI-Minimal archive, copy the CGI directory under lib to the system cgi-bin directory (in Debian/Ubuntu the default location is /usr/lib/cgi-bin, but it may not exist if you have not installed any packages that use it -- the apache2 package does not create it). Also copy calendar.cgi and Template.pm from the calendar archive to the system cgi-bin directory. In calendar.cgi, edit the line beginning my $default_file to point to the directory where you placed config.txt -- i.e. my $default_file = '/var/www/calendar/config.txt';

Preparing your dosage

Download the most recent stable release of dosage (1.5.8 at the time of this writing). Expand this archive as you did the calendar. You can place the expanded directory in your home directory, /usr/local, or any other location. For this howto I will assume it is in your home directory. Make the script that runs everything executable: chmod +x ~/dosage-1.5.8/bin/mainline.

Now you can pick the comics you want to read each day. The command ~/dosage-1.5.8/bin/mainline -l will give you a list of all comics supported by dosage. As of 1.5.8 there are 1,933 comics listed. If your terminal will not let you scroll back through that many lines, you may want to redirect the output to a text file, then open it in your favorite text editor. Write down the names of all comics you want to download, then run the following command in your home directory: ~/dosage-1.5.8/bin/mainline -c space separated list of comics. The command will take some time to complete, as it downloads a complete archive of each available comic in your list.

Next you want to make some links that will make these comics available to your Web server, and make the calendar event list available to the collection script you will build soon. In /var/www, run the command ln -s ~/Comics. In ~/Comics, run ln -s /var/www/calendar/config.txt\.

Now you are ready to build the bash script that will run daily to collect your comics. Open a text editor and copy the following lines into a blank text file:

#!/bin/bash
cd ~
~/dosage-1.5.8/bin/mainline -ohtml @
echo -e "$(date +%Y)\t$(date +%-m)\t$(date +%-d)\t<a target="_blank" href="/Comics/html/comics-$(date +%Y%m%d).html">Comics</a>" >> ~/Comics/config.txt

Line two ensures that this script will run in your home directory. Line three calls dosage to download all newly available comics for each comic present in your ~/Comics directory, and write a Web page in ~/Comics/html which will link to each comic downloaded that day. Line four adds an entry to the Calendar of Events, linking to the new Web page.

Save this file in a location of your choice, with the name comics.sh. I collect scripts such as this inside my home directory, in a directory called bin. Make the script executable: chmod +x comics.sh.

Next, configure your system to run the script each day using cron. If you have never worked with cron, Wikipedia has a good introduction. crontab -e will present you with the crontab for your user. Add a line similar to the following at the bottom:

0 5 * * * ~/bin/comics.sh

This line instructs the system to download any newly available comics at 5 a.m. each day. You can adjust the time to whatever works best for you.

In the morning, open your Web browser and point it to http://localhost/cgi-bin/calendar.cgi, and click on the Comics link to pull up a list of your favorite comics. Click on each link to read the comic. If you miss a day, you can still go back and read them the next day, as long as your computer was on when cron was scheduled to run the collection.

Consolidating the comics

Let's add one final touch to make this system even easier to use. Instead of putting links to each comic in the Web page, let's put the comics on the page itself. Open your favorite text editor again, and copy the following lines to a blank text file:

s,file:///home/your user name/Comics/html/,,g
s,file:///home/your user name/Comics,..,
s,<li><a href,<li><img src,
s,">.*</a></li>,"></li>,

Save this file in the same directory as comics.sh, with the name comics.sed. Now open comics.sh, and add the following lines between lines three and four:

today=$(date +%Y%m%d)
sed -f ~/bin/comics.sed Comics/html/comics-$today.html > /tmp/comics.html
mv -f /tmp/comics.html Comics/html/comics-$today.html

The next time your collection script runs, you will be able to open a comics page in your browser that rivals any you can find in your local newspaper, because it has all, and only, the comics you want to read.

Categories:

  • Entertainment
  • Internet & WWW
Click Here!