April 22, 2010

reading through HTML tables with Perl

Yes sounds spooky since most of you might know that Tables in the HTML sence shouldn't be used any more and are mostly renound as bad style if you choose them to do you website in. But if you just want a simple Table for data and have it in an HTML page they are quite convenient. They get much more convenient when you use them with perl.

Lets say you have a website with a table in it. The table is really damn big and you dont want to read all through till you find what you want. Here is a way to find it with perl:


use warnings;
use strict;
use HTML::TableExtract;

my $content="";
#crack open the file and get whats in there.
open( FILE ,'<' , "file.html"); # we assume only one file at a time will be requested
while(<FILE>) { $content .= "

my $headers = "user name password";
#create a new TableExtract Object headers;
my $te = HTML::TableExtract->new( headers => [qw($headers)] ); 
$te->parse($content);  #parse the content for what we want
foreach my $ts ($te->tables) {
    foreach my $row ( $ts->rows ) {
      if ( @$row[1]=~ m/expression/ ) {
        print @$row[1];

This will help you along alot for example if you have a big list of password or user machines and you just search for that one machine or password.


Andreas Marschke.

Click Here!