Although cURL is sometimes misconceived as an updated wget, that's wrong. The two utilities do share some features and options, but are distinctly different tools; wget is for downloading files from the Web, and is best used to mirror entire sites or parts of sites -- which is something that cURL alone can't do.
cURL's job is to copy data to or from a given set of URLs; along with HTTP it recognizes the FTP, TFTP, GOPHER, TELNET, DICT, LDAP, FILE, HTTPS, and FTPS protocols. Other features include support for proxies, forms, cookies, SSL, client-side certificates, URL globbing, and very large files. Along with the curl command-line tool is a counterpart library, libcurl, that you can use to get cURL's functionality from within your own programs.
You can do a lot of neat tricks with curl. Here's a look at how you can copy to and from URLs, and then use cURL's reporting facilities to get simple Web server metrics from your operations.
Copy URLs
The curl tool supports the GNU-style --version option, which shows not only the version but also the protocols recognized, as well as any extra features that are compiled in. If you get stuck, --help gives a summary of options, while --manual gives an ASCII rendition of the entire man page plus a usage guide, in 40-odd pages. (Pipe these to less!) Also useful is the standard "verbose" option, -v, which tells you what the program is doing, step by step.
When you give curl a URL, its contents are retrieved and sent to standard output. To save the URL's output to a local file with the same name as the remote file, give the -O option; to specify a different name, use the -o option instead and give the name to use as an argument.
One of cURL's most useful features is a kind of URL "glob" support, which lets you specify a pattern as part of the URL to match multiple URLs. You can give a character range in brackets, such as [A-Z] or [0-9], and you can also give a list of alternatives in braces, such as {about,blog,news}. The only trick is that if you're saving to files with the -O option, you have to give that option as many times as files you match. For example, suppose you want to grab all three versions of a manual. You'd need a command like:
$ curl -O -O -O http://example.com/docs/manual.{html,pdf,tar.gz}
For HTTP requests, you can specify HTTP 1.1 byte ranges instead of entire files -- if the server has byte ranges enabled, this option returns only the specified bytes instead of the whole file. 0 represents the beginning of the file. For example, to grab the first 100 bytes:
$ curl -r 0-99 http://example.com/
Ranges don't have to begin with zero. To get bytes 100 through 200:
$ curl -r 100-200 http://example.com/
Negative values alone work from the end of the document. To grab the last eight bytes:
$ curl -r -8 http://example.com/
The -i option precedes a given URL by the server headers. Alternately, -I outputs only the headers, which is useful for seeing the OS and Web server software that a specified site is running. It also shows the date and time of the request, content length, and type of the given URL. When the -I option is used on a FILE or FTP URL, you'll get the file size and modification time.
You can upload files by specifying them as arguments to the -T option. It supports the same kind of globbing as the URL argument:
$ curl -T index-{01-99}.html ftp://ftp.example.com/pub/incoming/
By default, file uploads are given the same name as the source files, but you can specify a new name by including it in the target URL:
$ curl -T index-mine.html ftp://ftp.example.com/pub/incoming/index-yours.html
If you need to specify a username and password, give them as arguments to the -u option, separated by a colon. To upload standard input, use the hyphen as an argument:
$ some-long-pipeline | curl -u bob:secret -T - ftp://ftp.example.com/pub/bob/results.txt
Get server metrics
cURL supports built-in runtime variables that you can use to perform ad hoc diagnostics and benchmarking, or to gather statistics about the accessibility of a given URL, site, or server (all times are given in seconds, and all sizes are in bytes):
Content-Type value of the fileOutput any of these variables with the -w option ("write-out"), giving the variables in the format %{name} as part of a quoted string. You can include any other text as part of that string, and do simple formatting by using \n for a newline or \t for a tab. For example:
$ curl -w '\nLookup time:\t%{time_namelookup}\nConnect time:\t%{time_connect}\nPreXfer time:\t%{time_pretransfer}\nStartXfer time:\t%{time_starttransfer}\n\nTotal time:\t%{time_total}\n' -o /dev/null -s http://linux.com/
Lookup time: 0.038
Connect time: 0.038
PreXfer time: 0.039
StartXfer time: 0.039
Total time: 0.039
$
To get the amount of time between when a connection is established and when the data actually begins to be transferred, subtract the value of time_pretransfer from time_starttransfer. You can automate this by sending the output to bc with echo:
$ echo "`curl -s -o /dev/null -w '%{time_starttransfer}-%{time_pretransfer}' http://linux.com/`"|bc
cURL offers other important options you'll want to use to check for timeouts or to control the transfer speed -- it has more than 100 options in total. By specifying huge URL ranges or calling curl from a loop, you can use the commands to do simple server load testing, or check for various failures by reading the variable output -- and since curl handles forms, you can even use it to test Web application speed.
Note: Comments are owned by the poster. We are not responsible for their content.
Globbing done in the shell
Posted by: Anonymous Coward on October 17, 2006 08:24 PM-Dom
#