Firefox, Chrome, and other browsers do an acceptable job of downloading a single file of reasonable size. But I don't like to trust a browser to grab ISO images and other files that are hundreds of megabytes, or larger. For that I prefer to turn to
wget. You'll find that using
wget provides some significant advantages over grabbing files with your browser.
First of all, there's the obvious — if your browser crashes or you need to restart for some reason, you don't lose the download. Firefox and Chrome have been fairly stable for me lately, but it's not unheard of for them to crash. That's a bit of a bummer if they're 75% of the way (or 98%) through downloading a 3.6GB ISO for the latest Fedora or openSUSE DVD.
It's also inconvenient when I want to download a file on a server. For example, if I'm setting up WordPress on a remote system I need to be able to get the tarball with the latest release on the server. It seems silly to copy it to my desktop and then use
scp to upload it to the server. That's twice the time (at least). Instead, I use
wget to grab the tarball while I'm SSH'ed into the server and save myself a few minutes.
wget is scriptable. If you want to scrape a Web site or download a file every day at a certain time, you can use
wget as part of a script that you call from a cron job. Hard to do that with Firefox or Chrome.
Get Started with
Most Linux distributions should have
wget installed, but if not, just search for the wget package. Several other packages use or reference wget, so you'll probably get several results — including a few front-ends for
Let's start with something simple. You can download files over HTTP, FTP, and HTTPS with
wget, so let's say you want to get the hot new Linux Mint Fluxbox edition. Just copy the URL to the ISO image and pass it to
wget like so:
Obviously, you'd replace "mirrorsite" with a legitimate site name, and the path to the ISO image with the correct path.
What about multiple files? Here's where
wget really starts showing its advantages. Create a text file with the URLs to the files, one per line. For instance, if I wanted to copy the CD ISO images for Fedora 14 alpha, I'd copy the URLs for each install ISO to a text file like this:
You get the idea. Save the file as
fedoraisos.txt or similar and then tell
wget to download all of the ISO images:
wget -i fedoraisos.txt
wget will start grabbing the ISOs in order of appearance in the text file. That might take a while, depending on the speed of your Net connection, so what happens if the transfer is interrupted? No sweat. If
wget is running, but the network goes down, it will continue trying to fetch the file and resume where it left off.
But what if the computer crashes or you need to stop
wget for some other reason? The
wget utility has a "continue" option (
-c) that can be used to resume a download that's been interrupted. Just start the download using the
-c option before the argument with the file name(s) like so:
wget -c ftp://mirrorsite.net/filename.iso
If you try to resume a download after
wget has been stopped, it will usually start from scratch and save to a new file with a
.1 after the main filename. This is
wget trying to protect you from "clobbering" a previous file.
Mirroring and More
You can also use
wget to mirror a site. Using the
wget will actually try to suck down the entire site, and will follow links recursively to grab everything it thinks is necessary for the site.
Unless you own a site and are trying to make a backup, the
--mirror site might be a bit aggressive. If you're trying to download a page for archival purposes, the
-p option (page) might be better. When
wget is finished, it will create a directory with the site name (so if you tried Linux.com, it'd be
linux.com) and all of the requisite files underneath. Odds are when you open the site in a browser it won't look quite right, but it's a good way to get the content of a site.
Password protected sites are not a problem, as
wget supports several options for passing the username and password to a site. Just use the
--password options, like so:
wget --user=username --password=password ftp://mirrornet.net/filename.file where the user name and password are replaced with your credentials. You might want to specify this from a script if you're on a shared system, lest other users see the username and password via
ps or similar.
Sometimes a site will deny access to non-browser user agents. If this is a problem,
wget can fake the user agent string with
If you don't have the fastest connection in the world, you might want to throttle
wget a bit so it doesn't consume your available bandwidth or hammer a remote site if you are on a fast connection. To do that, you can use the
--limit-rate option, like this:
wget --limit-rate=2m http://filesite.net/filename.iso
That will tell
wget to cap its downloads at 2 megabytes, though you can also use
k to specify kilobytes.
If you're grabbing a bunch of files, the
-w (wait) option can pause
wget between the files. So
wget -w=1m will pause
wget one minute between downloads.
There's a lot more to
wget, so be sure to check the man page to see all the options. In a future tutorial, we'll cover using
wget for more complex tasks and examining HTTP responses from Apache.