Sunday, 17 April 2016

How to use wget command for downloading the web page ?


GNU wget is a free utility for non-interactive download of files from the Web.
It supports HTTP, HTTPS, and FTP protocols, as well as retrieval through HTTP proxies.

wget is non-interactive, meaning that it can work in the background, while the user is not logged on.
This allows you to start a retrieval and disconnect from the system, letting Wget finish the work.
By contrast, most of the Web browsers require constant user's presence, which can be a great hindrance when transferring a lot of data.

Some examples
In order to download the web page linux.about.com you would type
wget linux.about.com
and the result will be saved in a file "index.html" in the current directory.
You can open and view the file with a web browser.

1. There are connection problems wget will try up to 20 times to reconnect. You can use the -t option to change the maximum number of   attempts. For example, with
  wget -t 10 linux.about.com
   it will try only up to 10 times.

2. Instead of having the progress messages displayed on the standard output, you can save it to a log file with the -o option:
  wget -o logfile linux.about.com

3. To run it in the background you would put an ampersand at the end as usual:
    wget -o logfile linux.about.com &

4. To download a copy of a complete web site, up to five levels deep, you use the -r option (for recursive):
  wget -r linux.about.com -o logfile

No comments:

Post a Comment

Note: only a member of this blog may post a comment.