Top

Member login | Register                     t | f | RSS | in

longlogo
                                    design | develop | deploy

Blog

Downloading an entire website or part of it

wget \
     --recursive \
     --no-clobber \
     --page-requisites \
     --html-extension \
     --convert-links \
     --restrict-file-names=windows \
     --domains example.com \
     --no-parent \
         http://www.example.com/examples/

This wget command downloads the website www.example.com/examples/ in it's entirety following hyperlinks recursively.

What it all means:

  • --recursive
    • download the entire website.
  • --domains example.com
    • don't follow links outside example.com.
  • --no-parent
    • don't follow links above or outside the directory examples/.
  • --page-requisites
    • get all the elements that compose the page (images, CSS and so on).
  • --html-extension
    • save files with the .html extension.
  • --convert-links
    • convert links so that they work locally, off-line.
  • --restrict-file-names=windows
    • modify filenames so that they will work in Windows as well.
  • --no-clobber
    • don't overwrite any existing files (used in case the download is interrupted and resumed).

Wget is available for most operating systems, pre-installed in most versions of Linux and can be downloaded as a binary for windows too.


Smileys

:confused::cool::cry::laugh::lol::normal::blush::rolleyes::sad::shocked::sick::sleeping::smile::surprised::tongue::unsure::whistle::wink:

 500 Characters left

 

Copyright © 2007 - 2012 Noxidsoft. All Rights Reserved.