The stem of the names of the output files is normally derived from a component of the url. If the url contains a path name, the stem is the component of that path, less any dot-separated suffix and prefix. For example, given
the stem would be index. If there is no path name, but the url contains a domain name, the stem is the penultimate component of the domain name (eg, excluding trailing .com, and initial www, etc). For example, given
the stem would be vitanuova. If all else fails, webgrab uses the stem webgrab.
Given a stem, the initial page is stored in stem.suffix where suffix is the suffix (eg, .html) of the name of the original page. Subordinate pages are saved in a similar way in files named stem_1.suffix1, stem_2.suffix2, ... .
The options are:
Webgrab reads the configuration file /services/webget/config (if it exists), to look for the address of an optional HTTP proxy (in the httpproxy entry), and list of domains for which a proxy should not be used (in the noproxy or noproxydoms entry). If symbolic network and service names might be involved, the connection server lib/cs needs to be already running.
WEBGRAB(1 ) | Rev: Mon Mar 12 21:23:11 GMT 2007 |