One of many generally used utilities by sysadmin is wget. It may be very helpful throughout web-related troubleshooting.
What’s a wget command?
wget command is a well-liked Unix/Linux command line utility for retrieving content material from the Web. It’s free to make use of and provides a non-interactive technique to obtain recordsdata from the Web. The wget command helps normal HTTPS, HTTP, and FTP protocols. Furthermore, it additionally permits you to use HTTP proxies.
How does wget make it easier to with troubleshooting?
There are numerous methods.
As a system administrator, you normally work on a terminal, and when troubleshooting internet functions, it’s possible you’ll not need to examine the entire web page, simply the connectivity. Otherwise you need to confirm intranet web sites. Otherwise you need to obtain a selected web page to confirm its content material.
wget is non-interactive, which suggests you may run it within the background even if you’re logged out. There could also be many cases the place it’s important that you simply disconnect from the system even if you end up retrieving recordsdata from the web. Within the background the
wget will run and full their assigned process.
It will also be used to get the total web site in your native machines. It could possibly monitor left XHTML and HTML pages to create a neighborhood model. To do that, the web page should be downloaded recursively. That is very helpful as you should utilize it to obtain necessary pages or websites for offline viewing.
Let’s examine them in motion. The syntax of the wget is as under.
wget [option] [URL]
Obtain an online web page
Let’s attempt to obtain a web page. For instance: github.com
If the connectivity is nice, the homepage shall be downloaded and the output shall be proven as under.
root@tendencies:~# wget github.com URL remodeled to HTTPS resulting from an HSTS coverage --2020-02-23 10:45:52-- https://github.com/ Resolving github.com (github.com)... 220.127.116.11 Connecting to github.com (github.com)|18.104.22.168|:443... linked. HTTP request despatched, awaiting response... 200 OK Size: unspecified [text/html] Saving to: ‘index.html’ index.html [ <=> ] 131.96K --.-KB/s in 0.04s 2020-02-23 10:45:52 (2.89 MB/s) - ‘index.html’ saved  root@tendencies:~#
Obtain a number of recordsdata
Helpful if it’s good to obtain a number of recordsdata directly. This may provide you with an concept about the best way to automate file downloading by way of some scripts.
Let’s strive downloading Python 3.8.1 and three.5.1 recordsdata.
wget https://www.python.org/ftp/python/3.8.1/Python-3.8.1.tgz https://www.python.org/ftp/python/3.5.1/Python-3.5.1.tgz
So, as you may guess, the syntax is as under.
wget URL1 URL2 URL3
You simply must make it possible for area is given between the URLs.
Restrict the obtain velocity
This is able to be helpful if you wish to examine how a lot time it takes to obtain your file on completely different bandwidths.
--limit-rate possibility you may restrict the obtain velocity.
Right here is the output of downloading the Nodejs file.
root@tendencies:~# wget https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz --2020-02-23 10:59:58-- https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz Resolving nodejs.org (nodejs.org)... 22.214.171.124, 126.96.36.199, 2606:4700:10::6814:162e, ... Connecting to nodejs.org (nodejs.org)|188.8.131.52|:443... linked. HTTP request despatched, awaiting response... 200 OK Size: 14591852 (14M) [application/x-xz] Saving to: ‘node-v12.16.1-linux-x64.tar.xz’ node-v12.16.1-linux-x64.tar.xz 100%[===========================================================================================>] 13.92M --.-KB/s in 0.05s 2020-02-23 10:59:58 (272 MB/s) - ‘node-v12.16.1-linux-x64.tar.xz’ saved [14591852/14591852]
Downloading 13.92 MB recordsdata took 0.05 seconds. Now let’s strive limiting the velocity to 500K.
root@tendencies:~# wget --limit-rate=500k https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz --2020-02-23 11:00:18-- https://nodejs.org/dist/v12.16.1/node-v12.16.1-linux-x64.tar.xz Resolving nodejs.org (nodejs.org)... 184.108.40.206, 220.127.116.11, 2606:4700:10::6814:162e, ... Connecting to nodejs.org (nodejs.org)|18.104.22.168|:443... linked. HTTP request despatched, awaiting response... 200 OK Size: 14591852 (14M) [application/x-xz] Saving to: ‘node-v12.16.1-linux-x64.tar.xz.1’ node-v12.16.1-linux-x64.tar.xz.1 100%[===========================================================================================>] 13.92M 501KB/s in 28s 2020-02-23 11:00:46 (500 KB/s) - ‘node-v12.16.1-linux-x64.tar.xz.1’ saved [14591852/14591852]
Decreasing bandwidth took longer to obtain: 28 seconds. Think about your customers are complaining about gradual downloads, and you recognize their community bandwidth is low. You possibly can strive it quickly
--limit-rate to simulate the issue.
Obtain within the background
Massive recordsdata could take some time to obtain as within the instance above, the place you additionally need to set the velocity restrict. That is to be anticipated, however what in the event you do not need to stare at your terminal?
Properly, you should utilize it
-b argument to start out the wget within the background.
root@tendencies:~# wget -b https://slack.com Persevering with in background, pid 25430. Output shall be written to ‘wget-log.1’. root@tendencies:~#
Ignore certificates error
That is helpful if it’s good to examine intranet internet functions that should not have the proper certificates. By default, wget generates an error if a certificates shouldn’t be legitimate.
root@tendencies:~# wget https://expired.badssl.com/ --2020-02-23 11:24:59-- https://expired.badssl.com/ Resolving expired.badssl.com (expired.badssl.com)... 22.214.171.124 Connecting to expired.badssl.com (expired.badssl.com)|126.96.36.199|:443... linked. ERROR: can not confirm expired.badssl.com's certificates, issued by ‘CN=COMODO RSA Area Validation Safe Server CA,O=COMODO CA Restricted,L=Salford,ST=Larger Manchester,C=GB’: Issued certificates has expired. To hook up with expired.badssl.com insecurely, use `--no-check-certificate'.
The instance above is for the URL the place the certificates has expired. As you may see it advised to make use of
--no-check-certificate which ignores any certificates validation.
root@tendencies:~# wget https://untrusted-root.badssl.com/ --no-check-certificate --2020-02-23 11:33:45-- https://untrusted-root.badssl.com/ Resolving untrusted-root.badssl.com (untrusted-root.badssl.com)... 188.8.131.52 Connecting to untrusted-root.badssl.com (untrusted-root.badssl.com)|184.108.40.206|:443... linked. WARNING: can not confirm untrusted-root.badssl.com's certificates, issued by ‘CN=BadSSL Untrusted Root Certificates Authority,O=BadSSL,L=San Francisco,ST=California,C=US’: Self-signed certificates encountered. HTTP request despatched, awaiting response... 200 OK Size: 600 [text/html] Saving to: ‘index.html.6’ index.html.6 100%[===========================================================================================>] 600 --.-KB/s in 0s 2020-02-23 11:33:45 (122 MB/s) - ‘index.html.6’ saved [600/600] root@tendencies:~#
HTTP response header
View the HTTP response header from a selected website on the terminal.
-S will print the header, as you may see under for Coursera.
root@tendencies:~# wget https://www.coursera.org -S --2020-02-23 11:47:01-- https://www.coursera.org/ Resolving www.coursera.org (www.coursera.org)... 220.127.116.11, 18.104.22.168, 22.214.171.124, ... Connecting to www.coursera.org (www.coursera.org)|126.96.36.199|:443... linked. HTTP request despatched, awaiting response... HTTP/1.1 200 OK Content material-Sort: textual content/html Content material-Size: 511551 Connection: keep-alive Cache-Management: non-public, no-cache, no-store, must-revalidate, max-age=0 Date: Solar, 23 Feb 2020 11:47:01 GMT etag: W/"7156d-WcZHnHFl4b4aDOL4ZSrXP0iBX3o" Server: envoy Set-Cookie: CSRF3-Token=1583322421.s1b4QL6OXSUGHnRI; Max-Age=864000; Expires=Wed, 04 Mar 2020 11:47:02 GMT; Path=/; Area=.coursera.org Set-Cookie: __204u=9205355775-1582458421174; Max-Age=31536000; Expires=Mon, 22 Feb 2021 11:47:02 GMT; Path=/; Area=.coursera.org Strict-Transport-Safety: max-age=31536000; includeSubDomains; preload X-Content material-Sort-Choices: nosniff x-coursera-render-mode: html x-coursera-render-version: v2 X-Coursera-Request-Id: NCnPPlYyEeqfcxIHPk5Gqw X-Coursera-Hint-Id-Hex: a5ef7028d77ae8f8 x-envoy-upstream-service-time: 1090 X-Body-Choices: SAMEORIGIN x-powered-by: Categorical X-XSS-Safety: 1; mode=block X-Cache: Miss from cloudfront By way of: 1.1 884d101a3faeefd4fb32a5d2a8a076b7.cloudfront.internet (CloudFront) X-Amz-Cf-Pop: LHR62-C3 X-Amz-Cf-Id: vqvX6ZUQgtZAde62t7qjafIAqHXQ8BLAv8UhkPHwyTMpvH617yeIbQ== Size: 511551 (500K) [text/html]
Manipulate the person agent
There could also be a state of affairs the place you need to join a website utilizing a customized person agent. Or the person agent of a selected browser. That is doable by specifying
--user-agent. The instance under is for the person agent as MyCustomUserAgent.
root@tendencies:~# wget https://gf.dev --user-agent="MyCustomUserAgent"
When an software continues to be in growth, it’s possible you’ll not have URL to check it. Or perhaps you need to take a look at a separate HTTP occasion utilizing IP, however it’s good to specify the host header for the applying to work appropriately. On this state of affairs,
--header can be useful.
Let’s take an instance of testing http://10.10.10.1 with the host header as software.com
wget --header="Host: software.com" http://10.10.10.1
Not simply internet hosting, however you may inject any header you need.
Join by way of proxy
In case you are working in a DMZ atmosphere, it’s possible you’ll not have the ability to entry web websites. However you may reap the benefits of proxy to attach.
wget -e use_proxy=sure http_proxy=$PROXYHOST:PORT http://externalsite.com
Remember to replace the $PROXYHOST:PORT variable with the precise variables.
Join by way of a selected TLS protocol
Usually I like to recommend utilizing OpenSSL to check the TLS protocol. However it’s also possible to use wget.
wget --secure-protocol=TLSv1_2 https://instance.com
The above forces wget to attach over TLS 1.2.
Understanding the mandatory instructions may also help you at work. I hope the above provides you an concept of what you are able to do with it