How can I cleanly "pause" download & continue later?

How can I cleanly "pause" download & continue later?

Postby n8marti » 11.12.2017, 08:20

I'm on a very slow connection where I pay by the MB. I get better rates overnight and on the weekends, so I run wsusoffline (download-updates.bash) at those times, download what I can, then continue again the next night/weekend. I noticed in the log that an update did not continue where it left off the previous time, but instead started over again. I know this is expected with wsusscn2.cab and mpam..., but the update I'm referring to now is /client/w100-x64/glb/windows10.0-kb4048952-x64_e7918b1dff5622b1a03c1e599c6251d3b11f8f33.cab (904M). Here are excerpts from the log:

--2017-12-03 14:33:34-- http://download.windowsupdate.com/d/msd ... 1f8f33.cab
Connecting to download.windowsupdate.com (download.windowsupdate.com)|88.221.134.41|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 948001705 (904M) [application/vnd.ms-cab-compressed]
Saving to: '../client/w100-x64/glb/windows10.0-kb4048952-x64_e7918b1dff5622b1a03c1e599c6251d3b11f8f33.cab'

0K ........ ........ ........ ........ ........ ........ 0% 63.6K 4h1m
3072K ........ ........ ........ ........ ........ ........ 0% 73.7K 3h44m
[...]
64512K ........ ........ ........ ........ ........ ........ 7% 59.5K 4h31m
67584K ........ ........ ........ .......
--------------------------------------------------------------------------------
[### I believe I stopped this with Ctrl+C ###]
[...]
--2017-12-09 18:51:27-- http://download.windowsupdate.com/d/msd ... 1f8f33.cab
Connecting to download.windowsupdate.com (download.windowsupdate.com)|23.62.2.88|:80... connected.
HTTP request sent, awaiting response... 200 OK <================================================= I expected "206 Partial Content" here
Length: 948001705 (904M) [application/vnd.ms-cab-compressed]
Saving to: '../client/w100-x64/glb/windows10.0-kb4048952-x64_e7918b1dff5622b1a03c1e599c6251d3b11f8f33.cab'

0K ........ ........ ........ ........ ........ ........ 0% 89.2K 2h52m
3072K ........ ........ ........ ........ ........ ........ 0% 75.0K 3h8m
6144K ........ ........ ........ ........ ........ ........ 0% 77.7K 3h10m
[...]
294912K ........ ........ ........ ........ ........ ........ 32% 241K 5h21m
297984K ........ ........ ........ ........ ........ ........ 32% 9.98K 5h27m
301056K ........ . 32% 6.77K=2h39m

2017-12-09 21:31:48 (31.6 KB/s) - Read error at byte 308912317/948001705 (Success). Retrying.

--2017-12-09 21:31:49-- (try: 2) http://download.windowsupdate.com/d/msd ... 1f8f33.cab
Connecting to download.windowsupdate.com (download.windowsupdate.com)|23.62.2.88|:80... connected.
HTTP request sent, awaiting response... 206 Partial Content
Length: 948001705 (904M), 639089388 (609M) remaining [application/vnd.ms-cab-compressed]
Saving to: '../client/w100-x64/glb/windows10.0-kb4048952-x64_e7918b1dff5622b1a03c1e599c6251d3b11f8f33.cab'

[ skipping 301056K ]
301056K ,,,,,,,, ,....... ........ ........ ........ ........ 32% 194K 66m43s
304128K ........ ........ ........ ........ ........ ........ 33% 180K 61m22s
307200K ........ ........ ........ ........ ........ ........ 33% 115K 71m1s
[...]


Between the 1st and 2nd attempts I believe I stopped it with Ctrl+C. When it tried again several days later it did not find the partially downloaded file and so restarted the download. Between the 2nd and 3rd attempts it did find the partial download and continued as expected. Since the cost of my downloads changes depending on the day of the week and time of day, I need to be able to "pause" the download arbitrarily to wait for a better time. Is there another option besides Ctrl+C? I guess I could just turn off my networking and the script would ultimately give up due to lack of a connection, but that isn't my preferred solution.
n8marti
 

Re: How can I cleanly "pause" download & continue later?

Postby hbuhrmester » 12.12.2017, 00:33

To have wget continue a download between different runs, you need to add the option --continue to the wget command-line. From the wget manual page:

Code: Select all
-c
--continue
    Continue getting a partially-downloaded file.  This is useful when
    you want to finish up a download started by a previous instance of
    Wget, or by another program.  For instance:

           wget -c ftp://sunsite.doc.ic.ac.uk/ls-lR.Z

    If there is a file named ls-lR.Z in the current directory, Wget
    will assume that it is the first portion of the remote file, and
    will ask the server to continue the retrieval from an offset equal
    to the length of the local file.

    Note that you don't need to specify this option if you just want
    the current invocation of Wget to retry downloading a file should
    the connection be lost midway through.  This is the default
    behavior.  -c only affects resumption of downloads started prior to
    this invocation of Wget, and whose local files are still sitting
    around.


In the Linux scripts, this could be added to the variable $wget_optimized_options in the file 40-configure-downloaders.bash:

Code: Select all
readonly wget_optimized_options="--tries=10 --waitretry=10 --continue"



To pause a running command or script, you could try the job control options of the bash:
https://www.gnu.org/software/bash/manua ... ob-Control

You can start a script in the background, and then the shell will print a reference number of the new job, starting with %1.

You can then temporarily stop the script with:

Code: Select all
kill -s SIGSTOP %1


and resume it with:

Code: Select all
kill -s SIGCONT %1



For example, to start the script download-updates.bash in the background, without any output for clarity:

Code: Select all
$ ./download-updates.bash w60 deu,enu -includesp >/dev/null 2>&1 &
[1] 1184
$ kill -s SIGSTOP %1
$ jobs
[1]+  Stopped                 ./download-updates.bash w60 deu,enu -includesp > /dev/null 2>&1
$ kill -s SIGCONT %1
$ jobs
[1]+  Running                 ./download-updates.bash w60 deu,enu -includesp > /dev/null 2>&1 &


This also stops and resumes other running processes like hashdeep and wget, which are started from the script. But, if the connection is lost in-between, wget may not be able to simply resume the download.


The file integrity check may also interfere here and remove partial files between different runs. Note, that the SHA-1 hash is embedded into the filename of most security updates as a hexadecimal number of 40 digits length. In your example, the SHA-1 hash is "e7918b1dff5622b1a03c1e599c6251d3b11f8f33". hashdeep recalculates all hashes after download and saves them to the file ../client/md/hashes-w100-x64-glb.txt. Then the expected values are compared to the calculated values. Files, which do not validate, will be moved to the trash (or deleted directly). You could selectively disable this test by commenting out the line:

Code: Select all
verify_embedded_checksums "$hashed_dir" "$hashes_file"


in the file 60-main-updates.bash.


I'm not sure, if the other downloader aria2c might help. It is often used to speedup a slow download by using multiple connections. This sometimes results in errors "502: Bad Gateway", if used in a content delivery network, but only if multiple connections are used for the download of the same file. You could still try it and modify the aria2c options to:

Code: Select all
readonly aria2c_optimized_options="--log-level=notice --max-connection-per-server=5"


Aria2 also creates a saved-state file for partial downloads and can resume the download exactly where it left.

The preferred method to enable aria2c is to rename the file preferences-template.bash to preferences.bash, and then to change the option "$supported_downloaders" to:

Code: Select all
supported_downloaders="aria2c wget"



Regards
hbuhrmester
hbuhrmester
 
Posts: 525
Joined: 11.10.2013, 20:59


Return to Download

Who is online

Users browsing this forum: No registered users and 51 guests