It appears that ResponseHeaderTimeout needs to be set explicitly, no headers response from the server doesn't seem to be covered by any other timeout.
This should fix the download getting stuck issue.
For more context, see the stacktrace obtained from a system where download got stuck: https://pastebin.canonical.com/p/nZkRMTBbv3/
(the affected goroutine seems to be sitting inside for-select loop of /usr/lib/go-1.10/src/net/http/transport.go:2033)
(to play with it comment out ResponseHeaderTimeout: ... from transport.go and modify test timeout from 5 * time.Second to a huge value and it should be stuck)
* Set ResponseHeaderTimeout on the default transport.
* Use 250ms for mocked ResponseHeaderTimeout.
* Bump ResponseHeaderTimeout to 15s.
* Bump test timeout to avoid potential issues with LP builds.
The detection of timeout coupled with context.DeadlineExceeded errors in
http.Client.do() is racy, and boils down to `time.Now().After(deadline)` where
deadline is based on the timeout setting of the client.
This has been partially fixed in 1.14 in
7fc2625ef1 but
feels more reliable if returned via err when reading the response body.
Make the tests expect either the proper `Client.Timeout exceeded while awaiting
headers` or context deadline exceeded error.
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
On the (very slow) armhf builder one unit test failed with:
```
FAIL: retry_test.go:359: retrySuite.TestRetryRequestTimeoutHandling
retry_test.go:411:
// check that we exhausted all retries (as defined by mocked retry strategy)
c.Assert(permanentlyBrokenSrvCalls.Count(), Equals, 5)
... obtained int = 4
... expected int = 5
```
This is caused by the retry stragey limit of 1s. The timeout of
the retry to increased to 100ms recently and it looks like this
can sometimes cause the test to go over the 1s retry strategy
limit now. This commit increases this timeout to make this much
less likely.
The test uses a short 25ms connect timeout. Slow (overcommitted)
systems may fail because they are unable to reply in this time.
This commit changes the timeout to 100ms which makes it at least
4x less likely to hit this condition. It also makes the test
slower unfortunately (by ~400ms).
* travis.yml: run unit tests with go/master as well
The unit tests of snapd are broken currently for golang-1.14. This
was observed on debian-sid. We did not catch this. To ensure we
get an early warning about failures with the latest go this PR
adds "master" to the go versions to run the unit tests against.
This commit also fixes the broken tests with 1.14
introduce a new package snapdenv to present the common env options
for snapd components
start with snadpenv.Testing exposing SNAPPY_TESTING
it doesn't make sense for *util packages to use snapdenv directly,
as a consequence move (Set)UserAgent from httputil to snapdenv