Replace ioutil.WriteFile with os.WriteFile since the former has been
deprecated since go1.16 and simply calls the latter.
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
Add a SNAP_CLIENT_DEBUG_HTTP bitfield environment variable that allows
logging requests, responses and bodies that the client receives from
the REST API. Bodies are never logged in local installs because, since
they contain packaged snaps, their size would make the log unreadable.
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
This commit adds code to deal with the issue that the time may
we widely off when snapd tries to register the serial. For devices
without a RTC the date maybe so much in the past that the SSL
certificats are not valid yet. To fix this the following changes are
made:
* httputil: add `CertExpiredOrNotValidYet()` helper
This helper can be used to check if the error is that the
certificate is expired or not yet not valid. This is useful
to detect situations like when the time has not yet been
syncronized from a NTP sources.
* devicestate: retry serial acquire on time based certificate errors
When the serial assertion cannot be acquired because the certificate
of the remote system is expired or not yet valid then the most
likely reason for this is that the system clock is off. This case
is now treated in the similar to no network errors, i.e. snapd
will retry to acquire the serial and will only go into the a slow backoff
mode. This helps with the issue that on systems without a RTC
when the device comes up and the NTP sync is slow the serial
is (re)tried 3 times and then it goes into a very long backoff
(as defined in DeviceManager.ensureOperationalShouldBackoff()).
A gradual backoff is still used to not overwhelm the servers and
it is only tried for a bit more than 2048s because that is the maxium
time it takes for timesyncd to wait before trying a NTP sync.
When using SNAPD_DEBUG_HTTP, it was also necessary to set SNAPD_DEBUG
for the HTTP traffic to be logged. So it was impossible to log the HTTP
traffic without logging everything. This commit changes that so that
setting the former logs the HTTP traffic and setting SNAPD_DEBUG
enables the debug logs in the rest of snapd.
Signed-off-by: Miguel Pires <miguel.pires@canonical.com>
It appears that ResponseHeaderTimeout needs to be set explicitly, no headers response from the server doesn't seem to be covered by any other timeout.
This should fix the download getting stuck issue.
For more context, see the stacktrace obtained from a system where download got stuck: https://pastebin.canonical.com/p/nZkRMTBbv3/
(the affected goroutine seems to be sitting inside for-select loop of /usr/lib/go-1.10/src/net/http/transport.go:2033)
(to play with it comment out ResponseHeaderTimeout: ... from transport.go and modify test timeout from 5 * time.Second to a huge value and it should be stuck)
* Set ResponseHeaderTimeout on the default transport.
* Use 250ms for mocked ResponseHeaderTimeout.
* Bump ResponseHeaderTimeout to 15s.
* Bump test timeout to avoid potential issues with LP builds.
The detection of timeout coupled with context.DeadlineExceeded errors in
http.Client.do() is racy, and boils down to `time.Now().After(deadline)` where
deadline is based on the timeout setting of the client.
This has been partially fixed in 1.14 in
7fc2625ef1 but
feels more reliable if returned via err when reading the response body.
Make the tests expect either the proper `Client.Timeout exceeded while awaiting
headers` or context deadline exceeded error.
Signed-off-by: Maciej Borzecki <maciej.zenon.borzecki@canonical.com>
On the (very slow) armhf builder one unit test failed with:
```
FAIL: retry_test.go:359: retrySuite.TestRetryRequestTimeoutHandling
retry_test.go:411:
// check that we exhausted all retries (as defined by mocked retry strategy)
c.Assert(permanentlyBrokenSrvCalls.Count(), Equals, 5)
... obtained int = 4
... expected int = 5
```
This is caused by the retry stragey limit of 1s. The timeout of
the retry to increased to 100ms recently and it looks like this
can sometimes cause the test to go over the 1s retry strategy
limit now. This commit increases this timeout to make this much
less likely.
The test uses a short 25ms connect timeout. Slow (overcommitted)
systems may fail because they are unable to reply in this time.
This commit changes the timeout to 100ms which makes it at least
4x less likely to hit this condition. It also makes the test
slower unfortunately (by ~400ms).
* travis.yml: run unit tests with go/master as well
The unit tests of snapd are broken currently for golang-1.14. This
was observed on debian-sid. We did not catch this. To ensure we
get an early warning about failures with the latest go this PR
adds "master" to the go versions to run the unit tests against.
This commit also fixes the broken tests with 1.14
introduce a new package snapdenv to present the common env options
for snapd components
start with snadpenv.Testing exposing SNAPPY_TESTING
it doesn't make sense for *util packages to use snapdenv directly,
as a consequence move (Set)UserAgent from httputil to snapdenv