Commit Graph

108 Commits

Author SHA1 Message Date
Julian Andres Klode 994515e689 https: Quote path in URL before passing it to curl
Curl requires URLs to be urlencoded. We are however giving it
undecoded URLs. This causes it go completely nuts if there is
a space in the URI, producing requests like:

    GET /a file HTTP/1.1

which the servers then interpret as a GET request for "/a" with
HTTP version "file" or some other non-sense.

This works around the issue by encoding the path component of
the URL. I'm not sure if we should encode other parts of the URL
as well, this one seems to do the trick for the actual issue at
hand.

A more correct fix is to avoid the dequoting and (re-)quoting
of URLs when a redirect occurs / a new request is sent. That's
been on the radar for probably a year or two now, but nobody
bothered implementing that yet.

LP: #1651923
2017-01-17 01:59:15 +01:00
David Kalnischkies d8617331af rename ServerMethod to BaseHttpMethod
This 'method' is the abstract base for http and https and should as such
be called out like this rather using an easily confused name.

Gbp-Dch: Ignore
2016-12-31 02:29:21 +01:00
David Kalnischkies 13a9f08de1 separating state variables regarding server/request
Having a Reset(bool) method to partially reset certain variables like
the download size always were strange, so this commit splits the
ServerState into an additional RequestState living on the stack for as
long as we deal with this request causing an automatic "reset".

There is much to do still to make this code look better, but this is a
good first step which compiles cleanly and passes all tests, so keeping
it as history might be beneficial and due to avoiding explicit memory
allocations it ends up fixing a small memory leak in https, too.

Closes: #440057
2016-12-31 02:29:21 +01:00
Lukasz Kawczynski 49b91f6903 Honour Acquire::ForceIPv4/6 in the https transport 2016-12-08 13:48:12 +00:00
David Kalnischkies d94b1d80d8 don't sent Range requests if we know its not accepted
If the server told us in a previous request that it isn't supporting
Ranges with bytes via an Accept-Ranges header missing bytes, we don't
try to formulate requests using Ranges.
2016-08-16 18:49:37 +02:00
David Kalnischkies ebdb6f1810 reorganize server-states resetting in http/https
We keep various information bits about the server around, some only
effecting the currently handled file (like sizes) while others
should be persistent (like pipeline detections). http used to reset all
file-related manually, which is a bit silly if we already have a Reset()
method – which does reset all through –, so extending it with a
parameter for reuse and calling it from https too (as this was
previously resetting by just creating a new state struct – it uses no
value of the persistent state-keeping yet as it supports no pipelining).

Gbp-Dch: Ignore
2016-08-16 18:49:37 +02:00
David Kalnischkies 0568d325ad http: auto-configure for local Tor proxy if called as 'tor'
With apts http transport supporting socks5h proxies and all the work
in terms of configuration of methods based on the name it is called with
it becomes surprisingly easy to implement Tor support equally (and
perhaps even a bit exceeding) what is available currently in
apt-transport-tor.

How this will turn out to be handled packaging wise we will see in
https://lists.debian.org/deity/2016/08/msg00012.html , but until this is
resolved we can add the needed support without actively enabling it for
now, so that this can be tested better.
2016-08-11 01:34:39 +02:00
David Kalnischkies 3006044202 implement generic config fallback for methods
The https method implemented for a long while now a hardcoded fallback
to the same options in http, which, while it works, is rather inflexible
if we want to allow the methods to use another name to change their
behavior slightly, like apt-transport-tor does to https – most of the
diff being s#https#tor#g which then fails to do the full circle
fallthrough tor -> https -> http for https sources. With this config
infrastructure this could be implemented now.
2016-08-10 23:19:44 +02:00
David Kalnischkies 4bba5a88d0 use the same redirection handling for http and https
cURL which backs our https implementation can handle redirects on its
own, but by dealing with them on our own we gain finer control over which
redirections will be performed (we don't like https → http) and by whom
so that redirections to other hosts correctly spawn a new https method
dealing with these instead of letting the current one deal with it.
2016-08-10 23:19:44 +02:00
David Kalnischkies ece81b7517 fail on unsupported http/https proxy settings
Closes: #623443
2016-08-10 23:19:44 +02:00
David Kalnischkies d415fc795a support all socks-proxy known to curl in https method 2016-08-10 23:19:44 +02:00
David Kalnischkies b50dfa6b2d report all instead of first error up the acquire chain
If we don't give a specific error to report up it is likely that all
error currently in the error stack are equally important, so reporting
just one could turn out to be confusing e.g. if name resolution failed
in a SRV record list.
2016-07-06 15:53:59 +02:00
David Kalnischkies 0b45b6e5de use +0000 instead of UTC by default as timezone in output
All apt versions support numeric as well as 3-character timezones just
fine and its actually hard to write code which doesn't "accidently"
accepts it. So why change? Documenting the Date/Valid-Until fields in
the Release file is easy to do in terms of referencing the
datetime format used e.g. in the Debian changelogs (policy §4.4). This
format specifies only the numeric timezones through, not the nowadays
obsolete 3-character ones, so in the interest of least surprise we should
use the same format even through it carries a small risk of regression
in other clients (which encounter repositories created with
apt-ftparchive).

In case it is really regressing in practice, the hidden option
  -o APT::FTPArchive::Release::NumericTimezone=0
can be used to go back to good old UTC as timezone.

The EDSP and EIPP protocols use this 'new' format, the text interface
used to communicate with the acquire methods does not for compatibility
reasons even if none of our methods would be effected and I doubt any
other would (in these instances the timezone is 'GMT' as that is what
HTTP/1.1 requires). Note that this is only true for apt talking to
methods, (libapt-based) methods talking to apt will respond with the
'new' format.  It is therefore strongly adviced to support both also in
method input.
2016-07-02 12:01:17 +02:00
David Kalnischkies 8b79c94af7 use std::locale::global instead of setlocale
We use a wild mixture of C and C++ ways of generating output, so having
a consistent world-view in both styles sounds like a good idea and
should help in preventing regressions.
2016-05-28 18:12:02 +02:00
Patrick Cable 8707edd9e4 refactored no_proxy code to work regardless of where https proxy is set
when using the https transport mechanism, $no_proxy is ignored if apt is
getting it's proxy information from $https_proxy (as opposed to
Acquire::https::Proxy somewhere in apt config). if the source of proxy
information is Acquire::https::Proxy set in apt.conf (or apt.conf.d),
then $no_proxy is honored.
2016-04-27 16:55:55 -04:00
Julian Andres Klode 74dedb4ae2 Convert most callers of isspace() to isspace_ascii()
This converts all callers that read machine-generated data,
callers that might work with user input are not converted.
2015-12-27 01:20:41 +01:00
David Kalnischkies 258b9e512c apply various suggestions made by cppcheck
Reported-By: cppcheck
Git-Dch: Ignore
2015-11-05 12:21:33 +01:00
David Kalnischkies ce1f3a2c61 wrap every unlink call to check for != /dev/null
Unlinking /dev/null is bad, we shouldn't do that. Also, we should print
at least a warning if we tried to unlink a file but didn't manage to
pull it of (ignoring the case were the file is /dev/null or doesn't
exist in the first place).

This got triggered by a relatively unlikely to cause problem in
pkgAcquire::Worker::PrepareFiles which would while temporary
uncompressed files (which are set to keep compressed) figure out that to
files are the same and prepare for sharing by deleting them. Bad move.
That also shows why not printing a warning is a bad idea as this hide
the error for in non-root test runs.

Git-Dch: Ignore
2015-11-04 18:42:28 +01:00
David Kalnischkies bce8e59b81 set failreasons similar to connect.cc based on curl errors
Detecting network errors has some benefits in the acquire system as if
we can't connect to a host trying it for a million files is pointless.
http and co which use connect.cc deal with this, but https which uses
curl had connection failures as "normal" errors which could potentially
be worked around (like trying Release instead of the failed InRelease).

Git-Dch: Ignore
2015-11-04 18:04:01 +01:00
David Kalnischkies 830a1b8c9e fix two memory leaks reported by gcc
Reported-By: gcc -fsanitize=address -fno-sanitize=vptr
Git-Dch: Ignore
2015-09-14 15:22:18 +02:00
Michael Vogt 4fc6b7570c Merge branch 'debian/sid' into debian/experimental
Conflicts:
	apt-pkg/pkgcache.h
	debian/changelog
	methods/https.cc
	methods/server.cc
	test/integration/test-apt-download-progress
2015-05-22 17:01:03 +02:00
Michael Vogt 65759e00ef Update methods/https.cc now that ServerState::Size is renamed
Git-Dch: ignore
2015-05-22 16:30:29 +02:00
Michael Vogt 0f3150e704 Merge remote-tracking branch 'upstream/debian/jessie' into debian/sid
Conflicts:
	apt-pkg/deb/dpkgpm.cc
2015-05-22 16:17:08 +02:00
Michael Vogt 6291f60e86 Rename "Size" in ServerState to TotalFileSize
The variable "Size" was misleading and caused bug #1445239. To
avoid similar issues in the future, rename it to make the meaning
more obvious.

git-dch: ignore
2015-05-22 15:40:18 +02:00
David Kalnischkies 8eafc75954 detect Releasefile IMS hits even if the server doesn't
Not all servers we are talking to support If-Modified-Since and some are
not even sending Last-Modified for us, so in an effort to detect such
hits we run a hashsum check on the 'old' compared to the 'new' file, we
got the hashes for the 'new' already for "free" from the methods anyway
and hence just need to calculated the old ones.

This allows us to detect hits even with unsupported servers, which in
turn means we benefit from all the new hit behavior also here.
2015-05-13 16:09:12 +02:00