Commit Graph

417 Commits

Author SHA1 Message Date
Ken Jin
a2f0654b0a bpo-42967: Fix urllib.parse docs and make logic clearer (GH-24536) 2021-02-15 09:00:20 -08:00
Adam Goldschmidt
fcbe0cb04d bpo-42967: only use '&' as a query string separator (#24297)
bpo-42967: [security] Address a web cache-poisoning issue reported in urllib.parse.parse_qsl().

urllib.parse will only us "&" as query string separator by default instead of both ";" and "&" as allowed in earlier versions. An optional argument seperator with default value "&" is added to specify the separator.


Co-authored-by: Éric Araujo <merwok@netwok.org>
Co-authored-by: blurb-it[bot] <43283697+blurb-it[bot]@users.noreply.github.com>
Co-authored-by: Ken Jin <28750310+Fidget-Spinner@users.noreply.github.com>
Co-authored-by: Éric Araujo <merwok@netwok.org>
2021-02-14 14:41:57 -08:00
Senthil Kumaran
030a713183 Allow / character in username,password fields in _PROXY envvars. (#23973) 2020-12-29 04:18:42 -08:00
Christian Heimes
f97406be4c bpo-40968: Send http/1.1 ALPN extension (#20959)
Signed-off-by: Christian Heimes <christian@python.org>
2020-11-13 16:37:52 +01:00
Ronald Oussoren
93a1ccabde bpo-41471: Ignore invalid prefix lengths in system proxy settings on macOS (GH-22762) 2020-10-19 20:16:21 +02:00
Batuhan Taşkaya
0361556537 bpo-39481: PEP 585 for a variety of modules (GH-19423)
- concurrent.futures
- ctypes
- http.cookies
- multiprocessing
- queue
- tempfile
- unittest.case
- urllib.parse
2020-04-10 07:46:36 -07:00
Victor Stinner
0b297d4ff1 bpo-39503: CVE-2020-8492: Fix AbstractBasicAuthHandler (GH-18284)
The AbstractBasicAuthHandler class of the urllib.request module uses
an inefficient regular expression which can be exploited by an
attacker to cause a denial of service. Fix the regex to prevent the
catastrophic backtracking. Vulnerability reported by Ben Caller
and Matt Schwager.

AbstractBasicAuthHandler of urllib.request now parses all
WWW-Authenticate HTTP headers and accepts multiple challenges per
header: use the realm of the first Basic challenge.

Co-Authored-By: Serhiy Storchaka <storchaka@gmail.com>
2020-04-02 02:52:20 +02:00
Stephen Balousek
5e260e0fde bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest Auth (GH-18338)
* bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication

 - The 'qop' value in the 'WWW-Authenticate' header is optional. The
   presence of 'qop' in the header should be checked before its value
   is parsed with 'split'.

Signed-off-by: Stephen Balousek <stephen@balousek.net>

* bpo-39548: Fix handling of 'WWW-Authenticate' header for Digest authentication

 - Add NEWS item

Signed-off-by: Stephen Balousek <stephen@balousek.net>

* Update Misc/NEWS.d/next/Library/2020-02-06-05-33-52.bpo-39548.DF4FFe.rst

Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com>

Co-authored-by: Brandt Bucher <brandtbucher@gmail.com>
2020-02-29 12:31:58 -08:00
idomic
c33bdbb20c bpo-37970: update and improve urlparse and urlsplit doc-strings (GH-16458) 2020-02-16 21:17:58 +02:00
Serhiy Storchaka
6a265f0d0c bpo-39057: Fix urllib.request.proxy_bypass_environment(). (GH-17619)
Ignore leading dots and no longer ignore a trailing newline.
2020-01-05 14:14:31 +02:00
PypeBros
14a89c4798 bpo-38686: fix HTTP Digest handling in request.py (#17045)
* fix HTTP Digest handling in request.py

There is a bug triggered when server replies to a request with `WWW-Authenticate: Digest` where `qop="auth,auth-int"` rather than mere `qop="auth"`. Having both `auth` and `auth-int` is legitimate according to the `qop-options` rule in §3.2.1 of [[https://www.ietf.org/rfc/rfc2617.txt|RFC 2617]]:
>      qop-options       = "qop" "=" <"> 1#qop-value <">
>      qop-value         = "auth" | "auth-int" | token
> **qop-options**: [...] If present, it is a quoted string **of one or more** tokens indicating the "quality of protection" values supported by the server.  The value `"auth"` indicates authentication; the value `"auth-int"` indicates authentication with integrity protection

This is description confirmed by the definition of the [//n//]`#`[//m//]//rule// extended-BNF pattern defined in §2.1 of [[https://www.ietf.org/rfc/rfc2616.txt|RFC 2616]] as 'a comma-separated list of //rule// with at least //n// and at most //m// items'.

When this reply is parsed by `get_authorization`, request.py only tests for identity with `'auth'`, failing to recognize it as one of the supported modes the server announced, and claims that `"qop 'auth,auth-int' is not supported"`.

* 📜🤖 Added by blurb_it.

* bpo-38686 review fix: remember why.

* fix trailing space in Lib/urllib/request.py

Co-Authored-By: Brandt Bucher <brandtbucher@gmail.com>
2019-11-22 15:19:08 -08:00
Pablo Galindo
293dd23477 Remove binding of captured exceptions when not used to reduce the chances of creating cycles (GH-17246)
Capturing exceptions into names can lead to reference cycles though the __traceback__ attribute of the exceptions in some obscure cases that have been reported previously and fixed individually. As these variables are not used anyway, we can remove the binding to reduce the chances of creating reference cycles.

See for example GH-13135
2019-11-19 21:34:03 +00:00
Tim Graham
5a88d50ff0 bpo-27657: Fix urlparse() with numeric paths (#661)
* bpo-27657: Fix urlparse() with numeric paths

Revert parsing decision from bpo-754016 in favor of the documented
consensus in bpo-16932 of how to treat strings without a // to
designate the netloc.

* bpo-22891: Remove urlsplit() optimization for 'http' prefixed inputs.
2019-10-18 06:07:20 -07:00
Stein Karlsen
aad2ee0156 bpo-32498: urllib.parse.unquote also accepts bytes (GH-7768) 2019-10-14 13:36:29 +03:00
Zackery Spytz
b761e3aed1 bpo-25068: urllib.request.ProxyHandler now lowercases the dict keys (GH-13489) 2019-09-13 15:07:07 +01:00
Ashwin Ramaswami
ff2e182865 bpo-12707: deprecate info(), geturl(), getcode() methods in favor of headers, url, and status properties for HTTPResponse and addinfourl (GH-11447)
Co-Authored-By: epicfaace <aramaswamis@gmail.com>
2019-09-13 12:40:07 +01:00
Rémi Lapeyre
8047e0e1c6 bpo-35922: Fix RobotFileParser when robots.txt has no relevant crawl delay or request rate (GH-11791)
Co-Authored-By: Tal Einat <taleinat+github@gmail.com>
2019-06-16 09:48:57 +03:00
Steve Dower
8d0ef0b5ed bpo-36742: Corrects fix to handle decomposition in usernames (#13812) 2019-06-04 17:55:29 +02:00
Rémi Lapeyre
674ee12600 bpo-35397: Remove deprecation and document urllib.parse.unwrap (GH-11481) 2019-05-27 09:43:45 -04:00
Steve Dower
b82e17e626 bpo-36842: Implement PEP 578 (GH-12613)
Adds sys.audit, sys.addaudithook, io.open_code, and associated C APIs.
2019-05-23 08:45:22 -07:00
Victor Stinner
0c2b6a3943 bpo-35907, CVE-2019-9948: urllib rejects local_file:// scheme (GH-13474)
CVE-2019-9948: Avoid file reading as disallowing the unnecessary URL
scheme in URLopener().open() and URLopener().retrieve()
of urllib.request.

Co-Authored-By: SH <push0ebp@gmail.com>
2019-05-22 22:15:01 +02:00
Xtreak
c661b30f89 bpo-36948: Fix NameError in urllib.request.URLopener.retrieve (GH-13389) 2019-05-19 16:40:05 +03:00
Steve Dower
d537ab0ff9 bpo-36742: Fixes handling of pre-normalization characters in urlsplit() (GH-13017) 2019-04-30 12:03:02 +00:00
Jörn Hees
750d74fac5 bpo-12910: update and correct quote docstring (#2568)
Fixes some mistakes and misleadings in the quote function docstring:
- reserved chars are never actually used by quote code, unreserved chars are
- reserved chars were wrong and incomplete
- mentioned that use-case is not minimal quoting wrt. RFC, but cautious quoting
2019-04-09 17:31:18 -07:00
Serhiy Storchaka
da0847048a bpo-36431: Use PEP 448 dict unpacking for merging two dicts. (GH-12553) 2019-03-27 08:02:28 +02:00