97 Commits

Author SHA1 Message Date
Terry Jan Reedy
f106f8f29c whitespace 2014-02-23 23:39:57 -05:00
Terry Jan Reedy
9dc3a36c84 Issue #9974: When untokenizing, use row info to insert backslash+newline.
Original patches by A. Kuchling and G. Rees (#12691).
2014-02-23 23:33:08 -05:00
Terry Jan Reedy
5b8d2c3af7 Issue #8478: Untokenizer.compat now processes first token from iterator input.
Patch based on lines from Georg Brandl, Eric Snow, and Gareth Rees.
2014-02-17 23:12:16 -05:00
Terry Jan Reedy
5e6db31368 Untokenize: An logically incorrect assert tested user input validity.
Replace it with correct logic that raises ValueError for bad input.
Issues #8478 and #12691 reported the incorrect logic.
Add an Untokenize test case and an initial test method.
2014-02-17 16:45:48 -05:00
Serhiy Storchaka
768c16ce02 Issue #18960: Fix bugs with Python source code encoding in the second line.
* The first line of Python script could be executed twice when the source
encoding (not equal to 'utf-8') was specified on the second line.

* Now the source encoding declaration on the second line isn't effective if
the first line contains anything except a comment.

* As a consequence, 'python -x' works now again with files with the source
encoding declarations specified on the second file, and can be used again
to make Python batch files on Windows.

* The tokenize module now ignore the source encoding declaration on the second
line if the first line contains anything except a comment.

* IDLE now ignores the source encoding declaration on the second line if the
first line contains anything except a comment.

* 2to3 and the findnocoding.py script now ignore the source encoding
declaration on the second line if the first line contains anything except
a comment.
2014-01-09 18:36:09 +02:00
Ezio Melotti
4bcc796acc #19620: Fix typo in docstring (noticed by Christopher Welborn). 2013-11-25 05:14:51 +02:00
Serhiy Storchaka
dafea85190 Issue #18873: The tokenize module, IDLE, 2to3, and the findnocoding.py script
now detect Python source code encoding only in comment lines.
2013-09-16 23:51:56 +03:00
Ezio Melotti
fafa8b7797 #16152: merge with 3.2. 2012-11-03 17:46:51 +02:00
Ezio Melotti
2cc3b4ba9f #16152: fix tokenize to ignore whitespace at the end of the code when no newline is found. Patch by Ned Batchelder. 2012-11-03 17:38:43 +02:00
Florent Xicluna
fed2c51eea Merge branch 2012-07-07 12:26:56 +02:00
Florent Xicluna
11f0b41e9d Issue #14990: tokenize: correctly fail with SyntaxError on invalid encoding declaration. 2012-07-07 12:13:35 +02:00
Christian Heimes
0b3847de6d Issue #15096: Drop support for the ur string prefix 2012-06-20 11:17:58 +02:00
Meador Inge
8d5c0b8c19 Issue #15054: Fix incorrect tokenization of 'b' string literals.
Patch by Serhiy Storchaka.
2012-06-16 21:49:08 -05:00
Brett Cannon
c33f3f2339 Issue #14629: Mention the filename in SyntaxError exceptions from
tokenizer.detect_encoding() (when available).
2012-04-20 13:23:54 -04:00
Martin v. Löwis
63c39fe38e merge 3.2: issue 14629 2012-04-20 14:37:17 +02:00
Martin v. Löwis
63674f4b52 Issue #14629: Raise SyntaxError in tokenizer.detect_encoding
if the first two lines have non-UTF-8 characters without an encoding declaration.
2012-04-20 14:36:47 +02:00
Armin Ronacher
c0eaecafe9 Updated tokenize to support the inverse byte literals new in 3.3 2012-03-04 13:07:57 +00:00
Armin Ronacher
6ecf77b3f8 Basic support for PEP 414 without docs or tests. 2012-03-04 12:04:06 +00:00
Meador Inge
00c7f85298 Issue #2134: Add support for tokenize.TokenInfo.exact_type. 2012-01-19 00:44:45 -06:00
Antoine Pitrou
10a99b024d Issue #13150: The tokenize module doesn't compile large regular expressions at startup anymore.
Instead, the re module's standard caching does its work.
2011-10-11 15:45:56 +02:00
Meador Inge
14c0f03b58 Issue #12943: python -m tokenize support has been added to tokenize. 2011-10-07 08:53:38 -05:00
Brett Cannon
45b96d373e Merged revisions 88498 via svnmerge from
svn+ssh://pythondev@svn.python.org/python/branches/py3k

........
  r88498 | brett.cannon | 2011-02-21 19:25:12 -0800 (Mon, 21 Feb 2011) | 8 lines

  Issue #11074: Make 'tokenize' so it can be reloaded.

  The module stored away the 'open' object as found in the global namespace
  (which fell through to the built-in namespace) since it defined its own 'open'.
  Problem is that if you reloaded the module it then grabbed the 'open' defined
  in the previous load, leading to code that infinite recursed. Switched to
  simply call builtins.open directly.
........
2011-02-22 03:35:18 +00:00
Brett Cannon
f3042782af Issue #11074: Make 'tokenize' so it can be reloaded.
The module stored away the 'open' object as found in the global namespace
(which fell through to the built-in namespace) since it defined its own 'open'.
Problem is that if you reloaded the module it then grabbed the 'open' defined
in the previous load, leading to code that infinite recursed. Switched to
simply call builtins.open directly.
2011-02-22 03:25:12 +00:00
Alexander Belopolsky
b9d10d08c4 Issue #10386: Added __all__ to token module; this simplifies importing
in tokenize module and prevents leaking of private names through
import *.
2010-11-11 14:07:41 +00:00
Victor Stinner
58c0752a33 Issue #10335: Add tokenize.open(), detect the file encoding using
tokenize.detect_encoding() and open it in read only mode.
2010-11-09 01:08:59 +00:00