Commit Graph

710 Commits

Author SHA1 Message Date
Joel Brobecker
76bd184b31 README.md: Document the Python version compatibility
Now that the hooks can be used with Python 3.x, time to enhance
the README to indicate which versions are supported.

TN: U530-006
Change-Id: I49c3fc05a484dfbecd6e10aea4d9e9221a75daa2
2021-10-06 11:27:20 -07:00
Joel Brobecker
23cc50ec72 syshooks/syslog.py: Decode the output returned by the logger
This is another preparation patch for the transition to python 3.x.
With Python 3.x, the output we get from the logger is going to be
a byte string, which we need to then decode into a string.

Change-Id: I2307b90f2e2cccaed8a93fa589f82fba0064c28b
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
23c0aea671 updates/sendmail.py: encode input and decode output when calling sendmail
This is another preparation patch for the transition to Python 3.x.
With Python 3.x, we need to make sure that the input used when
calling sendmail is converted to a byte string. We also then need
to make sure that the script's output is decoded into a string
when printing it.

Change-Id: I1b792638fb77c8d1b4ee2197b29b63922e0fe211
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
f457d10a92 updates/emails.py: encode input and decode output when calling filer
This is another preparation patch for the transition to Python 3.x.
With Python 3.x, we need to make sure that the input used when
calling the filer cmd is converted to a byte string. We also
then need to make sure that the script's output is decoded into
a string.

Change-Id: I324410dd5c9b1e811252803b854d0f06ca65435d
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
62fa030351 ThirdPartyHook.call: encode/decode the hook's input/output
This commit is part of the prep work for the transition to Python 3.x,
where the input of the hooks need to be encoded before send it,
and where the output needs to be decoded.

Change-Id: I68fa9de5b4c8b932f931725174a6424716855c2a
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
2178791fa7 encode input in call to git.check_attr
This is another preparation patch for the transition to Python 3.x,
where the input first needs to be encoded before it is passed to
the git command to be executed.

Change-Id: I2d8aac02a17b5d5765ab5e2c357bc15ead4f2c64
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
010c33e913 encode input and decode output when calling hooks.mailinglist script
This is another preparation patch for the transition to Python 3.x.

The script's input needs to be encoded when called, and its output
needs to be decoded into a string for us to process it.

For input encoding, the same approach as for decoding is taken:
In order to make progress towards Python 3.x support while at
the same time preserving support for Python 2.x, we introduce
a new function "encode_utf8" which only performs the encoding
on Python 3.x. With Python 2.x, the function just returns the string
unmodified.

Change-Id: Ieb47d32c756405cdd0d300254e8cd7c8c3db50b5
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
9c82498e7b Introduce (the concept of) git command output decoding
This commit is preparation work for the transition to Python 3.x,
where the output obtained by running Git commands will become
bytes as opposed to a string. In the vast majority of cases,
we'll want to decode that output into a string. Ideally, we would
want to do this in a way that is both compatible with Python 2.x
and Python 3.x, but we have found that this requires a lot of
work with many changes spread all over the code. So, instead,
what this commit does is introduce the concept of decoding
the output, but with the decoding only occurring when running
under Python 3.x.

That way, we can make progress towards Python 3.x while preserving
the behavior under Python 2.x intact.

Change-Id: I189577798ee96cba1fa55c7356babf102575642f
TN: U530-006
2021-10-06 11:27:20 -07:00
Joel Brobecker
d87407ca43 ensure_iso_8859_15_only: Enhance for Python 3.x (unicode revlog lines)
This is another commit to prepare for the transition to Python 3.x,
where the revlog of the given Commit object will be (unicode) strings.
In this situation decoding before checking the string against
the ISO-8859-15 is not necessary.

This commit therefore enhances the function to handle the case
where the revlog is already unicode by checking whether it offers
the "decode" method or not. The code is written in such a way that
the Python 3.x case doesn't use a temporary variable. That way, when
Python 2.x support is removed, we can simply remove the corresponding
half of the if-else block, and just keep the other half untouched.

TN: U530-006
Change-Id: I674c639eb7af656cdc084cc510b9eca90751f18f
2021-10-04 08:16:01 -07:00
Joel Brobecker
6d9cc85f1d commit_filer_non_ascii_diff: Remove unused import (Header)
Found while working on U530-006 (transition to Python 3.x).

Change-Id: I1e8acfed478e67b94fafdac990c98398421981cb
2021-10-04 08:16:01 -07:00
Joel Brobecker
40bf19a92e commit_filer_non_ascii_body: Remove unused import (Header)
Found while working on U530-006 (transition to Python 3.x).

Change-Id: I6ff544fa0cf424b34714d3d0622beccc4c4105b8
2021-10-04 08:16:01 -07:00
Joel Brobecker
5c28a09670 commit_email_formatter_non_ascii_diff: Remove unused import (Header)
Found while working on U530-006 (transition to Python 3.x).

Change-Id: I6897a72d9801c44e8d2c21e9f80f110aedbeb209
2021-10-04 08:16:01 -07:00
Joel Brobecker
b5c1f88317 commit_email_formatter_non_ascii_body: Remove unused import (Header)
Found while working on U530-006 (transition to Python 3.x).

Change-Id: Ia9a67ad907566c4cecc6e90a427cba57ee9349fd
2021-10-04 08:16:01 -07:00
Joel Brobecker
9a3a3e8b47 guess_encoding (Python 3.x): drop iso-8859-15 guess (in favor of UTF-8)
This commit changes the guess_encoding function, in the Python 3.x
case, to return UTF-8 instead of iso-8859-15 for strings with content
which is compatible with iso-8859-15. The reason for this change is
to standardize a little more towards UTF-8 as our encoding of choice
when generating textual data.

Using iso-8859-15 was not wrong as far as I can tell, but I believe
the majority of users and applications have switched to UTF-8 now,
so this commit simply follows that trend.

This commit won't have any effect until we switch the testsuite over
to Python 3.x, where some email header fields will end up being
encoded using UTF-8 instead of iso-8859-15. For the moment, no visible
change within the current testing, as it only supports being run
with Python 2.x.

TN: U530-006
Change-Id: I6b008cd5c2e12a4dbb97a567fb35a76f40e9782a
2021-10-04 08:16:01 -07:00
Joel Brobecker
1b26ad75f0 testsuite/bin/stdout-sendmail: Parse email before printing
This script, which is used as a replacement for sendmail when
running the testsuite, has two modes. In the default mode,
it currently does nothing more than dumping to standard output
the email being asked to send.

This commit changes the script to parse the email instead, and
dump its contents once parsed, rather than dumping the email
as is. The dumping is written so to keep the output the same.

The goal is to help print the email body in a human-readable way
regardless of its content transfer encoding. This will become
particularly useful when we start using base64 content transfer
encodings.

This is tied to the transition to Python 3.x, where the email
support classes automatically chooses base64 transfer encoding
in some cases. More generally speaking, we have plans to use
base64 more often, as this format ensures that we do not exceed
some limitations in email transport (maximum line length, for
instance).

TN: U530-006
Change-Id: I645126470e8a3b944ff668aba6d30381d418c6b1
2021-10-04 08:14:13 -07:00
Joel Brobecker
f18c10b78c stop bypassing the updates.sendmail module during testsuite runs
The goal of this commit is to include the updates.sendmail module
in our testing strategy, in order to make sure that the hooks are
passing email data down to the sendmail program without issues.
This will become particularly important when we switch over to
using Python 3.x, because of the strong distinction between bytes
and strings with newer versions of Python which can cause a lot
problems. Hence the need to use this code during our testing.

The main strategy introduced by this commit to achieve this is
fairly simple: The testsuite framework introduces a new minimal
script to be called in place of the standard sendmail. A new
environment variable called GIT_HOOKS_SENDMAIL is introduced
allowing the testsuite to tell the hooks to use its own (fake)
sendmail instead of the system one. With that in place,
the old code bypassing the use of updates.sendmail can be removed,
thus allowing the testsuite to include it as part of the testing.
The testsuite's (fake) sendmail script was written in a way to
mimick the old bypassing code, so there is no change in output.

Parallel to that, the hooks are enhanced to check that we can
indeed find sendmail, and otherwise return immediately with
an error if not. This way, we avoid emails silently being
dropped due to the missing sendmail.

A couple of testcases are also added to double-check some
specific error situations.

Note that I tried to think of ways to split this patch into
smaller individual parts, but couldn't really find a way to
do so in a meaningful way, while at the same time producing
a commit where the coverage report stays clean (0 lines missed).

TN: U530-006 (transition to Python 3.x)
TN: U924-032 (test for sendmail not found)
TN: U924-034 (test for sendmail override when in testsuite mode)
Change-Id: I74b993592ec6d701347bbca5283a42e037411f1c
2021-09-24 17:41:10 -07:00
Joel Brobecker
26ee444039 sendmail.py: Remove fallback on smtplib
The implementation of this module was originally inherited from
gnatpython, where it was trying first to call sendmail, and if
not available, then fallback on using Python's smtplib instead.

This commit removes support for using smtplib, and instead assumes
that sendmail is always available.

The reasons for this change are two-fold:

  - For all the users of these scripts I know of, sendmail is always
    available, so we haven't really used the smtplib fallback.

  - While this code is currently excluded during testing (to avoid
    sending emails while running the testsuite), I'd like to enhance
    our testing strategy to start including this code as part of
    the testing. In particular, one thing we can do is for the testuite
    to eventually provide its own version of a sendmail program that
    would dump the traces to stdout rather than actually send an email.

    On the other hand, if we were to keep smtplib support as a fallback,
    I do not see how we could test that part without actually having it
    send email, something we absolutely do not want.

    This is related to the effort of moving to Python 3.x, where Python
    now makes a strong distinction between bytes and strings when
    passing data between processes. With Python 3.x, it's much more
    important to always test that data is passed correctly.

TN: U530-006
Change-Id: Ic2153be62a80906dce709fb3d622e1194ca7c869
2021-09-24 09:09:37 -07:00
Joel Brobecker
cbbf70fd11 Add coverage pragma for Python2-only block of code in updates/emails.py
This pragma allows us to exclude this block when doing coverage analysis
when testing the git-hooks using a Python 3.x interpreter.

Change-Id: Id2f61c2a1cbf965c93693771b6dcb9d55d6a2708
TN: U530-006
2021-08-22 07:26:54 -07:00
Joel Brobecker
fd889438a9 Add support for unicode strings to the "guess_encoding" function
This is another commit to prepare for the transition to Python 3.x,
where text will be converted early to unicode strings, instead of
being kept as byte strings. When passed to "guess_encoding" in
Python 3.x, unicode strings don't have a "decode" method, as
the strings are already decoded. As a result, the current
implementation always returns None (no encoding found), because
we get an exception calling the non-existent method, promptly
trapped and wrongly interpreted as being a decoding error.

To prepare the transition to Python 3.x, this commit adds
a check to see if we have a byte-string. If we do, then do
the same as before. Otherwise, we must have a unicode string,
and so check the encodings by trying to encoding rather than
decode the string.

TN: U530-006
Change-Id: I50cf689fec8c205a6e48b42fac3a95a6bb9886b4
2021-08-20 15:32:22 -07:00
Joel Brobecker
61a8e83e52 emails.py: Mark a couple of code blocks as being Python-2.x only
This commit adds a couple of "# pragma: py2-only" comments to
a couple of code blocks which are only expected to be run when
the hooks are tested with Python 2.x (these two code blocks are
conditioned on the version of Python being less than 3).

This will help us manage the transition to Python 3.x until we are
able to drop support for Python 2.x. That way, we can run coverage
analysis with both Python 2.x and Python 3.x, and get the Python 3.x
coverage analyzer to ignore those blocks we know we cannot cover
with Python 3.x.

Once the transition to Python 3.x is over, we will remove those code
blocks.

TN: U530-006
Change-Id: I44f1cd883c3fdf4e487e1e553158517e721416df
2021-07-17 17:03:22 -07:00
Joel Brobecker
19bafbec4e Make some testcase's output more predictable to avoid spurious errors
These testcases are testing the git-hooks behavior when using
a commit-extra-checker hook. This hook works by receiving info
about the commit to check via stdin, as a dictionary in JSON format.

To help the testcase verify the data being passed via its stdin,
the hook currently prints the contents of stdin verbatim.
Unfortunately, this makes an unwarranted assumption about
the order of the elements in that string representation,
causing spurious differences when trying to run this testcase
with Python 3.x.

This commit enhances the scripts used by these testcases to
output the contents of the data in a way which is always
the same, regardless of the order it was passed in.

TN: U530-006
Change-Id: Ia8b16611bb29b7601f14949ee5554c906b947853
2021-07-17 15:30:57 -07:00
Joel Brobecker
179346fafb cannot_find_style_checker/run_test.py: Use "format" method instead of "%"
This commit slightly rewrites the way we format the expected output
of our of our tests, by using the "format" method instead of using
the "%" operator. It doesn't make much of a difference, at the moment,
but this will become handy when we transition to Python 3.x, where
the "No such file or directory" error now includes the path of
the file/directory that's missing.

TN: U530-006
Change-Id: I748ca76d411cfdf79dd33402d1297e7372decc38
2021-07-11 17:51:32 -07:00
Joel Brobecker
473a7d3fe6 git_show_ref: Avoid changing dict size while iterating over it
This is another Python 3.x preparation patch, which fixes a situation
where we change a dictionary size while iterating over it, something
that became apparent while exercizing the hooks with Python 3.x.

This commit avoids this by rewriting this part of the function
in a way that aboves the dictionary iteration entirely.

TN: U530-006
Change-Id: Ia22b28919ac2dabc3e22ed1b3652e14d799e566b
2021-07-11 17:31:17 -07:00
Joel Brobecker
dc09109dca pre_receive.py, update, post_receive.py: Turn buffering off
While testing the behavior of the git-hooks under Python 3.x,
I noticed that the order of some output was not the same as
when running it with Python 2.x. When investigating further,
I found that the order with Python 2.x made better sense, and
that the different order was caused by stdout/stderr buffering.

This commit turns buffering off entirely in an effort to make
sure that output sent to stdout & stderr gets seen in the order
that it was sent.

As it happens, while this commit was aimed at Python 3.x,
running the testsuite showed that we had one testcase where
the order when using Python 2.x was also incorrect, and
therefore misleading, for a couple of tests. The tests'
expected output was double-checked, and adjusted accordingly,
with additional comments explaining what each part was about.

TN: U530-006
Change-Id: Iaf2c5266e13a645dab006e1f7f4cb553cbd5704f
2021-07-11 14:56:38 -07:00
Joel Brobecker
62d7872f6f mailinglist_script: Remove use of "basestring" (transition to Python 3.x)
Type "basestring" is does not exist with Python 3.x, so this commit
removes it from this testcase.

TN: U530-006
Change-Id: Ic53b269c29c90f686a2fb7b834215b22454cd6c0
2021-07-10 15:24:55 -07:00