This is another preparation patch for the transition to Python 3.x.
With Python 3.x, we need to make sure that the input used when
calling sendmail is converted to a byte string. We also then need
to make sure that the script's output is decoded into a string
when printing it.
Change-Id: I1b792638fb77c8d1b4ee2197b29b63922e0fe211
TN: U530-006
This is another preparation patch for the transition to Python 3.x.
With Python 3.x, we need to make sure that the input used when
calling the filer cmd is converted to a byte string. We also
then need to make sure that the script's output is decoded into
a string.
Change-Id: I324410dd5c9b1e811252803b854d0f06ca65435d
TN: U530-006
This is another preparation patch for the transition to Python 3.x.
The script's input needs to be encoded when called, and its output
needs to be decoded into a string for us to process it.
For input encoding, the same approach as for decoding is taken:
In order to make progress towards Python 3.x support while at
the same time preserving support for Python 2.x, we introduce
a new function "encode_utf8" which only performs the encoding
on Python 3.x. With Python 2.x, the function just returns the string
unmodified.
Change-Id: Ieb47d32c756405cdd0d300254e8cd7c8c3db50b5
TN: U530-006
This commit is preparation work for the transition to Python 3.x,
where the output obtained by running Git commands will become
bytes as opposed to a string. In the vast majority of cases,
we'll want to decode that output into a string. Ideally, we would
want to do this in a way that is both compatible with Python 2.x
and Python 3.x, but we have found that this requires a lot of
work with many changes spread all over the code. So, instead,
what this commit does is introduce the concept of decoding
the output, but with the decoding only occurring when running
under Python 3.x.
That way, we can make progress towards Python 3.x while preserving
the behavior under Python 2.x intact.
Change-Id: I189577798ee96cba1fa55c7356babf102575642f
TN: U530-006
This commit changes the guess_encoding function, in the Python 3.x
case, to return UTF-8 instead of iso-8859-15 for strings with content
which is compatible with iso-8859-15. The reason for this change is
to standardize a little more towards UTF-8 as our encoding of choice
when generating textual data.
Using iso-8859-15 was not wrong as far as I can tell, but I believe
the majority of users and applications have switched to UTF-8 now,
so this commit simply follows that trend.
This commit won't have any effect until we switch the testsuite over
to Python 3.x, where some email header fields will end up being
encoded using UTF-8 instead of iso-8859-15. For the moment, no visible
change within the current testing, as it only supports being run
with Python 2.x.
TN: U530-006
Change-Id: I6b008cd5c2e12a4dbb97a567fb35a76f40e9782a
The goal of this commit is to include the updates.sendmail module
in our testing strategy, in order to make sure that the hooks are
passing email data down to the sendmail program without issues.
This will become particularly important when we switch over to
using Python 3.x, because of the strong distinction between bytes
and strings with newer versions of Python which can cause a lot
problems. Hence the need to use this code during our testing.
The main strategy introduced by this commit to achieve this is
fairly simple: The testsuite framework introduces a new minimal
script to be called in place of the standard sendmail. A new
environment variable called GIT_HOOKS_SENDMAIL is introduced
allowing the testsuite to tell the hooks to use its own (fake)
sendmail instead of the system one. With that in place,
the old code bypassing the use of updates.sendmail can be removed,
thus allowing the testsuite to include it as part of the testing.
The testsuite's (fake) sendmail script was written in a way to
mimick the old bypassing code, so there is no change in output.
Parallel to that, the hooks are enhanced to check that we can
indeed find sendmail, and otherwise return immediately with
an error if not. This way, we avoid emails silently being
dropped due to the missing sendmail.
A couple of testcases are also added to double-check some
specific error situations.
Note that I tried to think of ways to split this patch into
smaller individual parts, but couldn't really find a way to
do so in a meaningful way, while at the same time producing
a commit where the coverage report stays clean (0 lines missed).
TN: U530-006 (transition to Python 3.x)
TN: U924-032 (test for sendmail not found)
TN: U924-034 (test for sendmail override when in testsuite mode)
Change-Id: I74b993592ec6d701347bbca5283a42e037411f1c
The implementation of this module was originally inherited from
gnatpython, where it was trying first to call sendmail, and if
not available, then fallback on using Python's smtplib instead.
This commit removes support for using smtplib, and instead assumes
that sendmail is always available.
The reasons for this change are two-fold:
- For all the users of these scripts I know of, sendmail is always
available, so we haven't really used the smtplib fallback.
- While this code is currently excluded during testing (to avoid
sending emails while running the testsuite), I'd like to enhance
our testing strategy to start including this code as part of
the testing. In particular, one thing we can do is for the testuite
to eventually provide its own version of a sendmail program that
would dump the traces to stdout rather than actually send an email.
On the other hand, if we were to keep smtplib support as a fallback,
I do not see how we could test that part without actually having it
send email, something we absolutely do not want.
This is related to the effort of moving to Python 3.x, where Python
now makes a strong distinction between bytes and strings when
passing data between processes. With Python 3.x, it's much more
important to always test that data is passed correctly.
TN: U530-006
Change-Id: Ic2153be62a80906dce709fb3d622e1194ca7c869
This pragma allows us to exclude this block when doing coverage analysis
when testing the git-hooks using a Python 3.x interpreter.
Change-Id: Id2f61c2a1cbf965c93693771b6dcb9d55d6a2708
TN: U530-006
This is another commit to prepare for the transition to Python 3.x,
where text will be converted early to unicode strings, instead of
being kept as byte strings. When passed to "guess_encoding" in
Python 3.x, unicode strings don't have a "decode" method, as
the strings are already decoded. As a result, the current
implementation always returns None (no encoding found), because
we get an exception calling the non-existent method, promptly
trapped and wrongly interpreted as being a decoding error.
To prepare the transition to Python 3.x, this commit adds
a check to see if we have a byte-string. If we do, then do
the same as before. Otherwise, we must have a unicode string,
and so check the encodings by trying to encoding rather than
decode the string.
TN: U530-006
Change-Id: I50cf689fec8c205a6e48b42fac3a95a6bb9886b4
This commit adds a couple of "# pragma: py2-only" comments to
a couple of code blocks which are only expected to be run when
the hooks are tested with Python 2.x (these two code blocks are
conditioned on the version of Python being less than 3).
This will help us manage the transition to Python 3.x until we are
able to drop support for Python 2.x. That way, we can run coverage
analysis with both Python 2.x and Python 3.x, and get the Python 3.x
coverage analyzer to ignore those blocks we know we cannot cover
with Python 3.x.
Once the transition to Python 3.x is over, we will remove those code
blocks.
TN: U530-006
Change-Id: I44f1cd883c3fdf4e487e1e553158517e721416df
Support for this specific file is purely AdaCore-specific and
historical. Since then, we have introduced (much!) better ways
to support users who want to suppress checks for a given commit.
We know this feature hasn't been used for many years, so it is
time to remove support for it. Note that it wasn't even documented.
TN: U627-004
Change-Id: Ie67532158abc1d302a3d98e57835f15dadfe0817
This commit updates the pre-commit hook to black version 21.5b1.
The hooks where then re-run on all files to update their formatting
to this new version of black.
Change-Id: Ib0866745ef8432cf93380a4d83fa23a479eb4a49
This commit is inspired by the fact that I couldn't understand
why I was skipping the first character from the output of a Git
command whe computing a commit's "diff", like so):
diff = git.show(commit.rev, [...], pretty="format:|")[1:]
(emphasis on the "[1:]" at the end).
To understand, I remove the subscripting and reran the testsuite
without it to see what failures I would get. This gave me
the answer, which is we were intentionally starting the "format:"
string with a "|", and so we needed to strip that extra character.
That's when I found a comment I wrote; I didn't see it at first
because it was placed further up:
# For the diff, there is one subtlelty:
# Git commands calls strip on the output, which is usually
# a good thing, but not in the case of the diff output.
# Prevent this from happening by putting an artificial
# character at the start of the format string, and then
# by stripping it from the output.
This may have made sense back when I wrote that comment, but
we no longer strip the start of the output anymore (see commit
af06d5ea54). So I decided to
simplify the code by removing the extraneous character in
the "format:" string.
As it happens, this revealed that git behaves slightly differently
when given an empty "format:" string. Before:
| $ git show -p -M --stat --pretty="format:|" HEAD
| |---
| hooks/git.py | 4 +---
| 1 file changed, 1 insertion(+), 3 deletions(-)
|
| diff --git a/hooks/git.py b/hooks/git.py
| index fe2b36b..0669111 100644
| [snip]
After (removing the "|" in the "format:" string):
| $ git show -p -M --stat --pretty="format:" HEAD
| hooks/git.py | 4 +---
| 1 file changed, 1 insertion(+), 3 deletions(-)
|
| diff --git a/hooks/git.py b/hooks/git.py
| [snip]
What we can see is that the "---" separate line is no longer shown
in the second command.
Rather than forcing Git to print it, or rather than staying with
the existing code, this commit simply hardcodes the separator line.
One minor bonus of doing it this way is that, if Git decides to
change that separator, this won't affect us, and thus we won't
have to change hundreds of tests accordingly.
And by doing so, this revealed that there was actually an inconsistency
in the formatting produced by Git: In some cases (e.g. merge commits),
it became apparent that Git was omitting this "---" separator line,
even when the "format:" string was empty. The corresponding testcases
where the inconsistency showed up were adjusted to match the new
behavior, which is consisdered (slightly) better, because more
consistent.
Found while working on U530-006 (transition to Python 3.x).
Change-Id: Ifc473fa471ba618e11c3c4bcc6d83cc6f82fc6bf
When looking at the 8 code points that are different between
iso-8859-1 and iso-8859-15, it seems like the iso-8859-15 has
some characters might be more widely used than the ones in
iso-8859-1 (some standalone accents, some fractions, a generic
"currency sign"). This commit therefore changes the hooks to
use iso-8859-15 instead of iso-8859-1.
Note that this doesn't change the fact that UTF-8 is still
the first-choice encoding we try.
Change-Id: I36092552dc647935269b1f0f6b401d198e1a7bd6
TN: U528-040
This commit simplifies the choice of the charset being used to send
our emails to just using UTF-8. This makes all emails consistently using
that charset, and in particular follows something we were already doing
when the message body was found to be in unicode format (this happens
when the message body comes from calling one of the project's hooks).
The expectation is that this preliminary change will facilitate
the transition to Python 3, where strings are unicode.
Change-Id: I0e44baf460dd99a2505d94671ac6042304addfd2
TN: TB22-002
With python 3.x, the string module no longer provides this function.
So this commit reworks the one use of this function to avoid it.
Change-Id: I0b7a771533cbee7d7eed5dcdc1b531eb538be92d
... as recommended by flake8 under B011:
B011 Do not call assert False since python -O removes these calls.
Instead callers should raise AssertionError().
Change-Id: Ib0860cc31d2fbbae1beafdb0b3ed99df7852b433
This commit fixes an unexpected UnicodeEncodeError error which
happens when a user pushes a change to a repository which uses
both a file-commit-cmd hook as well as a commit-email-formatter
hook which customizes commit-email bodies.
While this doesn't happen when the commit-email-formatter only
changes the email subject or the "diff", tests are also added
to cover these cases, for completeness reason.
Change-Id: I360fcaf25db6cd4a930302e2d6f34e8609e2b364
TN: U110-001
In preparation for the transition to Python 3, convert to using the
print function. The special import
from __future__ import print_function
in Python 2.7 makes "print" become the function and not the statement.
In Python 3, it has no effect. Once the transition is done and we
exclusively use Python 3, we can simply remove the imports.
The testsuite shows no regressions.
This commit fixes https://github.com/AdaCore/git-hooks/issues/13,
and introduces multiple testcases for similar scenarios.
The "commit_email_formatter" demonstrates a small change in
the charset being used, going from "ascii" to "utf-8". Since
the contents is all ASCII characters, this makes no difference
in the actual encoding of the body.
Change-Id: Id3b8e29e5e8eff843e325528c7b6710ff0dd2614
TN: TB15-003
This commit eliminates one instance of a print statement used
without parenthesizing its parameter, which is not valid Python 3.x
syntax.
Found while working on TB15-003.
Change-Id: I6f9b2cb3675e762e79b6f1927a05402d4662c1fa
This commit fixes a crash which occurs during the validation phase
(when the "update" hook is called). This only happens when the repository
is configured with the combined-style-checking config option set.
So far, we found a couple of scenarios where this could happen:
- The user pushes a single commit, which happens to be a "revert"
commit (i.e. a commit which is created via the "git revert"
command); those commits are purposefully excluded from the list
of commits we have to check;
- The user pushes a tag, where the tag points to a commit which
already exists in the repository (something very much typical
of people pushing tags).
In both cases, we land in the AbstractUpdate.__do_style_checks method,
doing the following:
| added = self.commits_to_check
| if git_config('hooks.combined-style-checking'):
| # This project prefers to perform the style check on
| # the cumulated diff, rather than commit-per-commit.
| # Behave as if the update only added one commit (new_rev),
| # with a single parent being old_rev. If old_rev is nul
| # (branch creation), then use the first parent of the oldest
| # added commit.
| debug('(combined style checking)')
!!! -> | if not added[-1].pre_existing_p:
| [...]
The problem in the scenarios mentioned above is that, because there are
no commits to check, added is the empty list, and so trying to get
the last element of that list does not work.
This commit fixes the problem by checking the list of added commits
early, and returning right away if there are no commits to check.
As a result of this early return, a couple of tests saw their output
change, because a debug log is no longer printed. I verified that
these were for two tests where the push did not send any commits
that are new to the repository. So the change of behavior is expected.
Change-Id: I28e5ffe61ec19b8acf46b6da5c0bfe5ed0139d2c
TN: TA21-016
This change is mostly a cleanup which simplifies a bit the iteration
over the list of commits for which to perform the various checks,
by having that information provided via an attribute whose name
is very specific about the semantics of what to use that list for.
This was motivated by an issue we discovered recently where we forgot
to exclude some commits when calling the commit-extra-checker hook.
This happened because, no matter how simple the algorithm for
computing the list of commits to check was, that simple algorithm
was duplicated in many places. This avoids that, in the hopes that
this will prevent this same kind of issues again in the future.
Another way to view the benefits of this cleanup is that this will
help ensure consistency in terms of the list of commits to which
the various checks are applied.
Note that we could have logically split this patch into two steps,
with the first step simply renaming the "added_commits" attribute
into "new_commits_for_ref", and then a second step introducing
the "commits_to_check" attribute. In the end, both are combined
in a single commit because it seems easier to review every location
where the "added_commits" attribute was used, and make sure from
the context that the correct list of commits was chosen in each
instance the "added_commits" list used to be referenced.
Otherwise, it's harder to review the second commit adding
the "commits_to_check" attribute, because the reviewer then has to
audit the entire set of sources himself in order to make sure
no spot was missed.
No actual behavior change should result from this change, and
therefore no test is being added.
Change-Id: I93c206968800dc738d3ebe4f5424f9201875383b
TN: T929-030