Commit Graph

191 Commits

Author SHA1 Message Date
Simon Marchi
a2909b2c8d Convert print statements to print function
In preparation for the transition to Python 3, convert to using the
print function.  The special import

    from __future__ import print_function

in Python 2.7 makes "print" become the function and not the statement.
In Python 3, it has no effect.  Once the transition is done and we
exclusively use Python 3, we can simply remove the imports.

The testsuite shows no regressions.
2020-11-22 19:29:38 -05:00
Joel Brobecker
ac8f4d6a54 update.py crash when not commits to style-check and combined-style-checking
This commit fixes a crash which occurs during the validation phase
(when the "update" hook is called). This only happens when the repository
is configured with the combined-style-checking config option set.
So far, we found a couple of scenarios where this could happen:

  - The user pushes a single commit, which happens to be a "revert"
    commit (i.e. a commit which is created via the "git revert"
    command); those commits are purposefully excluded from the list
    of commits we have to check;

  - The user pushes a tag, where the tag points to a commit which
    already exists in the repository (something very much typical
    of people pushing tags).

In both cases, we land in the AbstractUpdate.__do_style_checks method,
doing the following:

        | added = self.commits_to_check
        | if git_config('hooks.combined-style-checking'):
        |     # This project prefers to perform the style check on
        |     # the cumulated diff, rather than commit-per-commit.
        |     # Behave as if the update only added one commit (new_rev),
        |     # with a single parent being old_rev.  If old_rev is nul
        |     # (branch creation), then use the first parent of the oldest
        |     # added commit.
        |     debug('(combined style checking)')
 !!! -> |     if not added[-1].pre_existing_p:
        |         [...]

The problem in the scenarios mentioned above is that, because there are
no commits to check, added is the empty list, and so trying to get
the last element of that list does not work.

This commit fixes the problem by checking the list of added commits
early, and returning right away if there are no commits to check.

As a result of this early return, a couple of tests saw their output
change, because a debug log is no longer printed. I verified that
these were for two tests where the push did not send any commits
that are new to the repository. So the change of behavior is expected.

Change-Id: I28e5ffe61ec19b8acf46b6da5c0bfe5ed0139d2c
TN: TA21-016
2020-10-22 07:09:08 +04:00
Joel Brobecker
e536bf865a centralize the computation of the list of commits to apply checks on
This change is mostly a cleanup which simplifies a bit the iteration
over the list of commits for which to perform the various checks,
by having that information provided via an attribute whose name
is very specific about the semantics of what to use that list for.

This was motivated by an issue we discovered recently where we forgot
to exclude some commits when calling the commit-extra-checker hook.
This happened because, no matter how simple the algorithm for
computing the list of commits to check was, that simple algorithm
was duplicated in many places. This avoids that, in the hopes that
this will prevent this same kind of issues again in the future.

Another way to view the benefits of this cleanup is that this will
help ensure consistency in terms of the list of commits to which
the various checks are applied.

Note that we could have logically split this patch into two steps,
with the first step simply renaming the "added_commits" attribute
into "new_commits_for_ref", and then a second step introducing
the "commits_to_check" attribute. In the end, both are combined
in a single commit because it seems easier to review every location
where the "added_commits" attribute was used, and make sure from
the context that the correct list of commits was chosen in each
instance the "added_commits" list used to be referenced.
Otherwise, it's harder to review the second commit adding
the "commits_to_check" attribute, because the reviewer then has to
audit the entire set of sources himself in order to make sure
no spot was missed.

No actual behavior change should result from this change, and
therefore no test is being added.

Change-Id: I93c206968800dc738d3ebe4f5424f9201875383b
TN: T929-030
2020-10-02 11:09:37 -07:00
Joel Brobecker
8b48cfec4f catch merge commits masquerading as revert commits
The git-hooks currently incorrectly disable the check against
merge commits for commits that looks like (from the revision log)
merge commits. This commit fixes this problem and adds a new
testcase to verify this is correctly handled.

Change-Id: Ie2d12de818c43cc11971734ea12ae3dc3fc79039
TN: TA02-034.
2020-10-02 10:29:00 -07:00
Joel Brobecker
35192940c7 AbstractUpdate.pre_commit_checks: Remove unnecessary early return
This method computes a list of commits that need to be checked,
and then makes the assumption that all the checks should be applied
to the same list. This is mostly correct, except in the case of
merge commits: The list of commits to check excludes commits
which have a revision log suggesting that they are revert commits.
In the interest of keeping a given branch free of merge commits,
we still want to reject the revert commits that are merges. So
the list of commits to check for merge-commit violations is
different from the list used by most checks.

In practice, this early return doesn't contribute anything tangible
in terms of performance, and doesn't allow each check to decide
what list of commits it should operate on.

So this commit removes it. There should be no user-visible effect
coming from this change.

Seen while working on TA02-034.

Change-Id: I3a7e826ed4dba36f75da2fc088360356e5a8fa70
2020-10-02 10:28:29 -07:00
Joel Brobecker
f1414aed4a Do not call the hooks.commit-extra-checker for pre-existing commits
A new testcase for this change is not necessary, because an existing
testcase was actually already exercising this scenario (and is therefore
adjusted to expect the correct behavior). An extra test within the same
testcase is also added to exercise the scenario where all commits being
pushed are new to the repository. To make the branch names clearer
in terms of which has all-new commits, and which has some preexisting
ones, the multiple-commits-accept and future-multiple-commits-accept
have been renamed to add a "-some-prexisting" suffix to them. And
in the new test, we use the "-all-new" suffix.

Change-Id: I784a5f2fb3a4619c94d667f301d5a948cf843ac0
TN: T929-030
2020-10-02 09:06:46 -07:00
Joel Brobecker
fe8c5b5009 Add support for new config option: hooks.max-filepath-length
Note that, as part of the implementation, this commit introduces
a couple of calls to Git for each commit meant to retrieve the entire
list of files in a given commit. Luckily, one of these two calls
was already part of another check (filename collisions), and
the result of that call is cached in a CommitInfo object, so
the delta is only one extra Git call. Some performance measurements
seem to indicate that this is a fast-enough operation that users
should not notice the difference in performance, even for repositories
with a very large number of files [1]. If real-life usage proves
this assessment wrong, however, we'll revisit the documented behavior
so as to allow the use of a cheaper approach (e.g perform the check
on all modified files, rather than just the new ones).

[1]: Tested for instance on the GCC repository, which has more than
     87,000 files in total; the first call took roughly 100ms, and
     the next ones took about 60-70ms.

Change-Id: If0eea2c4990a945a3006cdd74d7a9aca2fc770d9
TN: T811-015
2020-09-11 10:47:51 -07:00
Joel Brobecker
3df0ef1658 CommitInfo.all_files: New method
This commit enhances the CommitInfo class by adding a new method that
returns the list of all files in the repository for the given commit.
This factorizes some code which already exists, and allows for the
result to be cached in this object for reuse, something we intend
to do on a followup commit. In the meantime, this commit should be
a nop in practice.

Change-Id: I3f9328dd9e89f8d82ad2ad6fdac69b5e2fbc0992
TN: T811-015
2020-09-02 17:28:28 -07:00
Joel Brobecker
a62e4bcad0 Add support for new config option hooks.email-new-commits-only
This enhancement should answer issue #9 on GitHub:
https://github.com/AdaCore/git-hooks/issues/9

Change-Id: I4b45f8fc62cb858eefd8b8b417bbdbaa7e80bf91
TN: T815-009
2020-08-15 14:20:30 -07:00
Joel Brobecker
da02784ca7 Move check for hooks.no-emails config to __set_send_email_p_attr
Currently, all users of AbstractUpdate.__set_send_email_p_attr first
verify that the reference being updated isn't on the hooks.no-emails
list, so __set_send_email_p_attr could arguably assume that this is
the case, and add an assertion. However, from a functional perspective,
this method is perfectly capable of doing its jobs in the case of
hooks.no-emails references as well, and doing so avoid a self-induced
potential crash, which means better resilience.

Therefore, this commit moves the check in __check_max_commit_emails
down to __set_send_email_p_attr.

This change makes the hooks slightly less effective in terms of
performance in the case of references with the hooks.no-emails and
the hooks.no-precommit-checks config options set (as demonstrated
by the slight change of behavior in one of our testcases showing that
we end up making one additional to Git): However, the impact is very
minimal, and only concerns updates for which we know the hooks have
very little to do, and therefore where performance is not key.

In the case of references that only have the hooks.no-emails config
option set, the overhead consists in setting one attribute for all
commits in a list and then counting the number of commits in that list.
This is offset to some degree by the fact that we save ourselves the
need to add an assertion check which would check the hooks.no-email
config option (which consists of regular expressions). Either way,
it should be negligible.

The upside is that future users of the __set_send_email_p_attr method
can save themselves a check of the hooks.no-emails config option.

Change-Id: Ib2216c639967fe6abf2dde57db7b6f0c7e5f554e
TN: T815-009
2020-08-15 14:20:30 -07:00
Joel Brobecker
632ea61e68 Update incorrect comment in AbstractUpdate.__check_max_commit_emails
This commit updates a comment which hasn't been true anymore for
quite a while. It is written in such a way that it is neutral about
how the policy used to determine which commits trigger an email
and which don't, so as to avoid being wrong again the next time
we introduce new policies.

Change-Id: Icbbf7d5451e7ce2bf6b3fd0e4e10cdca902b38b4
TN: T815-009
2020-08-15 14:20:30 -07:00
Joel Brobecker
2ebb722e13 Allow projects to customzie the contents of commit emails
This commit introduces a new config option hooks.commit-email-formatter
which allows projects to customize the emails being sent for new commits.

Change-Id: Ibd6b4991f15e204e0a076bb705fc8106f08ff8b4
TN: T209-006
2020-08-10 06:35:24 -07:00
Joel Brobecker
4c982600a3 AbstractUpdate.commit_data_for_hook: New method to factorize code
This commit factorizes some code which is currently defined locally
when handling the hooks.commit-extra-checker config option. The intent
is to build on it when we add support for a new option allowing users
to customize the contents of the commit emails being sent (the info
we will want to pass to the new hook to customer emails will be
a superset of the information passed here).

Change-Id: I3cc9659b80b0b03752d227966776c804d3c5203a
TN: T209-006
2020-08-07 18:30:02 -07:00
Joel Brobecker
ff8859e9ac Incorrect Content-Type charset for emails with non-ascii chars in email body
Prior to this commit, emails are always sent with the default charset.
This is a problem when the email body contains some characters which
are not ASCII (e.g. accented letters).

This commit fixes the issue by checking the contents of the email body,
makes a best-guess as to the encoding being used for that body, and
then sets the email's charset accordingly.

A new testcase is not added, because an existing testcase already
demonstrates the issue and thus needed to be adjusted to verify
that this change has the desired effect: email_header_encoding.
This testcase verifies the behavior when sending emails for two
commits, and those two commits do indeed use different encodings
for the commit author (the first one is UTF-8, and the second is
indeed iso8859-1).

Change-Id: I20e4659ac416845f5970448d5cf97d96e061fc5d
TN: S507-007
2020-07-26 15:37:52 -07:00
Joel Brobecker
3e7cb792fe move code guessing encoding into own new function (guess_encoding)
This is preparation work towards using that function in the context
of improving the handling of non-ascii characters in the email body

Change-Id: Ic64ae4ad5dedeb531704245243708c2c7a96e7f9
TN: S507-007
2020-07-26 15:37:52 -07:00
Joel Brobecker
1de3eb14c7 Add support for new config option hooks.commit-extra-check
This option allows projects to ask the git-hooks to perform custom
checks of any kind to validate every new commit being pushed.

This commit also contains a change in hooks/updates/emails.py which
enhances the email-sending debug tracing mechanism used by the testsuite,
adding support for a tracing mode where the traces are just one line
per email. Technically, this change is independent from the rest,
but it is merged with this commit nonetheless because these changes
would show up as uncovered by the testsuite without the testcases
we're adding here. That way, we maintain the principle that we always
have 100% testing coverage for any commit.

Change-Id: I0e446a95d05f5578f977af3eaa547144187eb86b
TN: T209-005
2020-07-19 19:38:58 -07:00
Joel Brobecker
9db584955e pass the RefKind and object_type when creating Update objects
As part of determining which Update class it needs to instantiate
for the reference udpate being evaluated, the udpate factory
computes the following information:
  - The kind of reference being updated (branch? notes? tag?)
  - What operation on the branch is being performed (create? delete?
    update?)
  - The type of object the commit targeted by the reference
    (commit? tag?)

In the context of enhancing the git-hooks to allow projects to provide
custom-checks on commits, the first and third items seem like these
could be useful information to pass to those hooks. In order to allow
this, without recomputing that information, this commit enhances
the AbstractUpdate class __init__ method to add those as additional
required paramaters, and then stores them as two new attributes.

As a bonus side-effect of this change, the new_rev_type attribute
is no longer necessary, which allows us to save one external call
to Git.

In the meantime, having these new attributes means that we can use
those to cross-check, within each AbstractUpdate child class'
self_sanity_check method, that the ref_kind and object_type values
correspond to each class' expectation (in other words, we cross-check
that the factory instantiated the correct class).

One other side-effect of that change is that we are no longer calling
the get_object_type function with a null SHA1 anymore. We could
modify the function's implementation to only accept non-null SHA1
revisions, but this would denature the function, in my opinion.
There is a genuine chance that perhaps, one day, we'll need that
again. So, instead, we cover that function by adding a new unit test
instead.

Change-Id: I8fd1ce180a6e17c0401b4dee07b9bc07d2abfdda
TN: T209-005
2020-07-19 19:30:55 -07:00
Joel Brobecker
564edb15d3 rename CommitInfo's author property to full_author_email (clearer)
Initially, this property was born an attribute. It was then converted
to a property when the attribute's value got split into two attributes.
The original name of the attribute was used for this property so as
to facilitate the transition, as well as review of that change.

Now that the transition is done, this commit renames the property
into something which is more meaningful and obvious to the reader,
and then updates all the users.

Change-Id: I29f759dfae2a71a1c9b1a65fd93fe7604b6d952a
TN: T209-005
2020-07-12 17:30:51 -07:00
Joel Brobecker
f825154a6b CommitInfo: Keep the author's name and email address separate
Currently, the CommitInfo object records information about the author
of a commit as a single element, which is a string that contains
both the author's name and his email address, in RFC 822 format.
As of now, this is all the git-hooks need; however, in the context
of adding a hook that projects could use implement project-specific
validation rules, such hook might prefer to have name and email
addresses be passed as separate entities. This change makes this
possible without requiring an additional call to Git.

Note that this commit introduces a new property in class CommitInfo
with the same name ("author") as the attribute which has now been
split into two new attributes. This change highlights the fact that
the name we previously chose is very compact ("author"), but not
overly expressive. This commit keeps things as they are now to keep
the focus on the splitting. A followup patch will focus on the renaming.

Change-Id: I2b3dd4863f37c4364d4c477bcfc3952cedd2944a
TN: T209-005
2020-07-12 17:26:55 -07:00
Joel Brobecker
ff8d81cb4a compute each commit's revlog once and then cache it
This is preparatory work for being to pass a certain amount of data
to user-defined hooks. The intent of this change is to limit the number
of times we compute that information to at most once.

Generally speaking, the main part of this change consists in adding
the following new methods to class CommitInfo:
  - raw_revlog;
  - raw_revlog_lines; and

The rest of this change is mostly adjustments to the code that needs
to access commits' rev logs to get them from a shared CommitInfo object
rather than from a play revision (SHA1).

Additionally, the function is_revert_commit in git.py, which took
a commit revision as a paramenter and needed a call to "git log"
to get the commit's body, has been replaced by a new method in
class CommitInfo. An alternative approach might have been to
keep the function, and change its parameter to be a CommitInfo object.
But it seemed more natural to make this a method of the CommitInfo
class instead, so this is what this change does.

Change-Id: Ia4bf23f24226d1e9eddafc61afd37db37f0f5287
TN: T209-005
2020-07-12 17:24:39 -07:00
Joel Brobecker
4106237e97 Add reference name as argument in call to hooks.mailinglist script
When a project's hooks.mailinglist configuration is the name of
a script, the call to that script now includes the name of the reference
being updated as the first (and so far only) command-line argument.
This allows these projects to have different mailing list addresses
depending on the name of the reference being updated.

Change-Id: Idc920357023c2b5ec7b3b4d755e298a0f0b83bf0
TN: T209-008
2020-06-29 11:19:27 -07:00
Joel Brobecker
453f2581f1 Add support for hooks.rejected-branch-deletion-tip config option
This is a followup on the previous commit, which introduced a feature
allowing projects to control which branches can be deleted.

This feature is added separately, as it is only a minor enhancement,
and this allows to keep the previous commit more focused.

Change-Id: I90d630e0a5b37d092db408379193e7a863201f8b
TN: T209-004
2020-06-29 11:14:14 -07:00
Joel Brobecker
c601254862 add options to allow repositories to control branch deletion
Note that tag deletion privileges are controlled via a different
and already existing option (hooks.allow-delete-tag). Also, notes
are not covered as notes deletion is currently not allowed at all.

Change-Id: I1ecfd83cf27861885f98c31bdc416f67d2dddb5e
TN: T209-004
2020-06-29 11:10:57 -07:00
Joel Brobecker
05211f43ef fix reference name matching to perform full reference name matching
In other words, if you have a repository is configured with:

    [hooks]
        no-emails = refs/heads/master

The "hooks.no-emails" configuration should only apply to the master
branch, not any branch whose name happens to start with "master".

Incidentally, fixing this problem uncovered an error in the configuration
in the repository used in one of the testcases. This commit fixes that
error.

Change-Id: Ic3cdc56e4122fae65bbec792e12a002f2042b7e3
TN: T622-001
2020-06-21 18:00:38 -07:00
Joel Brobecker
4c7efb128d fix AbstractUpdate.search_config_option_list method description
Change-Id: I16eba460e243c5bb93cf433eabc906c1059dc71e
2020-06-21 17:41:56 -07:00