-
Notifications
You must be signed in to change notification settings - Fork 115
[CI] Track whether a commit is a revert #601
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
780079c
to
3a3311a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Plumbing looks good, but I have some questions about data fidelity.
|
||
# Check which pull request or commit is being reverted (if any) | ||
pull_request_match = re.search( | ||
r"Reverts? (?:llvm\/llvm-project)?#(\d+)", commit.message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this only works for PR reverts created through the UI.
Eg 4f33d7b7a9f39d733b7572f9afbf178bca8da127 does not have this despite reverting a PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For examples like 4f33d7b7a9f39d733b7572f9afbf178bca8da127, I think it's fine. It would still capture the sha of the reverted commit from the message.
We could work backwards with some SQL and determine which PR was reverted from the data we're already maintaining since we map commit_sha -> PR. Although I could add that logic right into this script somewhere to make that clear.
The only issue I've (infrequently) encountered so far is commits with custom messages that don't match any of these patterns, since this pattern matching is relatively rigid. So these matches are "best effort" for the time being.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the developer did not capitalize "Revert"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if the developer did not capitalize "Revert"?
It won't get captured. But the standard ways of doing reverts (through the cli with git revert
and through the Github UI) both capitalize it. If we can properly capture both of those cases, I think we should be fine.
We could work backwards with some SQL and determine which PR was reverted from the data we're already maintaining since we map commit_sha -> PR. Although I could add that logic right into this script somewhere to make that clear.
Doing it later in SQL sounds good to me. Can you update the descriptions of the fields in the schema JSON to say that you can't rely on the presence of these fields to determine if it was a PR or commit that got reverted?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
What happens if the developer did not capitalize "Revert"?
I've updated the pattern matching to be case-insensitive just to get ahead of such commit messages.
pull_request_match = re.search( | ||
r"Reverts? (?:llvm\/llvm-project)?#(\d+)", commit.message | ||
) | ||
commit_match = re.search(r"This reverts commit (\w+)", commit.message) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this intended to match the commit SHA from reverts for both PRs and direct commits or just the latter? It matches the former too if someone manually runs git revert
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended to capture reverts from both PRs and direct commits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is getting to the point where we should really have some unit testing with reasonable mocks.
A lot of the discussion here would be best encoded in a handful of unit tests and that will ensure future functionality doesn't regress. Something like https://github.com/llvm/llvm-project/blob/main/.ci/metrics/metrics_test.py I think should be a good baseline. We don't have anything yet to run them in CI for zorg, but that's not difficult to get running.
I can follow up with another PR setting up some unit tests. |
This changes adds additional fields for tracking whether a commit to llvm-project is a revert and what commit/pull request it is reverting. Such commits are detecting by pattern matching against the commit message.
This changes adds additional fields for tracking whether a commit to llvm-project is a revert and what commit/pull request it is reverting.
Such commits are detecting by pattern matching against the commit message.