-
Notifications
You must be signed in to change notification settings - Fork 15.1k
[llvm][utils] Add a script to use PRs over pushing to main #166473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
As discussed on discourse https://discourse.llvm.org/t/rfc-require-pull-requests-for-all-llvm-project-commits/88164 This is an attempt at a script to automatically make and land PRs. It creates a branch (or a series of branches for stacks), and lands them one by one, without waiting for review or checks to pass. It supports --auto-merge, for single PRs. For stacks, use Graphite or some other tool. It can work with GitHub tokens for https or ssh keys. Example: ```console GITHUB_TOKEN=$(gh auth token) python3 llvm_push_pr.py --upstream-remote origin ```
|
|
||
| printer = Printer() | ||
| token = os.getenv("GITHUB_TOKEN") | ||
| default_prefix = "dev/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| default_prefix = "dev/" | |
| default_prefix = "users/" |
seems like a better default to have it work seamlessly with upstream branch restriction
| self.printer.print(f"On branch: {self.original_branch}") | ||
|
|
||
| try: | ||
| self._rebase_current_branch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this part of the flow?
| "For stacks, the script must merge sequentially.", | ||
| file=sys.stderr, | ||
| ) | ||
| sys.exit(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is that? When pushing branches to upstream, you can cascade pull-requests to create your stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming you've tested this somewhere?
| import subprocess | ||
| import sys | ||
| import time | ||
| from typing import List, Optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add a TODO that we should remove this once we move to Python 3.10 and can move to the native type annotations?
| import time | ||
| from typing import List, Optional | ||
|
|
||
| import requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to prefer requests (third party library) over the stdlib urllib.request?
| if e.stderr: | ||
| self.print(f"--- stderr ---\n{e.stderr}", file=sys.stderr) | ||
| raise e | ||
| return e |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're returning an exception here when the type is specified as subprocess.CompletedProcess?
| def __init__( | ||
| self, dry_run: bool = False, verbose: bool = False, quiet: bool = False | ||
| ): | ||
| """Initializes the Printer with dry_run, verbose, and quiet settings.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This docstring seems pretty redundant. They're also not required by most style guides (including the Google one) for short functions.
| self.quiet = quiet | ||
|
|
||
| def print(self, message: str, file=sys.stdout): | ||
| """Prints a message to the specified file, respecting quiet mode.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
|
|
||
| printer = Printer() | ||
| token = os.getenv("GITHUB_TOKEN") | ||
| default_prefix = "dev/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
users/ given the monorepo will reject branches without a users/ prefix?
| """Wrapper for run_command that passes the dry_run flag.""" | ||
| return self.printer.run_command(command, read_only=read_only, **kwargs) | ||
|
|
||
| def _get_repo_slug(self) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates functionality below and should also be constant given we only need to support the monorepo.
| self.printer.print(f"On branch: {self.original_branch}") | ||
|
|
||
| try: | ||
| self._rebase_current_branch() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to rebase on top of main. Github handles the actual merge for us, and I think not rebasing is one of the selling points of using this script.
| def _get_commit_stack(self) -> List[str]: | ||
| """Gets the stack of commits between the current branch's HEAD and its merge base with upstream.""" | ||
| target = f"{self.args.upstream_remote}/{self.args.base}" | ||
| merge_base_result = self._run_cmd( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need to find a merge base here? Why can't we just do a three dot diff between main and the user's branch?
I don't think this handles causes like someone rebasing a bunch of merged commits any more nicely.
| self.printer.print("Cleaning up temporary local branches...") | ||
| self._run_cmd(["git", "branch", "-D"] + self.created_branches) | ||
| self.printer.print("Cleaning up temporary remote branches...") | ||
| self._run_cmd( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this delete branches that have automerge enabled?
|
|
||
| self.printer.print(f"Found {len(initial_commits)} commit(s) to process.") | ||
| branch_base_name = self.original_branch | ||
| if self.original_branch in ["main", "master"]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite get why this check exists?
masterisn't a thing in LLVM- I almost never push from main, not sure why the current branch name matters?
| def _get_commit_details(self, commit_hash: str) -> tuple[str, str]: | ||
| """Gets the title and body of a commit.""" | ||
| result = self._run_cmd( | ||
| ["git", "show", "-s", "--format=%s%n%n%b", commit_hash], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ["git", "show", "-s", "--format=%s%n%n%b", commit_hash], | |
| ["git", "show", "-s", "--format=%B", commit_hash], |
Why not just %B?
| ) | ||
| parts = result.stdout.strip().split("\n\n", 1) | ||
| title = parts[0] | ||
| body = parts[1] if len(parts) > 1 else "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the body contains multiple paragraphs?
| self._run_cmd(["git", "branch", "-f", branch_name, commit_hash]) | ||
| push_command = ["git", "push", self.args.remote, branch_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we creating local branches? We can just push a sha directly:
| self._run_cmd(["git", "branch", "-f", branch_name, commit_hash]) | |
| push_command = ["git", "push", self.args.remote, branch_name] | |
| push_command = ["git", "push", self.args.remote, f"{commit_hash}:refs/head/{branch_name}"] |
| """Checks if git is installed and if inside a git repository.""" | ||
| printer.print("Checking prerequisites...") | ||
| printer.run_command(["git", "--version"], capture_output=True, read_only=True) | ||
| if not os.getenv("GITHUB_TOKEN"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we name it GITHUB_LLVM_TOKEN?
In general folks may want to segregate their tokens to be finer grain, and so having a name that is more specific allows them to put it in a .bashrc or similar without colliding with other token needs.
| parser = argparse.ArgumentParser( | ||
| description="Create and land a stack of Pull Requests." | ||
| ) | ||
| GITHUB_REMOTE_NAME = "origin" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why "origin" is the best name, instead of "fork" for example?
As discussed on discourse
https://discourse.llvm.org/t/rfc-require-pull-requests-for-all-llvm-project-commits/88164
This is an attempt at a script to automatically make and land PRs.
It creates a branch (or a series of branches for stacks), and lands them one by one, without waiting for review or checks to pass.
It supports --auto-merge, for single PRs. For stacks, use Graphite or some other tool.
It can work with GitHub tokens for https or ssh keys.
Example:
GITHUB_TOKEN=$(gh auth token) python3 llvm_push_pr.py --upstream-remote origin