Skip to content

Conversation

@ilovepi
Copy link
Contributor

@ilovepi ilovepi commented Nov 5, 2025

As discussed on discourse
https://discourse.llvm.org/t/rfc-require-pull-requests-for-all-llvm-project-commits/88164

This is an attempt at a script to automatically make and land PRs.

It creates a branch (or a series of branches for stacks), and lands them one by one, without waiting for review or checks to pass.

It supports --auto-merge, for single PRs. For stacks, use Graphite or some other tool.

It can work with GitHub tokens for https or ssh keys.

Example:

GITHUB_TOKEN=$(gh auth token) python3 llvm_push_pr.py --upstream-remote origin

As discussed on discourse
https://discourse.llvm.org/t/rfc-require-pull-requests-for-all-llvm-project-commits/88164

This is an attempt at a script to automatically make and land PRs.

It creates a branch (or a series of branches for stacks), and lands them
one by one, without waiting for review or checks to pass.

It supports --auto-merge, for single PRs. For stacks, use Graphite or
some other tool.

It can work with GitHub tokens for https or ssh keys.

Example:
```console
GITHUB_TOKEN=$(gh auth token) python3 llvm_push_pr.py --upstream-remote origin
```

printer = Printer()
token = os.getenv("GITHUB_TOKEN")
default_prefix = "dev/"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_prefix = "dev/"
default_prefix = "users/"

seems like a better default to have it work seamlessly with upstream branch restriction

self.printer.print(f"On branch: {self.original_branch}")

try:
self._rebase_current_branch()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this part of the flow?

"For stacks, the script must merge sequentially.",
file=sys.stderr,
)
sys.exit(1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is that? When pushing branches to upstream, you can cascade pull-requests to create your stack.

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming you've tested this somewhere?

import subprocess
import sys
import time
from typing import List, Optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a TODO that we should remove this once we move to Python 3.10 and can move to the native type annotations?

import time
from typing import List, Optional

import requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason to prefer requests (third party library) over the stdlib urllib.request?

if e.stderr:
self.print(f"--- stderr ---\n{e.stderr}", file=sys.stderr)
raise e
return e
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're returning an exception here when the type is specified as subprocess.CompletedProcess?

def __init__(
self, dry_run: bool = False, verbose: bool = False, quiet: bool = False
):
"""Initializes the Printer with dry_run, verbose, and quiet settings."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This docstring seems pretty redundant. They're also not required by most style guides (including the Google one) for short functions.

self.quiet = quiet

def print(self, message: str, file=sys.stdout):
"""Prints a message to the specified file, respecting quiet mode."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.


printer = Printer()
token = os.getenv("GITHUB_TOKEN")
default_prefix = "dev/"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

users/ given the monorepo will reject branches without a users/ prefix?

"""Wrapper for run_command that passes the dry_run flag."""
return self.printer.run_command(command, read_only=read_only, **kwargs)

def _get_repo_slug(self) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates functionality below and should also be constant given we only need to support the monorepo.

self.printer.print(f"On branch: {self.original_branch}")

try:
self._rebase_current_branch()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need to rebase on top of main. Github handles the actual merge for us, and I think not rebasing is one of the selling points of using this script.

def _get_commit_stack(self) -> List[str]:
"""Gets the stack of commits between the current branch's HEAD and its merge base with upstream."""
target = f"{self.args.upstream_remote}/{self.args.base}"
merge_base_result = self._run_cmd(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to find a merge base here? Why can't we just do a three dot diff between main and the user's branch?

I don't think this handles causes like someone rebasing a bunch of merged commits any more nicely.

self.printer.print("Cleaning up temporary local branches...")
self._run_cmd(["git", "branch", "-D"] + self.created_branches)
self.printer.print("Cleaning up temporary remote branches...")
self._run_cmd(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this delete branches that have automerge enabled?


self.printer.print(f"Found {len(initial_commits)} commit(s) to process.")
branch_base_name = self.original_branch
if self.original_branch in ["main", "master"]:
Copy link
Collaborator

@joker-eph joker-eph Nov 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite get why this check exists?

  1. master isn't a thing in LLVM
  2. I almost never push from main, not sure why the current branch name matters?

def _get_commit_details(self, commit_hash: str) -> tuple[str, str]:
"""Gets the title and body of a commit."""
result = self._run_cmd(
["git", "show", "-s", "--format=%s%n%n%b", commit_hash],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
["git", "show", "-s", "--format=%s%n%n%b", commit_hash],
["git", "show", "-s", "--format=%B", commit_hash],

Why not just %B?

)
parts = result.stdout.strip().split("\n\n", 1)
title = parts[0]
body = parts[1] if len(parts) > 1 else ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the body contains multiple paragraphs?

Comment on lines +432 to +433
self._run_cmd(["git", "branch", "-f", branch_name, commit_hash])
push_command = ["git", "push", self.args.remote, branch_name]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we creating local branches? We can just push a sha directly:

Suggested change
self._run_cmd(["git", "branch", "-f", branch_name, commit_hash])
push_command = ["git", "push", self.args.remote, branch_name]
push_command = ["git", "push", self.args.remote, f"{commit_hash}:refs/head/{branch_name}"]

"""Checks if git is installed and if inside a git repository."""
printer.print("Checking prerequisites...")
printer.run_command(["git", "--version"], capture_output=True, read_only=True)
if not os.getenv("GITHUB_TOKEN"):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we name it GITHUB_LLVM_TOKEN?
In general folks may want to segregate their tokens to be finer grain, and so having a name that is more specific allows them to put it in a .bashrc or similar without colliding with other token needs.

parser = argparse.ArgumentParser(
description="Create and land a stack of Pull Requests."
)
GITHUB_REMOTE_NAME = "origin"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why "origin" is the best name, instead of "fork" for example?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants