Skip to content

Conversation

@Uspectacle
Copy link

👋 Hello 😄
This is my first time adding full mypy typing to a library, so I’m excited to share this and would really appreciate your feedback.

This PR introduces comprehensive type annotations, detailed docstrings, and general code quality improvements across the MAIA2 codebase.

The main goals are:

  • Type safety — explicit, accurate type hints throughout the code.
  • Readability — improved clarity through standardized documentation.
  • Maintainability — clearer interfaces, better IDE and static analysis support.

All changes are fully backward compatible and carefully verified for correctness and consistency.

While I used an LLM to assist with the initial documentation and typing, I have personally reviewed, corrected, and validated every annotation and edit to ensure top-quality results.

If direct typing updates are not preferred, I’ve also prepared an alternative approach with stubs and .pyi files on another branch:
🔗 https://github.com/Uspectacle/maia2/tree/stubs

I personally believe that typing the codebase directly provides stronger long-term benefits, but I’m happy to adapt to the project’s preferred approach.

In the meantime, I’ve published the stubs separately here for convenience:
🔗 https://github.com/Uspectacle/maia2-stubs


General Improvements (all files)

  • Added module-level, class, and function docstrings.
  • Imported typing and used annotations for all functions and classes.
  • Removed unused imports.
  • Renamed unused variables with _.
  • Added comments to improve readability.
  • Applied ruff formatting and fixed all pylint and mypy warnings.

Breaking potential:

  • Minimal; almost only docstring and type annotation changes.

dataset.py

  • Ignored gdown import (not typed).
  • Moved constants to top: DEFAULT_SAVE_ROOT, TEST_DATASET_URL, TRAIN_DATASET_URL.
  • Simplified existence checks: if not os.path.exists(save_root).
  • Renamed evolving variable datafiltered_data in load_example_test_dataset.

Breaking potential:

  • Renaming datafiltered_data.
  • Constant names changed; users relying on inline URLs must update.

inference.py

  • Imported packages individually instead of bulk import.

  • preprocessing(): legal_moves cast to torch.float32.

  • get_preds(): used enumerate for loops.

  • inference_batch():

    • highest_prob_move now unpacks (key, value) from max().
    • Renamed accaccuracy.
  • inference_each(): renamed elo_selfelo_self_tensor and elo_oppoelo_oppo_tensor.

Breaking potential:

  • Variable renames (elo_selfelo_self_tensor, accaccuracy) could affect external code referencing old names.
  • Users depending on tuple unpacking from prepare() may not notice change; tuple return recommended in future.

main.py

  • Disabled pylint:too-many-lines.
  • Imported packages individually.
  • process_per_game(): checked clock_info is not None before threshold comparison.
  • game_filter(): renamed white_elowhite_elo_str, black_eloblack_elo_str; returns None when no result.
  • process_per_chunk(): simplified if len(ret_per_game) > 0.
  • MAIA2Dataset.__getitem__(): renamed board_inputboard_input_tensor.
  • Cast tqdm in evaluation functions (evaluate_MAIA1_data, train_chunks).
  • Cast loss_maia to loss in train_chunks().

Breaking potential:

  • Renamed variables may break user code depending on previous names.
  • None returns in game_filter() may require additional handling.

model.py

  • Ignored gdown import.

  • Moved constants to top: DEFAULT_SAVE_ROOT, CONFIG_URL, MODEL_URLS.

  • from_pretrained():

    • Renamed illegal Python variable typemodel_type.
    • Simplified os.path.exists checks.
    • Renamed evolving variable modelmaia2_model and model_module.

Breaking potential:

  • model renaming could break code using direct access to the old variable.

train.py

  • Renamed modelmaia2_model.
  • Renamed N_paramsn_params for PEP8 compliance.

Breaking potential:

  • Minor; variable renames may require user updates.

utils.py

  • Moved constants to top: ELO_INTERVAL, ELO_START, ELO_END, PIECE_TYPES.
  • Config default values provided (arbitrary choices).
  • parse_args(): added encoding="utf-8" for file reading.
  • get_side_info(): added assert moving_piece is not None to avoid NoneType errors.
  • Renamed variables to avoid shadowing outer scope (chunkschunk_list, all_movesall_moves_uci).

Breaking potential:

  • Renamed variables could break user code that relies on old names.

Other Notes

  • All type annotations checked with mypy.
  • All formatting issues resolved with ruff.
  • prepare() is marked for future improvement: ideally returning a Tuple instead of a List for proper type safety.
  • Minor runtime behavior changes: mostly variable renames and type casts.

This PR introduces extensive type annotations, docstrings, and code improvements throughout the MAIA2 codebase. All changes have been made with minimal impact on runtime behavior and backward compatibility, though some evolving-type variable renames and functional clarifications may affect code that relies on previous names.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant