Skip to content

Conversation

@alpineQ
Copy link

@alpineQ alpineQ commented Oct 28, 2025

Description

This PR adds support for Linux Pressure Stall Information (PSI) metrics to the instrumentation/host package. PSI metrics provide valuable insights into resource pressure on CPU, memory, and I/O subsystems, available on Linux systems with kernel 4.20+.

The implementation adds 20 new metric instruments:

  • CPU: system.psi.cpu.some.* (avg10, avg60, avg300, total)
  • Memory: system.psi.memory.some.* and system.psi.memory.full.* (avg10, avg60, avg300, total for each)
  • I/O: system.psi.io.some.* and system.psi.io.full.* (avg10, avg60, avg300, total for each)

Where "some" indicates some tasks are stalled and "full" indicates all tasks are stalled. The avg* metrics represent pressure averages over 10, 60, and 300 second windows, while total metrics track cumulative stall time in microseconds.

PSI metrics are automatically collected alongside existing host metrics when running on Linux systems with PSI support. On non-Linux systems or when PSI is unavailable, the implementation gracefully degrades with no impact.

Link to tracking issue

Closes #8082

@alpineQ alpineQ requested review from a team and dmathieu as code owners October 28, 2025 14:24
@alpineQ
Copy link
Author

alpineQ commented Oct 28, 2025

I understand that metric definitions should ideally be managed through the weaver tool and its YAML schema files. However, I noticed that the weaver configuration files are maintained in a separate repository, and I wasn't able to find documentation on the coordinated update process for both the schema definitions and the Go instrumentation code.

I'm happy to submit corresponding changes to the weaver repository if needed. If this is the preferred approach, I would appreciate guidance on:

  • The specific repository and file locations for PSI metric definitions
  • The recommended workflow for coordinating changes across both repositories

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature]: Pressure Stall Information (PSI) from linux hosts

1 participant