-
Notifications
You must be signed in to change notification settings - Fork 203
feat: Add instrumentation for MetricKit. #963
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
I don't know why the macOS tests are failing. |
I downloaded all macOS SDKs from version API_AVAILABLE(ios(13.0), macos(10.15)) API_UNAVAILABLE(tvos, watchos)
@interface MXMetricPayload : NSObject <NSSecureCoding>is in macOS After that, in all the others, it looks like this: API_AVAILABLE(ios(13.0)) API_UNAVAILABLE(macos, tvos, watchos)
@interface MXMetricPayload : NSObject <NSSecureCoding>Maybe it's related to this thing mentioned in the MXMetricManager documentation: So, seems that unless you compile it using the latest versions of Xcode (26.0 and up), this problem will happen Edit: I'd suggest removing macOS support until we fully migrate to Xcode 26.0 in CI/CD |
|
|
||
| let namespacedAttrs: [String: AttributeValueConvertable] = [ | ||
| "hang_duration": $0.hangDuration, | ||
| "exception.stacktrace_json": stacktraceJson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Why do we have exception.stacktrace and exception.stacktrace_tree? From the semconv, seems that different formats are already implied, and we can just add the metric kit json format to the list https://opentelemetry.io/docs/specs/semconv/exceptions/exceptions-spans/#stacktrace-representation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- I removed
exception.stacktrace_json. It was redundant. - I added some code to simplify the stacktrace, mostly flattening it, based on feedback from the SIG.
- I added a doc to define the new format and the differences from Apple's JSON format.
The use case is so specific, I'm not convinced we can reuse the format for other kinds of crashes, but I'm open to the discussion.
|
|
||
| @available(iOS 13.0, macOS 12.0, macCatalyst 13.1, visionOS 1.0, *) | ||
| public func reportMetrics(payload: MXMetricPayload) { | ||
| let span = getMetricKitTracer().spanBuilder(spanName: "MXMetricPayload") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was some debate about capturing MXMetric as a metric, span, or log. Could you document your decision to record this as a span?
With the way its setup now, seems that we can also extend this to support metrics in the future
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We discussed this in the SIG, and I think we're on the same page. This is documented in the README.
| decoding: callStackTree.jsonRepresentation(), | ||
| as: UTF8.self | ||
| ) | ||
| namespacedAttrs["exception.stacktrace_json"] = stacktraceJson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here about stacktrace_json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addresses above.
|
|
||
| // Standard OTel exception attributes - will be populated below | ||
| var otelType: String? | ||
| var otelMessage: String? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How good are signalDescription and machExceptionDescription at describing crashes, and allowing them to be grouped together by common root cause?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It varies, but probably not sufficient. It just maps the human-readable names for the signals from Apple's documentation. For example, SIGSEGV will have the description "segmentation violation".
- I looked into what KSCrash does with its "Crash Doctor" code to make better message names. I don't think it's feasible to adapt that code for this PR. There's a lot of complexity to it, and it depends heavily on the fields KSCrash populates. I think it would be a reasonable future change if we can get it working, though.
- Even with KSCrash's better descriptions, I doubt it would be sufficient for crash deduplication. I dunno.
My recommendation would be for us to go with this, and see if we can improve the descriptions in the future, based on usage.
The Crash Reporter instrumentation you're working on may make users less reliant on this particular diagnostic anyway.
|
Left a bunch of questions. Thanks Bee! |
| @@ -0,0 +1,244 @@ | |||
| # MetricKit Instrumentation | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doc is really great
|
I removed support for macOS, as recommended. |

This is proposed instrumentation for Apple's MetricKit. The technical details for the instrumentation are thoroughly documented in the README file.
I am sure some details of the proposal will be controversial. I'd encourage you to particularly pay attention to:
MXMetricPayloadwith attributes for all reported valuesMXDiagnosticPayloadMXMetricPayloadspan is a 24-hour span over the time the data was aggregatedMXHistograms are converted to approximate averagesOf course any other feedback would also be appreciated.