Add iOS/iPadOS app monitoring via OpenTelemetry Swift SDK (SWIP-11)#13828
Add iOS/iPadOS app monitoring via OpenTelemetry Swift SDK (SWIP-11)#13828
Conversation
SWIP-11 iOS-specific implementation: - IOS(47, true) layer in Layer.java - ios-analyzer module with two SpanListeners: - IOSLayerSpanListener: detect os.name=iOS, register service with Layer.IOS - IOSMetricKitSpanListener: extract MetricKit daily stats to MAL, skip trace - MAL rules (otel-rules/ios/ios-metrickit.yaml): 16 metrics with device/OS labels - LAL rule (lal/ios-metrickit.yaml): layer:auto crash/hang diagnostic logs Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces IOS layer with two SpanListeners: - IOSHTTPSpanListener: outbound HTTP (NSURLSession) client metrics via OAL (service_cpm, endpoint_cpm, etc.). Supports OTel Swift .old/.stable/.httpDup semconv modes via stable-then-legacy attribute fallback (server.address -> net.peer.name, etc.). - IOSMetricKitSpanListener: daily MetricKit stats (app launch time, hang time, CPU/GPU/memory, network transfer, exit counts split by foreground/background, OOM kills). Histogram percentiles use a finite 30s overflow ceiling and emit per-bucket counts with the MILLISECONDS unit override so MAL percentile math stays accurate. Ships LAL rule for MetricKit crash/hang diagnostic logs, MAL rules for service- and instance-level aggregation, iOS Root/Service/ Instance/Endpoint dashboards, Mobile menu entry, user-setup docs, SWIP-11 design doc, and an e2e test. Bug fixes wrapped in: - LAL 'layer: auto' mode dropped logs after the extractor decided the layer; codegen now propagates 'layer "..."' assignments to LogMetadata.layer so FilterSpec.doSink sees the script-decided value. - Stable-semconv fallback keeps IOSHTTPSpanListener working across all three OTel Swift URLSessionInstrumentation modes. UI submodule bump (apache/skywalking-booster-ui@a6be0e0): - Mobile menu icon + i18n labels (en/es/zh) for the iOS layer. - Fix metric label rendering in multi-expression dashboard widgets.
e299841 to
371b88e
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds end-to-end iOS/iPadOS application monitoring support in SkyWalking OAP via the OpenTelemetry Swift SDK, introducing a new IOS layer, span listeners for outbound HTTP and MetricKit metrics/logs, plus corresponding MAL/LAL rules, UI dashboards/menu, documentation, and CI e2e coverage.
Changes:
- Add
IOSlayer andios-analyzermodule withIOSHTTPSpanListener(OAL traffic metrics) andIOSMetricKitSpanListener(MetricKit → MAL via OTLP receiver’s MAL pipeline). - Add iOS MAL/LAL rules and UI initialized templates (root/service/instance/endpoint dashboards + Mobile menu).
- Add iOS e2e test case and update docs/SWIP/CHANGES/security guidance.
Reviewed changes
Copilot reviewed 38 out of 38 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e-v2/cases/ios/expected/service.yml | Expected service layer registration output for iOS e2e. |
| test/e2e-v2/cases/ios/expected/metrics-has-value.yml | Generic assertion for metrics returning non-empty values. |
| test/e2e-v2/cases/ios/expected/metrics-has-value-label.yml | Generic assertion for metrics returning non-empty labeled values. |
| test/e2e-v2/cases/ios/expected/logs.yml | Expected persisted MetricKit diagnostic log output for e2e. |
| test/e2e-v2/cases/ios/expected/endpoint.yml | Expected remote-domain endpoint output for e2e. |
| test/e2e-v2/cases/ios/e2e.yaml | New iOS e2e: sends OTLP traces/logs, verifies service/metrics/endpoints/logs. |
| test/e2e-v2/cases/ios/docker-compose.yml | e2e compose enabling OTLP traces/logs, iOS MAL rules, and iOS LAL. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml | Adds “Mobile → iOS” menu entry pointing to iOS docs. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-service.json | iOS Service dashboard template (HTTP + MetricKit + logs). |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-root.json | iOS root dashboard template (service list + intro text). |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-instance.json | iOS instance (app version) dashboard template. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-endpoint.json | iOS endpoint (remote domain) dashboard template. |
| oap-server/server-starter/src/main/resources/otel-rules/ios/ios-metrickit.yaml | Service-level MetricKit MAL rules for iOS. |
| oap-server/server-starter/src/main/resources/otel-rules/ios/ios-metrickit-instance.yaml | Instance-level (app version) MetricKit MAL rules for iOS. |
| oap-server/server-starter/src/main/resources/lal/ios-metrickit.yaml | LAL rule to persist MetricKit diagnostic logs with layer:auto. |
| oap-server/server-starter/pom.xml | Pull ios-analyzer into server-starter distribution. |
| oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java | Adds toMeter(...) to feed pre-built SampleFamily into MAL pipeline. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java | Includes IOS layer folder for UI template initialization. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java | Adds Layer.IOS. |
| oap-server/analyzer/pom.xml | Adds ios-analyzer module to build. |
| oap-server/analyzer/log-analyzer/src/test/resources/scripts/lal/test-lal/oap-cases/ios-metrickit.yaml | Adds LAL unit test script for iOS MetricKit diagnostics. |
| oap-server/analyzer/log-analyzer/src/test/resources/scripts/lal/test-lal/oap-cases/ios-metrickit.data.yaml | Adds LAL unit test data covering iOS/iPadOS/non-iOS cases. |
| oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/provider/log/listener/RecordSinkListener.java | Refines comments and removes unused getBuilder() accessor. |
| oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALBlockCodegen.java | Fix: propagate extractor-decided layer to LogMetadata.layer for layer:auto. |
| oap-server/analyzer/ios-analyzer/src/main/resources/META-INF/services/org.apache.skywalking.oap.server.core.trace.SpanListener | Registers iOS span listeners via SPI. |
| oap-server/analyzer/ios-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/ios/listener/IOSMetricKitSpanListener.java | New listener converting MetricKit spans into MAL samples (no trace persistence). |
| oap-server/analyzer/ios-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/ios/listener/IOSHTTPSpanListener.java | New listener emitting OAL sources for outbound HTTP URLSession spans. |
| oap-server/analyzer/ios-analyzer/pom.xml | New analyzer module definition for iOS listeners. |
| docs/menu.yml | Adds “Mobile Monitoring → iOS” doc navigation entry. |
| docs/en/swip/SWIP-11.md | Updates SWIP-11 to reflect final design/implementation details. |
| docs/en/setup/backend/backend-ios-monitoring.md | Adds iOS monitoring setup guide. |
| docs/en/security/README.md | Adds client-side monitoring security guidance for public-ingress telemetry. |
| docs/en/changes/changes.md | Adds changelog entry for iOS monitoring + related fixes. |
| CLAUDE.md | Adds guidance about full rebuild on cross-module changes. |
| .github/workflows/skywalking.yaml | Registers iOS monitoring e2e case in CI. |
| .claude/skills/run-e2e/SKILL.md | Adds e2e triage and UI-template DB-reset guidance. |
| .claude/skills/package/SKILL.md | Adds packaging skill doc to avoid stale dist/image artifacts. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ame in iOS listeners - IOSHTTPSpanListener: pipe service.version through NamingControl.formatInstanceName() so the instance name respects length/format limits. - IOSMetricKitSpanListener: add CoreModule dependency + NamingControl service; format service.name / service.version into MAL labels; early-return CONTINUE when service.name is missing/blank to avoid emitting empty-named service entities; fall back to 'unknown' instance id when service.version is missing.
The setup-phase curl loop used 'curl && break || sleep 5' which silently succeeded even when every curl attempt got connection-refused (sleep returns 0). In CI the OAP container wasn't ready at the 25-second mark, so all 5 attempts failed and the log was never ingested — causing the 'logs list' verify case to fail after 20 retries with an empty result. Replace the loop with curl --retry-connrefused --retry-all-errors -f, and 'set -e' so the step surfaces the failure instead of swallowing it.
There was a problem hiding this comment.
Pull request overview
Adds first-class iOS/iPadOS application monitoring support to SkyWalking OAP using telemetry produced by the OpenTelemetry Swift SDK, including new IOS-layer dashboards, MetricKit metrics/log processing, and CI e2e coverage. Also fixes a LAL layer:auto behavior where extractor-assigned layers weren’t propagated into LogMetadata, causing logs to be dropped.
Changes:
- Introduce
Layer.IOSplus iOS SpanListeners for URLSession outbound HTTP OAL metrics and MetricKit → MAL metrics conversion. - Add iOS MetricKit MAL rules, MetricKit diagnostic LAL rule, and UI initialized templates + menu entry for Mobile/iOS.
- Add new iOS e2e case and LAL unit test coverage; update docs, security notice, and changelog.
Reviewed changes
Copilot reviewed 39 out of 39 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| test/e2e-v2/cases/ios/expected/service.yml | Expected service inventory assertions for IOS layer e2e. |
| test/e2e-v2/cases/ios/expected/metrics-has-value.yml | Generic expected shape for metrics queries (non-empty values). |
| test/e2e-v2/cases/ios/expected/metrics-has-value-label.yml | Generic expected shape for metrics queries with labels. |
| test/e2e-v2/cases/ios/expected/logs.yml | Expected persisted MetricKit diagnostic logs assertions. |
| test/e2e-v2/cases/ios/expected/endpoint.yml | Expected endpoint list assertions for domain-based endpoints. |
| test/e2e-v2/cases/ios/e2e.yaml | New e2e scenario: send OTLP traces/logs and verify iOS metrics/log persistence. |
| test/e2e-v2/cases/ios/docker-compose.yml | e2e compose config enabling OTLP handlers, iOS MAL rules, and iOS LAL file. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/menu.yaml | Add “Mobile → iOS” menu entry in initialized UI templates. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-service.json | New iOS service dashboard template. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-root.json | New iOS root dashboard template. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-instance.json | New iOS instance (app version) dashboard template. |
| oap-server/server-starter/src/main/resources/ui-initialized-templates/ios/ios-endpoint.json | New iOS endpoint (remote domain) dashboard template. |
| oap-server/server-starter/src/main/resources/otel-rules/ios/ios-metrickit.yaml | Service-level MAL rules for MetricKit-derived metrics. |
| oap-server/server-starter/src/main/resources/otel-rules/ios/ios-metrickit-instance.yaml | Instance-level (service.version) MAL rules for MetricKit-derived metrics. |
| oap-server/server-starter/src/main/resources/lal/ios-metrickit.yaml | LAL rule to persist MetricKit diagnostic logs with IOS layer auto-detection. |
| oap-server/server-starter/pom.xml | Include ios-analyzer so iOS SpanListeners ship in the starter distribution. |
| oap-server/server-receiver-plugin/otel-receiver-plugin/src/main/java/org/apache/skywalking/oap/server/receiver/otel/otlp/OpenTelemetryMetricRequestProcessor.java | Add toMeter(...) entry point to push prebuilt SampleFamily into MAL converters. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/management/ui/template/UITemplateInitializer.java | Register IOS layer for UI template initialization. |
| oap-server/server-core/src/main/java/org/apache/skywalking/oap/server/core/analysis/Layer.java | Add IOS enum value. |
| oap-server/analyzer/pom.xml | Add new ios-analyzer module to build. |
| oap-server/analyzer/log-analyzer/src/test/resources/scripts/lal/test-lal/oap-cases/ios-metrickit.yaml | New LAL script test case for iOS MetricKit diagnostics. |
| oap-server/analyzer/log-analyzer/src/test/resources/scripts/lal/test-lal/oap-cases/ios-metrickit.data.yaml | Test data + expectations for iOS MetricKit diagnostics LAL rule. |
| oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/provider/log/listener/RecordSinkListener.java | Clarify builder dispatch semantics; remove unused accessor. |
| oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALBlockCodegen.java | Fix: propagate extractor-assigned layer into LogMetadata for layer:auto filtering/routing. |
| oap-server/analyzer/ios-analyzer/src/main/resources/META-INF/services/org.apache.skywalking.oap.server.core.trace.SpanListener | Register iOS SpanListeners via SPI. |
| oap-server/analyzer/ios-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/ios/listener/IOSMetricKitSpanListener.java | Convert MetricKit spans into MAL SampleFamily and bypass trace persistence. |
| oap-server/analyzer/ios-analyzer/src/main/java/org/apache/skywalking/oap/analyzer/ios/listener/IOSHTTPSpanListener.java | Emit OAL sources for outbound URLSession client spans (service/instance/endpoint traffic). |
| oap-server/analyzer/ios-analyzer/pom.xml | New analyzer module definition + dependencies. |
| docs/menu.yml | Add “Mobile Monitoring → iOS” doc navigation entry. |
| docs/en/swip/readme.md | Move SWIP-11 from Proposed to Accepted list. |
| docs/en/swip/SWIP-11.md | Update SWIP-11 design doc to match implementation details. |
| docs/en/setup/backend/backend-ios-monitoring.md | New setup guide for iOS monitoring using OTel Swift SDK. |
| docs/en/security/README.md | Expand security guidance for client-side monitoring/ingestion endpoints. |
| docs/en/changes/changes.md | Changelog entry for iOS monitoring + related fixes. |
| CLAUDE.md | Add build guidance for cross-module changes. |
| .github/workflows/skywalking.yaml | Register new iOS e2e case in CI workflow. |
| .claude/skills/run-e2e/SKILL.md | Add debugging guidance for sequential e2e verify failures + UI template refresh notes. |
| .claude/skills/package/SKILL.md | New packaging skill doc to avoid stale-jar docker image builds. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…vadoc - application.yml: include ios/* in the default enabledOtelMetricsRules so MetricKit MXMetricPayload spans are actually converted to MAL metrics by default. Previously, with the starter's default rule list, the IOSMetricKitSpanListener's toMeter() was a silent no-op (converters was empty) and the span was neither a metric nor a persisted trace. - OpenTelemetryMetricRequestProcessor: initialize converters to an empty list so processMetricsRequest()'s unconditional converters.forEach(...) no longer NPEs when no rules are enabled / rule loading produced none. toMeter() loses the now-redundant null guard. - IOSHTTPSpanListener: treat missing HTTP status (0) as FAILURE. Client-side errors (DNS, TLS, timeout, connection-refused) produce no response code, so the previous status = (code == 0 || code < 400) inflated *_sla metrics on exactly the errors users care about. Also clamp latency to >= 0 to guard against device clock skew, and cap at Integer.MAX_VALUE. - OTLPSpanReader: update the spanKind() javadoc to match the actual contract (OTLP proto enum name like 'SPAN_KIND_CLIENT', not 'CLIENT').
Regenerated via license-eye dependency resolve after the skywalking-ui submodule bump to a6be0e0 pulled lodash / lodash-es 4.17.23 -> 4.18.1.
Missed updating this expected file after adding ios/* to the default enabledOtelMetricsRules in application.yml; the storage e2e compares /debugging/config/dump output against a static expected file, so every change to the default rule list must be mirrored here.
spring-ai-examples forwards requests to provider:9090 (mock OpenAI), but only depended on oap. The trigger began before provider's Tomcat was listening, producing ConnectException and a flaky 500.
Add iOS/iPadOS app monitoring via OpenTelemetry Swift SDK (SWIP-11)
docs/en/setup/backend/backend-ios-monitoring.md; SWIP-11 design doc revised.oap-cases/ios-metrickitLAL unit tests andtest/e2e-v2/cases/ios/e2e covering HTTP OAL, MetricKit MAL, and LAL log persistence.Introduces the
IOSlayer with twoSpanListeners:IOSHTTPSpanListener— outbound HTTP (NSURLSession) client metrics via OAL (service_cpm,endpoint_cpm, etc.). Supports all three OTel Swift URLSessionInstrumentation semconv modes (.old/.stable/.httpDup) via stable-then-legacy attribute fallback (server.address→net.peer.name,http.request.method→http.method,http.response.status_code→http.status_code).IOSMetricKitSpanListener— daily MetricKit stats (app launch time, hang time, CPU/GPU/memory, network transfer, exit counts split by foreground/background, OOM kills). Histogram percentiles use a finite 30 s overflow ceiling and emit per-bucket counts withdefaultHistogramBucketUnit(MILLISECONDS)so MAL's percentile math produces correct values.Ships LAL rule for MetricKit crash/hang diagnostic logs, MAL rules for service- and instance-level aggregation, iOS Root/Service/Instance/Endpoint dashboards, a Mobile menu entry, user-setup docs, SWIP-11 design doc, and an e2e test registered in the CI workflow.
Bug fixes wrapped in:
layer: automode dropped logs after the extractor set the layer — codegen now propagateslayer "..."assignments toLogMetadata.layersoFilterSpec.doSink()sees the script-decided value.UI submodule bump (
apache/skywalking-booster-ui@a6be0e0):Screenshots
iOS Service dashboard — Overview tab (MetricKit launch/hang percentile charts, Hang Time Sum, Abnormal Exit counts split by foreground/background, OOM kills, outbound HTTP triplet, Peak Memory, Avg Network Transfer, Scroll Hitch Ratio)

iOS Service dashboard — Instance tab (App Version Breakdown table: Launch Time P50 / Hang Time Sum / Crashes / Outbound Load per

service.version)iOS Service dashboard — Log tab (LAL-persisted MetricKit diagnostic logs: crash / hang / CPU exception records from the

ios-metrickitrule)iOS Endpoint dashboard (remote domain: Outbound Load, Avg Response Time, Success Rate, Response Time Percentile)

CHANGESlog.