diff --git a/docs/images/synthetics/cert-authority.png b/docs/images/synthetics/cert-authority.png index 770996fe4..e9abb1c91 100644 Binary files a/docs/images/synthetics/cert-authority.png and b/docs/images/synthetics/cert-authority.png differ diff --git a/docs/images/synthetics/cert-expiry2.png b/docs/images/synthetics/cert-expiry2.png index 6b850c848..41c1a7134 100644 Binary files a/docs/images/synthetics/cert-expiry2.png and b/docs/images/synthetics/cert-expiry2.png differ diff --git a/docs/images/synthetics/http-add-new-monitor.png b/docs/images/synthetics/http-add-new-monitor.png index ab6ba6e60..254071d95 100644 Binary files a/docs/images/synthetics/http-add-new-monitor.png and b/docs/images/synthetics/http-add-new-monitor.png differ diff --git a/docs/images/synthetics/root-cause-discovery/logs-trace-match.png b/docs/images/synthetics/root-cause-discovery/logs-trace-match.png new file mode 100644 index 000000000..72057df5e Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/logs-trace-match.png differ diff --git a/docs/images/synthetics/root-cause-discovery/logs-url-match.png b/docs/images/synthetics/root-cause-discovery/logs-url-match.png new file mode 100644 index 000000000..345414b02 Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/logs-url-match.png differ diff --git a/docs/images/synthetics/root-cause-discovery/metrics-tab.png b/docs/images/synthetics/root-cause-discovery/metrics-tab.png new file mode 100644 index 000000000..db381e12f Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/metrics-tab.png differ diff --git a/docs/images/synthetics/root-cause-discovery/trace-id-response-header.png b/docs/images/synthetics/root-cause-discovery/trace-id-response-header.png new file mode 100644 index 000000000..f4179be67 Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/trace-id-response-header.png differ diff --git a/docs/images/synthetics/root-cause-discovery/traces-trace-match.png b/docs/images/synthetics/root-cause-discovery/traces-trace-match.png new file mode 100644 index 000000000..a3db93be9 Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/traces-trace-match.png differ diff --git a/docs/images/synthetics/root-cause-discovery/traces-url-match.png b/docs/images/synthetics/root-cause-discovery/traces-url-match.png new file mode 100644 index 000000000..eb65ab8e9 Binary files /dev/null and b/docs/images/synthetics/root-cause-discovery/traces-url-match.png differ diff --git a/docs/images/synthetics/ssl-check-type.png b/docs/images/synthetics/ssl-check-type.png index 158b5032d..02eddbf1d 100644 Binary files a/docs/images/synthetics/ssl-check-type.png and b/docs/images/synthetics/ssl-check-type.png differ diff --git a/docs/synthetics/getting-started.md b/docs/synthetics/getting-started.md index 7a4da3e1f..a3a327782 100644 --- a/docs/synthetics/getting-started.md +++ b/docs/synthetics/getting-started.md @@ -6,24 +6,22 @@ To start monitoring your website or API, create a Sematext Cloud account in eith * [Click here](https://apps.sematext.com/ui/synthetics-create/app/Synthetics) to create the App in the US data center; or * [Click here](https://apps.eu.sematext.com/ui/synthetics-create/app/Synthetics) to create the App in the EU data center -You should see the form below: +## Create a Synthetics App + +- Enter a name for your Synthetics App. Using the domain or API endpoint you want to monitor works well here. +- After naming your App, choose the type of monitor you want to create. Create Synthetics App Form +### HTTP Monitor - * Fill in the App Name, usually the domain or API Endpoint will work great for this purpose - * Enter the emails of your team members so that they're invited to the App if they don't have access to your account +The [HTTP monitor](/docs/synthetics/http-monitor/) sends a single HTTP request to the specified URL and records the response — status code, headers, body, and timing metrics. Use it to monitor APIs, endpoints, or any URL where you want to verify availability and response correctness. -After clicking Continue you will be taken to the next screen where you can create your first Synthetic monitor by clicking the **New Monitor** button in the top right corner of the page. You can then choose between creating an HTTP monitor and a Browser monitor, as shown in the screenshot below. +### Browser Monitor -Create First Monitor \ No newline at end of file +The [Browser monitor](/docs/synthetics/browser-monitor/) loads a URL or executes a script in a real Chrome browser. It records performance metrics, screenshots, and Web Vitals. Use it to monitor web pages or simulate user journeys across multiple pages. diff --git a/docs/synthetics/http-monitor.md b/docs/synthetics/http-monitor.md index 53625af57..85436b759 100644 --- a/docs/synthetics/http-monitor.md +++ b/docs/synthetics/http-monitor.md @@ -15,7 +15,9 @@ The HTTP monitor sends a single HTTP request with its configured request setting * **Locations** - List of locations to run the monitor. * **[Scheduled Monitor Pauses](/docs/synthetics/scheduled-pauses/)** - Specify one or more time periods a monitor should be paused -### Request Settings +### Advanced Settings + +#### Request Settings * **Authentication** - Fetch token for each run and pass it in your API requests, or pass username and password to connect to your protected APIs. * **Headers** - List of HTTP headers to be sent. @@ -35,7 +37,7 @@ By default, the HTTP monitor adds the headers below for all requests sent from t | `x-sematext-synthetics-id` | `` | Uniquely identifies this request. Can be used for tracing and correlation in the back end. | -### Response Settings +#### Response Settings * **Save Response Body** - Disable this option to prevent response body from being saved at runtime. This can be helpful to ensure no sensitive data gets featured in your test results. @@ -62,7 +64,7 @@ To monitor pages protected by some form of authentication each monitor can be co You can **dynamically fetch a token** for each run with token support and pass that token in your API requests. -When creating an HTTP monitor, navigate to the Authentication tab and select **Bearer/Access Token** authentication option. +When creating an HTTP monitor, expand the **Show Advanced Settings** and navigate to the Authentication tab and select **Bearer/Access Token** authentication option. ![Access Token Authentication](/docs/images/synthetics/authentication-token.png) diff --git a/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers.md b/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers.md new file mode 100644 index 000000000..b6b689c5f --- /dev/null +++ b/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers.md @@ -0,0 +1,109 @@ +title: Expose Trace ID in Response Headers +description: Expose your active trace ID in HTTP response headers so Sematext Synthetics can correlate failed monitor runs with the exact matching logs and traces. + +When Sematext Synthetics runs a monitor and the request fails, it checks the response headers for a trace ID. If one is present, it uses it to filter logs and traces to that exact request. Without it, correlation falls back to URL and time window matching, which may include data from unrelated requests. + +Adding a trace ID to your response headers is a small, one-time change to your application. The trace ID comes from your existing OpenTelemetry instrumentation — no additional setup is required. + +## Prerequisites + +Your service must already be instrumented with OpenTelemetry and shipping traces to a Sematext Tracing App. If you haven't set that up yet, see [Getting Started with Tracing](/docs/tracing/getting-started/) and the [OpenTelemetry SDKs](/docs/tracing/sdks/) documentation. + +For complete working examples, see the [sematext-opentelemetry-examples](https://github.com/sematext/sematext-opentelemetry-examples) repository. + +## Java / Spring Boot + +Add a servlet filter that reads the active span context from the OTel agent and writes the trace ID to the response headers before the request is processed. + +```java +import io.opentelemetry.api.trace.Span; +import io.opentelemetry.api.trace.SpanContext; +import jakarta.servlet.FilterChain; +import jakarta.servlet.ServletException; +import jakarta.servlet.http.HttpServletRequest; +import jakarta.servlet.http.HttpServletResponse; +import org.springframework.core.Ordered; +import org.springframework.core.annotation.Order; +import org.springframework.stereotype.Component; +import org.springframework.web.filter.OncePerRequestFilter; + +import java.io.IOException; + +@Component +@Order(Ordered.LOWEST_PRECEDENCE - 10) +public class TraceIdFilter extends OncePerRequestFilter { + + @Override + protected boolean shouldNotFilter(HttpServletRequest request) { + return "/error".equals(request.getRequestURI()); + } + + @Override + protected void doFilterInternal(HttpServletRequest request, HttpServletResponse response, FilterChain chain) + throws ServletException, IOException { + + SpanContext ctx = Span.current().getSpanContext(); + if (ctx.isValid()) { + response.setHeader("X-Trace-Id", ctx.getTraceId()); + } + + chain.doFilter(request, response); + } +} +``` + +The OTel Java agent creates the span at the Tomcat connector level, above the filter chain, so `Span.current()` is already valid when the filter runs. Headers must be set before `chain.doFilter()` is called because the response is committed after that point. No additional dependency is needed beyond the OTel API, which is already on the classpath when using the Java agent. + +## Node.js / Express + +Add a middleware that runs on every request and sets the trace ID header using the OTel API. + +```js +const { trace } = require('@opentelemetry/api'); + +app.use((req, res, next) => { + const span = trace.getActiveSpan(); + if (span) { + res.setHeader('X-Trace-Id', span.spanContext().traceId); + } + next(); +}); +``` + +Register this middleware early in your Express app, before your route handlers. + +## Python / Flask + +Use Flask's `after_request` hook to add the trace ID to every response. + +```python +from opentelemetry import trace + +@app.after_request +def add_trace_id(response): + span = trace.get_current_span() + ctx = span.get_span_context() + if ctx.is_valid: + response.headers['X-Trace-Id'] = format(ctx.trace_id, '032x') + return response +``` + +## Verifying It Works + +How you verify the header depends on the Synthetics Monitor type. + +**HTTP Monitor** — open the monitor's run details and check the **Request** tab. You should see the `X-Trace-Id` header in the response headers section. + +**Browser Monitor** — open the monitor's run details and go to the **Waterfall** view. Click on the URL you instrumented, then open its **Request** tab to inspect the response headers for that specific request. + +![Trace ID in Response Headers](/docs/images/synthetics/root-cause-discovery/trace-id-response-header.png) + +Once Sematext detects the trace ID, the Logs and Traces tabs in the Troubleshoot section will let you drill into the exact logs and the individual trace associated with the failed monitor run. Without trace ID, logs and traces will be filtered by the URL and time window, which could include logs and traces from other requests as well. + +## Further Reading + +- [Logs Correlation](/docs/synthetics/root-cause-discovery/logs-correlation/) +- [Traces Correlation](/docs/synthetics/root-cause-discovery/traces-correlation/) +- [OpenTelemetry SDKs](/docs/tracing/sdks/) +- [Sematext OpenTelemetry Examples](https://github.com/sematext/sematext-opentelemetry-examples) +- [Getting Started with Tracing](/docs/tracing/getting-started/) diff --git a/docs/synthetics/root-cause-discovery/logs-correlation.md b/docs/synthetics/root-cause-discovery/logs-correlation.md new file mode 100644 index 000000000..c01155f4c --- /dev/null +++ b/docs/synthetics/root-cause-discovery/logs-correlation.md @@ -0,0 +1,40 @@ +title: Logs Correlation +description: Correlate failed Synthetics monitor runs with application and service logs. + +The Logs tab in the failed run flyout finds logs from [connected](/docs/guide/connected-apps/) Logs Apps that match the failed request. Depending on your setup, logs are matched by exact trace ID or by URL and time window. + +## Connecting a Logs App + +If you don't have a Logs App, the Logs section in the Troubleshoot tab will prompt you to create one and connect it to your Synthetics App automatically. If you already have Logs Apps in your account, you can select and connect the relevant one directly from the same tab. + +We recommend using the [OpenTelemetry Logs integration](/docs/integration/opentelemetry-logs/) for two reasons: if you expose your trace ID in response headers, Sematext can filter logs to the exact failing request; and even without a trace ID, OTel logs are structured and enriched, making it easier to spot errors and correlate across services. The [Sematext OpenTelemetry Examples](https://github.com/sematext/sematext-opentelemetry-examples) repo includes log shipping examples alongside the tracing and metrics setup. + +If Sematext detects a known service on the monitored host, such as [Nginx](/docs/integration/nginx-integration/) or [Apache](/docs/integration/apache-integration/), it will also suggest creating a service-specific Logs App. These come with out-of-the-box dashboards and alerts tailored to that service. Connecting both gives you application-level events from OTel logs and infrastructure-level events from the service logs in one place. + +## How Logs Are Matched + +### With a Trace ID + +This applies to **OpenTelemetry Logs Apps** and is the recommended way to correlate logs. If your backend includes the active trace ID in its HTTP response headers, Sematext reads it from the monitor run result and uses it to filter logs to that exact request. See [Expose Trace ID in Response Headers](/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers/) to set this up. + +![Logs Tab - Matched by Trace ID](/docs/images/synthetics/root-cause-discovery/logs-trace-match.png) + +This eliminates noise from unrelated requests that happened around the same time and takes you directly to the logs for the specific request that triggered the monitor failure. + +### Without a Trace ID + +If your backend does not include a trace ID in its response headers, logs are matched by the monitored URL and the time window around the failure. The tab shows the number of matching logs — open them in a new tab by clicking the Logs App link. + +![Logs Tab - Matched by URL and Time Window](/docs/images/synthetics/root-cause-discovery/logs-url-match.png) + +## Exploring Logs + +Once you open the Logs App from this tab, filters are applied automatically — the URL and time range, or the trace ID if available. + +## Further Reading + +- [Expose Trace ID in Response Headers](/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers/) +- [OpenTelemetry Logs integration](/docs/integration/opentelemetry-logs/) +- [OpenTelemetry SDKs](/docs/tracing/sdks/) +- [Logs Discovery](/docs/logs/discovery/intro/) +- [Context View](/docs/logs/context-view/) diff --git a/docs/synthetics/root-cause-discovery/metrics-correlation.md b/docs/synthetics/root-cause-discovery/metrics-correlation.md new file mode 100644 index 000000000..ab6b6be44 --- /dev/null +++ b/docs/synthetics/root-cause-discovery/metrics-correlation.md @@ -0,0 +1,33 @@ +title: Metrics Correlation +description: Correlate failed Synthetics monitor runs with infrastructure and service metrics. + +The Metrics tab in the failed run flyout lists all Monitoring Apps [connected](/docs/guide/connected-apps/) to your Synthetics App. It gives you a direct path to the metrics of the services hosting your monitored endpoint, scoped to the time of the failure. + +## Connecting a Monitoring App + +If you don't have a Monitoring App yet, the Metrics section in the Troubleshoot tab will prompt you to create one and connect it to your Synthetics App automatically. If you already have Monitoring Apps in your account, you can select and connect the relevant one directly from the same tab. + +We recommend using the [OpenTelemetry Metrics integration](/docs/integration/opentelemetry-monitoring/) — it produces structured, enriched telemetry that correlates well across signals and gives you more context when investigating failures. + +If Sematext detects a known service on the monitored host, such as [Nginx](/docs/integration/nginx-integration/) or [Apache](/docs/integration/apache-integration/), it will also suggest creating a service-specific Monitoring App. These come with out-of-the-box dashboards and alerts tailored to that service, so you can start monitoring what matters immediately after installing the [Sematext Agent](/docs/agents/sematext-agent/) on your host. + +## Using the Metrics Tab + +Once a Monitoring App is connected, it appears in the Metrics tab when a monitor run fails. + +![Metrics Tab - Connected Monitoring Apps](/docs/images/synthetics/root-cause-discovery/metrics-tab.png) + +Click the App name to open it in a new tab. The Monitoring App opens with the time range pre-set to around the time of the failure. From there, check for anomalies that coincide with the failure: + +- CPU and memory spikes +- Elevated error rates or dropped request counts +- Database connection pool exhaustion or high query latency +- Network throughput drops + +## Further Reading + +- [Monitoring overview](/docs/monitoring/) +- [OpenTelemetry Metrics integration](/docs/integration/opentelemetry-monitoring/) +- [Nginx integration](/docs/integration/nginx-integration/) +- [Apache integration](/docs/integration/apache-integration/) +- [Available integrations](/docs/integration/) diff --git a/docs/synthetics/root-cause-discovery/overview.md b/docs/synthetics/root-cause-discovery/overview.md index 134d29b05..f2da89148 100644 --- a/docs/synthetics/root-cause-discovery/overview.md +++ b/docs/synthetics/root-cause-discovery/overview.md @@ -1,105 +1,41 @@ -title: Overview -description: Troubleshoot endpoint issues by correlating with metrics & logs around the time your monitor failed. +title: Root Cause Discovery +description: Correlate failed Synthetics monitor runs with metrics, logs, and traces to find the backend cause of failures. -While Synthetics monitors detect website and API performance and availability issues, pinpointing backend issues that caused or are related to a failed monitor can be challenging. By leveraging Sematext all-in-one-platform capabilities you can identify the root cause of backend issues that caused your Synthetic monitor to fail. To do that you’ll want to connect your Monitoring and Logs Apps with your Synthetics Apps, as described below. +When a Synthetics monitor run fails, the run details flyout shows a **Troubleshoot** tab with **Metrics**, **Logs**, and **Traces** sections. These sections will prompt you to create or connect Tracing, Logs, and Monitoring Apps. Once connected, each section surfaces the backend data around the time of the failure that explains *why* the monitor failed: with metrics you can check CPU usage, memory, and request rates; with logs you can see error messages and warnings from your application; with traces you can follow the full request journey and identify which service or database call caused the problem. There are multiple ways to set this up and correlate data, all explained in detail in the sections below. -When a Synthetics monitor run fails, a Troubleshoot tab, shown below, is introduced under the failed runs flyout. This tab provides options to troubleshoot backend issues by correlating with metrics and logs. +## Traces -![Troubleshoot Tab](/docs/images/synthetics/troubleshoot/troubleshoot-tab.png) +Connect a [Tracing App](/docs/tracing/). Instrument your services with [OpenTelemetry](/docs/tracing/getting-started/) and traces will appear automatically when a monitor hits an instrumented endpoint. With distributed traces you can see exactly where time was spent across your services, identify slow database queries, service timeouts, or errors propagating across microservices. -### Correlating with Metrics +[Traces correlation →](/docs/synthetics/root-cause-discovery/traces-correlation/) -Whenever a Synthetic monitor checks an endpoint, it triggers the execution of something that handles that endpoint. This might be an application, a serverless function, or something else. Digging into their performance metrics around the time when a monitor failed can provide clues about the failure. Perhaps the application or the underlying infrastructure were overloaded at the time. Maybe some process was using 100% of the CPU. Maybe the underlying database had too many open connections. These are the sorts of things that you will be able to correlate with the failed monitor run when you make use of this new functionality. +## Logs -### Correlating with Logs -Similarly, examining error messages or warnings in logs related to the endpoint around the time of failure can offer valuable insights. Logs may reveal issues such as misconfigurations, server resource limitations, application errors, connectivity problems, etc. +Connect a [Logs App](/docs/logs/). We recommend shipping logs via [OpenTelemetry](/docs/integration/opentelemetry-logs/) for the same reasons — structured, enriched telemetry that correlates well across signals and gives you more context when investigating failures. In addition, if we detect a known service on the monitored host, such as [Nginx](/docs/integration/nginx-integration/) or [Apache](/docs/integration/apache-integration/), we will suggest creating a dedicated Logs App for that service. These come with out-of-the-box dashboards and alerts tailored to each service, so you can start collecting the logs that matter immediately after installing the [Sematext Agent](/docs/agents/sematext-agent/) on your host. -## Setting up Troubleshooting +[Logs correlation →](/docs/synthetics/root-cause-discovery/logs-correlation/) -During monitor runs, we attempt to detect the type of service that is hosting the endpoint being monitored, such as Nginx or Apache. +## Metrics -**If the Service is Known by Sematext:** +Connect a [Monitoring App](/docs/monitoring/). We recommend shipping metrics via [OpenTelemetry](/docs/integration/opentelemetry-monitoring/) — it produces structured, enriched telemetry that correlates well across signals and gives you more context when investigating failures. Similar to logs, if we detect a known service running on the monitored host, we will suggest creating a dedicated Monitoring App for that service. -Sematext integrates with popular web servers such as Nginx, Apache, and more, providing out-of-the-box dashboards and alert rules tailored to each service type. If you don’t have any Monitoring or Logs Apps in your account for the [supported service](/docs/integration/#monitoring-logs), we will recommend creating one in the troubleshoot tab, as shown below. This will lead you to the [Monitoring](/docs/monitoring/) or [Logs](/docs/logs/) App creation. There you can install the [Sematext Agent](/docs/agents/sematext-agent/) based on the environment you choose and start shipping metrics and logs from that service. +[Metrics correlation →](/docs/synthetics/root-cause-discovery/metrics-correlation/) -![Apache Troubleshoot Tab](/docs/images/synthetics/troubleshoot/apache-troubleshoot-tab.png) +## Adding a Trace ID to Response Headers -Let’s imagine you’ve created a Monitoring App for Apache, installed [Sematext Agent](/docs/agents/sematext-agent/), and started monitoring Apache metrics. Now the Troubleshoot tab will change to Metrics and Logs tabs as shown in the screenshot below. The next time you go into the troubleshooting tab, if you want to add Logs on top of it, you don’t need to install anything additionally. +Once connected, a failed monitor run gives you metrics from the host, logs from the service, and the full distributed trace for the request, all filtered to the moment of failure. For your **OpenTelemetry Logs and Tracing Apps**, you can take correlation a step further by adding a trace ID to your HTTP response headers: -![Logs Tab](/docs/images/synthetics/troubleshoot/logs-tab.png) +- **Without trace ID** — results are filtered by URL and time window, which may include logs or traces from unrelated requests happening at the same time +- **With trace ID** — Sematext matches the trace ID from the response header directly against your OpenTelemetry logs and traces, filtering to the exact request that failed with no noise -We will just direct you to the [Discovery](/docs/fleet/discovery/) page, where you can easily set up log shipping from the service type with a couple of clicks without needing any additional installation. +The trace ID comes from your existing OpenTelemetry instrumentation. The change is small: read the active span's trace ID in your request handler and write it to a response header. See [Adding a Trace ID to Response Headers](/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers/). -![Apache Logs Discovery](/docs/images/synthetics/troubleshoot/apache-logs-discovery.png) +## Getting the Most Out of Root Cause Discovery -Upon creating the Apps or connecting existing ones, you will start seeing metrics and logs around the time your monitor failed within the Troubleshoot tab, allowing you to drill down to the root cause from a single page. For more details, refer to the [Troubleshooting](#troubleshooting) section. +Connecting Monitoring, Logs, and Tracing Apps to your Synthetics App gives you the full picture when a monitor fails: -**If the Service is Unknown by Sematext:** - -In cases where the service type is unknown, all existing Monitoring and Logs Apps associated with your account are listed. You can choose to connect relevant Apps directly from the Troubleshoot tab, as seen here - -![List of Logs Apps](/docs/images/synthetics/troubleshoot/list-of-logs-apps.png) - -If no Apps exist, you can create them from the tab, and they will be automatically connected with your [Synthetics App](/docs/synthetics/), where you can view metrics and logs the next time your monitor fails. For more details, refer to the [Troubleshooting](#troubleshooting) section. - -![Create Apps](/docs/images/synthetics/troubleshoot/create-apps.png) - -## Trace Request - -Trace request tab lets you look at logs related to a specific failed synthetic monitor run by adding the Synthetics request IDs to your applications and services logs. - -For more information, including how to set that up, refer to [How to troubleshoot with Synthetics Request ID](/docs/synthetics/root-cause-discovery/root-cause-discovery-with-request-id/). - -## Troubleshooting - -Once you have configured metrics and log shipping, whether for a known or unknown service type, all Monitoring and Logs Apps that are connected to your [Synthetics App](/docs/synthetics/) will appear under the Troubleshoot tab. - -### Metrics - -Under the Metrics tab, you'll find a list of Monitoring Apps connected with your [Synthetics App](/docs/synthetics/): - -![List Of Monitoring Apps](/docs/images/synthetics/troubleshoot/list-of-monitoring-apps.png) - -To analyze metrics around the time your monitor failed, you can open the relevant Monitoring Apps either in a Split Screen or in a new tab. Look for any sudden spikes or anomalies in metrics that are critical for your endpoint. Use these metrics to identify potential root causes of the failure. - -In the image below, our sematext.com endpoint has failed, which is hosted by an Apache server. We are shipping metrics from that server into an Apache Monitoring App called EU.Frontend, which is connected to our [Synthetics App](/docs/synthetics/). - -We navigate to the Troubleshoot tab and then open the Apache Monitoring App in a new tab by clicking on the App name or within the same screen using the Split Screen button. - -![Monitoring App Actions](/docs/images/synthetics/troubleshoot/monitoring-app-actions.png) - -An automatic filter is applied to show the metrics around the time our monitor failed. Sematext -From there, we would want to check basics first, such as CPU and Memory charts around the time the monitor failed. - -![Apache CPU Memory](/docs/images/synthetics/troubleshoot/apache-cpu-memory.png) - -Then, we can check charts for request rate and traffic rate to observe trends, along with CPU/Memory utilization charts. - -![Apache Request Rate](/docs/images/synthetics/troubleshoot/apache-request-rate.png) - -If you don't find anomalies in these charts, we move on to the logs! - -### Logs - -Any connected Logs Apps will appear under the Logs tab. Each Logs App will display the number of error and warning logs detected around the time your monitor failed. If you have multiple hosts shipping logs to the same App, you will see errors and warnings specific to each host. - -![Logs and Hosts](/docs/images/synthetics/troubleshoot/logs-and-hosts.png) - -You can analyze logs around the time your monitor failed directly within the same page using Split Screen. Alternatively, you can open the logs in a new tab to further drill down into the root cause of your endpoint failures. - -In the image below, our monitor has failed. The monitored endpoint is a service that runs in a Kubernetes cluster. We are shipping logs from that cluster into a logs App called EU.Logs.k8s, which is connected to our Synthetics App. We navigate to the Troubleshoot tab, where we can see the number of error and warning logs from the Kubernetes cluster from 5 minutes before and after the time our monitor failed. - -![Logs Errors and Warnings](/docs/images/synthetics/troubleshoot/logs-errors-and-warnings2.png) - -Then, we open the Logs App either in a new tab by clicking on the App name or within the same screen using the Split Screen button. - -![Logs App Actions](/docs/images/synthetics/troubleshoot/logs-app-actions.png) - -Automatic filters are applied to display only warning and error logs around the time our monitor failed. In this particular example We can add a filter to view logs exclusively from the pods that run our service endpoint. - -![K8s Warnings and Errors](/docs/images/synthetics/troubleshoot/k8s-warnings-and-errors.png) - -From there we can look at log events and potentially find the backend issue that is a likely cause of the monitor failure. - -You can also use the [Context View](/docs/logs/context-view/) when troubleshooting your failed monitor runs' application logs. The Context View lets you see logs from before and after an individual log, which helps you understand the sequence of events leading up to and following the failed endpoint request. +- **Metrics** show whether the failure coincided with resource exhaustion, a traffic spike, or a drop in service health +- **Logs** show what your application was reporting at the time, including errors, warnings, and application-level events +- **Traces** show the full request path through your backend services, pinpointing which service call was slow or failed +Used together, these three data types significantly reduce the time it takes to go from a failed monitor run to the underlying cause. diff --git a/docs/synthetics/root-cause-discovery/root-cause-discovery-with-request-id.md b/docs/synthetics/root-cause-discovery/root-cause-discovery-with-request-id.md deleted file mode 100644 index 2fc7ad784..000000000 --- a/docs/synthetics/root-cause-discovery/root-cause-discovery-with-request-id.md +++ /dev/null @@ -1,99 +0,0 @@ -title: Using Synthetics Request ID for Troubleshooting -description: Find the needle in the haystack. Eliminate noise by pinpointing logs specific to a single run using unique request IDs, ensuring focus on crucial information. - -Synthetics monitors are used for monitoring and detecting failures and performance issues of websites and APIs, alerting us when things go wrong. Sometimes failures are caused by frontend issues, sometimes by network glitches or slowness, and sometimes failures are actually caused by something in your backend - applications, services, databases, etc. Sometimes, a failure can be correlated with backend performance or availability issues. When monitor runs fail because of something in your backend a good place to find the root cause is in your logs. - -However, finding only those logs that are relevant to the synthetic monitor failure can be tricky. Typically, there are many logs in the backend, logs from different applications and services are interleaved, etc. To troubleshoot effectively you need to look at logs related to a specific failed synthetic monitor run. But how do you do that if you are presented with hundreds or thousands of log events? Synthetic monitor ID to the rescue. - -## Synthetic Monitor Request ID - -Each Synthetics monitor run has a unique request ID. Use this ID and include it in your backend logs. It's like tagging each log with a specific monitor run. This makes it super easy to find just the right logs when something goes wrong. It allows us to filter out all the noise and focus only on the logs that matter for that specific monitor run. - -Let’s see how to set up and use this integration. - -## The X-Sematext-Synthetics-id HTTP Request Header - -The Synthetic monitoring run ID is included in each request the monitor makes to your API, URL, or whatever you are monitoring. It is included as an HTTP request header called X-Sematext-Synthetics-id. The request ID for each monitor run can be found in the run result flyout under the request tab: - -![Synthetics Request ID](/docs/images/synthetics/troubleshoot/synthetics-request-id.png) - -## Including Request ID to Application Logs - -In the application code that’s executed when the endpoint specified in your synthetic monitor is called you need to do two things: - -- Get the Synthetics request ID included in the request header -- Include this ID in your application and service logs - -### Get the Request ID - -The code for getting the HTTP request header will vary from one programming language and framework to another. For example, if your application is written in Node.js that code might look something like this: - -`const headerValue = req.headers['X-Sematext-Synthetics-id'];` - -### Add the Request ID to Application Logs - -Next, you want to add the request ID information to your existing log messages. If your logging framework let’s you add a specific log event field, that’s ideal. Just make sure the field name is XXXX. This is important. If you cannot add a field to your log event, simply append it to your log messages. - -For example, you might be logging something like this: - -`logger.log(error', 'Data could not be stored. Connection to the database failed.');` - -Simply append the request ID to it: - -`logger.log(error', 'Data could not be stored. Connection to the database failed. X-Sematext-Synthetics-Id: ${headerValue}');` - -Include the request ID in all relevant logs. Again, the code for logging will vary from one programming language and framework to another. - -### Adding the Request ID to Service Logs - -In addition to capturing the individual request IDs in your own application code, you can configure the service hosting your endpoint by accessing its logging configuration to include HTTP request IDs. For example, endpoints that Synthetic monitors hit are likely hosted by Nginx, Apache, or a similar sort of service. Their documentation will show you how to log HTTP request header values and include them in their log files. - -## Shipping Logs - -To be able to dig into your logs from a failed Synthetic monitor run your logs should be shipped to one or more Logs Apps in Sematext. If your logs are not already in Sematext, the easiest way to ship logs is via [Logs Discovery](/docs/logs/discovery/intro/) or one of the [alternative log shipping methods](/docs/logs/sending-log-events/). - -## Connecting Synthetics with Logs App - -After forwarding your logs to your Logs App, [connect](/docs/guide/connected-apps/) your Synthetics App with your Logs App. This will let you dig into your logs from the failed runs screen . To connect Apps, open the connect Apps modal from one of your Apps and then choose the App to connect. - -![Connect Apps](/docs/images/synthetics/troubleshoot/connect-apps.gif) - -## Setting the Correct Field - -Keep in mind that correlating individual monitor runs with logs requires a specific field name called request.id. So when you are shipping logs to your logs App, make sure you send the Synthetics request ID that you’ve captured under the field called request.id, as shown earlier. - -If you are not able to do this in your application code, you can use [Logs Pipelines](/docs/logs/pipelines/) to extract this information into the request.id field. For instance, perhaps you added the request ID to your log messages as in our earlier example: - -`logger.log(error', 'Data could not be stored. Connection to the database failed. X-Sematext-Synthetics-Id: ${headerValue}');` - -In this case, this whole log message will likely end up in a field called `message` in your Logs App. You’ll want to extract this request ID value and store it in a new field called request.id. Why? Because this is the log event field that is used for filtering your logs for a given monitor run. - -Below is an example of shipped logs that have request ID within the message field. By using the [Field Extractor processor](/docs/logs/field-extractor-processor/), you can extract the request ID from the text message and assign it to a new field called request.id. - -![Request ID GROK](/docs/images/synthetics/troubleshoot/request-id-grok.png) - -## Dig Into Logs to Troubleshoot - -Once both Apps are connected and the request.id is captured and shipped to the Logs App, whenever your monitor run fails you will now be able to dig into the relevant logs to look for the root cause: - -- Navigate to the troubleshoot tab from the failed run flyout. -- Click on the "Trace Request" option. -- You can see how many logs are associated with that specific monitor run within this tab. - -![Request Tab](/docs/images/synthetics/troubleshoot/trace-request-tab.png) - -To see the logs, you can open the Logs App in a new tab by clicking on the new tab icon next to the App name. Or you can click to see them in [Split Screen](/docs/guide/split-screen/) to correlate logs within a single page while viewing the failed run details. - -![Request ID Logs](/docs/images/synthetics/troubleshoot/request-id-logs.png) - -The request ID is filtered automatically, and the time range for logs is set to the time your monitor has failed. You will only see the logs that are directly associated with that monitor run. This gives you the ability to explore what is going on in your application that might have caused the failed run and get to the bottom of the root cause of the problem. - -## Context View for Logs - -The [Context View](/docs/logs/context-view/) might come in handy when troubleshooting your failed monitor runs' application logs. When you are navigated to the Logs App associate with your Synthetics App, the request ID for that individual monitor run is already set as a filter automatically in your logs. - -![Request ID Logs 2](/docs/images/synthetics/troubleshoot/request-id-logs2.png) - -To dig deeper into the root cause, you might want to see what happened in your application before and after this request was made and captured. The Context View lets you see the logs coming before and after an individual log, which helps you analyze logs to understand the sequence of events leading up to and following the failed endpoint request. - -![Context View](/docs/images/synthetics/troubleshoot/context-view.gif) diff --git a/docs/synthetics/root-cause-discovery/traces-correlation.md b/docs/synthetics/root-cause-discovery/traces-correlation.md new file mode 100644 index 000000000..e17a5c9ef --- /dev/null +++ b/docs/synthetics/root-cause-discovery/traces-correlation.md @@ -0,0 +1,49 @@ +title: Traces Correlation +description: Correlate failed Synthetics monitor runs with distributed traces to see the full backend request journey. + +The Traces tab in the failed run flyout finds distributed traces from [connected](/docs/guide/connected-apps/) Tracing Apps that match the failed request. Depending on your setup, traces are matched by exact trace ID or by URL and time window. + +## Connecting a Tracing App + +If you don't have a Tracing App yet, the Traces section in the Troubleshoot tab will prompt you to create one and connect it to your Synthetics App automatically. If you already have Tracing Apps in your account, you can select and connect the relevant one directly from the same tab. + +To learn how to create a Tracing App and instrument your services to start shipping traces, see [Getting Started with Tracing](/docs/tracing/getting-started/) and the [OpenTelemetry SDKs](/docs/tracing/sdks/) documentation. The [Sematext OpenTelemetry Examples](https://github.com/sematext/sematext-opentelemetry-examples) repo shows end-to-end how to instrument and ship traces, which is exactly what someone setting up a Tracing App needs. + +## How Traces Are Matched + +### With a Trace ID + +This is the recommended way to correlate traces. If your backend includes the active trace ID in its HTTP response headers, Sematext reads it from the monitor run result and uses it to find the exact trace for that request. See [Expose Trace ID in Response Headers](/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers/) to set this up. + +![Traces Tab - Matched by Trace ID](/docs/images/synthetics/root-cause-discovery/traces-trace-match.png) + +This takes you directly to the trace for the specific request that failed, showing the full journey through your backend services. From the trace view you can identify: + +- Slow database queries +- Service timeouts +- Error propagation across microservices +- Third-party API failures + +### Without a Trace ID + +If your backend does not include a trace ID in its response headers, traces are matched by the monitored URL and the time window around the failure. There may be multiple matching traces — you can open them in a new tab by clicking the Tracing App link. + +![Traces Tab - Matched by URL and Time Window](/docs/images/synthetics/root-cause-discovery/traces-url-match.png) + +### No Matching Traces Found + +If no traces are found for the URL and time window, the likely causes are: + +- The monitored endpoint is not instrumented with OpenTelemetry +- Traces are being sent to a different Tracing App than the one connected +- The service name in the agent configuration does not match what's expected + +Check your [OpenTelemetry SDK setup](/docs/tracing/sdks/) and verify that traces from the monitored endpoint appear in the connected Tracing App. + +## Further Reading + +- [Expose Trace ID in Response Headers](/docs/synthetics/root-cause-discovery/adding-trace-id-to-response-headers/) +- [Getting Started with Tracing](/docs/tracing/getting-started/) +- [Creating a Tracing App](/docs/tracing/create-tracing-app/) +- [OpenTelemetry SDKs](/docs/tracing/sdks/) +- [Traces Explorer](/docs/tracing/reports/explorer/) diff --git a/docs/synthetics/ssl-certificate-monitoring.md b/docs/synthetics/ssl-certificate-monitoring.md index bccd2ce0d..d236f9815 100644 --- a/docs/synthetics/ssl-certificate-monitoring.md +++ b/docs/synthetics/ssl-certificate-monitoring.md @@ -14,7 +14,7 @@ Both the HTTP and Browser monitors perform these checks, though certificate chan ## Certificate Validation -Synthetics performs a set of validity checks on SSL certificates sent by the server for every monitor run. Synthetics will mark the monitor status as failing if any of these checks fail. Synthetics will raise a run failure alert with the details of the failure. Self-signed certificates are not supported and any websites or APIs using self-signed certificates will fail unless you select the **Relaxed** type of SSL certificate check under **Configure Alerts** -> **SSL Monitoring**. +Synthetics performs a set of validity checks on SSL certificates sent by the server for every monitor run. Synthetics will mark the monitor status as failing if any of these checks fail. Synthetics will raise a run failure alert with the details of the failure. Self-signed certificates are not supported and any websites or APIs using self-signed certificates will fail unless you select the **Relaxed** type of SSL certificate check under **Advanced Settings** -> **Show Alerts** -> **SSL Monitoring**. ![Relaxed SSL certificate check](/docs/images/synthetics/ssl-check-type.png) @@ -40,11 +40,11 @@ The Browser monitor loads the website in a real Google Chrome browser and perfor Sematext Synthetics checks the certificate expiry every day and alerts you via the monitor's configured [alert notification hooks](/docs/alerts/alert-notifications) multiple times before it expires. We make sure you're reminded about the expiry multiple times. -You can set the certificate expiry alert condition within the Conditions tab for both HTTP and Browser monitors +You can set the certificate expiry alert condition within the **Advanced Settings** -> **Show Alerts** -> **Conditions** tab for both HTTP and Browser monitors ![Certificate expiry BM](/docs/images/synthetics/cert-expiry2.png) -Or from SSL Monitoring tab for HTTP Monitors +Or from **Advanced Settings** -> **Show Alerts** -> **SSL Monitoring** tab for HTTP Monitors ![Relaxed SSL certificate check](/docs/images/synthetics/ssl-check-type.png) @@ -56,7 +56,7 @@ The monitor performs the expiry check for all the certificates in the chains - l Sematext Synthetics checks the certificate authority for both Root or Intermediate level and alerts you via the monitor's configured [alert notification hooks](/docs/alerts/alert-notifications) if the conditions are not met. -The alert condition is supported for both HTTP and Browser monitors and can be set from the Conditions tab +The alert condition is supported for both HTTP and Browser monitors and can be set from the **Advanced Settings** -> **Configure Alerts** -> **Conditions** tab ![CA check](/docs/images/synthetics/cert-authority.png) diff --git a/mkdocs.yml b/mkdocs.yml index 92111f639..5f070657f 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -199,7 +199,10 @@ nav: - Correlating: synthetics/correlation.md - Root Cause Discovery: - Overview: synthetics/root-cause-discovery/overview.md - - Request ID for Troubleshooting: synthetics/root-cause-discovery/root-cause-discovery-with-request-id.md + - Traces Correlation: synthetics/root-cause-discovery/traces-correlation.md + - Logs Correlation: synthetics/root-cause-discovery/logs-correlation.md + - Metrics Correlation: synthetics/root-cause-discovery/metrics-correlation.md + - Expose Trace ID in Response Headers: synthetics/root-cause-discovery/adding-trace-id-to-response-headers.md - API: - Overview: synthetics/using-the-api.md - Monitor Overview API: synthetics/monitor-overview-api.md