diff --git a/.gitignore b/.gitignore index 6426496..980084e 100644 --- a/.gitignore +++ b/.gitignore @@ -44,3 +44,4 @@ ci/ .idea .eslintcache +.DS_Store \ No newline at end of file diff --git a/product-knowledge/app-states.mdc b/product-knowledge/app-states.mdc new file mode 100644 index 0000000..5d864ca --- /dev/null +++ b/product-knowledge/app-states.mdc @@ -0,0 +1,669 @@ +--- +description: Grafana & Grafana Cloud Application States to URL Mappings +alwaysApply: false +--- +# Grafana Cloud & Application States to URL Mappings + +This document gives a semi-comprehensive mapping between reachable URLs in Grafana Cloud, and application areas, along +with notes on what those application areas do. Finally, it gives the best documentation page +for learning how to get started with that application state, focused on practical how-to steps +wherever possible. + +## Warning + +There are two _different kinds of URLs_ referred to in this document. URLs of the form + +`URL: /path/to/resource` refer to Grafana OSS / Grafana Cloud application states. + +`Best doc: ` URLs refer to documentation and learning materials on the Grafana website. + +## General Notes on URL Schemes + +Every organization in Grafana gets its own base URL which is generally `orgName.grafana.net`. All URLs in this guide are +partial fragment matches relative to that base URL. There are infinitely many possible URLs, and so subsequent fragments and +URL parameters are ommitted in this guide. As a concrete example, when a URL in this document is listed as /path/to/resource, +this URL is intended to match all of the following examples: + +* `https://john.grafana.net/path/to/resource` +* `https://john.grafana.net/path/to/resource/subitem` +* `https://john.grafana.net/path/to/resource/?x=1&y=2#anchor` + +And so on. + +# Navigation Hierarchy + +## Home + +URL: /a/cloud-home-app + +The Cloud Home app is the landing screen for every stack: it surfaces high-level usage stats, quick links to recent dashboards, and “next-step” cards that steer new users toward key observability features. A team might open Home each morning to verify that data sources are still healthy and see at-a-glance whether metrics or log volumes have spiked overnight. + +Best doc: https://grafana.com/docs/grafana-cloud/home/ + +### Getting Started Guide + +URL: /a/grafana-setupguide-app/getting-started + +This guided workflow installs demo data sources and sample dashboards so newcomers can poke around Grafana Cloud without first instrumenting their own systems. It walks you through signing up, importing demo metrics, and exploring pre-built SRE and weather dashboards—perfect for onboarding a new hire who needs a safe sandbox + +Best doc: https://grafana.com/docs/grafana-cloud/getting-started/ + +## Bookmarks + +URL: /bookmarks + +Bookmarks let you pin any page—dashboards, Explore, admin views—to a “Bookmarks” section that sits at the top of the left-hand nav, solving the “where did that page go?” problem as Grafana grows. A site-reliability engineer could bookmark their favorite latency dashboard and the OnCall schedule, so both are one click away during an incident. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/build-dashboards/manage-dashboard-links/ + +## Starred + +URL: /dashboards?starred + +Starring puts a ⭐ on dashboards (or individual visualizations in Explore) and makes them show up under the “Starred” filter, plus in widgets like the Dashboard List panel. It’s a lightweight way to curate a personal “mission-control” set—say, CPU, memory, and error-rate boards your team checks in every post-mortem. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/search-dashboards/ + +## Dashboards + +URL: /dashboards + +The Dashboards area is where you view, create, and organize Grafana’s famous boards—mixing graphs, logs, traces, and text in one canvas. Teams typically build service-level dashboards (e.g., “payments-api overview”) that link to deeper drill-downs or alerts, giving everyone a shared source of truth. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/ + +### Playlists + +URL: /playlists + +Playlists run a set of dashboards in a timed loop—perfect for “NOC wall” monitors or demo kiosks where screens must cycle automatically. You control order, delay, and whether to loop forever or stop after one pass; Grafana scales each board to any resolution, so a single playlist can serve laptops and 4-K TVs alike. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/playlist/ + +### Snapshots + +URL: /dashboard/snapshots + +A dashboard snapshot freezes the current visual state, strips sensitive queries, and stores the metric data inside the snapshot payload so anyone with the link can explore it—no data-source access required. Teams use snapshots to share incident evidence with partners or embed a “moment-in-time” view in tickets without granting Grafana logins. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/share-dashboard-snapshot/ + +### Library Panels + +URL: /library-panels + +Library panels are single-source-of-truth visualizations that you save once and reuse across many dashboards; edit the library panel and every instance updates instantly. They cut copy-paste drift—think of a standard latency heat-map that must look identical on all 20 service dashboards. + +Best doc: https://grafana.com/docs/grafana/latest/panels/library-panels/ + +### Shared Dashboards + +URL: /dashboard/public + +Externally shared (formerly “public”) dashboards publish a read-only version to a hard-to-guess URL so stakeholders without Grafana accounts can view live data. You can pause or revoke access anytime, and an admin page lists every active share to keep sprawl in check. + +Best doc: https://grafana.com/docs/grafana/latest/dashboards/public-dashboards/ + +### Reporting + +URL: /reports + +Reporting schedules automated emails that turn any dashboard into a PDF, CSV, or embedded image, complete with variable values and the chosen time range. Typical flows include sending a Monday-morning uptime report to execs or a nightly capacity PDF to the infra mailing list—no manual exports needed. + +Best doc: https://grafana.com/docs/grafana/latest/reporting/ + +## Explore + +URL: /explore + +Explore is a query workbench for interactive, ad-hoc digging into metrics, logs, or traces without building a full dashboard first. An engineer might paste a PromQL query to graph a single pod’s CPU, then pivot instantly to related logs or traces to chase down a spike. + +Best doc: https://grafana.com/docs/grafana/latest/explore/ + +## Drilldown + +URL: /drilldown + +Metrics and Logs Drilldown pages take a selected series and open a context-rich view—showing labels, histograms, or correlated traces—so you can peel back layers without leaving Grafana. For instance, clicking a high-latency series in a dashboard can jump straight into Logs Drilldown filtered to that pod. + +Best doc: https://grafana.com/docs/grafana/latest/explore/simplified-exploration/ + +### Metrics + +URL: /a/grafana-metricsdrilldown-app + +This app gives you a fully query-less Prometheus explorer: pick a data source, click through metric names and label values, and Grafana auto-draws graphs, histograms, or heat maps as you drill deeper. It’s ideal when you spot a CPU spike on a dashboard—open Metrics Drilldown, filter to job=node and instance=web-03, and instantly compare related series without writing a single PromQL line. + +Best doc: https://grafana.com/docs/grafana/latest/explore/simplified-exploration/metrics/ + +### Logs + +URL: /a/grafana-lokiexplore-app + +Logs Drilldown opens a Loki (or other log source) workspace pre-grouped by labels and automatic “log patterns,” so noisy repetitions collapse and the anomalies pop out. From the same spike above you can jump here, keep the time range, and inspect only the trace_id=abc123 lines—no regex gymnastics required. + +Best doc: https://grafana.com/docs/grafana/latest/explore/simplified-exploration/logs/ + +### Traces + +URL: /a/grafana-exploretraces-app + +This view centres on distributed-trace analysis: a RED-metrics sidebar shows error and latency outliers, while the main panel lists sample traces you can expand into waterfall or flame-chart mode. It’s a two-click path from “API latency high” to the exact downstream call that added 400 ms. + +Best doc: https://grafana.com/docs/grafana/latest/explore/simplified-exploration/traces/ + +### Profiles + +URL: /a/grafana-pyroscope-app/explore + +Powered by Pyroscope, Profiles Drilldown visualizes CPU, memory, or goroutine profiles as flamegraphs and diff views, letting you spot hot paths or regressions over time. After fixing that 400 ms trace, open Profiles Drilldown to confirm the new build cut heap allocations in half. + +Best doc: https://grafana.com/docs/grafana/latest/explore/simplified-exploration/profiles/ + +## Alerts & IRM + +URL: /alerts-and-incidents + +The Alerts & Incident Response Management (IRM) hub unifies alert rules, OnCall schedules, and incident timelines in a single app. It streamlines the firefighting loop: alerts page the right rotation via OnCall, responders declare an incident, and post-incident reports are all tracked in one view. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/ + +### Service Center + +URL: /a/grafana-slo-app/services + +A service-oriented landing page that aggregates everything about each microservice—active alerts, open incidents, recent SLO burn, ownership metadata—so responders have one “command deck” per service. An SRE can pull up Payments-API during an outage and instantly see its 500-error alert firing, last week’s incident timeline, and the SLO budget left for the day. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/service-center/ + +### Alerting + +URL: /alerting + +Grafana-managed Alerting lets you build cross-datasource rules, route notifications, and track alert life-cycles—all in one pane instead of juggling Prometheus + Alertmanager silos. Teams wire Slack or PagerDuty once, then let policies decide who gets pinged and when. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/ + +#### Alert Rules + +URL: /alerting/list + +Define queries + thresholds that evaluate on a schedule and change state (Normal → Alerting). + +Best doc: https://grafana.com/docs/grafana/latest/alerting/manage-your-alert-rules/ + +#### Contact Points + +URL: /alerting/notifications?search= + +Configure where notifications go—Slack, email, Opsgenie, webhooks. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/configure-notifications/manage-contact-points/ + +#### Notification Policies + +URL: /alerting/routes + +Label-based routing tree that decides which contact point handles each alert. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/configure-notifications/create-notification-policy/ + +#### Silences + +URL: /alerting/silences + +Temporary label-matched mutes so maintenance windows don’t page you. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/configure-notifications/create-silence/ + +#### Active Notifications + +URL: /alerting/groups + +Real-time list of alert groups currently firing or pending. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/monitor-status/view-active-notifications/ + +#### History + +URL: /alerting/history + +Audit log of every state change; great for spotting flappy rules. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/monitor-status/view-alert-state-history/ + +#### Recently Deleted + +URL: /alerting/recently-deleted + +30-day recycle bin where admins can restore or purge alert rules. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/monitor-status/view-alert-rules/#permanently-delete-or-restore-deleted-alert-rules + +#### Settings + +URL: /alerting/admin + +Cluster-wide knobs: RBAC, state-history retention, default contact point, etc. + +Best doc: https://grafana.com/docs/grafana/latest/alerting/set-up/configure-alertmanager/ + +### IRM + +URL: /a/grafana-irm-app + +Grafana IRM (Incident Response Management) merges on-call scheduling, alert routing, and incident coordination. An alert flows into an Alert Group, pages the schedule defined in Escalation Chains, and responders spin up an Incident room with tasks and post-mortem templates—all without leaving Grafana. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/get-started/ + +#### Alert Groups + +URL: /a/grafana-irm-app/alert-groups + +Buckets multiple alert instances into one actionable thread. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/use/respond-to-alerts/ + +#### Incidents + +URL: /a/grafana-irm-app/incidents + +Lifecycle timeline, chat links, severity, custom fields. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/use/incident-management/ + +#### Tasks + +URL: /a/grafana-irm-app/tasks + +Checklist of action items inside an incident. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/use/incident-management/manage-tasks/ + +#### Schedules + +URL: /a/grafana-irm-app/schedules + +Drag-and-drop on-call rotations, overrides, follow-the-sun support. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/manage/on-call-schedules/ + +#### Escalation Chains + +URL: /a/grafana-irm-app/escalations/ + +Ordered steps (wait x min → notify team → page manager). + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/configure/escalation-routing/escalation-chains/ + +#### Integrations + +URL: /a/grafana-irm-app/integrations/monitoring-systems + +Webhook & native hooks (Prometheus, Loki, Datadog, etc.). + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/configure/integrations/ + +#### Users + +URL: /a/grafana-irm-app/users + +Map Grafana teams to IRM on-call roles & notification rules. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/manage/users-and-teams/ + +#### Insights + +URL: /a/grafana-irm-app/insights/alerts + +Built-in dashboards for MTTA/MTTR, paging volume, flakiness. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/manage/insights-and-reporting/alert-insights/ + +#### Settings + +URL: /a/grafana-irm-app/settings + +Org-wide IRM defaults: severities, custom fields, mobile tokens. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/irm/configure/ + +### SLO + +URL: /a/grafana-slo-app/home + +Grafana SLO lets you define SLIs, error-budget-based objectives, and burn-rate alerts while giving execs high-level performance reports. Think of it as reliability KPIs baked into Grafana, complete with Terraform/API support for GitOps. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/introduction/ + +#### Manage SLOs + +URL: /a/grafana-slo-app/manage-slos + +CRUD list of all SLOs with filters, tagging, enable/disable. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/manage/ + +#### SLO Performance + +URL: /a/grafana-slo-app/slo-performance + +Tag-driven dashboard showing error-budget burn by team/service. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/overviewdashboards/ + +#### Reports + +URL: /a/grafana-slo-app/reports + +Auto-generated weekly/monthly PDFs summarizing multiple SLOs. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/slo/reports/ + +## AI & Machine Learning + +URL: /a/grafana-ml-app/home + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/machine-learning/ + +### Metrics Forecast + +URL: /a/grafana-ml-app/metric-forecast + +Forecast learns seasonality in a time-series (like QPS) and projects it forward, generating dynamic alert thresholds that adjust to traffic patterns—handy for predicting when you’ll breach 75% CPU next week. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/machine-learning/dynamic-alerting/forecasting/ + +### Outlier Detection + +URL: /a/grafana-ml-app/outlier-detector + +Outlier Detection watches a group of series (Kubernetes pods, EC2 instances) and flags any member that deviates from its peers, so you can catch a noisy-neighbor pod burning twice the CPU before customers notice. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/machine-learning/dynamic-alerting/outlier-detection/ + +### Sift Investigations + +URL: /a/grafana-ml-app/investigations + +Sift is an ML-powered diagnostic assistant: click “Run investigation” during an incident and it auto-executes checks across metrics, logs, and deployments, surfacing suspicious spikes or recent rollouts in a tidy report—cutting minutes off mean-time-to-root-cause. + +Best doc: https://grafana.com/docs/grafana-cloud/alerting-and-irm/machine-learning/sift/ + +## Testing & Synthetics + +URL: /testing-and-synthetics + +This top-level menu groups active checks that simulate user behavior. You can script browser journeys or simple HTTP pings to verify uptime from multiple regions, ensuring “login flow is < 2 s” long before real users complain. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/ + +## Performance Testing + +URL: /a/k6-app + +The k6 app runs load tests straight from the cloud, scaling to a million virtual users and correlating the results with Grafana dashboards—ideal for hammering a new API release and watching latency vs. VU in real time. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/k6/ + +### Projects + +URL: /a/k6-app/projects + +Projects are folders that collect related tests and enforce RBAC—ideal for separating “Checkout” vs. “Search” suites or giving a contractor access to only one project. Usage reports and quota limits roll up at the project level, which keeps large orgs from blowing past VU budgets. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/k6/projects-and-users/projects/ + +### Settings + +URL: /a/k6-app/settings/api-token + +The Settings screen is where you mint and revoke API tokens, configure static IPs, and toggle private-network test runners. Tokens are required for CLI-driven k6 cloud jobs and for CI pipelines that push results back to the app. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/k6/author-run/tokens-and-cli-authentication/ + +### Learn + +URL: /a/k6-app/learn + +Learn is a curated library of k6 tutorials, sample scripts, and “test-design” playbooks that open in an in-app viewer—great for onboarding teammates who’ve never written a load test before. Topics range from basic HTTP tests to TypeScript scripting and GraphQL load patterns. + +Best doc: https://grafana.com/docs/k6/latest/examples/tutorials/ + +### Synthetic Monitoring + +URL: /a/grafana-synthetic-monitoring-app/ + +Synthetic Monitoring is Grafana Cloud’s black-box uptime suite: you define browser journeys, HTTP pings, or gRPC calls, run them from public or private probes, and chart latency and availability like any other metrics in Grafana. Pre-built alert rules fire when SLAs slip, and every check stores logs and metrics for root-cause digs. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/synthetic-monitoring/ + +### Checks + +URL: /a/grafana-synthetic-monitoring-app/checks + +A “check” is the executable test object—HTTP, DNS, TCP, browser, or scripted k6—that runs on a schedule and exports Prometheus metrics plus Loki logs. You pick locations, thresholds, and probe types, then watch pass/fail counts accumulate in real time. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/synthetic-monitoring/create-checks/checks/ + +### Probes + +URL: /a/grafana-synthetic-monitoring-app/probes + +Probes are the agents that execute checks. Grafana supplies >20 managed public probes worldwide, or you can deploy private probes inside a VPC to monitor internal services; both stream results back over encrypted gRPC. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/synthetic-monitoring/create-checks/public-probes/ + +### Alerts + +URL: /a/grafana-synthetic-monitoring-app/alerts + +This tab maps check results to Grafana Alerting: toggle built-in latency/availability rules or craft custom policies that route failures to Slack, PagerDuty, or OnCall. Alert states sync back to each check card so you can see which monitors are paging right now. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/synthetic-monitoring/configure-alerts/ + +### Config + +URL: /a/grafana-synthetic-monitoring-app/config + +The Config page houses global settings such as private-probe tokens, default alert labels, and RBAC for who can create or edit checks. It’s also where you register new private probes—copy the token, deploy the Docker container, and the probe shows up ready for scheduling. + +Best doc: https://grafana.com/docs/grafana-cloud/testing/synthetic-monitoring/set-up/ + +## Observability + +URL: /observability + +The Observability menu gathers turnkey solutions—Application, Frontend, Kubernetes, Infrastructure monitoring—all powered by the LGTM+ stack (Mimir, Loki, Tempo, Pyroscope). It gives SREs a one-stop shop for correlated metrics, logs, traces, and profiles. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-applications/ + +### Asserts + +URL: /a/grafana-asserts-app/get-started + +Grafana Asserts auto-analyzes Prometheus metrics to surface anomalies and dependency issues, then maps them onto a live topology so engineers can zero-in on failing components without combing through dashboards. Typical flow: after a new release, open Asserts to see red “SAAFE” signals (Spike, Abnormal rate, etc.) on the checkout service and drill straight into the suspect pod’s logs. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-applications/asserts/get-started/ + +### Application + +URL: /a/grafana-app-observability-app + +An OpenTelemetry-based APM suite that ingests traces, metrics, and logs, then renders RED-metric dashboards, service maps, and trace waterfalls for every microservice. With a few lines of OTEL SDK plus Grafana Alloy, teams can watch p95 latency for “orders-api”, follow a slow trace into downstream DB calls, and raise error-budget alerts—all inside Grafana Cloud. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-applications/application-observability/ + +### Cloud Provider + +URL: /a/grafana-csp-app + +This app connects to AWS, Azure, and GCP with minimal credentials, auto-discovers resources, and ships cloud-native metrics to Grafana Cloud; it then overlays cost, health, and usage views across multiple accounts. An SRE might filter to “us-east-1 RDS” to spot a CPU surge and jump directly into the instance’s logs or traces. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/monitor-cloud-provider/ + +### Kubernetes + +URL: /a/grafana-k8s-app/ + +A turnkey Helm or Alloy-based installer that collects cluster metrics, events, and traces, bundling rich dashboards for nodes, workloads, and golden signals. Clicking a high-restart Deployment opens pod-level logs and events, while built-in advisors flag cost-saving opportunities like under-utilized nodes. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/ + +### Frontend + +URL: /a/grafana-kowalski-app + +Powered by the Faro SDK, Frontend Observability captures real-user monitoring (RUM) data—page loads, Core Web Vitals, JS errors—and correlates it with backend traces for full-stack visibility. Product teams instrument a React or Next.js app, then watch a UX scorecard and session replays to chase down a slow checkout flow or a spike in 404s. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-applications/frontend-observability/ + +## Connections + +URL: /connections + +Connections is the catalog of one-click integrations and Private Data-Source Connect (PDC). Pick “Postgres,” (or many other types), follow the wizard, and Grafana Cloud ships an agent, scrapes metrics, and installs dashboards. + +Best doc: https://grafana.com/docs/grafana-cloud/connect-externally-hosted/data-sources/ + +### Add a New Connection + +URL: /connections/add-new-connection + +The “Add new connection” wizard is the front door to everything Grafana Cloud can ingest: type “Postgres,” “Loki,” or any of 40-plus tiles, and the flow spins up the right data-source object, ships (or re-uses) a Grafana Alloy collector, and autoinstalls starter dashboards and alerts. It means a junior SRE can wire a new database or log source in under two minutes, no YAML or CLI required. + +Best doc: https://grafana.com/docs/grafana/latest/datasources/#add-a-data-source + +### Collector + +URL: /a/grafana-collector-app + +The Collector app (powered by Grafana Alloy/OpenTelemetry Collector) is fleet management for your ingest agents: dashboards show health, version, and throughput, while one-click actions restart or upgrade out-of-date nodes. Ops teams use it to spot an unhealthy edge collector before it drops metrics or to roll out a new scrape job to every host from one screen. + +Best doc: https://grafana.com/docs/grafana-cloud/send-data/fleet-management/manage-fleet/collectors/ + +### Data Sources + +URL: /connections/datasources + +“Data sources” lists every configured backend—Prometheus, CloudWatch, BigQuery, and more—plus connection status, quotas, and edit links. From here you can tweak auth keys, set custom headers, or turn on traces for a single source without touching other connections. + +Best doc: https://grafana.com/docs/grafana-cloud/connect-externally-hosted/data-sources/ + +### Integrations + +URL: /connections/infrastructure + +The Integrations catalog bundles exporters, dashboards, and alerts for popular services (Linux node, NGINX, Redis, etc.) behind point-and-click cards. Choose an integration and Grafana Cloud renders install commands, ships dashboards, and wires default alert rules—giving you observability best practices in about five minutes. + +Best doc: https://grafana.com/docs/grafana-cloud/monitor-infrastructure/integrations/ + +### Private Datasource Connect + +URL: /connections/private-data-source-connections + +PDC creates an outbound-only, encrypted tunnel from your private network to Grafana Cloud, so on-prem SQL servers or self-hosted Prometheus can be queried without opening inbound firewall holes. You manage multiple tunnels per stack, assign them to specific data sources, and monitor connection health in real time. + +Best doc: https://grafana.com/docs/grafana-cloud/connect-externally-hosted/private-data-source-connect/ + +## More Apps + +URL: /apps + +Any installed app plugin that doesn’t fit the core navigation lands under “More Apps.” Admins can later reposition pages, but by default this keeps niche tools (e.g., a custom billing app) tidy yet discoverable. + +Best doc: https://grafana.com/docs/grafana/latest/administration/plugin-management/ + +### Demo Data Dashboards + +URL: /a/grafana-demodashboards-app + +Best doc: https://grafana.com/docs/grafana-cloud/get-started/#install-demo-data-sources-and-dashboards + +## Administration + +URL: /admin + +The Administration section handles org-level housekeeping: user and team management, roles & permissions, billing, plugin toggles, and feature flags. An org-admin might head here to grant an intern “Viewer” rights or enable a new plugin across the stack. + +Best doc: https://grafana.com/docs/grafana/latest/administration/ + +### General + +URL: /admin/general + +This page centralizes stack-wide preferences such as home dashboard, time zone, default theme, and the brand assets (logo + fav icon) that appear in the UI. Changing a value here rewrites the org’s config file on the fly, so an admin could, for example, switch the home screen to a “Friday-on-call” dashboard without restarting Grafana. + +Best doc: https://grafana.com/docs/grafana/latest/administration/organization-management/ + +### Plugins and Data + +URL: /admin/plugins + +The Plugins panel lists every data source, panel, and app extension that’s installed (or available in the catalog) and lets admins enable, disable, update, or pin versions. Typical use: a team trials the Kubernetes-Monitoring app in a dev stack and, once satisfied, promotes the plugin to prod with one toggle. + +Best doc: https://grafana.com/docs/grafana/latest/administration/plugin-management/ + +### Users and Access + +URL: /admin/access + +Here you add users, place them into teams, and fine-tune permissions down to folder or dashboard level. A site-reliability lead might create a “DB-Ops” team with Editor rights on the “Database” folder while leaving broader org access at Viewer. + +Best doc: https://grafana.com/docs/grafana/latest/administration/user-management/ + +### Authentication + +URL: /admin/authentication + +This tab wires Grafana to auth providers—LDAP, SAML, OAuth (GitHub, Google, Okta, etc.)—and exposes toggles for sign-up modes, auto-invite, and session hardening. For example, you could flip on GitHub OAuth and require membership in the “infra-team” org before anyone can log in. + +Best doc: https://grafana.com/docs/grafana/latest/setup-grafana/configure-security/configure-authentication/ + +### Advisor + +URL: /a/grafana-advisor-app + +Advisor is a health-check dashboard that scans your instance for red flags—outdated plugins, high cardinality metrics, or missing TLS—and scores overall hygiene. Running the check after an upgrade quickly shows whether new settings broke alerting or data-source auth. + +Best doc: https://grafana.com/docs/grafana/latest/administration/grafana-advisor/ + +### Cost Management + +URL: /a/grafana-costmanagementui-app/overview + +The hub aggregates spend across metrics, logs, traces, and profiles, pairing usage dashboards with optimization tools like Adaptive Metrics and Log Volume Explorer. Admins drop in here to set usage alerts, allocate costs to teams, or preview next month’s bill curve. + +Best doc: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/ + +#### Metrics + +URL: /a/grafana-costmanagementui-app/metrics + +Shows active-series counts, DPM trends, and the Adaptive Metrics recommender that can auto-aggregate low-value labels, slashing ingestion by up to 40%. A performance engineer might run the recommender, accept the plan, and immediately see forecast savings in the usage panel. + +Best doc: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/analyze-costs/metrics-costs/ + +#### Logs + +URL: /a/grafana-costmanagementui-app/logs + +Pivots to Log Volume Explorer where you slice ingestion by label to find the namespace or app exploding your bill. Teams often start here to craft drop rules or retention tiers after spotting a chatty sidecar that doubles log volume overnight. + +Best doc: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/analyze-costs/logs-costs/ + +#### Traces + +URL: /a/grafana-costmanagementui-app/traces + +Highlights GB ingested vs retained and surfaces high-fan-out span attributes; guidance panels link to sampling knobs that trim unnecessary spans before export. Observability owners use it to adjust instrumentation and keep trace costs flat as traffic grows. + +Best doc: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/reduce-costs/traces-costs/ + +#### Profiles + +URL: /a/grafana-costmanagementui-app/profiles + +Highlights GB ingested vs retained and surfaces high-fan-out span attributes; guidance panels link to sampling knobs that trim unnecessary spans before export. Observability owners use it to adjust instrumentation and keep trace costs flat as traffic grows. + +Best doc: https://grafana.com/docs/grafana-cloud/cost-management-and-billing/understand-your-invoice/profiles-invoice/ diff --git a/product-knowledge/data-sources.mdc b/product-knowledge/data-sources.mdc new file mode 100644 index 0000000..588f047 --- /dev/null +++ b/product-knowledge/data-sources.mdc @@ -0,0 +1,228 @@ +--- +description: +globs: +alwaysApply: false +--- +# Grafana Data Source Research Report + +This report details a selection of significant, signed Grafana data source plugins. The focus is on plugins available in Grafana Cloud and popular, well-documented open-source offerings. Each section outlines the plugin's purpose, environment, use cases, and technical requirements. + +## Prometheus + +Prometheus is a leading open-source monitoring and alerting toolkit, and its data source is a cornerstone of modern observability stacks. It is designed for reliability and scalability, pulling time-series data via HTTP endpoints on monitored targets. + +* **Key Use Cases**: Infrastructure monitoring (Kubernetes, servers), application performance monitoring (APM), service monitoring, and alerting on multi-dimensional time-series data. + +#### Plugin Environment + +* **Availability**: Built-in to Grafana OSS and Grafana Cloud. +* **Publisher**: Grafana Labs. +* **Version**: Core plugin, version-tied to the Grafana instance. +* **Documentation URL**: `https://grafana.com/docs/grafana/latest/datasources/prometheus/` + +#### Use Cases & Application Scenarios + +* **Kubernetes Monitoring**: The primary use case is visualizing metrics scraped from a Kubernetes cluster, often in tandem with the `kube-state-metrics` and `node-exporter` services. +* **Service-Level Objective (SLO) Tracking**: Using PromQL (Prometheus Query Language) to define and visualize SLOs for service availability and performance. +* **Application Metrics**: Instrumenting custom applications (e.g., in Go, Java, Python) with Prometheus client libraries to expose metrics like request latency, error rates, and queue depths. + +#### Technical Requirements + +* A running Prometheus server instance or a compatible endpoint (e.g., Grafana Mimir, Cortex, Thanos). +* Network accessibility from the Grafana instance/agent to the Prometheus server's HTTP API (default port 9090). + +#### Security Considerations + +* Communication is typically over HTTP. For production, it is strongly recommended to place Prometheus behind a reverse proxy that provides TLS/SSL encryption and authentication. +* If using Grafana Agent to ship data to Grafana Cloud Metrics (which uses a Prometheus-compatible backend), authentication is handled via API keys. + +#### Grafana Ecosystem Integration + +* **Grafana Mimir**: The Prometheus data source is the primary way to query Grafana's horizontally scalable, multi-tenant TSDB. +* **Grafana Alerting**: The standard data source for defining metric-based alert rules. +* **Correlations**: Tightly integrated with Loki and Tempo to automatically link metrics to relevant logs and traces. + +## Loki + +Loki is Grafana's log aggregation system, inspired by Prometheus. It is designed to be highly cost-effective and easy to operate by indexing only metadata (labels) for logs, not the full text content. + +* **Key Use Cases**: Centralized log aggregation, live log streaming, infrastructure and application log analysis, and correlating logs with metrics and traces. + +#### Plugin Environment + +* **Availability**: Built-in to Grafana OSS and Grafana Cloud. +* **Publisher**: Grafana Labs. +* **Version**: Core plugin, version-tied to the Grafana instance. +* **Documentation URL**: `https://grafana.com/docs/grafana/latest/datasources/loki/` + +#### Use Cases & Application Scenarios + +* **Log Exploration**: Using LogQL (Loki Query Language) to interactively query and filter logs from multiple sources in the Explore view. +* **Kubernetes Pod Logs**: Aggregating logs from all pods in a Kubernetes cluster, with automatic metadata like `pod`, `namespace`, and `container`. +* **Application Debugging**: Finding specific error messages or log patterns across distributed services without the overhead of full-text indexing. + +#### Technical Requirements + +* A running Loki instance or a Grafana Cloud Logs subscription. +* Log collection agents (e.g., Grafana Agent, Promtail, Fluentd) configured to push logs to the Loki endpoint. + +#### Security Considerations + +* Authentication to a self-hosted Loki instance is typically handled via a reverse proxy. +* When pushing logs to Grafana Cloud, authentication is managed via the User ID and an API Key. + +#### Grafana Ecosystem Integration + +* **Grafana Logs**: The Loki data source is the native backend for Grafana's log visualization panels and the Explore view. +* **Correlations**: Automatically links logs to related traces (via TraceID) from Tempo and metrics from Prometheus by matching labels (e.g., `pod`, `service`). This is a key feature of the Grafana observability stack. +* **Grafana Alerting**: Supports defining log-based alert rules. + +## AWS CloudWatch + +The AWS CloudWatch data source allows Grafana to query metrics and logs directly from Amazon's native monitoring service for AWS cloud resources and applications. + +* **Key Use Cases**: Monitoring the performance and health of AWS services like EC2, RDS, Lambda, and S3; analyzing application logs stored in CloudWatch Logs. + +#### Plugin Environment + +* **Availability**: Core plugin for Grafana OSS and Grafana Cloud. +* **Publisher**: Grafana Labs. +* **Version**: Core plugin, version-tied to the Grafana instance. +* **Documentation URL**: `https://grafana.com/docs/grafana/latest/datasources/aws-cloudwatch/` + +#### Use Cases & Application Scenarios + +* **Infrastructure Dashboards**: Visualizing core metrics like EC2 CPU Utilization, RDS Database Connections, or ELB Request Counts. +* **Log Analysis**: Querying and visualizing logs from services, Lambda functions, and other resources that write to CloudWatch Logs. +* **Cost Management**: Visualizing AWS billing and cost data, which can be published as a metric to CloudWatch. + +#### Technical Requirements + +* An active AWS account. +* Configured AWS credentials with appropriate IAM permissions (e.g., `cloudwatch:GetMetricData`, `cloudwatch:ListMetrics`, `logs:StartQuery`, `logs:GetQueryResults`). + +#### Security Considerations + +* The recommended authentication method is to use IAM roles (e.g., an IAM role for an EC2 instance running Grafana, or IRSA for EKS). +* Alternatively, temporary security credentials or long-lived access key/secret pairs can be used, but this is less secure. Grafana's data source provisioning can use AWS Secrets Manager for secure key storage. + +#### Grafana Ecosystem Integration + +* **Grafana Alerting**: Can be used to create alerts based on CloudWatch metrics. +* **Cross-Service Dashboards**: Enables building unified dashboards that combine AWS metrics with data from other sources like Prometheus or on-premise databases. + +## PostgreSQL + +This data source allows Grafana to connect directly to PostgreSQL databases to visualize data from relational tables as time-series or tables. + +* **Key Use Cases**: Visualizing business intelligence (BI) data, monitoring application-specific data, and tracking business KPIs stored in a relational database. + +#### Plugin Environment + +* **Availability**: Core plugin for Grafana OSS and Grafana Cloud. +* **Publisher**: Grafana Labs. +* **Version**: Core plugin, version-tied to the Grafana instance. +* **Documentation URL**: `https://grafana.com/docs/grafana/latest/datasources/postgres/` + +#### Use Cases & Application Scenarios + +* **Business Dashboards**: Plotting user signups, sales trends, or inventory levels over time by querying application database tables. +* **Time-Series Analysis**: Leveraging PostgreSQL time functions to format data for Grafana's time-series panels. The plugin includes macros like `$__timeFilter()` to simplify time-based queries. +* **Table Visualization**: Displaying the raw results of a SQL query in a table format, useful for reports and non-time-series data. + +#### Technical Requirements + +* A running PostgreSQL server (or compatible databases like TimescaleDB or CockroachDB). +* Network accessibility from the Grafana instance to the PostgreSQL server. For Grafana Cloud, this often requires using a tool like the Grafana Cloud Agent for secure tunneling. +* A database user with `CONNECT` and `SELECT` privileges on the required schemas and tables. + +#### Security Considerations + +* Connection requires a database user and password, which should be stored securely. +* TLS/SSL encryption for the connection is highly recommended and configurable in the data source settings. +* It is best practice to create a read-only database user for Grafana with access to only the necessary tables. + +#### Grafana Ecosystem Integration + +* **Business Observability**: Complements traditional observability by bringing business metrics into the same dashboards as operational metrics. +* **Transformations**: Data returned from SQL queries can be further manipulated using Grafana's built-in transformations. + +## Infinity + +The Infinity data source is a generic plugin that can visualize data from any backend that returns JSON, CSV, XML, or HTML, including REST APIs. It acts as a powerful bridge to data sources that do not have a dedicated Grafana plugin. + +* **Key Use Cases**: Connecting to internal or public REST APIs, visualizing data from static files (CSV/JSON), and creating dashboards from unsupported data sources. + +#### Plugin Environment + +* **Availability**: Signed plugin, installable in Grafana OSS and Grafana Cloud. +* **Publisher**: Grafana Labs. +* **Version**: 1.2.1 (as of research date). +* **Documentation URL**: `https://grafana.com/grafana/plugins/yesoreyeram-infinity-datasource/` + +#### Use Cases & Application Scenarios + +* **Public API Dashboards**: Visualizing data from public APIs like GitHub (e.g., number of stars on a repo) or weather services. +* **Internal Microservice Data**: Querying custom JSON endpoints on internal microservices to display application-specific status or data. +* **Static Data Prototyping**: Using inline JSON/CSV data to quickly prototype a dashboard without needing a live backend. + +#### Technical Requirements + +* Depends entirely on the target API or file. It may require an API endpoint URL, a file path, or inline data. +* Network access from the Grafana instance to the target API endpoint. + +#### Security Considerations + +* Supports API Key, Basic, and Bearer Token authentication, which can be configured in the data source settings. +* Secrets (like API keys) should be handled using Grafana's secure provisioning mechanisms. +* Care should be taken when connecting to public, unauthenticated APIs. + +#### Grafana Ecosystem Integration + +* **Flexibility**: Fills gaps where a dedicated data source does not exist, allowing almost any data to be brought into a Grafana dashboard. +* **Grafana Scenes**: The flexibility of this plugin makes it a powerful tool when building interactive web applications with Grafana Scenes. + +## Splunk + +The Splunk Enterprise data source plugin allows Grafana to run Splunk Search Processing Language (SPL) queries and visualize the results. This enables organizations with existing Splunk investments to leverage Grafana's visualization capabilities. + +* **Key Use Cases**: Migrating or augmenting Splunk dashboards with Grafana, creating unified dashboards with data from Splunk and other sources, and using Grafana's alerting engine with Splunk data. + +#### Plugin Environment + +* **Availability**: Enterprise plugin, requires a Grafana Enterprise license. Available by default in Grafana Cloud (with a Pro/Advanced plan). +* **Publisher**: Grafana Labs. +* **Version**: Core Enterprise plugin, version-tied to the Grafana instance. +* **Documentation URL**: `https://grafana.com/docs/grafana/latest/datasources/splunk/` + +#### Use Cases & Application Scenarios + +* **Unified Observability**: Combining security and event data from Splunk with operational metrics from Prometheus in a single dashboard. +* **Data Visualization**: Using Splunk's powerful search capabilities as a backend and Grafana's superior and more flexible visualization engine as the frontend. +* **Alerting**: Leveraging Grafana Alerting on top of Splunk search results, which can be more intuitive and better integrated with notification channels like Grafana OnCall. + +#### Technical Requirements + +* A running Splunk Enterprise instance (version 6.3 or higher). +* The Splunk HTTP Event Collector (HEC) must be enabled. +* A Splunk user account with appropriate permissions to run searches. + +#### Security Considerations + +* Authentication is handled via Splunk credentials (username/password or token). +* Communication with the Splunk API should be secured with TLS/SSL. + +#### Grafana Ecosystem Integration + +* **Grafana Enterprise**: This is a key value proposition for Grafana Enterprise, appealing to customers with existing Splunk deployments. +* **Grafana Alerting/OnCall**: Provides a modern alerting and incident response workflow for events and data residing in Splunk. + +*** + +## Notes & Indications + +* **Plugin Versioning**: For core/built-in plugins (Prometheus, Loki, etc.), the version is directly tied to the main Grafana release. For installable plugins like Infinity, the version is independent and updated separately. The versions listed are a snapshot in time and may change. +* **Core vs. Installable Plugins**: A key distinction is between data sources that are built-in or available by default (like Prometheus) and those that must be explicitly installed from the Grafana plugin catalog (like Infinity), even if they are published by Grafana Labs. +* **Enterprise vs. OSS**: The availability of a plugin can be a major deciding factor for users. Some high-value plugins connecting to commercial products (like Splunk, ServiceNow, Dynatrace) are reserved for Grafana Enterprise customers, forming a significant part of the enterprise value proposition. +* **The Rise of Generic Plugins**: The Infinity data source represents an important trend. Its flexibility allows users to connect to a near-limitless number of sources, significantly expanding Grafana's reach beyond its dedicated plugins. +* **Security is Paramount**: Across all data sources, a consistent theme is the need for secure credential management and encrypted communication (TLS/SSL). Best practices favor temporary credentials or role-based access control over static username/password or token configurations. \ No newline at end of file diff --git a/product-knowledge/docs-structure-mapping.mdc b/product-knowledge/docs-structure-mapping.mdc new file mode 100644 index 0000000..3785920 --- /dev/null +++ b/product-knowledge/docs-structure-mapping.mdc @@ -0,0 +1,319 @@ +--- +alwaysApply: false +--- +# Docs Struture Mapping + +This file describes the connection between documentation URLs and broad product areas described +in @grafana-products.mdc, as well as features described in @app-states.mdc. + +Below are a set of URL prefixes, and the associated product and/or features. + +The prefixes refer to locations in Grafana docs, so it is assumed that they correspond +to URLs that start with https://grafana.com. + +When this file states that something is the "Child of" something else, this does **not** mean +that it is a child in the navigational sense, but in the product sense. It means more like +"this is a sub-feature or a sub-product of the parent", even though their documentation may be +in different places. + +# Prefix: /docs/grafana/ + +Product: Grafana OSS +Child of: N/A + +# Prefix: /docs/grafana-cloud/ + +Product: Grafana Cloud +Child of: Grafana OSS + +# Prefix: /docs/grafana-cloud/security-and-account-management/ + +Feature: Security & Account Management +Child of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/send-data/ + +Feature: Instrument and Send Data +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/send-data/metrics/ + +Feature: Send Metrics +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/logs/ + +Feature: Send Logs +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/traces/ + +Feature: Send Traces +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/aws-privatelink/ + +Feature: AWS PrivateLink +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/azure-privatelink/ + +Feature: Azure PrivateLink +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/gcp-psc/ + +Feature: GCP Private Service Connect +Child Of: Instrument and Send Data + +# Prefix: /docs/grafana-cloud/send-data/otlp/ + +Feature: OpenTelemetry +Child Of: Instrument and Send Data + +# Prefix: /docs/k6/ + +Product: K6 +Child of: N/A + +# Prefix: /docs/grafana/latest/introduction/grafana-enterprise/ + +Product: Grafana Enterprise +Child of: Grafana OSS + +# Prefix: /docs/helm-charts + +Product: Grafana Helm Charts +Child of: N/A + +# Prefix: /docs/alloy/ + +Product: Grafana Alloy +Child of: N/A + +# Prefix: /docs/beyla/ + +Product: Grafana Belya +Child of: N/A + +# Prefix: /docs/grafana/latest/explore/ + +Product: Explore +Child of: Grafana OSS + +# Prefix: /docs/grafana/latest/explore/simplified-exploration/ + +Product: Drilldown +Child of: Explore + +# Prefix: /docs/grafana/latest/observability-as-code/ + +Feature: As-Code +Child of: Grafana OSS + +# Prefix: /docs/grafana/latest/observability-as-code/grafana-cli/ + +Feature: Grafana CLI +Child Of: As-Code + +# Prefix: /docs/grafana/latest/observability-as-code/foundation-sdk/ + +Feature: Foundation SDK +Child Of: As-Code + +# Prefix: /docs/grafana/latest/troubleshooting/ + +Feature: Troubleshooting +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/getting-started/ + +Feature: Getting started with OSS +Child Of: Grafana OSS +OSS Only: true + +# Prefix: /docs/k6-studio/ + +Product: K6 Studio +Child Of: K6 + +# Prefix: /docs/loki/ + +Product: Loki +Child Of: N/A + +# Prefix: /docs/loki/latest/setup/ + +Feature: Loki Setup +Child Of: Loki +OSS Only: true + +# Prefix: /docs/loki/latest/configure/ + +Feature: Loki Configure +Child Of: Loki +OSS Only: true + +# Prefix: /docs/loki/latest/send-data/ + +Feature: Loki Send Data +Child Of: Loki + +# Prefix: /docs/loki/latest/query/ + +Feature: Loki Query +Child Of: Loki + +# Prefix: /docs/loki/latest/operations/ + +Feature: Loki Manage Operations +Child Of: Loki +OSS Only: true + +# Prefix: /docs/loki/latest/reference/ + +Feature: Loki Reference +Child Of: Loki + +# Prefix: /docs/mimir + +Product: Mimir +Child Of: N/A + +# Prefix: /docs/pyroscope/ + +Product: Pyroscope +Child Of: N/A + +# Prefix: /docs/oncall/ + +Product: Grafana OnCall +Child Of: Grafana Cloud + +# Prefix: /docs/tempo + +Product: Grafana Tempo +Child Of: N/A + +# Prefix: Monitor Applications + +Feature: Monitor Applications +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/monitor-applications/frontend-observability/ + +Product: Grafana Faro +Child Of: Monitor Applications + +# Prefix: /docs/grafana-cloud/monitor-applications/asserts/ + +Product: Grafana Asserts +Child Of: Monitor Applications + +# Prefix: /docs/grafana-cloud/monitor-applications/ai-observability/ + +Product: Grafana AI Observability +Child Of: Monitor Applications + +# Prefix: /docs/grafana-cloud/cost-management-and-billing/ + +Feature: Cost Management & Billing +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/visualizations/ + +Feature: Visualize Data +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/send-data/fleet-management/ + +Product: Fleet Management +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/monitor-infrastructure/monitor-cloud-provider/ + +Product: Cloud Provider Observability +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/alerting-and-irm/irm/ + +Product: IRM +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/alerting-and-irm/alerting/ + +Feature: Alerting +Child Of: IRM + +# Prefix: /docs/grafana/latest/alerting/ + +Feature: Alerting +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/dashboards/ + +Feature: Dashboards +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/datasources/ + +Feature: Data Sources +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/administration/ + +Feature: Grafana Administration +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/upgrade-guide/ + +Feature: Upgrading +Child Of: Grafana OSS +OSS Only: True + +# Prefix: /docs/grafana/latest/setup-grafana/ + +Feature: Setup +Child Of: Grafana OSS +OSS Only: True + +# Prefix: /docs/grafana/latest/panels-visualizations/ + +Feature: Panels & Visualizations +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/panels-visualizations/query-transform-data/ + +Feature: Query & Transform Data +Child Of: Grafana OSS + +# Prefix: /docs/grafana/latest/search/ + +Feature: Search +Child Of: Grafana OSS + +# Prefix: /docs/grafana-cloud/alerting-and-irm/slo/ + +Product: SLOs +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/testing/synthetic-monitoring/ + +Product: Synthetic Monitoring +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/alerting-and-irm/machine-learning/ + +Product: Machine Learning +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/monitor-applications/application-observability/ + +Product: Application Observability +Child Of: Grafana Cloud + +# Prefix: /docs/grafana-cloud/monitor-infrastructure/kubernetes-monitoring/ + +Product: Kubernetes Monitoring +Child Of: Grafana OSS + diff --git a/product-knowledge/grafana-products.mdc b/product-knowledge/grafana-products.mdc new file mode 100644 index 0000000..fe4cff0 --- /dev/null +++ b/product-knowledge/grafana-products.mdc @@ -0,0 +1,424 @@ +--- +alwaysApply: false +--- + +Here is a list of Grafana Labs products with their descriptions and common use cases. + +### Grafana +Grafana is an open-source platform for monitoring and observability. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. + +**Common Use Cases:** +* Creating and sharing dynamic and interactive dashboards. +* Visualizing data from a wide variety of data sources. +* Setting up alerts based on metrics and logs. +* Fostering a data-driven culture by enabling teams to explore and share data. + +### Grafana Loki +Grafana Loki is a horizontally scalable, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost-effective and easy to operate. + +**Common Use Cases:** +* Aggregating logs from all your applications and infrastructure. +* Querying and filtering logs using LogQL, a query language similar to PromQL. +* Correlating logs with metrics and traces to get a complete picture of your system. + +### Grafana Tempo +Grafana Tempo is a high-scale, minimal-dependency distributed tracing backend. It is designed to be a simple and cost-effective way to store and query traces. + +**Common Use Cases:** +* Storing and retrieving traces from applications instrumented with OpenTelemetry, Jaeger, or Zipkin. +* Debugging and understanding the flow of requests through a distributed system. +* Integrating with Grafana for seamless correlation between metrics, logs, and traces. + +### Grafana Mimir +Grafana Mimir is a scalable, long-term storage solution for Prometheus metrics. It allows you to aggregate metrics from multiple Prometheus instances into a single, globally-viewable dashboard. + +**Common Use Cases:** +* Centralized, long-term storage for Prometheus metrics. +* High-performance querying of large volumes of time-series data. +* Creating a single-pane-of-glass view of metrics from across your entire infrastructure. + +### Grafana Pyroscope +Grafana Pyroscope is a continuous profiling platform that helps you find performance issues in your code. It allows you to understand which parts of your code are consuming the most resources. + +**Common Use Cases:** +* Identifying and debugging performance bottlenecks in applications. +* Optimizing resource utilization and reducing infrastructure costs. +* Analyzing CPU and memory usage over time to improve application performance. + +### Grafana K6 +Grafana k6 is a modern, open-source load testing tool for engineering teams. It allows you to write tests in JavaScript and run them from the command line or the cloud. + +**Common Use Cases:** +* Performance and load testing of APIs, microservices, and websites. +* Ensuring the reliability and scalability of your systems. +* Integrating performance testing into your CI/CD pipelines. + +### Grafana IRM (Incident Response Management) +Grafana IRM is a suite of tools designed to help you manage incidents from detection to resolution. It includes Grafana OnCall and Grafana Incident. + +**Common Use Cases:** +* **Grafana OnCall:** Managing on-call schedules, escalations, and notifications to ensure that the right people are alerted when an incident occurs. +* **Grafana Incident:** Automating incident response workflows, from creating a dedicated Slack channel to tracking action items and generating post-mortems. + +### Grafana Cloud +Grafana Cloud is a fully managed, composable observability platform that brings together all of Grafana's open-source and enterprise products into a single, easy-to-use solution. + +**Common Use Cases:** +* Organizations that want to use the Grafana stack without the overhead of managing the infrastructure themselves. +* Getting started quickly with a complete observability solution. +* Scaling observability as your organization grows. + +### Grafana Enterprise +Grafana Enterprise is a self-managed, commercial edition of Grafana that includes additional features and support for enterprise use cases. + +**Common Use Cases:** +* Organizations with strict security and compliance requirements that need to run Grafana on their own infrastructure. +* Accessing enterprise-grade features such as enhanced security, reporting, and data source integrations. +* Receiving dedicated support from Grafana Labs. + +### Grafana Beyla +Grafana Beyla is an eBPF-based application auto-instrumentation tool. It allows you to capture telemetry data from your applications without modifying their code. + +**Common Use Cases:** +* Gaining visibility into the performance of applications without manual instrumentation. +* Reducing the time and effort required to instrument a large number of services. +* Monitoring applications written in languages that are difficult to instrument. + +### Grafana Faro +Grafana Faro is a frontend application observability SDK. It allows you to collect real-user monitoring (RUM) data from your web applications. + +**Common Use Cases:** +* Understanding the end-user experience of your web applications. +* Identifying and debugging frontend performance issues. +* Collecting data on page load times, JavaScript errors, and other frontend events. + +### Grafana Alloy +Grafana Alloy is a distribution of the OpenTelemetry Collector with added support for Prometheus pipelines. + +**Common Use Cases:** +* Collecting and processing telemetry data from a variety of sources. +* Sending data to multiple observability backends. +* Building custom telemetry pipelines. + +### Grafana Helm Charts + +Grafana Helm Charts are official Kubernetes deployment packages that simplify the installation, configuration, and management of Grafana on Kubernetes clusters. These charts provide a standardized way to deploy both Grafana OSS and Enterprise editions with customizable configurations, persistent storage, and integrated security features. + +**Common Use Cases:** +* Deploying Grafana on Kubernetes clusters with consistent, repeatable configurations. +* Managing Grafana installations across multiple environments (development, staging, production). +* Automating Grafana deployments as part of CI/CD pipelines and GitOps workflows. +* Simplifying upgrades, rollbacks, and configuration changes for Grafana instances. +* Integrating Grafana with other Kubernetes-native tools and monitoring stacks. + +### Grafana Asserts + +Grafana Asserts is an intelligent observability platform that provides automated root cause analysis and correlation intelligence for distributed applications. It leverages telemetry data to create visual representations of application and infrastructure relationships, automatically correlates issues using the SAAFE model (Saturations, Amends, Anomalies, Failures, Errors), and eliminates the need for manual dashboard maintenance through intelligent workbench capabilities. + +**Common Use Cases:** +* Performing automated root cause analysis for complex distributed system issues. +* Tracking SLO violations and understanding why error budgets are depleting. +* Correlating metrics, logs, and traces to identify the true source of problems. +* Reducing observability costs through intelligent data distillation and retention. +* Eliminating the need to maintain hundreds of manually created dashboards. + +### Grafana AI Observability + +Grafana AI Observability is a comprehensive solution designed to monitor and optimize generative AI applications. It captures observability signals and visualizes real-time performance of your GenAI stack from LLMs to vector databases, correlating usage data for full-stack AI observability. + +**Common Use Cases:** +* Monitoring Large Language Model (LLM) performance, token usage, and costs in production environments. +* Tracking user interactions with AI applications to understand usage patterns and model behavior. +* Analyzing vector database query response times and throughput for efficient data retrieval. + +### Machine Learning + +Grafana Machine Learning is an AI/ML toolkit integrated into Grafana Cloud that provides intelligent anomaly detection, forecasting, and automated insights for observability data. It includes dynamic alerting with metric forecasting, outlier detection for groups of similar resources, and AI-powered root cause analysis capabilities through Sift investigations. + +**Common Use Cases:** +* Detecting anomalous behavior in groups of similar Kubernetes pods or services using outlier detection algorithms. +* Creating predictive alerts for capacity planning and resource management through metric forecasting and anomaly detection. +* Automating root cause analysis during incidents with AI-powered investigations and contextual insights. + +### Fleet Management + +Grafana Fleet Management is a centralized service for managing fleets of telemetry collectors and configuration pipelines across distributed infrastructure. It provides remote configuration management, pipeline creation, and monitoring capabilities for Grafana Alloy collectors, enabling streamlined observability data collection at scale. + +**Common Use Cases:** +* Centrally managing and configuring multiple Grafana Alloy collectors deployed across different environments and locations. +* Creating and deploying standardized telemetry collection pipelines to ensure consistent data gathering across infrastructure. +* Monitoring collector health and performance while troubleshooting data collection issues from a unified interface. + +### Cloud Provider Observability + +Cloud Provider Observability is a comprehensive multi-cloud monitoring solution that provides unified observability across AWS, Azure, and Google Cloud Platform infrastructure. It offers out-of-the-box integration with cloud services like CloudWatch, Azure Monitor, and Google Cloud Monitoring, enabling teams to visualize and alert on cloud resources without deploying local agents or exporters. + +**Common Use Cases:** +* Monitoring AWS services like EC2, RDS, Lambda, and S3 with pre-built dashboards and cost tracking across multiple accounts and regions. +* Centralizing multi-cloud observability by correlating metrics and logs from AWS, Azure, and GCP in a single Grafana Cloud instance. +* Implementing cloud cost management and optimization by tracking resource utilization, billing metrics, and identifying underused infrastructure across cloud providers. + +### SLO + +Grafana SLO is a comprehensive Service Level Objective management solution that enables organizations to define, track, and manage reliability targets for their services. It provides automated SLO calculation, error budget tracking, and burn rate alerting to help teams maintain service reliability and make data-driven decisions about service improvements. + +**Common Use Cases:** +* Defining and monitoring service reliability targets with automated error budget tracking and burn rate calculations. +* Creating SLO-based alerting policies that notify teams when error budgets are being consumed too quickly or reliability targets are at risk. +* Establishing data-driven reliability practices by correlating SLO performance with business metrics and customer satisfaction. + +### Synthetic Monitoring + +Grafana Synthetic Monitoring is a proactive monitoring solution that tests applications and services by creating checks that continuously run against remote targets from global probe locations. It leverages k6 technology to provide comprehensive monitoring capabilities including HTTP/API endpoint testing, DNS resolution validation, browser-based user flow testing, and multi-step transaction monitoring. + +**Common Use Cases:** +* Monitoring website and API availability from multiple global locations to detect outages before users do. +* Testing critical user journeys and workflows with browser-based checks to ensure core functionality works correctly. +* Validating DNS resolution, SSL certificate health, and network connectivity for proactive infrastructure monitoring. + +### Application Observability + +Grafana Application Observability is a comprehensive solution for monitoring application performance using OpenTelemetry-based instrumentation. It provides automatic discovery and correlation of metrics, logs, and traces to deliver service-level insights including service maps, service inventory, and automated root cause analysis capabilities. + +**Common Use Cases:** +* Monitoring distributed application performance with automatic service discovery and RED (Rate, Errors, Duration) metrics visualization. +* Performing root cause analysis during incidents by correlating metrics, logs, and traces across service boundaries and dependencies. +* Tracking application health and SLI/SLO compliance with pre-built dashboards and alerting for service-level objectives. + +### Kubernetes Monitoring + +Grafana Kubernetes Monitoring provides comprehensive observability for Kubernetes clusters and workloads, offering both reactive troubleshooting and proactive management capabilities. It includes pre-built dashboards, alerting rules, cost monitoring, and resource efficiency tracking with easy deployment via Helm chart and correlation between metrics and logs for faster issue resolution. + +**Common Use Cases:** +* Monitoring cluster health and resource utilization with drill-down capabilities from cluster-level overview to individual pod and container performance metrics. +* Tracking infrastructure costs and resource efficiency to optimize spending and identify underutilized or over-provisioned resources across nodes and workloads. +* Proactively detecting performance issues and bottlenecks using built-in alerting rules, machine learning-powered resource forecasting, and real-time notifications for critical events. + +### Alerting + +Grafana Alerting is a unified alerting system that allows you to define alert rules across multiple data sources and manage notifications with flexible routing via contact points and notification policies. Built on the Prometheus alerting model, it supports both Grafana-managed and data source-managed alert rules with templating, grouping, and scheduling capabilities for comprehensive alerting workflows. + +**Common Use Cases:** +* Creating multi-dimensional alert rules that can query data from multiple data sources simultaneously and generate separate alert instances for each series or dimension. +* Managing complex notification routing using notification policies that route alerts to different contact points (email, Slack, PagerDuty, etc.) based on label matching and timing configurations. +* Building sophisticated alert conditions using expressions and transformations to reduce, resample, or apply mathematical operations on query results before triggering alerts. + +### Drilldown + +Grafana Drilldown refers to a collection of simplified exploration applications that provide queryless interfaces for navigating observability data including logs, traces, profiles, and metrics. These apps enable users to explore and analyze data through interactive visualizations and filtering without requiring knowledge of query languages like LogQL, TraceQL, or PromQL. + +**Common Use Cases:** +* Exploring log data from Loki services by drilling down through volumes, patterns, labels, and fields without writing LogQL queries. +* Navigating distributed tracing data to investigate performance bottlenecks and error patterns across service dependencies. +* Analyzing profiling data to identify CPU and memory usage patterns by drilling down from high-level metrics to line-specific details. + +Here is a list of Grafana Labs products with their descriptions and common use cases. + +### Grafana +Grafana is an open-source platform for monitoring and observability. It allows you to query, visualize, alert on, and understand your metrics no matter where they are stored. + +**Common Use Cases:** +* Creating and sharing dynamic and interactive dashboards. +* Visualizing data from a wide variety of data sources. +* Setting up alerts based on metrics and logs. +* Fostering a data-driven culture by enabling teams to explore and share data. + +### Grafana Loki +Grafana Loki is a horizontally scalable, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost-effective and easy to operate. + +**Common Use Cases:** +* Aggregating logs from all your applications and infrastructure. +* Querying and filtering logs using LogQL, a query language similar to PromQL. +* Correlating logs with metrics and traces to get a complete picture of your system. + +### Grafana Tempo +Grafana Tempo is a high-scale, minimal-dependency distributed tracing backend. It is designed to be a simple and cost-effective way to store and query traces. + +**Common Use Cases:** +* Storing and retrieving traces from applications instrumented with OpenTelemetry, Jaeger, or Zipkin. +* Debugging and understanding the flow of requests through a distributed system. +* Integrating with Grafana for seamless correlation between metrics, logs, and traces. + +### Grafana Mimir +Grafana Mimir is a scalable, long-term storage solution for Prometheus metrics. It allows you to aggregate metrics from multiple Prometheus instances into a single, globally-viewable dashboard. + +**Common Use Cases:** +* Centralized, long-term storage for Prometheus metrics. +* High-performance querying of large volumes of time-series data. +* Creating a single-pane-of-glass view of metrics from across your entire infrastructure. + +### Grafana Pyroscope +Grafana Pyroscope is a continuous profiling platform that helps you find performance issues in your code. It allows you to understand which parts of your code are consuming the most resources. + +**Common Use Cases:** +* Identifying and debugging performance bottlenecks in applications. +* Optimizing resource utilization and reducing infrastructure costs. +* Analyzing CPU and memory usage over time to improve application performance. + +### Grafana K6 +Grafana k6 is a modern, open-source load testing tool for engineering teams. It allows you to write tests in JavaScript and run them from the command line or the cloud. + +**Common Use Cases:** +* Performance and load testing of APIs, microservices, and websites. +* Ensuring the reliability and scalability of your systems. +* Integrating performance testing into your CI/CD pipelines. + +### Grafana IRM (Incident Response Management) +Grafana IRM is a suite of tools designed to help you manage incidents from detection to resolution. It includes Grafana OnCall and Grafana Incident. + +**Common Use Cases:** +* **Grafana OnCall:** Managing on-call schedules, escalations, and notifications to ensure that the right people are alerted when an incident occurs. +* **Grafana Incident:** Automating incident response workflows, from creating a dedicated Slack channel to tracking action items and generating post-mortems. + +### Grafana Cloud +Grafana Cloud is a fully managed, composable observability platform that brings together all of Grafana's open-source and enterprise products into a single, easy-to-use solution. + +**Common Use Cases:** +* Organizations that want to use the Grafana stack without the overhead of managing the infrastructure themselves. +* Getting started quickly with a complete observability solution. +* Scaling observability as your organization grows. + +### Grafana Enterprise +Grafana Enterprise is a self-managed, commercial edition of Grafana that includes additional features and support for enterprise use cases. + +**Common Use Cases:** +* Organizations with strict security and compliance requirements that need to run Grafana on their own infrastructure. +* Accessing enterprise-grade features such as enhanced security, reporting, and data source integrations. +* Receiving dedicated support from Grafana Labs. + +### Grafana Beyla +Grafana Beyla is an eBPF-based application auto-instrumentation tool. It allows you to capture telemetry data from your applications without modifying their code. + +**Common Use Cases:** +* Gaining visibility into the performance of applications without manual instrumentation. +* Reducing the time and effort required to instrument a large number of services. +* Monitoring applications written in languages that are difficult to instrument. + +### Grafana Faro +Grafana Faro is a frontend application observability SDK. It allows you to collect real-user monitoring (RUM) data from your web applications. + +**Common Use Cases:** +* Understanding the end-user experience of your web applications. +* Identifying and debugging frontend performance issues. +* Collecting data on page load times, JavaScript errors, and other frontend events. + +### Grafana Alloy +Grafana Alloy is a distribution of the OpenTelemetry Collector with added support for Prometheus pipelines. + +**Common Use Cases:** +* Collecting and processing telemetry data from a variety of sources. +* Sending data to multiple observability backends. +* Building custom telemetry pipelines. + +### Grafana Helm Charts + +Grafana Helm Charts are official Kubernetes deployment packages that simplify the installation, configuration, and management of Grafana on Kubernetes clusters. These charts provide a standardized way to deploy both Grafana OSS and Enterprise editions with customizable configurations, persistent storage, and integrated security features. + +**Common Use Cases:** +* Deploying Grafana on Kubernetes clusters with consistent, repeatable configurations. +* Managing Grafana installations across multiple environments (development, staging, production). +* Automating Grafana deployments as part of CI/CD pipelines and GitOps workflows. +* Simplifying upgrades, rollbacks, and configuration changes for Grafana instances. +* Integrating Grafana with other Kubernetes-native tools and monitoring stacks. + +### Grafana Asserts + +Grafana Asserts is an intelligent observability platform that provides automated root cause analysis and correlation intelligence for distributed applications. It leverages telemetry data to create visual representations of application and infrastructure relationships, automatically correlates issues using the SAAFE model (Saturations, Amends, Anomalies, Failures, Errors), and eliminates the need for manual dashboard maintenance through intelligent workbench capabilities. + +**Common Use Cases:** +* Performing automated root cause analysis for complex distributed system issues. +* Tracking SLO violations and understanding why error budgets are depleting. +* Correlating metrics, logs, and traces to identify the true source of problems. +* Reducing observability costs through intelligent data distillation and retention. +* Eliminating the need to maintain hundreds of manually created dashboards. + +### Grafana AI Observability + +Grafana AI Observability is a comprehensive solution designed to monitor and optimize generative AI applications. It captures observability signals and visualizes real-time performance of your GenAI stack from LLMs to vector databases, correlating usage data for full-stack AI observability. + +**Common Use Cases:** +* Monitoring Large Language Model (LLM) performance, token usage, and costs in production environments. +* Tracking user interactions with AI applications to understand usage patterns and model behavior. +* Analyzing vector database query response times and throughput for efficient data retrieval. + +### Machine Learning + +Grafana Machine Learning is an AI/ML toolkit integrated into Grafana Cloud that provides intelligent anomaly detection, forecasting, and automated insights for observability data. It includes dynamic alerting with metric forecasting, outlier detection for groups of similar resources, and AI-powered root cause analysis capabilities through Sift investigations. + +**Common Use Cases:** +* Detecting anomalous behavior in groups of similar Kubernetes pods or services using outlier detection algorithms. +* Creating predictive alerts for capacity planning and resource management through metric forecasting and anomaly detection. +* Automating root cause analysis during incidents with AI-powered investigations and contextual insights. + +### Fleet Management + +Grafana Fleet Management is a centralized service for managing fleets of telemetry collectors and configuration pipelines across distributed infrastructure. It provides remote configuration management, pipeline creation, and monitoring capabilities for Grafana Alloy collectors, enabling streamlined observability data collection at scale. + +**Common Use Cases:** +* Centrally managing and configuring multiple Grafana Alloy collectors deployed across different environments and locations. +* Creating and deploying standardized telemetry collection pipelines to ensure consistent data gathering across infrastructure. +* Monitoring collector health and performance while troubleshooting data collection issues from a unified interface. + +### Cloud Provider Observability + +Cloud Provider Observability is a comprehensive multi-cloud monitoring solution that provides unified observability across AWS, Azure, and Google Cloud Platform infrastructure. It offers out-of-the-box integration with cloud services like CloudWatch, Azure Monitor, and Google Cloud Monitoring, enabling teams to visualize and alert on cloud resources without deploying local agents or exporters. + +**Common Use Cases:** +* Monitoring AWS services like EC2, RDS, Lambda, and S3 with pre-built dashboards and cost tracking across multiple accounts and regions. +* Centralizing multi-cloud observability by correlating metrics and logs from AWS, Azure, and GCP in a single Grafana Cloud instance. +* Implementing cloud cost management and optimization by tracking resource utilization, billing metrics, and identifying underused infrastructure across cloud providers. + +### SLO + +Grafana SLO is a comprehensive Service Level Objective management solution that enables organizations to define, track, and manage reliability targets for their services. It provides automated SLO calculation, error budget tracking, and burn rate alerting to help teams maintain service reliability and make data-driven decisions about service improvements. + +**Common Use Cases:** +* Defining and monitoring service reliability targets with automated error budget tracking and burn rate calculations. +* Creating SLO-based alerting policies that notify teams when error budgets are being consumed too quickly or reliability targets are at risk. +* Establishing data-driven reliability practices by correlating SLO performance with business metrics and customer satisfaction. + +### Synthetic Monitoring + +Grafana Synthetic Monitoring is a proactive monitoring solution that tests applications and services by creating checks that continuously run against remote targets from global probe locations. It leverages k6 technology to provide comprehensive monitoring capabilities including HTTP/API endpoint testing, DNS resolution validation, browser-based user flow testing, and multi-step transaction monitoring. + +**Common Use Cases:** +* Monitoring website and API availability from multiple global locations to detect outages before users do. +* Testing critical user journeys and workflows with browser-based checks to ensure core functionality works correctly. +* Validating DNS resolution, SSL certificate health, and network connectivity for proactive infrastructure monitoring. + +### Application Observability + +Grafana Application Observability is a comprehensive solution for monitoring application performance using OpenTelemetry-based instrumentation. It provides automatic discovery and correlation of metrics, logs, and traces to deliver service-level insights including service maps, service inventory, and automated root cause analysis capabilities. + +**Common Use Cases:** +* Monitoring distributed application performance with automatic service discovery and RED (Rate, Errors, Duration) metrics visualization. +* Performing root cause analysis during incidents by correlating metrics, logs, and traces across service boundaries and dependencies. +* Tracking application health and SLI/SLO compliance with pre-built dashboards and alerting for service-level objectives. + +### Kubernetes Monitoring + +Grafana Kubernetes Monitoring provides comprehensive observability for Kubernetes clusters and workloads, offering both reactive troubleshooting and proactive management capabilities. It includes pre-built dashboards, alerting rules, cost monitoring, and resource efficiency tracking with easy deployment via Helm chart and correlation between metrics and logs for faster issue resolution. + +**Common Use Cases:** +* Monitoring cluster health and resource utilization with drill-down capabilities from cluster-level overview to individual pod and container performance metrics. +* Tracking infrastructure costs and resource efficiency to optimize spending and identify underutilized or over-provisioned resources across nodes and workloads. +* Proactively detecting performance issues and bottlenecks using built-in alerting rules, machine learning-powered resource forecasting, and real-time notifications for critical events. + +### Alerting + +Grafana Alerting is a unified alerting system that allows you to define alert rules across multiple data sources and manage notifications with flexible routing via contact points and notification policies. Built on the Prometheus alerting model, it supports both Grafana-managed and data source-managed alert rules with templating, grouping, and scheduling capabilities for comprehensive alerting workflows. + +**Common Use Cases:** +* Creating multi-dimensional alert rules that can query data from multiple data sources simultaneously and generate separate alert instances for each series or dimension. +* Managing complex notification routing using notification policies that route alerts to different contact points (email, Slack, PagerDuty, etc.) based on label matching and timing configurations. +* Building sophisticated alert conditions using expressions and transformations to reduce, resample, or apply mathematical operations on query results before triggering alerts. + +### Drilldown + +Grafana Drilldown refers to a collection of simplified exploration applications that provide queryless interfaces for navigating observability data including logs, traces, profiles, and metrics. These apps enable users to explore and analyze data through interactive visualizations and filtering without requiring knowledge of query languages like LogQL, TraceQL, or PromQL. + +**Common Use Cases:** +* Exploring log data from Loki services by drilling down through volumes, patterns, labels, and fields without writing LogQL queries. +* Navigating distributed tracing data to investigate performance bottlenecks and error patterns across service dependencies. +* Analyzing profiling data to identify CPU and memory usage patterns by drilling down from high-level metrics to line-specific details. +