Skip to content

HDDS-15396. Grafana dashboard for disk balancer#10609

Open
navinko wants to merge 2 commits into
apache:masterfrom
navinko:HDDS-15396
Open

HDDS-15396. Grafana dashboard for disk balancer#10609
navinko wants to merge 2 commits into
apache:masterfrom
navinko:HDDS-15396

Conversation

@navinko

@navinko navinko commented Jun 25, 2026

Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Created brand new grafana dashboard for disk balancer.
Used already exposed prometheus matrices for disk balancer.

Please describe your PR in detail:

Used volume_info_metrics + below metrices in Grafana dashboard for tracking

image

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-15396

How was this patch tested?

Simulated skew disk for datanode in my mac-> Triggered disk balancer through CLI using docker-compose -> Captured dashboard screen shot and validated panel
Used same json "Ozone - Disk Balancer.json" for current PR

Screen shots for references:

Screenshot 2026-06-25 at 8 52 53 PM Screenshot 2026-06-25 at 8 53 28 PM Screenshot 2026-06-25 at 8 54 25 PM Screenshot 2026-06-25 at 8 55 05 PM image Screenshot 2026-06-25 at 9 01 48 PM

@jojochuang

Copy link
Copy Markdown
Contributor

The metric name needs to update, because the volume info metric class produces metrics names like this:

volume_info_metrics_data_12_hadoop_ozone_datanode_data_vadiraja_reserved{context="ozone",storagetype="DISK",datanodeuuid="8f6f944b-77b3-4a95-91da-c9d9b51d773b",volumetype="DATA_VOLUME",storagedirectory="/data/12/hadoop-ozone/datanode/data/vadiraja/hdds",volumestate="NORMAL",hostname="drc6-oz-worker07.sjc.cloudera.com"}

where the metric name is appended with the volume root directory path.

"to": "now"
},
"timezone": "browser",
"title": "Ozone Disk Balancer Operations Matrix",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Ozone Disk Balancer Operations"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

"datasource": { "type": "prometheus", "uid": "${datasource}" },
"targets": [
{
"expr": "{__name__=~\"volume_info_metrics_data_disk[0-9]+_ozone_used\", instance=~\"$datanode.*\"}",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"expr": "{__name__=~\"volume_info_metrics_data_disk[0-9]+_ozone_used\", instance=~\"$datanode.*\"}",
"expr": "{__name__=~\"volume_info_metrics_.*_ozone_used\"}",

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jojochuang for reviewing .
Updated the suggestions for panel id 2 & 3 .
image
Screenshot 2026-06-28 at 10 26 44 AM

image

@navinko navinko requested a review from jojochuang June 28, 2026 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants