Skip to main content

Disk IOPS Monitor

Alert when a drive's read and write operations per second exceed a threshold for a sustained period. Configure drive scope, IOPS threshold, breach duration, remediations, and notifications.

Introduction

Alert when a drive's input/output operations per second climb above a threshold and stay there. The Disk IOPS monitor catches sustained disk thrashing, the kind that makes a machine feel slow even when CPU and memory look fine.

It's one of five Disk IO monitors, alongside throughput, latency, active time, and queue length. Each measures a different dimension of disk activity. IOPS counts how many read and write operations the drive is handling per second, regardless of how large each operation is.


How the Disk IOPS Monitor Works

Level measures total read and write operations per second on the selected drives. When IOPS exceeds your threshold and holds there for the full breach duration, Level creates an alert.

The breach duration filters out normal bursts. A drive spiking during a file copy or an update isn't a problem. A drive pinned at high IOPS for 10 minutes usually means something is hammering the disk: a backup job, an AV scan, a runaway database query, or a misbehaving process.

ℹ️ NOTE: The monitor evaluates on the device itself, not on Level's backend. The device needs to be online for the alert to fire. New or edited monitors reach online devices near-instantly; offline devices pick up the change when they reconnect.


Configuring the Disk IOPS Monitor

Open the target monitor policy and click Add monitor. The Add new monitor dialog opens.

Disk IOPS Monitor

Name and Type

  1. Enter a name in the Name field. "Servers - Disk Thrashing" or "SQL Hosts - High IOPS" reads better in an alert list than "Disk IOPS."

  2. Set Type to Disk IOPS. The dialog shows the monitor's description: alerts when total input/output operations per second on the selected drives breaches the threshold for the configured duration.

Severity

Set Severity to match how urgent sustained high IOPS is in this context:

  • Information

  • Warning

  • Critical

  • Emergency

💡 TIP: Warning is a sensible default. High IOPS is often a symptom worth investigating rather than an outage. Reserve Critical for hosts where disk contention directly degrades a production service, like database or file servers.

Drives

Drives controls which drives the monitor evaluates:

  • Any drive: monitor every drive on the device

  • System disk: monitor only the device's primary system drive

💡 TIP: System disk is useful when secondary drives are expected to run hot, like backup targets or scratch volumes. You only get alerted when the OS drive itself is saturated.

Threshold

Threshold sets the operations per second value that must be exceeded to trigger the monitor. Adjust using the up/down arrows or type a value directly. The unit is ops/sec. Enter a value that fits the hardware you're monitoring.

💡 TIP: Starting points that work in practice: 150 ops/sec for devices on spinning hard drives (HDDs physically top out around 100 to 200 IOPS, so sustained activity at that level means the drive is saturated), 5,000 ops/sec for SSD-backed workstations (sustained IOPS that high on a workstation usually means a runaway process, AV scan, or sync client gone wrong), and 20,000 ops/sec for SSD or NVMe servers running databases or file shares. Then tune: if it fires during normal load, raise it; if a known thrashing event doesn't trip it, lower it.

ℹ️ NOTE: A threshold that's right for an HDD file server will never fire on an NVMe workstation, and vice versa. Split mixed hardware across separate policies (or separate monitors with different severities) rather than hunting for one number that covers both.

Breach Duration

Breach duration sets how long IOPS must stay above the threshold before an alert fires. Adjust using the slider or up/down arrows. Range is 1 to 120 minutes.

💡 TIP: 5 to 10 minutes filters out routine bursts like file transfers and update installs while still catching sustained contention. Shorter durations make sense on latency-sensitive hosts where even a few minutes of saturation hurts.


Remediation

Attach an automation to run when this alert fires: restart a service, capture diagnostics, or notify your team.

  1. Click the Remediation field and select an automation. This is optional.

  2. Use the link icon to open the selected automation, the eye icon to preview it, and the × to remove it.

Once attached, open the automation to assign the monitor's payload to an automation variable if you want to pass alert context into the automation's logic.

Notify Recipients

Notify recipients sends emails to the policy's recipients when the selected events occur:

  • On alert creation

  • On alert resolution

Auto-Resolve

Auto-resolve alert when conditions clear closes the alert automatically once IOPS drops back below the threshold. Leave it off if you want alerts to persist for manual review.

ℹ️ NOTE: Manually resolving an alert while the device is still over the threshold won't recreate it. The alert only fires again when IOPS drops below the threshold and then breaches it again.


FAQ

  • What's a reasonable IOPS threshold to start with? Depends on the drive. Start at 150 ops/sec for HDDs, 5,000 for SSD workstations, and 20,000 for SSD/NVMe servers, then tune from there. If you're unsure what's normal for a device, watch its baseline IOPS during typical hours first and set the threshold comfortably above it.

  • What's the difference between IOPS and throughput? IOPS counts operations per second regardless of size. Throughput measures the volume of data moved. A drive doing thousands of tiny random reads can have high IOPS with low throughput, and a single large file copy can do the opposite. Monitor the one that matches the failure mode you care about, or use both.

  • My IOPS alert fires every night at the same time. What's going on? That's almost certainly a scheduled job: backups, AV scans, or indexing. Either raise the threshold, extend the breach duration past the job's runtime, or accept the alert as confirmation the job ran. If the device is in maintenance mode during the window, monitor alerts are suppressed entirely.

  • With Any drive selected, does the threshold apply per drive or to all drives combined? Per drive. Each drive is evaluated against the threshold independently, and any one of them breaching for the full duration creates the alert.

  • Can I set different thresholds for my database servers and workstations? Yes. Create separate monitor policies targeting different tags and configure thresholds independently. You can also add multiple Disk IOPS monitors to one policy with different thresholds and severities.

  • What happens to open alerts if I delete the monitor? Existing alerts remain. Deleting a monitor doesn't close alerts it already created. Resolve those manually.

Did this answer your question?