Skip to main content

Disk SMART Status Monitor

Detect failing drives before they die. The SMART Status monitor watches every drive's SMART self-assessment on Windows, macOS, and Linux and alerts the moment any drive reports a failing status.

Introduction

A drive that's about to fail usually knows it first. SMART (Self-Monitoring, Analysis, and Reporting Technology) is built into modern hard drives and SSDs to predict imminent failure, and the SMART Status monitor surfaces that prediction in Level.

The monitor checks every drive on a device and creates an alert the moment any drive reports a failing SMART self-assessment. It works on Windows (including Windows Server), macOS, and Linux, and there's nothing to configure: no thresholds, no drive selection, no tuning.

This is the monitor that turns "the server won't boot" into "swap this drive during Tuesday's maintenance window."


How It Works

SMART Status is a state monitor. Each check, the agent asks every drive on the device for its overall SMART self-assessment: healthy or failing. If one or more drives report failing, Level creates an alert.

The self-assessment is the drive's own pass/fail verdict, calculated by the manufacturer's firmware against its internal thresholds. The monitor reports that verdict directly. It doesn't track individual SMART attributes like reallocated sectors or temperature.

ℹ️ NOTE: A drive can be degrading and still report healthy right up until it crosses the manufacturer's failure threshold. That's how SMART works by design. Treat a failing status as urgent, but don't treat a healthy status as a guarantee.

Drives that don't support SMART, can't be read, or return an error are recorded as Unknown. Unknown never triggers an alert.

⚠️ WARNING: Unknown drives are silent. The absence of a SMART alert doesn't prove every disk on the device is healthy. USB enclosures and drives behind some RAID controllers commonly report as Unknown or don't appear at all.


Adding the Monitor

  1. Open your monitor policy and click + Add new monitor.

  2. Enter a Name. Something like "Disk SMART Status" works; include the device class if you run separate policies (e.g., "Servers - SMART Status").

  3. Set Type to SMART status.

  4. Set Severity: Information, Warning, Critical, or Emergency.

Disk SMART Status Monitor

💡 TIP: A failing SMART status means a drive is predicting its own death. For servers, Critical is usually the right call. For workstations where the data is backed up, Warning may be enough.

That's the whole configuration. The panel confirms it: the monitor watches all drives reporting SMART data, and there are no thresholds or drive selections to set.


Remediation

Run an automation when this alert fires: create a ticket, page the on-call tech, or kick off a backup of the affected device while the drive still works.

  1. Click Select an automation in the Remediation field and choose one.

  2. Use the × button to clear the selection.


Notify Recipients

Two checkboxes control email to the policy's recipients:

  • On alert creation sends an email when the alert fires

  • On alert resolution sends an email when the alert resolves

Both are off by default. Recipients are managed at the monitor policy level, in the Recipients section. If the policy has no recipients, no email sends regardless of these checkboxes.


Auto-Resolve

Auto-resolve alert when conditions clear closes the alert automatically if the device stops reporting a failing drive.

A failing drive rarely "gets better"; the condition usually clears because the drive was replaced or stopped responding entirely. Leaving auto-resolve off keeps the alert open until a technician confirms what happened.


What the Alert Includes

The alert payload lists each failing drive's device path, plus its model and serial number when available. That's usually enough to identify the physical drive without touching the machine.

💡 TIP: Pass the payload into your remediation automation as an automation variable to include the drive model and serial in the ticket or notification it creates.


Platform Behavior

🖥️ PLATFORM NOTE:

  • Windows: Reads drive health through the Windows Storage Management provider. Works on Windows 8 and later, and Windows Server 2012 and later. A drive reporting a Warning or Unhealthy health status counts as failing.

  • macOS: Reads the SMART status reported by macOS for SATA and NVMe drives. A drive counts as failing when macOS reports its SMART status as "Failing" rather than "Verified."

  • Linux: Queries drives directly, with no dependency on smartctl or other tools. ATA drives are checked via their SMART return status; NVMe drives via the controller's health log (a non-zero critical warning counts as failing). Virtual and removable devices (loop, ram, dm-, md, sr, vd) are skipped.

Like other monitors, an individual SMART Status monitor can be scoped to a single OS, even though the monitor type supports all three platforms.

ℹ️ NOTE: On some Apple Silicon Macs, SMART reporting for internal storage is limited and may show as Unknown. Verify coverage on your M-series devices before relying on this monitor for them.


FAQ

  • My drive died but the monitor never fired. Why didn't it warn me? SMART is a prediction, not a promise. Some failures (controller death, sudden electronic failure) happen without the drive ever crossing its self-assessment threshold. SMART catches gradual mechanical and media degradation best.

  • Can I see SMART attributes like reallocated sectors or temperature? Not with this monitor. It reports the drive's overall pass/fail self-assessment only. For attribute-level detail, use a Run Script monitor with a tool like smartctl.

  • Why doesn't my external USB drive show up? Many USB enclosures don't pass SMART data through to the host. Those drives report as Unknown, which never triggers. The same applies to drives behind some hardware RAID controllers.

  • A drive on one of my devices shows Unknown. Is that a problem? Not necessarily. Unknown means the drive doesn't support SMART, sits behind hardware that blocks SMART queries, or returned an error. It does mean that drive has no SMART coverage, so plan monitoring for it another way if it matters.

  • The alert fired. How urgent is this really? Urgent. A failing self-assessment means the drive's own firmware predicts imminent failure. Back up the data and schedule a replacement; don't wait for a second opinion from the drive.

  • Does this work on Windows Server? Yes. Detection works the same on Windows Server 2012 and later as it does on desktop Windows 8 and later.

Did this answer your question?