Skip to main content
All CollectionsFeatures
Monitoring & Alerting
Monitoring & Alerting

Level offers a range of predefined monitors for system resources and services, as well as script-based monitors for advanced customization.

Updated over 4 months ago

Set policies to monitor and alert you about specific endpoint issues

Monitor policies allow you and your team to keep an eye on the health and well-being your devices. Create a monitor policy and target tags assigned to your devices. Once a monitor policy has been created and assigned to tag(s), it's as simple as adding the tag to the devices you wish to monitor and stay alerted to.


Video Walkthrough


Monitor policies

In order to start receiving alerts regarding the state of your devices, a monitor policy must first be created. Monitor policies can be found under Policies > Monitor. This page will show the list of policies. From here policies can be edited or created.

The monitor policy list contains total monitor counts and targeted devices

The monitor policy page gives you the ability to see all of the policies you currently have as well as the devices they currently monitor through their assigned tags. See tags for more details on Level's dynamic tagging system.

A policy contains many monitors to keep an eye on device health

A policy contains a few modifiable parameters:

  1. Recipients: Email addresses for alert notifications.

  2. Targets: Assign one or more tags here and all the devices assigned to the tags will receive the Monitor Policy.

  3. Monitors: One more device attributes to be watched. When a threshold is exceeded, then an alert will be triggered.

A monitor policy can have many monitors attached, even multiple of the same type. This gives you complete granular control over thresholds, severity, and which monitors you and your team actually need to be alerted on.


Monitors

Monitors allow you to fine-tune parameters for effective system management. Each monitor gives control over the following options

Monitors can be fine-tuned to meet your needs

A monitor gives you control over several parameters:

  1. Monitor Name: A descriptive name for the monitor, e.g., "Windows Reboot Required."

  2. Monitor Type: Defines the type of monitoring performed. In the example, "Run Script" is selected, but there are other types as well.

  3. Severity: Set the importance level for the alert—e.g., Informational, Warning, Critical, or Emergency.

  4. Type-Specific Parameters: These will change depending on the selected Monitor Type. For example, when "Run Script" is chosen, you can specify the script to run, check the script output, and set the trigger conditions.

  5. Value: Set the value that the monitor checks to trigger an alert. In the example, the script checks if the output contains "ALERT."

  6. Auto-Resolve Alert: Toggle whether alerts should automatically resolve once the threshold is no longer exceeded.

  7. Remediation Automation: Assign an automation for remediation, e.g., "Ask User To Reboot." Automations will soon replace remediation scripts (optional).

  8. Run Script: Select a script to run automatically when the monitor is triggered (optional). (Scripts will be deprecated in an upcoming release.)

  9. Send Notification on Alert: Enable sending notifications when the threshold is breached.

  10. Send Notification on Resolution: Enable sending notifications when the alert is resolved.

Note: Scripts will be deprecated in an upcoming release. Use Remediation automations instead


Monitor types

There are many monitor types

  1. CPU: Monitor CPU level

    • Measured in percent used

  2. Connection: Monitor when an agent is unable to check-in with Level servers. When Level sees no check-in, then the device is considered offline. This can be caused by an internet/network outage, a system problem, a power event, a reboot, etc.

    • Measured in minutes offline

  3. Disk Usage: Monitor disk usage. Can choose to only monitor the system drive or all drives.

    • Measured in either GB free or percent free

  4. Memory Usage: Monitor memory/RAM usage.

    • Measured in percent used

  5. Process: Monitor if a process is running or not running.

    • The exact process name is needed

  6. Service: Monitor if a service is running or stopped.

    • The service name (not Display Name) is needed

  7. Run Script: Run a script and evaluate the returned output.

    • Learn more about script-based monitors below.


Best practices and recommendations

  1. While a monitor policy can have many monitors, do not attempt to cram all monitors into a single policy.

  2. Split out application monitoring from resource monitoring. For example, it's typically best to monitor CPU, hard drives and memory in one policy and services and processes in a different policy.

  3. Use monitor policies modularly based on roles. For example, create monitor policies for domain controllers, file servers, Exchange servers, etc. In those polices, only monitor the processes and services specific to that role.

  4. Assign tags to devices that are also specific to roles, and when possible, use the same name for monitor policies and their tags. For example if you have a tag called Domain Controllers that you have assigned to all domain controllers, then a Monitor Policy called Domain Controllers (which only monitors domain controller functions) is an obvious pairing.

  5. Only leave auto-resolve unchecked if you want a technician to investigate the root cause of an alert. If auto-resolve is unchecked, a tech must manually clear the alert which may create unnecessary administrative overhead.


Script-Based Monitors

Monitor anything with a script

Level has powerful built-in monitor types, however there are hundreds and thousands of device characteristics that an IT team might want to monitor. In response to this, we have created a custom monitoring type called "Run Script". When this type is selected, a script can be chosen for Level to run on the devices and evaluate if there is a metric outside of a healthy threshold.

A script can be a one-liner, or a complex series of health checks. Regardless of the size of the script, a script-based monitor simply reads the console output of the script and will evaluate if the device is in an errored state. The error state is determined by which values are checked on the monitor and can be a numerical comparison, or a check whether a string is present.

Video Walkthrough


First Create a Script

Before jumping into creating a script-based monitor, first create the script that will query for the information in question. Any of the supported scripting languages can be used (PowerShell, Bash, Zsh, Osquery, etc).

Pro Tip: Osquery works on Windows, Mac, and Linux, so a single script might be able to cover all operating systems! Check if there is an Osquery table that contains what you're looking for.

Ideally, the script will query for a specific value or state that Level can use to determine if the monitor item is in a healthy state or not. For example, if we want to query the state of the Windows firewall, we can run the following PowerShell command.

PS C:\> get-netfirewallprofile | select Enabled Enabled ------- True True True

In this case we see each of the 3 Windows Firewall profiles (Domain, Private, and Public) are enabled and display "True". If any value was "False" then this would indicate the firewall is disabled on a profile. Now we can paste this command into a Level script.

Here is the command in a script


Create the Monitor

Either create a monitor policy or edit an existing monitor policy. Add a new monitor to the policy and give it a useful name (this name will appear in the alert). In our firewall example we will use "Windows Firewall is disabled". Choose the type "Run Script" and set the desired severity for your alerts. Select the script that was created, in our example, it's called "Check Windows Firewall".

In the "Script output" dropdown there are several options that can be used for matching the script output.

In our case we want to match if the script output contains "False", so we chose "Contains" from the dropdown and entered "False" as the value. Greater than and less than are useful for comparing a numeric value to the threshold value.

The "Run frequency" will determine how frequently the script is run (in minutes or hours). We caution that if a script is run too frequently it could cause performance issues on the device.

The trigger count determines how many consecutive runs the matched value is present.

All remaining settings are the same as what can be found in the standard monitor types.


The Results

Now that the monitor is in place, the script will begin running at the set frequency on all targeted devices. If the value is ever matched on a device, then an alert will be generated. If you would like to test the monitor on the targeted devices in order to see the returned output of the script, then open the monitor and select "Test Script". This will open a new window with a new job that already has the script and devices selected. Simply press "Execute Script" and it will run. Press the expand icon at the far right of the device list to see the live output.

An alert with the warning threshold


Best practices and recommendations

  1. Attempt to keep the logic that determines if a script is in a healthy or unhealthy state in the monitor. For example query for a numerical value or a state (true/false, yes/no, enabled/disabled) and then choose what to match in the monitor.

  2. In complex monitors where many things need to be checked, then the logic can reside in the script. Consider outputting a message to the console like "Alert" and then exit the script with a failure (exit 1). In the monitor simply choose Contains "Alert" and that will trigger the alert.

  3. A Script-based monitor evaluates all the output of a script, not just the first or last line. Keep this in mind when considering what to output to the console.

Did this answer your question?