Monitoring S.M.A.R.T. Metrics with Prometheus and PMM

In his excellent blog post, Pavel Trukhanov showed the value of S.M.A.R.T. metric collections, so I wondered how hard would it be to enable their collection in Percona Monitoring and Management (PMM)

A quick search led me to the  text_collector plugin SmartMon, which can be easily integrated with any Prometheus Installation

For PMM, Vadim Yalovets recently showed how to do custom integrations based on text_collector

Let’s put those together:

  1. Ensure you have the smartctl tool installed. It is available in repositories for most Linux distributions
  2. Get and place it in /usr/local/bin or other location
  3. Install the cron job
  4. Enable textfile_collector as described in this blog post

That’s it! You should get your data flowing. Now you can use Prometheus to query device information:

use prometheus to query device

Or if you want to get a specific S.M.A.R.T value, such as media_wearout indicator:

specific smart value wearout indicator

If you would like to see a nicer visualization in Grafana, you can install the appropriate dashboard from the Grafana web site.

visualized using Grafana

The number and kind of metrics you’re going to get depends on the storage device vendor and model. Here is an example list from one of my test systems: