Evolution is one of the inherent traits of modern software. Many people reach out to product teams daily, asking to add more functionality to the software products they use and love. This is understandable: there will always be ways to make a product better by adding more features to the users’ delight so they can solve their problems in less time and with less effort.

We are always hard at work trying to keep Percona Monitoring and Management (PMM) as competitive as possible. While PMM does a good job today in terms of monitoring users’ infrastructure, it does not offer logging functionality yet. However, logging is one of the three pillars of observability, and we seriously consider making it part of our product at a certain point in time. To that effect, we have recently had a number of discussions to get the ball rolling and figure out the optimal architecture for logging so we can keep PMM robust, performant, and capable of handling substantial workloads.

You may say that sounds good, but it’ll take you guys some time to deliver it. Can I add logging to PMM by myself?

By all means! In this post, I will explore one of the possible ways to do it.

What’s the plan?

There’s really not much to it:

  • Take the latest version of PMM v3
  • Add a logging stack to it
  • Configure the installed tools
  • Start PMM and enjoy logging!

OK, let’s see how we can get there by following the plan above.

Which PMM distribution should I take and why?

We ship PMM in three flavors: container image, AMI, and OVF. Currently, PMM’s most popular distribution is the container image. You can deploy it with Docker or Podman or with any other OCI-compatible container orchestration system, including Kubernetes. One of the greatest features of containers is multi-stage builds. It allows you to grab an arbitrary image and install some software to it in order to augment its capabilities. Actually, many companies around the world, including Percona, are leveraging multi-stage builds to produce their software, be it for public or private distribution. This makes it a good fit for such a purpose.

What logging tool should I choose?

There are plenty of awesome open source logging tools out there. One important thing to keep in mind is their compatibility with PMM. So far, we have tested only two products:

I’ll choose the former for two main reasons:

  • Better overall performance, including disk and CPU
  • Lower contribution to the increase in container image size

You will probably notice that Victorialogs is authored by Victoriametrics, whose product we already use in PMM as a Prometheus-compatible metrics engine. I agree; that sounds biased. Therefore, if Loki looks like a better choice, it will also do the job.

The tech stack

When it comes to logging, it’s important to visualize the whole process from start to finish.

Logging technology stack in PMM

One of the challenges of any logging solution is that log files may come in a variety of different formats. Therefore, for Victorialogs to be able to understand the format and efficiently store the logs, they need to be parsed first. Parsing is shown as the first phase on the diagram. The parser needs to know where to locate the log file, which format the logs are written in, and where to send the parsed logs. To make it rather simple, we will use a well-known and battle-tested parser called promtail.

The second phase is normally referred to as log ingestion, which is when Victorialogs receives the logs from the parser and compresses and stores the parsed log lines for later retrieval.

Phase three is known as log querying or, otherwise, log retrieval. Normally, any logging solution would offer a log query language. Victorialogs’ query language is called LogsQL. It features a lot of language constructs and parameters, which you probably don’t need to learn all from the start. The majority of simple queries and use cases, like retrieving log errors or finding a specific attribute value, are abstracted away by the data source’s user interface. This last phase is called log visualization or log analysis. When I started exploring Victorialogs for the first time, I was able to get up to speed in a record time. Certainly, more complex tasks would require the user to get some degree of familiarity with the query language. Luckily, their documentation is abundant with examples, which may be super helpful.

Choosing the automation tool

I like to use automation whenever I can, simply because it allows me to create reproducible scenarios. Whether you need to make a change, upgrade dependencies, or simply experiment with your code — automation is your friend. You write it once, but you can run it as many times as you need.

You can obviously choose any automation tool you like, but I’ll use the one that ships with PMM: Ansible. We use Ansible playbooks mainly to build PMM, but Ansible also kicks in when the users migrate from an older version to the most recent one.

Preparing an Ansible playbook

First, let’s see what our directory structure will ultimately look like:

The playbook file victorialogs.yml is stored in the root directory along with Dockerfile. Following the conventional directory structure of Ansible, the tasks are stored in the tasks directory, and the rest of the auxiliary files are stored in the files directory.

We’ll create our playbook victorialogs.yml, which is very simple:

For your convenience, I’m providing the software versions as variables in this playbook, along with their respective URLs, which you can visit to find the latest software versions. These products are well maintained and get quite frequent updates. When reading this article, you will find the versions above outdated, and you will most probably want to install the most recent version.

The tasks file contains a set of ansible tasks that will be executed to:

  • Download the tools
  • Install them
  • Configure them
  • Clean up cache and tmp files

Let’s create main.yml file with the following content and save it in the tasks directory:

Configuring the tools

To configure the tools, we will need to create five more files, all of which will be placed in the files directory.

Let’s proceed with creating the victorialogs.ini file. This file configures the runtime parameters for promtail and victorialogs and delegated process management to the supervisor.

I have to mention that victoria-logs executable can be further configured using the environment variables, which you must prefix with VL_ and pass to the PMM container as usual. For example, let’s assume you want to set the retention period for your logs to be 8 weeks instead of 30 days:

What logs to collect?

In production environments, you’ll want to collect the logs from various types of infrastructure, such as database nodes. However, that kind of setup is quite complex and is, therefore, out of the scope of this post. So, we’ll target only the logs that are generated by PMM components within the container. Let’s go ahead and create promtail.yml file that provides a so-called “scrape configuration” per each log file, including promtail itself:

Nice job! This configuration allows the injection of logs from nine different sources.

You will note that the path to every log file begins with /srv/logs – this is the root directory where PMM stores the logs of all the components. You can use a special labels: section to assign custom labels to every log line being parsed. That is necessary so you can tell where the logs originate from when making queries.

I will not expand on the log ingestion format but only mention that Victorialogs allows you to ingest logs in many different formats, including the “Loki” format, which promtail will use to send the logs in (see above).

Next, we need to configure the data source, which is essentially a Grafana plugin. It allows users to query and visualize the logs in the PMM user interface. Let’s create one more file — victorialogs.yml — with the following content:

We will also create vlogs_location.conf, which will be used to patch the configuration of our Nginx proxy. This will expose two more URLs within PMM. The first URL lets the user navigate to the Victoriametrics UI, while the second proxies the log query requests to the Victoriametrics backend.

One final patch to nginx proxy configuration is needed to make sure nginx logs are formatted in “logfmt” similar to other PMM logs. Create a file nginx_logfmt.conf as follows:

Building the image

Now that we have prepared the stage let’s see how we can actually build the container. Our Dockerfile is very simple:

With FROM percona/pmm-server:3 we are making sure to pull the latest version of PMM available at the time of building. Then, we just copy over the playbook files we created earlier and run the playbook. Once the playbook succeeds, we can optionally expose port 9428, which Victorialogs will be listening on, to make it possible to inject logs from outside of PMM.

Finally, you can proceed with building the container by running the following command, which should give you a fresh container image named local/pmm-server:victorialogs:

When launching the container, do not forget to use the new image name:

Exploring the user interface

Once you get your PMM instance up and running, you can let it run for a while to generate some logs. Then click on the “Explore” menu item, choose the “Victorialogs” data source that we have provisioned at build time, and switch the editor mode (in the upper right) to “Code.”

Now, you can begin querying the logs. Let’s choose the source of logs. In my case, Nginx logs seem to be a good starting point (check the syntax below).

However, as we mentioned before, for Victorialogs to query the logs, we need to instruct it on their format. This can be achieved by passing our logs through the unpack_logfmt pipe (see below).

Victorialogs features quite a number of pipes for different use cases, including unpack_json, which you’ll want to use if the logs are produced in JSON format. I encourage you to get familiar with the pipes since they provide a way to extract virtually anything you like from the logs.

To query and visualize the logs, adjust the time range (in my case, it’s “Last 1 hour”) and press Run query. You should see a list of log lines ordered by time in descending order, i.e., “Newest first.”

All the lines are collapsed by default, so if a certain line is of interest, you can always expand it by clicking the little “chevron” icon that can be found at the beginning of each line.

Now, let’s imagine you want to know which Nginx requests end up with a 404 error, which stands for “Not found.” Again, that can be achieved using another very powerful filter pipe (see below).

You can certainly use the filter pipe with any other log attribute, be it remote_addr, host, body_bytes_sent, etc. You can also use boolean operators to combine several attributes.

Just keep in mind that every log file will have its own set of log attributes to filter on.

Conclusion

PMM is an extremely flexible product thanks to its architecture. In this post, we just scratched the surface of what’s possible to augment its functionality. We hope it will inspire you to experiment with PMM in any way you like 🔬

Hold on, did I mention that Victorialogs integrates well with PMM alerting? Well, it does! But I prefer to save this topic for another blog post! 😊

We can’t wait to hear how far you can go with your logging tool added to PMM and what your experience is. It will help us better understand your needs and use cases.


Percona Monitoring and Management is a best-of-breed open source database monitoring solution tool for MySQL, PostgreSQL, MongoDB, and the servers on which they run. It allows you to monitor, manage, and improve the performance of your databases, no matter where they are located or deployed.

 

Download Percona Monitoring and Management Today

Subscribe
Notify of
guest

2 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Johannes Wagner

Great post! I was wondering what kind of feature requests have been made for PMM to add logging support. I’d like to contribute, but I’m not sure how to get started.
I have some experience with observability, mostly using Elastic/OpenSearch, which are both quite heavy. Still, I’m interested in learning about other tools that could help.
I also read that Promtail is reaching end-of-life next year — could I help by contributing support for another log shipper like Fluent Bit?

If needed, I’m also happy to create an Ansible playbook with a Dockerfile to help with deployment or testing.