I was once helping one of our customers on how to load MySQL audit logs into a MySQL database and analyze them. But immediately I thought: “Hey, this is not the most efficient solution! MySQL or typical RDBMS, in general, were not really meant to store logs after all.”
So, I decided to explore an alternative – which seemed more sensible to me – and use MongoDB as the storage for logs, for three main reasons:
Just to mention, audit logging is available in MySQL Enterprise Edition but a similar, yet free, solution, is available in Percona Server for MySQL. In both cases, it works by installing the audit log plugin.
The simplest scenario is to just set the audit log format to JSON:
|
1 |
audit_log_format = JSON |
And as soon as it collects some data, import the log file into MongoDB collection via the mongoimport command, like this:
|
1 |
# mongoimport --username percona --password P3rc0n4 --host 10.95.83.225 --port 27017 --db auditlogs --collection audit1 --file /var/lib/mysql/audit.log<br>2020-12-31T16:24:43.782+0000 connected to: 10.95.83.225:27017<br>2020-12-31T16:24:44.316+0000 imported 25462 documents<br><br>mongo > db.audit1.countDocuments({})<br>25462 |
Of course, this works, but I prefer an automated solution, so I looked at available options for live-streaming the logs.
The first thing that looked useful is the ability to send the audit log directly to syslog instead of a file. Knowing that both rsyslog, as well as syslog-ng, have MongoDB output modules, it felt like a very easy approach. So I installed the rsyslog-mongodb module package on my test Ubuntu VM with running Percona Server for MySQL, configured audit log with:
|
1 |
[mysqld]<br>audit_log_handler = syslog<br>audit_log_format = JSON |
Rsyslog (version 8.2) example configuration with:
|
1 |
# cat /etc/rsyslog.d/49-ship-syslog.conf<br>action(type="ommongodb"<br>uristr="mongodb://percona:[email protected]:27017/?authSource=auditlogs"<br>db="auditlogs" collection="mysql_node1_log") |
This worked, however, inserted documents looked like this:
|
1 |
mongo > db.mysql_node1_log.findOne().pretty()<br>{<br>"_id" : ObjectId("5fece941f17f487c7d1d158b"),<br>"msg" : " {"audit_record":{"name":"Connect","record":"7_1970-01-01T00:00:00","timestamp":"2020-12-30T20:55:29Z","connection_id":"9","status":0,"user":"root","priv_user":"root","os_login":"root","proxy_user":"","host":"localhost","ip":"","db":""}}"<br>} |
Basically, because of syslog escaping the double quote symbols, the whole audit record appears as a single string inside MongoDB collection, instead of a JSON object. No matter what I tried, like custom templates and property values in rsyslog, I could not disable escaping. Therefore, although feeding MongoDB with audit logs works this way, it becomes pretty useless when it comes to analyzing the logs later. The same issue applies to syslog-ng and the syslog-ng-mod-mongodb module. And since MongoDB does not offer before-insert triggers, I could not easily “fix” the inserted data on the fly.
This forced me to look for alternative solutions. One of them would be using FIFO file and tail the audit log continuously to feed it, and then read from it to insert logs to mongodb. I wanted a more robust way, though, and decided to try Fluentd instead. It was created as a versatile log collector machine, highly flexible, prepared to work with many different applications out of the box, but most importantly, it is an open source project and speaks JSON natively. Making it to do the job I wanted turned out to be easier than I expected.
Here is what I did:
|
1 |
td-agent-gem install fluent-plugin-mongo |
|
1 |
# cat /etc/td-agent/td-agent.conf<br>####<br>...<br><source><br> @type tail<br> path /var/lib/mysql/audit.log<br> pos_file /var/log/td-agent/audit.access_log.pos<br> <parse><br> @type json<br> </parse><br> tag mongo.audit.log<br></source><br><match mongo.audit.log><br> @type mongo<br> database auditlogs #(required)<br> collection audit_log #(optional; default="untagged")<br> capped<br> capped_size 100m<br> host 10.95.83.225 #(optional; default="localhost")<br> port 27017 #(optional; default=27017)<br> user percona<br> password P3rc0n4<br> <buffer><br> flush_interval 1s<br> </buffer><br></match> |
|
1 |
# id td-agent<br>uid=114(td-agent) gid=121(td-agent) groups=121(td-agent)<br># usermod -a -G mysql td-agent<br># id td-agent<br>uid=114(td-agent) gid=121(td-agent) groups=121(td-agent),120(mysql) |
|
1 |
[mysqld]<br>audit_log_handler = file<br>audit_log_format = JSON<br>audit_log_file = audit.log<br>audit_log_rotate_on_size = 10M<br>audit_log_rotations = 3 |
|
1 |
# systemctl restart mysql<br># systemctl restart td-agent |
|
1 |
# tail -f /var/log/td-agent/td-agent.log<br>2020-12-31 02:41:39 +0000 [info]: adding match pattern="mongo.audit.log" type="mongo"<br>...<br>2020-12-31 02:41:40 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log<br>...<br>2020-12-31 02:52:14 +0000 [info]: #0 detected rotation of /var/lib/mysql/audit.log; waiting 5 seconds<br>2020-12-31 02:52:14 +0000 [info]: #0 following tail of /var/lib/mysql/audit.log |
|
1 |
mongo > db.audit_log.countDocuments({})<br>281245<br><br>mongo > db.audit_log.stats()<br>{<br> "ns" : "auditlogs.audit_log",<br> "size" : 104857293,<br> "count" : 281245,<br> "avgObjSize" : 372,<br> "storageSize" : 26357760,<br> "capped" : true,<br> "max" : -1,<br> "maxSize" : 104857600,<br>(...) |
Yay, it works like a charm! Not only are the audit logs rotated automatically on Percona Server for MySQL, but also on MongoDB the destination collection size cap works as well, so I am safe when it comes to disk space on both hosts!
Here, there is a little caveat – if for some reason you drop the destination collection manually on MongoDB, incoming inserts will make it re-created without the capped setting! Therefore, either let the collection be created by Fluentd on its service startup or create it manually with a capped setting, and don’t drop it later.
Now, we can try some example aggregations to get some useful audit stats:
|
1 |
mongo > db.audit_log.aggregate([ { $group: { _id: {name: "$audit_record.name", command: "$audit_record.command_class"}, count: {$sum:1}}}, { $sort: {count:-1}} ])<br>{ "_id" : { "name" : "Execute", "command" : "error" }, "count" : 267086 }<br>{ "_id" : { "name" : "Query", "command" : "begin" }, "count" : 14054 }<br>{ "_id" : { "name" : "Close stmt", "command" : "error" }, "count" : 76 }<br>{ "_id" : { "name" : "Query", "command" : "show_variables" }, "count" : 7 }<br>{ "_id" : { "name" : "Query", "command" : "select" }, "count" : 6 }<br>{ "_id" : { "name" : "Quit" }, "count" : 5 }<br>{ "_id" : { "name" : "Query", "command" : "show_tables" }, "count" : 4 }<br>{ "_id" : { "name" : "Init DB", "command" : "error" }, "count" : 2 }<br>{ "_id" : { "name" : "Field List", "command" : "show_fields" }, "count" : 2 }<br>{ "_id" : { "name" : "Query", "command" : "show_databases" }, "count" : 2 }<br>{ "_id" : { "name" : "Connect" }, "count" : 1 } |
|
1 |
mongo > db.audit_log.aggregate([ { $match: { "audit_record.status": {$gt: 0} } }, { $group: { _id: {command_class: "$audit_record.command_class", status: "$audit_record.status"}, count: {$sum:1}}}, { $sort: {count:-1}} ])<br>{ "_id" : { "command_class" : "error", "status" : 1049 }, "count" : 2 }<br>{ "_id" : { "command_class" : "show_tables", "status" : 1046 }, "count" : 2 }<br>{ "_id" : { "command_class" : "create_table", "status" : 1050 }, "count" : 2 }<br>{ "_id" : { "command_class" : "drop_table", "status" : 1051 }, "count" : 2 }<br>{ "_id" : { "command_class" : "drop_table", "status" : 1046 }, "count" : 2 }<br>{ "_id" : { "command_class" : "create_table", "status" : 1046 }, "count" : 1 }<br>{ "_id" : { "command_class" : "create_table", "status" : 1113 }, "count" : 1 } |
https://dev.mysql.com/doc/refman/8.0/en/audit-log.html
https://www.rsyslog.com/doc/v8-stable/configuration/modules/ommongodb.html
https://docs.fluentd.org/output/mongo
Resources
RELATED POSTS