Skip to main content

Logging with MySQL

# Be aware that this log type is a performance killer.

I read this line from configuration file of MySQL, when trying to enable disabled-by-default general logging option for MySQL Server, in Linux. I decided to dig in deep and find out why? And what are the alternatives?

Performance loss:

If you enable MySQL logging, you'll notice instant drop in performance if you are dealing with large data, being manipulated by over a dozen connections. The primary reason for this is the IO operation after every database event; if the database repository and log files are on the same storage device, the performance reduces even more due to the simultaneous write operations (not to forget that write is more expensive than read). Secondarily, the size of the log file is from 4 to 20 times than actual data, depending on the architecture of data feeding application (you try this yourself). This is because the file contains not only the actual values present in the data, but also the queries it used. So a table with columns, say: ID, Name and Gender may contain:
1 ALI M
2 OWAIS M
3 FATIMA F
But the general log file is going to look like:
------------------------------------------------------------------------------------------------------------
                   95 Connect   root@localhost on
                   95 Query     set autocommit=1
                   95 Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
                   95 Query     SHOW SESSION VARIABLES LIKE 'lower_case_table_names'
                   95 Query     SET CHARACTER SET utf8
                   95 Query     SET NAMES utf8
                   95 Query     SET SQL_SAFE_UPDATES=1
                   95 Query     SELECT CONNECTION_ID()
                   95 Query     SELECT current_user()
                   95 Query     USE `test`
                   95 Query     set autocommit=1
                   95 Query     SHOW SESSION VARIABLES LIKE 'sql_mode'
                   95 Query     SELECT current_user()
121218 18:25:49    94 Query     SHOW TRIGGERS FROM `test` LIKE 'test'
                   94 Query     SHOW INDEXES FROM `test`.`test`
                   94 Query     SHOW CREATE TABLE `test`.`test`
                   94 Query     SHOW FULL COLUMNS FROM `test`.`test`
121218 18:26:03    95 Query     insert into test.test (id, name, gender) values (1, 'ALI', 'M')
                   95 Query     SHOW WARNINGS
121218 18:26:04    95 Query     insert into test.test (id, name, gender) values (2, 'OWAIS', 'M')
                   95 Query     SHOW WARNINGS
121218 18:26:05    95 Query     insert into test.test (id, name, gender) values (3, 'FATIMA', 'F')
                   95 Query     SHOW WARNINGS
------------------------------------------------------------------------------------------------------------
Now think of reproducing the example for a web application, making a HTTP call that might be calling 5 queries.
Another addition to this is the SELECT queries, that have nothing to do with data modification but are still logged. The use of such logging is minimal, as compared to the DML queries. Not only this, some applications use standards and frameworks, which generate fully-qualified queries; "Hibernate" framework for Object-Relation Mapping is an example. The cost of logging fully-qualified queries is high, take an example:

Query with aliases:
select t.id, t.name, t.gender from test as t

Fully-qualified query:
select `test`.`id`, `test`.`name`, `test`.`gender` from `mydb`.`test`


Logging Types:

MySQL has multiple ways to handle logs, depending on the query types and responses:
  1. Error Log: contains critical errors that occur during starting and stopping mysqld, the MySQL Server's executable.
  2. General Query Log: contains activities performed during the execution of mysqld; in other words, it is use used to keep track of MySQL client activities. These activities are logged as they are received.
  3. Binary Log: a binary file - unlike General Query Log file, which is text - containing activities that change database and its schema. These activities are logged after their execution.
  4. Slow-query Log: consists of queries that take time longer than specified (set in variable long_query_time in configuration file.

Solutions:

Using Existing Application/Web Server Logs

Every appropriately designed application logs its activities to some extent; also, if your application is hosted on a web server, like Tomact or IIS, it too will have its own logging mechanism. You can use these logs to track the database activities and keep the database logs disabled.

Note that this will work only when logs are not read on regular basis because then, you'll have to write a parser to extract useful information from the log for quick readability; the cost of the extraction process will be another addition, so choose this option carefully. Also, if more than one projects are being hosted on the same web server, then this solution may not suit because a single log file will contain a mix of activities from all the applications unless you configure your web server to log every application on a separate file.

Logging into MySQL Tables

MySQL Server allows to log the activities into database instead of a file. You can configure this option in configuration file by setting variable log-output = TABLE (or adding this line if it doesn't exist, it should be set to FILE by default).

Doing so will record the logs in separate tables for each type of log in mysql schema.

Using stored procedures

MySQL Server only logs the direct statements executed. If you have queries that execute in batches, it is better for logging efficiency to put them in stored procedures.

Log maintenance using scheduled jobs

By using MySQL Events or OS tasks like cronjob, you can archive old logs as well as remove unwanted queries. Here is an example using MySQL events:
First, make sure Event Scheduler is enabled by adding event-scheduler variable (if it doesn't exist). This requires restarting the MySQL Server:
event-scheduler = ON
Now create event:



------------------------------------------------------------------------------------------------------------
DELIMITER $$
CREATE EVENT `log_maintenance` ON SCHEDULE EVERY 1 WEEK 
DO BEGIN
   CREATE TABLE IF NOT EXISTS mysql.general_log_archive LIKE mysql.general_log;
   INSERT INTO mysql.general_log_archive
   SELECT * FROM mysql.general_log;
   TRUNCATE mysql.general_log;
   CREATE TABLE IF NOT EXISTS mysql.slow_log_archive LIKE mysql.slow_log;
   INSERT INTO mysql.slow_log_archive
   SELECT * FROM mysql.slow_log;
   TRUNCATE mysql.slow_log;
END 
$$ DELIMITER ;


------------------------------------------------------------------------------------------------------------


The event will fire like a Trigger (in fact, events are temporal triggers) on its start time.

Tuning up more

- Limiting Databases: setting binlog-do-db variable in config file will limit logging to only the databases listed. You can define this like:
binlog-do-db = test
binlog-do-db = mysql
binlog-do-db = my_db

- Indexing: certain columns can be indexed to improve querying, like:
ALTER TABLE mysql.general_log ADD INDEX (event_time);

ALTER TABLE mysql.slow_log ADD INDEX (db);

- Convert log table(s) to MyISAM: since MyISAM storage engine is a better option for huge data, you may convert the engine of log tables to MyISAM. Example:
ALTER TABLE mysql.slow_log ENGINE = MyISAM;

Query Results:

I performed some experiments using 30,000 INSERTUPDATE and DELETE queries on random data to illustrate the difference in performance with and without logging:

- Logging disabled: 138 sec
General Log enabled on File: 139 sec
General Log enabled on Table: 141 sec
- Binary Logging enabled on File165 sec
- Binary Logging enabled on Table143 sec
- All Logs (General, Binary, Error, Slow-query) enabled on File: 174 sec
- All Logs (General, Binary, Error, Slow-query) enabled on Table: 152 sec

Below are the results of a larger number of queries on a real data:

- Logging disabled: 335 sec
All Logs (General, Binary, Error, Slow-query) enabled on File432 sec
All Logs (General, Binary, Error, Slow-query) enabled on Table: 422 sec


--
Was the article helpful for you? Please provide feedback. Corrections are more than welcome.

Owais Ahmed

Comments

Popular posts from this blog

Playing in Amazon's Clouds - Introduction to Elastic Computing Cloud - Part 1

A really brief Intro.. Researcher, Trying to execute an extremely computationally resource hungry experiment? App developer, unsure of how much data you'll be collecting from the users? Student, tasked to build your FYP (final year project) on distributed computing environment? Just an ordinary techie trying to catch up with the world? If you're any of these, you cannot escape the fact that Cloud computing is storming in and you have to engage yourself actively in it. Adopt it, or perish. I'm a newbie (better say wannabe) in this massive web of computing, and here just to share some experiences I'm having - successes and failures. First of all, Cloud computing is nothing new, it has been there for over 3 decades and was referred with names like Grid computing  and Distributed computing . It was business people that came up with a catchy name to attract business. The idea behind distributed computing is simple. We create a network of computers t...

How to detach from Facebook... properly

Yesterday, I deactivated my Facebook account after using it for 10 years. Of course there had to be a very solid reason; there was, indeed... their privacy policy . If you go through this page, you might consider pulling off as well. Anyways, that's not what this blog post is about. What I learned from yesterday is that the so-called "deactivate" option on Facebook is nothing more than logging out. You can log in again without any additional step and resume from where you last left. Since I really wanted to remove myself from Facebook as much as I can, I investigated ways to actually delete a Facebook account. There's a plethora of blogs on the internet, which will tell you how you can simply remove Facebook account. But almost all of them will either tell you to use "deactivate" and "request delete" options. The problem with that is that Facebook still has a last reusable copy of your data. If you really want to be as safe from its s...

Yet another Blog on Query Optimization for MySQL Server

If you have been into MIS development for some time, then you may have realized that buying latest, multi-thousand-dollar Machine, stuffed with a top notch processor and an army of memory chips is not sufficient to your needs when it comes to processing large data, especially when your DBMS is MySQL Server. In this article, I have tried to input  the tips and techniques to-be-followed - some in general and some specific to MySQL Server; but I would, as every blogger, repeat the same common phrase that " in the end   it all depends on your scenario ". The results you are going to see will mostly be in milliseconds so before thinking "is it worth the effort if the result is in a few milliseconds?", do know that these results are derived using a very very simple database with not more than 100000 records in a table.  With complex databases and records in millions, the effort will pay you back. Coming straight to topic, here are some points you should not ign...