Thursday, December 27, 2012

Logging with MySQL

# Be aware that this log type is a performance killer.

I read this line from configuration file of MySQL, when trying to enable disabled-by-default general logging option for MySQL Server, in Linux. I decided to dig in deep and find out why? And what are the alternatives?

Performance loss:

If you enable MySQL logging, you'll notice instant drop in performance if you are dealing with large data, being manipulated by over a dozen connections. The primary reason for this is the IO operation after every database event; if the database repository and log files are on the same storage device, the performance reduces even more due to the simultaneous write operations (not to forget that write is more expensive than read). Secondarily, the size of the log file is from 4 to 20 times than actual data, depending on the architecture of data feeding application (you try this yourself). This is because the file contains not only the actual values present in the data, but also the queries it used. So a table with columns, say: ID, Name and Gender may contain:
1 ALI M
2 OWAIS M
3 FATIMA F
But the general log file is going to look like:
------------------------------------------------------------------------------------------------------------
                   95 Connect   root@localhost on
                   95 Query     set autocommit=1
                   95 Query     SET SESSION TRANSACTION ISOLATION LEVEL REPEATABLE READ
                   95 Query     SHOW SESSION VARIABLES LIKE 'lower_case_table_names'
                   95 Query     SET CHARACTER SET utf8
                   95 Query     SET NAMES utf8
                   95 Query     SET SQL_SAFE_UPDATES=1
                   95 Query     SELECT CONNECTION_ID()
                   95 Query     SELECT current_user()
                   95 Query     USE `test`
                   95 Query     set autocommit=1
                   95 Query     SHOW SESSION VARIABLES LIKE 'sql_mode'
                   95 Query     SELECT current_user()
121218 18:25:49    94 Query     SHOW TRIGGERS FROM `test` LIKE 'test'
                   94 Query     SHOW INDEXES FROM `test`.`test`
                   94 Query     SHOW CREATE TABLE `test`.`test`
                   94 Query     SHOW FULL COLUMNS FROM `test`.`test`
121218 18:26:03    95 Query     insert into test.test (id, name, gender) values (1, 'ALI', 'M')
                   95 Query     SHOW WARNINGS
121218 18:26:04    95 Query     insert into test.test (id, name, gender) values (2, 'OWAIS', 'M')
                   95 Query     SHOW WARNINGS
121218 18:26:05    95 Query     insert into test.test (id, name, gender) values (3, 'FATIMA', 'F')
                   95 Query     SHOW WARNINGS
------------------------------------------------------------------------------------------------------------
Now think of reproducing the example for a web application, making a HTTP call that might be calling 5 queries.
Another addition to this is the SELECT queries, that have nothing to do with data modification but are still logged. The use of such logging is minimal, as compared to the DML queries. Not only this, some applications use standards and frameworks, which generate fully-qualified queries; "Hibernate" framework for Object-Relation Mapping is an example. The cost of logging fully-qualified queries is high, take an example:

Query with aliases:
select t.id, t.name, t.gender from test as t

Fully-qualified query:
select `test`.`id`, `test`.`name`, `test`.`gender` from `mydb`.`test`


Logging Types:

MySQL has multiple ways to handle logs, depending on the query types and responses:
  1. Error Log: contains critical errors that occur during starting and stopping mysqld, the MySQL Server's executable.
  2. General Query Log: contains activities performed during the execution of mysqld; in other words, it is use used to keep track of MySQL client activities. These activities are logged as they are received.
  3. Binary Log: a binary file - unlike General Query Log file, which is text - containing activities that change database and its schema. These activities are logged after their execution.
  4. Slow-query Log: consists of queries that take time longer than specified (set in variable long_query_time in configuration file.

Solutions:

Using Existing Application/Web Server Logs

Every appropriately designed application logs its activities to some extent; also, if your application is hosted on a web server, like Tomact or IIS, it too will have its own logging mechanism. You can use these logs to track the database activities and keep the database logs disabled.

Note that this will work only when logs are not read on regular basis because then, you'll have to write a parser to extract useful information from the log for quick readability; the cost of the extraction process will be another addition, so choose this option carefully. Also, if more than one projects are being hosted on the same web server, then this solution may not suit because a single log file will contain a mix of activities from all the applications unless you configure your web server to log every application on a separate file.

Logging into MySQL Tables

MySQL Server allows to log the activities into database instead of a file. You can configure this option in configuration file by setting variable log-output = TABLE (or adding this line if it doesn't exist, it should be set to FILE by default).

Doing so will record the logs in separate tables for each type of log in mysql schema.

Using stored procedures

MySQL Server only logs the direct statements executed. If you have queries that execute in batches, it is better for logging efficiency to put them in stored procedures.

Log maintenance using scheduled jobs

By using MySQL Events or OS tasks like cronjob, you can archive old logs as well as remove unwanted queries. Here is an example using MySQL events:
First, make sure Event Scheduler is enabled by adding event-scheduler variable (if it doesn't exist). This requires restarting the MySQL Server:
event-scheduler = ON
Now create event:



------------------------------------------------------------------------------------------------------------
DELIMITER $$
CREATE EVENT `log_maintenance` ON SCHEDULE EVERY 1 WEEK 
DO BEGIN
   CREATE TABLE IF NOT EXISTS mysql.general_log_archive LIKE mysql.general_log;
   INSERT INTO mysql.general_log_archive
   SELECT * FROM mysql.general_log;
   TRUNCATE mysql.general_log;
   CREATE TABLE IF NOT EXISTS mysql.slow_log_archive LIKE mysql.slow_log;
   INSERT INTO mysql.slow_log_archive
   SELECT * FROM mysql.slow_log;
   TRUNCATE mysql.slow_log;
END 
$$ DELIMITER ;


------------------------------------------------------------------------------------------------------------


The event will fire like a Trigger (in fact, events are temporal triggers) on its start time.

Tuning up more

- Limiting Databases: setting binlog-do-db variable in config file will limit logging to only the databases listed. You can define this like:
binlog-do-db = test
binlog-do-db = mysql
binlog-do-db = my_db

- Indexing: certain columns can be indexed to improve querying, like:
ALTER TABLE mysql.general_log ADD INDEX (event_time);

ALTER TABLE mysql.slow_log ADD INDEX (db);

- Convert log table(s) to MyISAM: since MyISAM storage engine is a better option for huge data, you may convert the engine of log tables to MyISAM. Example:
ALTER TABLE mysql.slow_log ENGINE = MyISAM;

Query Results:

I performed some experiments using 30,000 INSERTUPDATE and DELETE queries on random data to illustrate the difference in performance with and without logging:

- Logging disabled: 138 sec
General Log enabled on File: 139 sec
General Log enabled on Table: 141 sec
- Binary Logging enabled on File165 sec
- Binary Logging enabled on Table143 sec
- All Logs (General, Binary, Error, Slow-query) enabled on File: 174 sec
- All Logs (General, Binary, Error, Slow-query) enabled on Table: 152 sec

Below are the results of a larger number of queries on a real data:

- Logging disabled: 335 sec
All Logs (General, Binary, Error, Slow-query) enabled on File432 sec
All Logs (General, Binary, Error, Slow-query) enabled on Table: 422 sec


--
Was the article helpful for you? Please provide feedback. Corrections are more than welcome.

Owais Ahmed

Tuesday, December 18, 2012

Few tips for speedier browsing.

How often have you lost your temper when you are searching a forum's thread for a "how-to-fix" the broken piece of electronics your best mate gifted you on your birthday? Or trying to find a nice greeting card on Google Image search, clicking each thumbnail and exhausting yourself in back and forth browsing exercise? Quite often, right?

If you browse the internet, you have experienced the annoyance that multi-page galleries, reviews, and sometimes even Google search creates. By multi-page content, I mean a single piece of information, which is broken into multiple pages, mostly for advertisements that the viewer hates.

This small article should help you speed up your browsing. But your browser should be Google Chrome or Mozilla Firefox (honestly, I haven't tried other browsers).

The Solution


  1. Fastest Chrome/Fastest Fox: automatically loads next pages whenever possible, when you scroll to the bottom of the web-page. This saves you from trying to locate the next button or counter and clicking it. Not only that, you can have quick Wikipedia definitions on simply selecting a word or phrase without having to open a new tab.


  2. Hover Zoom/Thumbnail Zoom: hover your mouse over any image on the webpage and it will most likely magnify, i.e. if the image is actually a low-resolution or thumbnail for preview. This again, saves you from opening the link of the original image.

I have been using these couple of extensions for years now and enjoying an incredibly fast browsing experience.



Please leave a comment if you like or dislike. Corrections are very much appreciated...

-
Owais