October 30, 2014

Data mart or data warehouse?

This is part two in my six part series on business intelligence, with a focus on OLAP analysis. Part 1 – Intro to OLAP Identifying the differences between a data warehouse and a data mart. (this post) Introduction to MDX and the kind of SQL which a ROLAP tool must generate to answer those queries. […]

Checking for a live database connection considered harmful

It is very common for me to look at a customer’s database and notice a lot of overhead from checking whether a database connection is active before sending a query to it. This comes from the following design pattern, written in pseudo-code:

Many of the popular development platforms do something similar to this. Two […]

Limiting InnoDB Data Dictionary

One of InnoDB’s features is that memory allocated for internal tables definitions is not limited and may grow indefinitely. You may not notice it if you have an usual application with say 100-1000 tables. But for hosting providers and for user oriented applications ( each user has dedicated database / table) it is disaster. For […]

Beware of MySQL Data Truncation

Here is nice gotcha which I’ve seen many times and which can cause just a minefield for many reasons. Lets say you had a system storing articles and you use article_id as unsigned int. As the time goes and you see you may get over 4 billions of articles you change the type for article_id […]

How to load large files safely into InnoDB with LOAD DATA INFILE

Recently I had a customer ask me about loading two huge files into InnoDB with LOAD DATA INFILE. The goal was to load this data on many servers without putting it into the binary log. While this is generally a fast way to load data (especially if you disable unique key checks and foreign key […]

How fast can MySQL Process Data

Reading Barons post about Kickfire Appliance and of course talking to them directly I learned a lot in their product is about beating data processing limitations of current systems. This raises valid question how fast can MySQL process (filter) data using it current architecture ? I decided to test the most simple case – what […]

MySQL: Data Storage or Data Processing

I was thinking today of how people tend to use MySQL in modern applications and it stroke me in many cases MySQL is not used to process the data, at least not on the large scale – instead it is used for data storage and light duty data retrieval. Even in this case however the […]

Working with large data sets in MySQL

What does working with large data sets in mySQL teach you ? Of course you have to learn a lot about query optimization, art of building summary tables and tricks of executing queries exactly as you want. I already wrote about development and configuration side of the problem so I will not go to details […]

Handling big result sets

Sometime it is needed to handle a lot of rows on client side. Usual way is send query via mysql_query and than handle the result in loop mysql_fetch_array (here I use PHP functions but they are common or similar for all APIs, including C). Consider table:

Operation Systems do not provide good IO interface for Database Servers

Thinking more about the problems I wrote about yesterday I had a question why so ugly workaround and guesses or manual configuration is needed ? The answer seems to be Operation Interfaces just do not provide IO interface which is good enough. The big missing piece is priority. There are process and threads priorities in […]