amitshah.in Saturday, Jan 10 2009 

New site has been launched amitshah.in

This site has extremely good collections of Articles from various stream like PHP, MySQL, Health, Jainism etc. I have also added Latest News Feed from the various News Portals which helps you to stay connect with current affairs. Photo gallery is one of good section and its come from Picasa. You can also send any feedback from guest book and share you link with link section.

Goal driven performance optimization Saturday, Jan 10 2009 

When your goal is to optimize application performance it is very important to understand what goal do you really have. If you do not have a good understanding of the goal your performance optimization effort may well still bring its results but you may waste a lot of time before you reach same results as you would reach much sooner with focused approach.

The time is critical for many performance optimization tasks not only because of labor associated expenses but also because of the suffering – slow web site means your marketing budget is wasted, customer not completing purchases, users are leaving to competitors, all of this making the time truly critical matter.

So what can be the goal ? Generally I see there are 2 types of goals seen in practice. One is capacity goal this is when the system is generally overloaded so everything is slow, when you’re just looking to see how you can get most out of your existing system, looking for consolidation or saving on infrastructure cost. If this is the goal you can perform general system performance evaluation and just fix the stuff which causes the most load on the system. MySQL Log analyzes with Mk-Log-Parser is a very good start for a ways to generally optimize MySQL load on the system.

Latency Goal is another breed. The system may not look loaded but some pages still may want to be loading much slower than you like. These goals are not system wise but they are much more specific to the different user interactions or even types of users. For example you may define goal also “Search pages have to have response time below 1 second in 95% cases and below 3 seconds in 99% cases”. Note We’re specific to the user interaction – people are used to Search taking longer time than other interactions for many applications, and also we speak about percentile response time rather than “all queries”. It is surely good all search queries complete in one seconds but it is too not practical. The goal description may be more specific too – for example you may have different response time guidelines for pages which are requested for real humans vs search engine bots (which are often quite different in their access pattern) or you may define “large users” as users having more than 100.000 images uploaded and measure the response time for them specifically because this group has its own performance challenges.

Looking at Latency it is also much more practical to look from the top of the stack. If you look at MySQL log you may find some queries which are slow but it is hard to go back from them to what is really important for the user and so the business – the page response times. Furthermore. It is not enough in many cases to focus only on Server Side optimization – the Client Side Optimization is also quite important in particular for aggressive performance goals and fast back-end. This is why we added this service to Percona offerings.

If Server side or Client Side performance optimization is going to be more important for your application depends on the application performance a lot. The better your application is the more Client Side optimization you will need. For example if it takes you 30 seconds to generate the search results and 3 more seconds to load all style sheets images and render the page server side optimization is more important. If you have optimized things and now HTML takes 0.5 seconds to generates an extra 3 seconds become the main response time contributer which has the highest performance optimization potential.

But let us get back to the Server Side Optimization. Lets assume our performance goal applies to the HTML generation rather than full page load on the client. So meet our goal we should look at the pages which do not meet our goal, which is pages which take more than 1 second to generate in given example.

For goal driven performance optimization it is important there is enough instrumentation and production performance logging in place so you really can focus on hard data in your work. For small and medium size applications you can log all requests to MySQL table for larger ones you can log only small portion of them. I usually keep one table per day so it is easy to copy the data to a different box for data crunching and remove the old ones.

The log table should contain URL, IP and all the data you need to be able to repeat request if you need to. It may include cookie data, post data, logged in user information etc. But the real thing is number of times which are stored for request. wall clock time – is the real time it took to generate the page by server backend. CPU Time This is the CPU time needed to generate request (you can split it to user and system if you want) and when there come various wait times – mysql, memcache, sphinx, web services etc.

For web applications doing processing in a single thread the following simple formula applies wall_time=cpu_time+sum(wait_time)+lost_time The lost time is the time which was lost for some reason – some waits we did not profile or waits we do not have control of, for example when processing had to wait for CPU available to do processing. For multi-thread application it is a bit more complicated but you still can analyze critical path.

If you have such profiling in place all you have to do is to run the query to see what are contributing factors to the response time of the problematic pages:

SQL:

  1. mysql> SELECT count(*),avg(wtime),avg(utime/wtime) cpu_ratio, avg(mysql_time/wtime) mysql_ratio ,avg(sphinx_time/wtime) sphinx_ratio, avg((wtime-mysql_time-sphinx_time-utime)/wtime) lost_ratio FROM performance_log_081221 WHERE page_type=’search’ AND wtime>1;
  2. +———-+—————–+——————+——————+——————+——————+
  3. | count(*) | avg(wtime) | cpu_ratio | mysql_ratio | sphinx_ratio | lost_ratio |
  4. +———-+—————–+——————+——————+——————+——————+
  5. | 112376 | 6.0645327150223 | 0.11126040714778 | 0.17609498370795 | 0.54612972549309 | 0.16651488365119 |
  6. +———-+—————–+——————+——————+——————+——————+
  7. 1 row IN SET (2.29 sec)

Why looking only at such pages is important ? This is because if you look at all pages rather than problematic subset it may lead you away from your goal. For example it is very possible among all pages we would see CPU usage as the main factor because sphinx and MySQL respond from cache.

We however see for pages which have the problem it is Sphinx which accounts for most of the time.

Looking at the data such way we have two great benefits. First we really understand what is the bottleneck. Second we know what performance gain potential is. For example in this case we could spend a lot of time optimizing PHP code but because it takes only 10% of response time in average even speeding it up 10 times we would not get more than 10% response time reduction. At the same time if we find a way to speed up Sphinx we can reduce response time to its half.

Note in this case there is some 16% of response time which is not accounted for. Large portion probably comes from memcache accesses which are not instrumented for this application. In this case this portion is not the biggest part yet but if we’d speed up Sphinx and MySQL dramatically we would have to go and look into better instrumentation so we can look inside this black box.

Once we know it is Sphinx which causes the problem we have to go and find what queries exactly are causing it – this can be done by adding request ID as comment to Sphinx log so you can profile it carefully or you can add tracing functionality to the application. All the same. Once you found the queries causing the problem you see the ones which cause the most impact and focus on optimizing them.

There are multiple ways to optimize something, my checklist is usually get rid of it, cache it, tune it, get more hardware in this order. It is often it is possible to get rid of some queries, cache them, tune them so they are faster (often at the same time changing semantics a bit) and if nothing helps or can be done quickly we can buy more hardware, assuming application can use it.

Once you’ve performed optimizations you can repeat analyzes again to see if performance goals are met and where is the bottleneck this time.

As a side note I should mention looking at performance statistics for the day overall is often not enough. Application performs as good as it performs during its worst times so it is very good to plot some graph over time. Sometimes an hour base may be enough but for large scale application I’d recommend to looking down to 5 minutes or even 1 minute intervals and making sure there are no hiccups.

Check the stats from the application above for example:

SQL:

  1. mysql> SELECT date_format(logged,‘%H’) h,count(*),avg(wtime),avg(sphinx_time/wtime) sphinx_ratio FROM performance_log_081221 WHERE page_type=’search’ AND wtime>1 GROUP BY h;
  2. +——+———-+—————–+——————+
  3. | h | count(*) | avg(wtime) | sphinx_ratio |
  4. +——+———-+—————–+——————+
  5. | 00 | 5851 | 3.0608555987602 | 0.49142908242509 |
  6. | 01 | 6639 | 2.9099249532198 | 0.48133478800683 |
  7. | 02 | 5406 | 3.3770073273647 | 0.49140835595675 |
  8. | 03 | 5397 | 2.9834221059666 | 0.53178056214228 |
  9. | 04 | 4820 | 3.8182240369409 | 0.53530183347988 |
  10. | 05 | 3720 | 13.025273085185 | 0.61126549080115 |
  11. | 06 | 1606 | 60.624889697559 | 0.89123114911947 |
  12. | 07 | 2699 | 38.821067012253 | 0.90885394709571 |
  13. | 08 | 2419 | 45.388828675971 | 0.9226436892381 |
  14. | 09 | 4810 | 6.330725168364 | 0.60329631087965 |
  15. | 10 | 5445 | 3.8355732669953 | 0.53918653169648 |
  16. | 11 | 5283 | 3.0498331333457 | 0.5512679788082 |
  17. | 12 | 4147 | 2.9050685487542 | 0.52802563348716 |
  18. | 13 | 2313 | 3.1297905412629 | 0.47887915792732 |
  19. | 14 | 4155 | 2.9788750504185 | 0.53700871350403 |
  20. | 15 | 4081 | 4.4940078389087 | 0.67605124513469 |
  21. | 16 | 3720 | 3.1698921914062 | 0.54566719123393 |
  22. | 17 | 4210 | 2.7616731525034 | 0.47537024159769 |
  23. | 18 | 6735 | 2.639767089152 | 0.5204920072653 |
  24. | 19 | 5581 | 2.6058266677645 | 0.42959908812738 |
  25. | 20 | 4990 | 2.4441354725308 | 0.44270882435635 |
  26. | 21 | 6305 | 2.6316682707403 | 0.5236776389174 |
  27. | 22 | 6774 | 2.4394227009732 | 0.53342757714496 |
  28. | 23 | 5270 | 2.3949674527604 | 0.51381316608346 |
  29. +——+———-+—————–+——————+
  30. 24 rows IN SET (2.37 sec)

As you can see in this case during certain hours the average type of bad queries skyrockets and it becomes 90% or so driven by Sphinx. This tells us there is some irregular activity (cron jobs?) is happening and it affects Sphinx layer significantly.

Such goal based from top to bottom approach is especially helpful for complex applications using mutliple components (like sphinx and MySQL) or multiple MySQL Servers because in these cases you often can’t easily guess the component which needs attention. Though even for less complicated single MySQL server application there is often the question if it is MySQL server causing the problem or if application code needs to be optimized.

MySQL 5.1 partitions in practice Thursday, Dec 18 2008 

This article explains how to test the performance of a large database with MySQL 5.1, showing the advantages of using partitions.

The test database uses data published by the US Bureau of Transportation Statistics. Currently, the data consists of ~ 113 million records (7.5 GB data + 5.2 GB index).

Getting and loading the data

The data for this exercise comes from the Bureau of Transportation Statistics. The data is provided as CSV files, and they are available from July 1987 to July 2007 (at the time of writing).
The details of getting the data, setting the data structure and loading the data are explained on MySQL Forge.

Problem specification

The test affects a database that is bigger than the amount of RAM in the server, and also the indexes are bigger than the RAM.
The server used for this exercise has 4 GB of RAM, and the size of indexes is over 5 GB.

The reasoning behind this specification is that data warehouses have data collections that are far beyond any reasonable amount of RAM that you can possibly install, occupying several terabytes of storage.

In normal database storage, especially for OLTP, the indexes are cached in RAM, to allow for fast retrieval of records. When the data reaches sizes that can’t be contained in the available RAM, we need to use a different approach.

One of MySQL 5.1 main features is partitioning, a technique that divides a table into logical portions to speed-up retrievals.

Using MySQL 5.1 partitions looks simple in principle, but there are some tricky points to be aware of while setting the data for maximum performance.
This article will examine the risks and offer some practical advice to achieve the best performance.

Partitioning overview

The current implementation of partitioning in MySQL 5.1 is quite simple. You can partition your data by

  • range
  • list
  • hash
  • key

Depending on your needs, you may choose different partitioning types. In this article we concentrate on range partitioning, which is perhaps the most interesting for data warehousing.

MySQL partitioning has some constraints that you must be aware of if you want to use this feature effectively.
* the partitioning value must be an integer;
* if the table has a unique/primary key, the partitioning column must be part of that key.

The first limitation is the one that has the biggest impact on your design decisions. If the column that you need to use for partitioning is not an integer, you need to use a function to transform it. Some additional constraints apply to partitions, as described in the manual, but we are not concerned about it here.

Partitioning gotchas

Using date columns

What is relevant in this context is the usage of date columns for partitioning. Since the native data type is not supported, we must convert the date into an integer. In addition to the list of allowed functions, we must take into account the fact that only two date functions can trigger the partition pruning. Thus, if we have to deal with a date column, we need to use one of them (YEAR or TO_DAYS).

When using the YEAR() function, partitioning is easy, readable, and straightforward.

CREATE TABLE by_year (
   d DATE
)
PARTITION BY RANGE (YEAR(d))
(
PARTITION P1 VALUES LESS THAN (2001),
PARTITION P2 VALUES LESS THAN (2002),
PARTITION P3 VALUES LESS THAN (2003),
PARTITION P4 VALUES LESS THAN (MAXVALUE)
)

Partitioning by month is trickier. You can’t use the MONTH() for two reasons:
* you would be limited to 12 partitions, because MONTH does not include the year;
* the MONTH function is not optimized for partition pruning, thus performances would be horrible.

Thus, you need to use the other function that is optimized for partition pruning, TO_DAYS.

CREATE TABLE by_month (
   d DATE
)
PARTITION BY RANGE (TO_DAYS(d))
(
PARTITION P1 VALUES LESS THAN (to_days('2001-02-01')), -- January
PARTITION P2 VALUES LESS THAN (to_days('2001-03-01')), -- February
PARTITION P3 VALUES LESS THAN (to_days('2001-04-01')), -- March
PARTITION P4 VALUES LESS THAN (MAXVALUE)
)

That’s already less clear to read than the one partitioned by year. What’s worse is that the server won’t retain your values, but it will only save the corresponding integers. This is what you get for the above table:

show create table by_month\G
*************************** 1. row ***************************
       Table: by_month
Create Table: CREATE TABLE `by_month`
  `d` date DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 /*!50100
PARTITION BY RANGE (TO_DAYS(d))
(
PARTITION P1 VALUES LESS THAN (730882) ENGINE = MyISAM,
PARTITION P2 VALUES LESS THAN (730910) ENGINE = MyISAM,
PARTITION P3 VALUES LESS THAN (730941) ENGINE = MyISAM,
PARTITION P4 VALUES LESS THAN MAXVALUE ENGINE = MyISAM) */

It is advisable to save a copy of the script used to create tables partitioned by month, if you want to have a readable reference of what each partition means.

partitioning by function, searching by column

One common mistake that is made when using tables partitioned using a date column is to query by the same function used for partitioning.

For example if your table was created with the clause

partition by range( YEAR(d) )

and you are told that YEAR and TO_DAYS are optimized for partition pruning, it seems logical to use a query like

SELECT count(*) FROM by_year
WHERE YEAR(d) = 2000; # <-- ERROR !!

The partition pruning does not kick, as you can see from EXPLAIN

explain partitions select count(*) from by_year where year(d) = 2001\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: by_year
   partitions: P1,P2,P3,P4
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 30123
        Extra: Using where

The query is doing a full table scan. The meaning of “optimized for partition pruning” is that the search will use partitions when that column is used in the WHERE clause.

The right way of querying is

SELECT count(*) FROM by_year
WHERE d BETWEEN '2001-01-01' and '2001-12-31';

And now the partition pruning is being used:

explain partitions select count(*)
from by_year
where d between '2001-01-01' and '2001-12-31'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: by_year
   partitions: P2
         type: system
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 30123
        Extra:

This shows that the query does not cause a full table scan, but it will use only one partition.

Using primary keys and indexes

One of the problems I had while testing partitioned tables with a large dataset is that I could not get the performance improvement that I was expecting.
In Robin Schumacher’s first article on partitioning the examples compare tables without primary keys. The partitioned table wins the contest hands down.
In my current tests, I started by using a table with a primary key, on the ground that this is what normally would happen.
So when I first compared data holding 113 million rows, I used tables with primary keys.

That was a mistake.

A primary key on a table so large that its indexes can’t fit in memory is not efficient. To get records from such a table means accessing the disk frequently. Your performance depends completely on the speed of your disk and your processor.

Looking at what others do in data warehousing design, I found out that it’s common practice to design large DW sets without using indexes.
In this article we’ll see also a performance comparison between partitioned tables with and without primary key.

Testing method

In this test, I wanted to compare performance of a large dataset using MyISAM, InnoDB, and Archive storage engines.
For each engine, I created one unpartitioned table, with primary key (except for archive) and two partitioned tables, one by month and one by year.

Each topology is tested on a dedicated instance of MySQL server, containing only one database with one table.
For each engine, I start the server, run the set of queries and record their results, and then shut down the server.

The server instances were created using MySQL Sandbox.

ID storage partitioned records size notes loading time (*)
1 MyISAM none 113 Mil 13 GB with PK 37 min
2 MyISAM by month 113 Mil 8 GB without PK 19 min
3 MyISAM by year 113 Mil 8 GB without PK 18 min
4 InnoDB none 113 Mil 16 GB with PK 63 min
5 InnoDB by month 113 Mil 10 GB without PK 59 min
6 InnoDB by year 113 Mil 10 GB without PK 57 min
7 Archive none 113 Mil 1.8 GB no keys 20 min
8 Archive by month 113 Mil 1.8 GB no keys 21 min
9 Archive by year 113 Mil 1.8 GB no keys 20 min
  • loading times on a dual-Xeon server.

To compare the effects of partitions on large and small datasets, I created 9 more instances, each containing slightly less than 2 GB of data, to see if the results were different.

The results were recorded in yet another database instance, for better comparison.
Thus, at any given time during the tests, there were two instances running. One containing the results and the one being tested.

The queries used for this test are of two types

  • aggregate queries.
 SELECT COUNT(*)
 FROM table_name
 WHERE date_column BETWEEN start_date and end_date
  • specific record fetch
 SELECT column_list
 FROM table_name
 WHERE column1 = x and  column2 = y and column3 = z

For each query type, I generated queries for different date ranges. For each range, I generated a set of additional queries on adjacent dates. The first query for each range is cold, meaning that such range was hit for the first time. Subsequent queries on the same range are warm, meaning that the range was already at least partially cached.

The list of queries used for the test is on the Forge

Results

partitioned tables with primary key

Let’s start with the wrong approach first.

My first batch of tests used partitioned tables with a composite primary key, the same used in the original table. The total size of the PK was 5.5 GB. Rather than improving performance, PK slowed down the operations.

Queries with partitioned tables, burdened with the index search through a PK that can’t fit in RAM (4 GB) performed poorly. This is a lesson to be remembered. Partitions are useful, but they must be used in the right way.

+--------+-----------------+-----------------+-----------------+
| status | myisam unpart   |    myisam month |    myisam year  |
+--------+-----------------+-----------------+-----------------+
| cold   | 2.6574570285714 |       2.9169642 | 3.0373419714286 |
| warm   | 2.5720722571429 | 3.1249698285714 | 3.1294000571429 |
+--------+-----------------+-----------------+-----------------+

The solution was in front of my eyes, but I refused to see it at first. Look at the results for the same set of queries using the ARCHIVE storage engine.

+--------+----------------+-----------------+-----------------+
| status | archive unpart |   archive month |   archive year  |
+--------+----------------+-----------------+-----------------+
| cold   |     249.849563 | 1.2436211111111 | 12.632532527778 |
| warm   |     235.814442 | 1.0889786388889 | 12.600520777778 |
+--------+----------------+-----------------+-----------------+

The results for the table partitioned by month are better than the ones I got on the corresponding MyISAM table. More comments about this fact later.

partitioned tables without primary key

Since the performance on partitioned tables using primary keys was so bad, I decided to jump the gun, and get rid of the primary key.
The key factor is that the primary key for this table is larger than the available key buffer (and larger than the total available RAM in this case). Therefore, any search by key will use the disk.

The new approach was successful. Using partitions only, without primary keys, I got what I wanted. A significant improvement in performance. Tables partitioned by month get a 70 to 90% performance gain.

+--------+------------------+------------------+------------------+
| status |  myisam unpart   |     myisam month |     myisam year  |
+--------+------------------+------------------+------------------+
| cold   |  2.6864490285714 | 0.64206445714286 |  2.6343286285714 |
| warm   |  2.8157905714286 | 0.18774977142857 |  2.2084743714286 |
+--------+------------------+------------------+------------------+

To make the difference more visible, I tested with two massive queries that should take advantage of the partition pruning mechanism.

# query 1 -- aggregate by year
SELECT year(FlightDate) as y, count(*)
FROM flightstats
WHERE FlightDate BETWEEN  "2001-01-01" and "2003-12-31"
GROUP BY y
# query 2 -- aggregate by month
SELECT date_format(FlightDate,"%Y-%m") as m, count(*)
FROM flightstats
WHERE FlightDate BETWEEN "2001-01-01" and "2003-12-31"
GROUP BY m

The results show a performance gain of 30 to 60% for the table partitioned by month, and a gain of 15 to 30% for the table partitioned by year.

+----------+-----------+-----------+-----------+
| query_id | m_u       | m_m       | m_y       |
+----------+-----------+-----------+-----------+
|        1 | 97.779958 | 36.296519 | 82.327554 |
|        2 |  69.61055 | 47.644986 |  47.60223 |
+----------+-----------+-----------+-----------+

The processor factor

I had some trouble when I moved the test to a different server. The above tests were taken on my home desktop, which uses an Intel Dual Core 2.3 MHz CPU. The new server is much faster, with a dual Xeon 2.66 MHz.
Repeating the above tests, I got this surprise:

+--------+-------------------+-------------+-----------------+
| status |  myisam unpart    |myisam month |    myisam year  |
+--------+-------------------+-------------+-----------------+
| cold   | 0.051063428571429 |   0.6577062 | 1.6663527428571 |
| warm   | 0.063645485714286 |   0.1093724 | 1.2369152285714 |
+--------+-------------------+-------------+-----------------+

The original table, with primary key, is faster than the partitioned ones. The times for partitioned tables are the same I got on the slower server, but the performance of the original table has improved, making partitions unnecessary.

What to do? Since this server seems to take advantage of indexes so well, I added an index on the partitioning column to the partitioned tables.

# original table
create table flightstats (
AirlineID int not null,
UniqueCarrier char(3) not null,
Carrier char(3) not null,
FlightDate date not null,
FlightNum char(5) not null,
TailNum char(8) not null,
ArrDelay double not null,
ArrTime datetime not null,
DepDelay double not null,
DepTime datetime not null,
Origin char(3) not null,
Dest char(3) not null,
Distance int not null,
Cancelled char(1) default 'n',
primary key (FlightDate, AirlineID, Carrier, UniqueCarrier, FlightNum, Origin, DepTime, Dest)
)
# partitioned tables
create table flightstats (
AirlineID int not null,
UniqueCarrier char(3) not null,
Carrier char(3) not null,
FlightDate date not null,
FlightNum char(5) not null,
TailNum char(8) not null,
ArrDelay double not null,
ArrTime datetime not null,
DepDelay double not null,
DepTime datetime not null,
Origin char(3) not null,
Dest char(3) not null,
Distance int not null,
Cancelled char(1) default 'n',
KEY (FlightDate)
)
PARTITION BY RANGE ...

The results were then much more satisfactory, with a 35% performance gain.

+--------+-------------------+-------------------+-------------------+
| status | myisam unp        |     myisam month  |      myisam year  |
+--------+-------------------+-------------------+-------------------+
| cold   | 0.075289714285714 | 0.025491685714286 | 0.072398542857143 |
| warm   | 0.064401257142857 | 0.031563085714286 | 0.056638085714286 |
+--------+-------------------+-------------------+-------------------+

Lessons learned

Testing partitions has been a tiring experience. Gaining performance improvements is not a straightforward operation. What I assumed was a painless change turned out to be a long trial-and-error process.

There is no silver bullet

Applying a partition change to a table is no guarantee for performance improvement. The gain depends on several factors:

  • the column used for partitioning;
  • the function used for partitioning, where the native column type is not an integer;
  • the server speed;
  • the amount of RAM.

Nothing should be taken for granted.

Run benchmarks before applying changes to a production system

Depending on the usage of your database, you may get a huge performance gain, or nothing at all. You can also get a performance decrease, if you are not careful.
Consider this: a table partitioned by date with month intervals can get you dreamy performance gains if your queries always include a date range. It can be a full table scan if your queries don’t include a date range.

Archive tables can be an excellent compromise

Archive tables achieve a huge performance gain when partitioned. Again, it depends on your usage. Without partitioning, any query to an Archive table is a full table scan. If you have historical data that does not change and you need to perform statistical queries by range, the Archive engine is an excellent choice. It uses 10 to 20% of the original storage, and it can perform better than the original MyISAM or InnoDB table for aggregate queries.
Again, it is all a matter of benchmarks. A well tuned partitioned MyISAM table performs far better than a corresponding Archive table, but it needs ten times more storage.

Summing up

Partitioning is the key for performance gain with large databases. The definition of large depends on the hardware at your disposal.

Applying partitions blindly is no guarantee of achieving performance improvements, but with the aid of some preliminary benchmarks it can become your perfect solution.
Reference: amitshah.in)

Sun and MySQL: How It Stacks Up for Developers Saturday, Nov 1 2008 

Most business analysts have applauded the announced acquisition by Sun Microsystems of open-source database provider MySQL AB. Although the business case for bringing MySQL under the Sun umbrella is clear, the acquisition leaves many developers wondering just what is in store for them.

Developers want products that are easy to use during development, that are easy to scale to enterprise levels, and that will last long enough to repay the time investment required to master the technologies. Some MySQL community members wonder if they will continue to enjoy superior open-source products from the combined companies, and if the community will remain strong in the future.

MySQL, the Leading Open-Source Option

MySQL leads the open-source database market, and by every indication, the future of MySQL at Sun is with the open-source community. According to Mårten Mickos, CEO of MySQL, it was Sun’s commitment to free and open-source software that helped seal the deal. During a four-year courtship, which Jonathan Schwartz chronicles in his blog, it became apparent to the MySQL leadership that Sun’s historic commitment to open systems had extended to open-source software, and that Sun would be a home in which the MySQL community could thrive. Both Java technology and MySQL are licensed under GPL Version 2 with an eye toward GPL Version 3 as that license matures.

For MySQL developers, little will change. But the MySQL Enterprise customer can expect an even greater level of service and support, backed by a Fortune 500 company. And as MySQL is adopted more broadly, all users can expect faster product innovation, more rigorous testing, and a better MySQL product overall.

MySQL and the Sun Software Stack

Skeptics of Sun’s commitment to open source might fear being coerced, however subtly, into adopting other Sun products — in particular the Solaris OS, which is itself undergoing an open-source renaissance. Events at Sun, including partnerships with Intel, Microsoft, Oracle, and other competitors, indicate that Sun will adapt to the marketplace rather than try to enforce any type of software regime on developers.

The acronym LAMP (Linux operating system, Apache web server, MySQL database server, PHP/Perl/Python languages) describes the software stack favored by open-source developers. Each element in the stack has alternatives, both open source and proprietary, making LAMP not very useful as a descriptive acronym. For example, of the 50,000 daily downloads of MySQL, the majority of binaries are for Windows operating systems — for WAMP developers, perhaps? Similar numbers apply to other open-source tools and products sponsored by Sun, notably the GlassFish application server. While determined to compete aggressively for market share among LAMP developers, Sun is porting its tools and software to every popular platform.

Developers who are committed to Sun software may likewise be wondering if MySQL will supplant database alternatives such as PostgreSQL and JavaDB in its software stack. Again, Sun has voiced its intention of supporting any popular management software that developers want to insert into the database slot, including Apache Derby. In the words of Josh Berkus of the PostgreSQL core team, “Sun is not dropping PostgreSQL support. It’s Sun’s goal to be the number one data center vendor, not the number one MySQL vendor…. In the world of databases, there is no ‘one size fits all.’ “

If you’re already using MySQL in your web application software stack, you can expect MySQL to be the same free, community-based database you’ve been using, regardless of operating system, web server, or language. When you want to scale out for enterprise-level performance, Sun will provide the support and services you need. If you’re committed to another database, expect it to cooperate well with Sun products.

MySQL Details

If you haven’t tried MySQL or other Sun offerings in developing web applications, it may be time to reevaluate.

Although there is no such thing as a one-size-fits-all database, the popularity of MySQL shows just how close it has come to that ideal. Its adaptability and ease of use is largely due to its architecture and ability to scale. This architecture helps explain why MySQL is the choice for more than 11 million active installations worldwide with more than 100 million copies downloaded or distributed to date. At this writing, a new copy of MySQL is being downloaded every two seconds.

Scale-out Capabilities

The strategy that many MySQL users employ for success is known as scale-out. The term refers to an architectural design that separates and spreads a database’s workload across commodity hardware by replicating data across different physical servers. This strategy is in contrast to adding more CPU power, storage, and memory to a smaller number of monolithic servers — a strategy sometimes called scale-up. The scale-out strategy typically provides better fault tolerance and overall performance for the money, as evidenced by the many heavily trafficked MySQL web sites that have achieved high levels of uptime with this approach.

Figure 1 illustrates the scale-out architecture.

Scale-out Architecture

Figure 1. Scale-out Architecture
Click here for a larger image.

In a typical situation, users initiate sessions on customer-facing web or application servers. These servers send requests as required into the MySQL database servers through load balancers or other application logic that directs the requests. When a request causes a server to write into the database, the server replicates its data to other MySQL servers to keep them up-to-date for subsequent queries.

In high-availability, high-volume settings, a failure in any MySQL server is typically corrected by taking the failed server offline and swapping it for another server. In a few minutes, the database can be replicated on the new server, and query service can proceed as before. The low cost of commodity servers makes this swap-out strategy cost-effective.

Storage Engine Architecture

The storage engine architecture of MySQL makes it unique in the database world. Of special interest is the pluggable storage engines layer, as shown in Figure 2.

Storage Engine Architecture

Figure 2. Storage Engine Architecture
Click here for a larger image.

Above the storage engine layer, MySQL presents itself as any other database management system would. A group of connectors is available through which to query the database. A connection pool in the server provides authentication and manages threads, connections, memory, and caches. Modules in the server parse queries, optimize access paths, and so on.

Beneath this layer, multiple storage engines can be plugged into the architecture. The storage engines are the components of the database server that perform actions on the underlying data being maintained at the physical server level. The pluggable storage engine architecture enables a standard set of management and support services to apply to all underlying storage engines. The result is like having many databases under the umbrella of a single manager.

Some storage engines, such as those used for archiving, are nontransactional in nature. These storage engines can insert and read data very efficiently. Other storage engines are tuned for efficiency when transactional operations are required, and still others provide high availability through clustering. These storage engines avoid the natural overhead of proprietary systems, which typically must live with the transactional overhead of their table architectures even when it isn’t needed.

For example, a site that primarily needs to look up data can avoid all of the transactional overhead by using the MyISAM storage engine, which is the default engine for MySQL. The MyISAM engine offers high-speed query and insert capability, is nontransactional, provides table-level locking, and supports indexes. It is a good engine for traditional data warehouses.

The MySQL database server comes supplied with a group of internal storage engines. External storage engines are tuned to optimize performance for specific products and situations, and are supplied both by independent software vendors and by the MySQL community.

Having multiple storage engines provides you with the capability of multiple databases. If you migrate to MySQL from another database, you don’t need to change the way your applications interact with the database server — the SQL code is the same. But internal to the MySQL server, you can pick and choose storage engines that meet your application’s needs.

For example, if you discover that you need a transactional database, you can use a single statement to transform your database from a nontransactional system to a transactional one. If you then decide that you need a data warehouse, a single command can put you in possession of a powerful column-oriented database tuned for data retrieval. If your main activity is to capture and store data, the Archive storage engine is designed to efficiently handle large volumes of inserts and compress data to a small footprint.

High Availability

ISPs and mainstream businesses may find that the MySQL server’s replication feature is sufficient for their high availability needs.

To increase availability, customers can add storage area network (SAN) technology or Distributed Replicated Block Device (DRBD), a Linux kernel module from LINBIT that distributes block device replication across the network. The highest availability — “five nines,” or 99.999 percent — is provided by MySQL Cluster, which performs synchronous replication between data nodes, automatic failover, and replication between clusters.

The ability of MySQL to provide the highest levels of performance for the most heavily trafficked sites, combined with its ease of use in product development, helps account for its broad popularity.

If It’s Free, Why Pay for It?

Twenty-five percent of MySQL users are developers. In many small companies, especially startups, the developer is also the database administrator. As the company matures and volume and profits grow, key individuals discover that their time is better spent in activities other than database administration. High reliability and staffing expenses become increasingly important. At this point, purchasing MySQL Enterprise makes economic sense because of the monitoring and support capabilities it supplies.

Note that the community version of MySQL has features identical to the MySQL Enterprise version — although the Enterprise version undergoes more frequent updates. These features include the GUI management tools, the migration toolkits for migrating data from other servers, connectors for JDBC and ODBC, and so on. Access to documentation and user forums is also free to the community. You can develop for free with MySQL, and there will be no surprises when you begin to scale out your deployment and move to a paid subscription version of MySQL.

Conclusion

Embracing an open-source model for Java technology, the Solaris Operating System, and other properties has made Sun Microsystems the world’s largest corporate contributor to the open-source community. MySQL, already the world’s most popular open-source database, fills an important niche in Sun’s software stack. Backed by Sun’s reach and resources, MySQL is poised for even wider adoption. The result should benefit Sun, MySQL, the developer community, and enterprise customers both big and small.

(Reference: amitshah.in)