What’s New in MySQL 5.1

 

What’s New in MySQL 5.1
Partitioning, Event scheduler are two big features that are introduced in this version.
and a major of sql optimization has been done.
MySQL 5.1 performs much better in all tests: it can handle more transactions per second and it does not deadlock with 256 threads, unlike 5.0.

The following features have been added to MySQL 5.1.

*

Partitioning. This capability enables distributing portions of individual tables across a file system, according to rules which can be set when the table is created. In effect, different portions of a table are stored as separate tables in different locations, but from the user point of view, the partitioned table is still a single table. Syntactically, this implements a number of new extensions to the CREATE TABLE, ALTER TABLE, and EXPLAIN … SELECT statements. As of MySQL 5.1.6, queries against partitioned tables can take advantage of partition pruning. In some cases, this can result in query execution that is an order of magnitude faster than the same query against a non-partitioned version of the same table. See Chapter 18, Partitioning, for further information on this functionality. (Author: Mikael Ronström)
*

Row-based replication. Replication capabilities in MySQL originally were based on propagation of SQL statements from master to slave. This is called statement-based replication. As of MySQL 5.1.5, another basis for replication is available. This is called row-based replication. Instead of sending SQL statements to the slave, the master writes events to its binary log that indicate how individual table rows are effected. As of MySQL 5.1.8, a third option is available: mixed. This will use statement-based replication by default, and only switch to row-based replication in particular cases. See Section 16.1.2, “Replication Formats”. (Authors: Lars Thalmann, Guilhem Bichot, Mats Kindahl)
*

Plugin API. MySQL 5.1 adds support for a very flexible plugin API that enables loading and unloading of various components at runtime, without restarting the server. Although the work on this is not finished yet, plugin full-text parsers are a first step in this direction. This allows users to implement their own input filter on the indexed text, enabling full-text search capability on arbitrary data such as PDF files or other document formats. A pre-parser full-text plugin performs the actual parsing and extraction of the text and hands it over to the built-in MySQL full-text search. See Section 22.2, “The MySQL Plugin Interface”. (Author: Sergey Vojtovich)
*

Event scheduler. MySQL Events are tasks that run according to a schedule. When you create an event, you are creating a named database object containing one or more SQL statements to be executed at one or more regular intervals, beginning and ending at a specific date and time. Conceptually, this is similar to the idea of the Unix crontab (also known as a “cron job”) or the Windows Task Scheduler. See Section 19.4, “Using the Event Scheduler”. (Author: Andrey Hristov)
*

Server log tables. Before MySQL 5.1, the server writes general query log and slow query log entries to log files. As of MySQL 5.1, the server’s logging capabilities for these logs are more flexible. Log entries can be written to log files (as before) or to the general_log and slow_log tables in the mysql database. If logging is enabled, either or both destinations can be selected. The –log-output option controls the destination or destinations of log output. See Section 5.2.1, “Selecting General Query and Slow Query Log Output Destinations”. (Author: Petr Chardin)
*

Upgrade program. The mysql_upgrade program (available as of MySQL 5.1.7) checks all existing tables for incompatibilities with the current version of MySQL Server and repairs them if necessary. This program should be run for each MySQL upgrade. See Section 4.4.8, “mysql_upgrade — Check Tables for MySQL Upgrade”. (Authors: Alexey Botchkov, Mikael Widenius)
*

MySQL Cluster. MySQL Cluster is now released as a separate product, based on MySQL 5.1 but with the addition of the NDBCLUSTER storage engine. Clustering support is no longer available in mainline MySQL 5.1 releases. MySQL Cluster releases are identified by a 3-part NDB version number; currently, the MySQL Cluster NDB 6.2 and MySQL Cluster NDB 6.3 release series are available for production use.

Some of the changes in MySQL Cluster since MySQL 5.0 are listed here:
o

MySQL Cluster replication. Replication between MySQL Clusters is now supported. It is now also possible to replicate between a MySQL Cluster and a non-cluster database. See Section 17.10, “MySQL Cluster Replication”.
o

MySQL Cluster disk data storage. Formerly, the NDBCLUSTER storage engine was strictly in-memory; now, it is possible to store Cluster data (but not indexes) on disk. This allows MySQL Cluster to scale upward with fewer hardware (RAM) requirements than previously. In addition, the Disk Data implementation includes a new “no-steal” restoration algorithm for fast node restarts when storing very large amounts of data (terabyte range). See Section 17.11, “MySQL Cluster Disk Data Tables”, for more information.
o

Improved backups for MySQL Cluster. A fault arising in a single data node during a Cluster backup no longer causes the entire backup to be aborted, as occurred in previous versions of MySQL Cluster.

Many other new features and improvements have been made to the NDBCLUSTER storage engine in MySQL Cluster NDB 6.2 and MySQL Cluster NDB 6.3; for more information about these, see Section 17.14, “MySQL Cluster Development Roadmap”.
*

Backup of tablespaces. The mysqldump utility now supports an option for dumping tablespaces. Use -Y or –all-tablespaces to enable this functionality.
*

Improvements to INFORMATION_SCHEMA. MySQL 5.1 provides much more information in its metadata database than was available in MySQL 5.0. New tables in the INFORMATION_SCHEMA database include FILES, EVENTS, PARTITIONS, PROCESSLIST, ENGINES, and PLUGINS.
*

XML functions with XPath support. ExtractValue() returns the content of a fragment of XML matching a given XPath expression. UpdateXML() replaces the element selected from a fragment of XML by an XPath expression supplied by the user with a second XML fragment (also user-supplied), and returns the modified XML. See Section 11.10, “XML Functions”. (Author: Alexander Barkov)
*

Load emulator. The mysqlslap program is designed to emulate client load for a MySQL server and report the timing of each stage. It works as if multiple clients were accessing the server. See Section 4.5.7, “mysqlslap — Load Emulation Client”. (Authors: Patrick Galbraith, Brian Aker)

 

SQL Index Types

Index Types

In addition to an index being clustered or nonclustered, it can be configured in other ways:

  • Composite index: An index that contains more than one column. In both SQL Server 2005 and 2008, you can include up to 16 columns in an index, as long as the index doesn’t exceed the 900-byte limit. Both clustered and nonclustered indexes can be composite indexes.
  • Unique Index: An index that ensures the uniqueness of each value in the indexed column. If the index is a composite, the uniqueness is enforced across the columns as a whole, not on the individual columns. For example, if you were to create an index on the FirstName and LastName columns in a table, the names together must be unique, but the individual names can be duplicated.

A unique index is automatically created when you define a primary key or unique constraint:

·          

    • Primary key: When you define a primary key constraint on one or more columns, SQL Server automatically creates a unique, clustered index if a clustered index does not already exist on the table or view. However, you can override the default behavior and define a unique, nonclustered index on the primary key.
    • Unique: When you define a unique constraint, SQL Server automatically creates a unique, nonclustered index. You can specify that a unique clustered index be created if a clustered index does not already exist on the table.
  • Covering index: A type of index that includes all the columns that are needed to process a particular query. For example, your query might retrieve the FirstName and LastName columns from a table, based on a value in the ContactID column. You can create a covering index that includes all three columns.

Index Design

As beneficial as indexes can be, they must be designed carefully. Because they can take up significant disk space, you don’t want to implement more indexes than necessary. In addition, indexes are automatically updated when the data rows themselves are updated, which can lead to additional overhead and can affect performance. As a result, index design should take into account a number of considerations.

Database

As mentioned above, indexes can enhance performance because they can provide a quick way for the query engine to find data. However, you must also take into account whether and how much you’re going to be inserting, updating, and deleting data. When you modify data, the indexes must also be modified to reflect the changed data, which can significantly affect performance. You should consider the following guidelines when planning your indexing strategy:

  • For tables that are heavily updated, use as few columns as possible in the index, and don’t over-index the tables.
  • If a table contains a lot of data but data modifications are low, use as many indexes as necessary to improve query performance. However, use indexes judiciously on small tables because the query engine might take longer to navigate the index than to perform a table scan.
  • For clustered indexes, try to keep the length of the indexed columns as short as possible. Ideally, try to implement your clustered indexes on unique columns that do not permit null values. This is why the primary key is often used for the table’s clustered index, although query considerations should also be taken into account when determining which columns should participate in the clustered index.
  • The uniqueness of values in a column affects index performance. In general, the more duplicate values you have in a column, the more poorly the index performs. On the other hand, the more unique each value, the better the performance. When possible, implement unique indexes.
  • For composite indexes, take into consideration the order of the columns in the index definition. Columns that will be used in comparison expressions in the WHERE clause (such as WHERE FirstName = ‘Charlie’) should be listed first. Subsequent columns should be listed based on the uniqueness of their values, with the most unique listed first.
  • You can also index computed columns if they meet certain requirements. For example, the expression used to generate the values must be deterministic (which means it always returns the same result for a specified set of inputs). For more details about indexing computed columns, see the topic “Creating Indexes on Computed Columns” in SQL Server Books Online.

Queries

Another consideration when setting up indexes is how the database will be queried. As mentioned above, you must take into account the frequency of data modifications. In addition, you should consider the following guidelines:

  • Try to insert or modify as many rows as possible in a single statement, rather than using multiple queries.
  • Create nonclustered indexes on columns used frequently in your statement’s predicates and join conditions.
  • Consider indexing columns used in exact-match queries.

 

How to create Index in database

One of the most important routes to high performance in a SQL Server database is the index. Indexes speed up the querying process by providing swift access to rows in the data tables, similarly to the way a book’s index helps you find information quickly within that book. In this article, I provide an overview of SQL Server indexes and explain how they’re defined within a database and how they can make the querying process faster. Most of this information applies to indexes in both SQL Server 2005 and 2008; the basic structure has changed little from one version to the next. In fact, much of the information also applies to SQL Server 2000. This does not mean there haven’t been changes. New functionality has been added with each successive version; however, the underlying structures have remained relatively the same. So for the sake of brevity, I stick with 2005 and 2008 and point out where there are differences in those two versions.

Index Structures

Indexes are created on columns in tables or views. The index provides a fast way to look up data based on the values within those columns. For example, if you create an index on the primary key and then search for a row of data based on one of the primary key values, SQL Server first finds that value in the index, and then uses the index to quickly locate the entire row of data. Without the index, a table scan would have to be performed in order to locate the row, which can have a significant effect on performance.

You can create indexes on most columns in a table or a view. The exceptions are primarily those columns configured with large object (LOB) data types, such as image, text, and varchar(max). You can also create indexes on XML columns, but those indexes are slightly different from the basic index and are beyond the scope of this article. Instead, I’ll focus on those indexes that are implemented most commonly in a SQL Server database.

An index is made up of a set of pages (index nodes) that are organized in a B-tree structure. This structure is hierarchical in nature, with the root node at the top of the hierarchy and the leaf nodes at the bottom, as shown in Figure 1.

http://www.simple-talk.com/iwritefor/articlefiles/610-image002.jpg

Figure 1: B-tree structure of a SQL Server index

When a query is issued against an indexed column, the query engine starts at the root node and navigates down through the intermediate nodes, with each layer of the intermediate level more granular than the one above. The query engine continues down through the index nodes until it reaches the leaf node. For example, if you’re searching for the value 123 in an indexed column, the query engine would first look in the root level to determine which page to reference in the top intermediate level. In this example, the first page points the values 1-100, and the second page, the values 101-200, so the query engine would go to the second page on that level. The query engine would then determine that it must go to the third page at the next intermediate level. From there, the query engine would navigate to the leaf node for value 123. The leaf node will contain either the entire row of data or a pointer to that row, depending on whether the index is clustered or nonclustered.

Clustered Indexes

A clustered index stores the actual data rows at the leaf level of the index. Returning to the example above, that would mean that the entire row of data associated with the primary key value of 123 would be stored in that leaf node. An important characteristic of the clustered index is that the indexed values are sorted in either ascending or descending order. As a result, there can be only one clustered index on a table or view. In addition, data in a table is sorted only if a clustered index has been defined on a table.

Note: A table that has a clustered index is referred to as a clustered table. A table that has no clustered index is referred to as a heap.

Nonclustered Indexes

Unlike a clustered indexed, the leaf nodes of a nonclustered index contain only the values from the indexed columns and row locators that point to the actual data rows, rather than contain the data rows themselves. This means that the query engine must take an additional step in order to locate the actual data.

A row locator’s structure depends on whether it points to a clustered table or to a heap. If referencing a clustered table, the row locator points to the clustered index, using the value from the clustered index to navigate to the correct data row. If referencing a heap, the row locator points to the actual data row.

Nonclustered indexes cannot be sorted like clustered indexes; however, you can create more than one nonclustered index per table or view. SQL Server 2005 supports up to 249 nonclustered indexes, and SQL Server 2008 support up to 999. This certainly doesn’t mean you should create that many indexes. Indexes can both help and hinder performance, as I explain later in the article.

In addition to being able to create multiple nonclustered indexes on a table or view, you can also add included columns to your index. This means that you can store at the leaf level not only the values from the indexed column, but also the values from non-indexed columns. This strategy allows you to get around some of the limitations on indexes. For example, you can include non-indexed columns in order to exceed the size limit of indexed columns (900 bytes in most cases).

SQL JOINS

 

The SQL JOIN clause is used whenever we have to select data from 2 or more tables.

To be able to use SQL JOIN clause to extract data from 2 (or more) tables, we need a relationship between certain columns in these tables.

We are going to illustrate our SQL JOIN example with the following 2 tables:

Customers:

CustomerID

FirstName

LastName

Email

DOB

Phone

1

John

Smith

John.Smith@yahoo.com

2/4/1968

626 222-2222

2

Steven

Goldfish

goldfish@fishhere.net

4/4/1974

323 455-4545

3

Paula

Brown

pb@herowndomain.org

5/24/1978

416 323-3232

4

James

Smith

jim@supergig.co.uk

20/10/1980

416 323-8888

Sales:

CustomerID

Date

SaleAmount

2

5/6/2004

$100.22

1

5/7/2004

$99.95

3

5/7/2004

$122.95

3

5/13/2004

$100.00

4

5/22/2004

$555.55

As you can see those 2 tables have common field called CustomerID and thanks to that we can extract information from both tables by matching their CustomerID columns.

Consider the following SQL statement:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers, Sales
WHERE Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

The SQL expression above will select all distinct customers (their first and last names) and the total respective amount of dollars they have spent.
The SQL JOIN condition has been specified after the SQL WHERE clause and says that the 2 tables have to be matched by their respective CustomerID columns.

Here is the result of this SQL statement:

FirstName

LastName

SalesPerCustomers

John

Smith

$99.95

Steven

Goldfish

$100.22

Paula

Brown

$222.95

James

Smith

$555.55

The SQL statement above can be re-written using the SQL JOIN clause like this:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

There are 2 types of SQL JOINS – INNER JOINS and OUTER JOINS. If you don’t put INNER or OUTER keywords in front of the SQL JOIN keyword, then INNER JOIN is used. In short “INNER JOIN” = “JOIN” (note that different databases have different syntax for their JOIN clauses).

The INNER JOIN will select all rows from both tables as long as there is a match between the columns we are matching on. In case we have a customer in the Customers table, which still hasn’t made any orders (there are no entries for this customer in the Sales table), this customer will not be listed in the result of our SQL query above.

If the Sales table has the following rows:

CustomerID

Date

SaleAmount

2

5/6/2004

$100.22

1

5/6/2004

$99.95

And we use the same SQL JOIN statement from above:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

We’ll get the following result:

FirstName

LastName

SalesPerCustomers

John

Smith

$99.95

Steven

Goldfish

$100.22

Even though Paula and James are listed as customers in the Customers table they won’t be displayed because they haven’t purchased anything yet.

But what if you want to display all the customers and their sales, no matter if they have ordered something or not? We’ll do that with the help of SQL OUTER JOIN clause.

The second type of SQL JOIN is called SQL OUTER JOIN and it has 2 sub-types called LEFT OUTER JOIN and RIGHT OUTER JOIN.

The LEFT OUTER JOIN or simply LEFT JOIN (you can omit the OUTER keyword in most databases), selects all the rows from the first table listed after the FROM clause, no matter if they have matches in the second table.

If we slightly modify our last SQL statement to:

SELECT Customers.FirstName, Customers.LastName, SUM(Sales.SaleAmount) AS SalesPerCustomer
FROM Customers LEFT JOIN Sales
ON Customers.CustomerID = Sales.CustomerID
GROUP BY Customers.FirstName, Customers.LastName

and the Sales table still has the following rows:

CustomerID

Date

SaleAmount

2

5/6/2004

$100.22

1

5/6/2004

$99.95

The result will be the following:

FirstName

LastName

SalesPerCustomers

John

Smith

$99.95

Steven

Goldfish

$100.22

Paula

Brown

NULL

James

Smith

NULL

As you can see we have selected everything from the Customers (first table). For all rows from Customers, which don’t have a match in the Sales (second table), the SalesPerCustomer column has amount NULL (NULL means a column contains nothing).

The RIGHT OUTER JOIN or just RIGHT JOIN behaves exactly as SQL LEFT JOIN, except that it returns all rows from the second table (the right table in our SQL JOIN statement).

interesting?..http://www.devshed.com/c/a/MySQL/Understanding-SQL-Joins/

 

 

 

Get Row Count in mysql

 

FOUND_ROWS()

A SELECT statement may include a LIMIT clause to restrict the number of rows the server returns to the client. In some cases, it is desirable to know how many rows the statement would have returned without the LIMIT, but without running the statement again. To obtain this row count, include a SQL_CALC_FOUND_ROWS option in the SELECT statement, and then invoke FOUND_ROWS() afterward:

mysql> SELECT SQL_CALC_FOUND_ROWS * FROM tbl_name

    -> WHERE id > 100 LIMIT 10;

mysql> SELECT FOUND_ROWS();

The second SELECT returns a number indicating how many rows the first SELECT would have returned had it been written without the LIMIT clause.

In the absence of the SQL_CALC_FOUND_ROWS option in the most recent successful SELECT statement, FOUND_ROWS() returns the number of rows in the result set returned by that statement. If the statement includes a LIMIT clause, FOUND_ROWS() returns the number of rows up to the limit. For example, FOUND_ROWS() returns 10 or 60, respectively, if the statement includes LIMIT 10 or LIMIT 50, 10.

The row count available through FOUND_ROWS() is transient and not intended to be available past the statement following the SELECT SQL_CALC_FOUND_ROWS statement. If you need to refer to the value later, save it:

mysql> SELECT SQL_CALC_FOUND_ROWS * FROM … ;

mysql> SET @rows = FOUND_ROWS();

If you are using SELECT SQL_CALC_FOUND_ROWS, MySQL must calculate how many rows are in the full result set. However, this is faster than running the query again without LIMIT, because the result set need not be sent to the client.

SQL_CALC_FOUND_ROWS and FOUND_ROWS() can be useful in situations when you want to restrict the number of rows that a query returns, but also determine the number of rows in the full result set without running the query again. An example is a Web script that presents a paged display containing links to the pages that show other sections of a search result. Using FOUND_ROWS() allows you to determine how many other pages are needed for the rest of the result.

The use of SQL_CALC_FOUND_ROWS and FOUND_ROWS() is more complex for UNION statements than for simple SELECT statements, because LIMIT may occur at multiple places in a UNION. It may be applied to individual SELECT statements in the UNION, or global to the UNION result as a whole.

The intent of SQL_CALC_FOUND_ROWS for UNION is that it should return the row count that would be returned without a global LIMIT. The conditions for use of SQL_CALC_FOUND_ROWS with UNION are:

  • The SQL_CALC_FOUND_ROWS keyword must appear in the first SELECT of the UNION.
  • The value of FOUND_ROWS() is exact only if UNION ALL is used. If UNION without ALL is used, duplicate removal occurs and the value of FOUND_ROWS() is only approximate.
  • If no LIMIT is present in the UNION, SQL_CALC_FOUND_ROWS is ignored and returns the number of rows in the temporary table that is created to process the UNION.

Beyond the cases described here, the behavior of FOUND_ROWS() is undefined (for example, its value following a SELECT statement that fails with an error).

Important

FOUND_ROWS() is not replicated reliably, and should not be used with databases that are to be replicated.

 

Temporary Tables With MySQL

If you have had problems with MySQL not supporting functionality like sub selects this short article might be of interest for you. This article will explain how to use temporary tables with MySQL.

The juice

The SQL code below shows you how you can create a temporary table.

CREATE TEMPORARY TABLE TempTable ( ID int, Name char(100) ) TYPE=HEAP;
INSERT INTO TempTable VALUES( 1, “Foo bar” );
SELECT * FROM TempTable;
DROP TABLE TempTable;

If you’re using an MySQL version older than 3.23, you should use the code below.

CREATE TABLE TempTable ( ID int, Name char(100) ) TYPE=HEAP;
LOCK TABLE TempTable WRITE;
INSERT INTO TempTable VALUES( 1, “Foo bar” );
SELECT * FROM TempTable;
DROP TABLE TempTable;
UNLOCK TABLES;

Temporary tables

Temporary tables are per connection so heavy loaded sites will not have a problem with using the same table name since it’s unique per connection. If there already exists a table, which is not temporary, with the same same this table is then hidden until the temporary table is deleted. Temporary tables where added in MySQL version 3.23.

If you use an older version of MySQL than 3.23 you can’t use temporary tables, but you can use heap tables.

If the connection to the database is lost the temporary table is automatically deleted.

Heap tables

The heap tables are shared between all connections. Therefore you must use locking of the temporary table to ensure that no other connection is corrupting the data in your temporary table.

Since we need speed and the tables are temporary we create them as heap tables. This means that the tables are stored in memory and not on disk. Of course the data will be lost if the power is lost, but the kind of information stored in a heap table should be temporary data used for further queries.

To free memory always remember to drop the heap tables when you don’t need them anymore. Or at least delete the rows which is no longer of interest. Heap tables can take up a lot of memory if you don’t clean up after you’re done.

Heap tables are shared between all clients (just like any other table), so use them with care.

Using the LOCK TABLE mechanism

Since lock table works across connections this can be used to mimic the behaviour of the temporary tables. Although this is not that effective on servers with many connections, it works.

The lock table mechanism ensures that no other connections get access to write to the table so you don’t get corrupted data in the table while you are using it.

If the connection to the database is lost the locks are automatically unlocked.

 

Get Age Using Mysql Functions

Get Age Using Mysql Functions:

 

·         select (YEAR(CURRENT_DATE()) – YEAR(‘2004-05-05’) –

(UNIX_TIMESTAMP(CURRENT_DATE())<UNIX_TIMESTAMP(CONCAT(YEAR(CURRENT_DATE()),

‘-‘, MONTH(‘2004-01-02’), ‘-‘, DAYOFMONTH(‘2008-01-02’))))) AS `age`

 

·         SELECT YEAR(DATE_SUB(NOW(), INTERVAL TO_DAYS(‘2004-01-02’) DAY)) AS age

MySQL 5.x: passing limits as stored procedure / function parameters

 

MySQL 5.x (at least 5.0.15 and earlier versions) does not allow using variables or procedure formal parameters with LIMIT. Here is a workaround:

CREATE PROCEDURE sp (
  IN LimitStart_ INT,
  IN LimitCnt_ INT
)
BEGIN
  SET @lim = CONCAT(‘ LIMIT ‘, LimitStart_, ‘,’, LimitCnt_);
  SET @q = "SELECT mycol FROM mytable";

  SET @q = CONCAT(@q, @lim);
  PREPARE st FROM @q;
  EXECUTE st;
  DEALLOCATE PREPARE st;
END;