12
Dec/09
0

Oracle Performance Tuning Part 1: Using Full Table Scans

It is a common misconception that all SQL queries on all tables in Oracle databases should be index driven. In fact, using full-table scans can improve performance in two scenarios: when querying very small tables and when querying very large tables.

The Effect of Full Table Scans When Querying Small Tables

Let’s suppose your using your Oracle database to run in-house designed and built HR application. Consider a reference table such as a list of department ids and the associated department names. Even a large company is likely to have only a few departments – HR, Sales, Marketing, Finance, IT, so the table is going to be quite small.

Now let’s suppose the table has just 2 columns – department id and department name – with an index on the department id. To find the department name for a given id, we would have to read the index and then read the table, but because the table is so small and because Oracle reads multiple database blocks in one read operation the whole table is scanned in just one read, so however efficient the index, by using it we will be performing unnecessary i/o.

In this case, therefore, a full table scan is faster than an index scan and table lookup. The exception to this of course is when the table has been created as an index-only table (available since Oracle 8) which means that the whole table is stored in a B-tree structure (although you may have pointers to overflow areas).

The Effect of Full Table Scans When Querying Very Large Tables

Let’s look at using this technique for querying very large tables in your Oracle database. Surely they should use an index? Otherwise you might have to read thousands of blocks. It is correct to say that a full table scan of a very large table could read many thousands of data blocks, but as we shall see it may be better to do this than to perform an index scan and table lookup.

The situation when the a full table scan is very likely to perform better than an index scan and table lookup is when you are retrieving 10% or more of the data in the table and it may perform better even when you are retrieving as little as 1% of the table data. Of course if you only want to retrieve one row in the table, then you would want to use an index.

Index Scan And Table Lookup Vs. Full Table Scan For Very Large Tables in Oracle Databases

Let’s look at the 2 scenarios then – retrieving 10% of the table data by index scan and table lookup vs. full table scan.

To make the maths easy, assume our table has 10,000,000 rows with 10 data rows per block and 100 index entries per block. Therefore to read 10% of the table via an index scan and table lookup, we would have to read 10,000 (1,000,000/100) index blocks plus 100,000 (1,000,000/10) data blocks. That’s 110,000 blocks in total.

However this assumes that the data is stored in order which means that we only retrieve the blocks of data that we want. If the data is not sorted then the worst case is that we would have to read 1 block for each row of data i.e. 1,000,000 blocks, which would would give us a worst case total of 1,010,000 blocks.

For a full table scan the maths is easy: (10,000,000 rows)/(10 rows per block) = 1,000,000 blocks. This is less than the worst case scenario for an indexed read, but more than the best case scenario for an indexed read. This would seem to suggest that if you sort your data before loading, an indexed read would be faster than a full-table scan.

Whilst it is true that pre-sorting the data of very large table will improve performance , it is not necessarily correct that the read via the index will be better overall. We also need to take into account what happens to the blocks stored in the buffer cache of the Oracle SGA and the impact this will have on other users of the database.

Let’s look at the effect on the buffer cache of reading many data blocks via an index. As we know, data and index blocks are stored in the buffer cache by Oracle for reuse by other queries by being marked as least recently used when we do an indexed read. However, those data blocks read by a full table scan are quickly aged out of the buffer cache, because they are not marked as least recently used.

What this means is that the large number of blocks (1,010,000) of index and table data read via the indexed read of our table will be saved in the SGA flushing practically all other data from it – which will obviously have an effect on other users.

Conversely, when a full table scan is performed only the last blocks read are held in the SGA (the actual number is determined by the multi-block read count) so the impact on other users of the database would be minimal.

Summary

To decide whether or not a full-table scan would be better than an indexed read, for a large table you need to consider what proportion of the table the query will retrieve from your Oracle database and consider the likely effect of that on other users. The denser the data, the more efficient a full table scan is for very large tables, but generally if you’re reading more than 1-10% of a very large table, a full table scan would be more efficient than an index scan and table lookup.

For small tables you will get much better performance from your Oracle database by caching the table in its entirety (so that it is always in memory), or by using an index-organised table then you will by relying on indexes. Having said that, every table should have a primary key index to guarantee uniqueness, but you don’t have to use it.

25
Jun/09
0

Reasons to Migrate from Microsoft Access to MySQL

Use of MySQL as a storage manager for Access offers several benefits. One is that you can use your information in additional ways when it’s not locked into Access. Other differences pertain more specifically to the case where you intend to continue using Access as the user interface to your information.

Deployment of information. When your information resides in MySQL, you’re free to continue using it from Access if you wish, but a number of other possibilities open up as well. Any kind of MySQL client can use the information, not just Access. This allows your data to be exploited more fully in more contexts, and by more people. For example, other people can use the data through the standard MySQL client programs or from GUI-based applications. Your database also becomes more accessible over the Web. Access now provides some capabilities for making a database available on the Web, but if MySQL manages the database, you have a wider range of options. MySQL integrates easily with Web servers like Apache through any of a number of languages, such as Perl, PHP, Python, Java, and Ruby. This allows you to provide a Web interface to your database with the language of your choice. In addition, the interface can be accessed by browsers on many types of machines, providing a platform-independent entryway to your information. All of these components can be obtained for free–MySQL, Apache, and the languages just mentioned have been released as Open Source. You can also obtain them in packages that include support.

Multiple-user access. Although Access provides some data sharing capabilities, that is not really its strength. It has the feel of a single-user data manager designed for local use. MySQL, on the other hand, easily handles many simultaneous users. It was designed from the ground up to run in a networked environment and to be a multiple-user system that is capable of servicing large numbers of clients.

Management of large databases. MySQL can manage hundreds of megabytes of data, and more. Care to try that with Access?

Security. When Access tables are stored locally, anyone can walk up to your Windows machine, launch Access, and gain access to your tables. It’s possible to assign a database a password, but many people routinely neglect to do so. When your tables are stored in MySQL, the MySQL server manages security. Anyone attempting to access your data must know the proper user name and password for connecting to MySQL.

Backup management. If you work in an organization that supports many Access users, migrating data to MySQL provides a benefit for backups and data integrity. With Access databases centralized in MySQL, they’re all backed up using the regular MySQL backup procedures that already exist at your site. If individual Access users each store their data locally, backup can be more complicated: 50 users means 50 database backups. While some sites address this problem through the use of network backups, others deal with it by making backups the responsibility of individual machine owners–which unfortunately sometimes means no backups at all.

Local disk storage requirements. Local Access database files become smaller, because the contents of tables are not stored internally, they’re stored as links to the MySQL server where the tables reside. This results in reduced local disk usage. And, should you wish to distribute a database, less information need be copied. (Of course, anyone you distribute the database to also must have access to the MySQL server.)

Cost. MySQL can be obtained for free. Access cannot. Providing other means of using your database (such as through a Web interface) can reduce your dependence on proprietary software and lower your software acquisition and licensing costs.

Hardware choices. MySQL runs on several platforms; Access is a single-platform application. If you want to use Access, your choice of hardware is determined for you.