2011-06: Unused index analysis

 

Building on what we have read over the last months, there is now a methodology available for checking if all of your indexes are actually being used as you think they should be.

 

Check the LASTUSED field in the RTS tables

First, you can simply check the LASTUSED field in the RTS tables, which gives you a good pointer to “used” or “not used”. But beware if the index is Unique and is there purely to stop duplicate inserts, then this style of usage is *not* recorded in the RTS!

You should have full EXPLAIN data for production Static SQL and you should  also trap and run snapshot EXPLAINs of all the Dynamic SQL. Over time, you  can easily build up a usage table to show which indices are being actively used by the SQL running on your machine.

Now all you have to do is work out which indices are:

  1. Marked as not used in the RTS
  2. Not listed in any EXPLAIN table (for dynamic and static)

 

Analyze candidate indexes in depth

This list of candidate indexes then needs to be analyzed in depth. The way I do it, is as follows:

– Pick an index (start with the largest, longest nastiest index(s), where the payback is good or where you have lots of unused indexes on a give table, where the INSERT and DELETE ratio is relatively high).

– Then find the table that it is indexing. Use this as a “driver” for the following EXPLAINS. Use EXPLAIN to get all the access paths that use this table, (this captures non-use of existing indexes), and store into a temporary work table. Use DSN_VIRTUAL_INDEXES to mark our candidate index(s) as DROPPED, and then re-run the EXPLAINS into a new table.
Compare the access paths.

If there is no difference in access path, then you can be 95% sure that the index can indeed be dropped. Now all you must do, is make sure it is either non-unique, or that it is not being used as a duplication stopper!

If all is ok, drop the index and monitor the table usage for a week or two.

 

Repeat the entire procedure every six months or so…

 

What is important to stress here, is that some indexes – even though “not  used” – are actually used by the DB2 optimizer to decide access path. This  is because the SYSCOLDIST is based on the table and not the index, so any  access path choice can use the filter factors and cardinality data in the table. Virtual Indexes does not stop this usage as the data has not been deleted from SYSCOLDIST. This means, that to get that last 5% certainty, you must do the EXPLAINS and the compare *again* after physically dropping the index.

I hope that this little bit of data was of interest and/or of use and, as always, comments or questions are more than welcome!
TTFN,
Roy Boxwell
Senior Architect