Every now and again, the DBAs of the world meet with the developers of the world and try and bang some heads together…I already wrote a newsletter all about existence checks which has changed and evolved over the years and now I wish to investigate another age old problem “Is Db2 there?”
Newsletter 2012-10: Existence check SQL – Through the ages
Reader test results: see the end of this newsletter for SELECT queries tested by one of our readers on Db2 10 and DB2 11
Why?
First up, is “Why?” – Why would you want to know that? Well think about our wonderful agile world these days. The JAVA developers do not even know what Db2 *is* and so when they write/script/generate/get generated for them SQL (Delete whatever is not appropriate), they have no idea *where* the code will run. To be fair, they should not really need to…the idea these days is that your back end can be anywhere and on any hardware platform.
So, wouldn’t it be cool to be able to tell the application that the required database is “not currently available” – This is when the “Is Db2 there?” SQL question, and all of its problems, was born.
Problems
How to do this ?
How often to do this ?
Do you need to do this ?
Haven’t you got better things to do than this?
1 – How to do this?
The first problem was solved when a developer wrote this SQL:
SELECT * FROM SYSIBM.SYSDUMMY1
PREPAREd, OPENed, FETCHed it and CLOSEd it. Now remember, with dynamic SQL, everything must be in cursors so you really did have to define a cursor, open the cursor, fetch the cursor and, if you are being a good coder, close the cursor. If all of this was successful, then you “knew” that Db2 on z/OS was up and running and available for all your further SQL requests.
This was ok when there was not so much dynamic SQL going on (Before Db2 V8), but naturally it started getting a bit out of hand very quickly.
The SQL then changed a little bit to be:
SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1
The logic here was it could only return one row/value back and so made the network traffic less. Naturally Db2 went off and counted the single row in that table using a tablespace scan *every* time…
This slowly became a small, but very noticeable, overhead with 1,000,000’s of getpages against the smallest Db2 Catalog table ever invented…
Now Terry Purcell and the optimizer crew came up with the PRUNED access plan in Db2 9:
PRUNED
Db2 does not generate an access path for the query because the query is guaranteed to qualify zero rows, such as the case of an always-false WHERE clause.
For example:
WHERE 0 = 1
This happens when a “leg” of an SQL query cannot ever be true. In this case, the complete leg of the SQL query is thrown away. With this advance in optimizer logic the predicate:
WHERE 1 = 0
Actually became really useful as it meant that the *entire* SQL was only run through the optimizer and the SQLCODE was set to +100 if doing a FETCH with the * style SQL or the value of your host variable was set to zero if using the COUNT(*) style. This meant that the getpages for SYSIBM.SYSDUMMY1 dropped down to zero – which is a very very good thing
IBM have been improving what PRUNED can do ever since, and Terry Purcell wrote :
IBM have been improving what PRUNED can do ever since, and Terry Purcell wrote :
“Another benefit of column pruning is that it can provide extra table pruning opportunities. In Db2 10, an enhancement was delivered to prune outer join tables if no columns were required from that table and it was guaranteed that the table would not introduce duplicates. In Db2 12, this pruning enhancement is extended to outer joins to views/table expressions that are guaranteed not to return duplicates and no columns are required from the view/table expression.”
2 – How often to do this?
How long is a piece of string? The problem is that when you do the “Is Db2 there?” test at the start of a long conversational transaction, it could well be that Db2 has disappeared at the end…
3 – Do you need to do this?
Again, an age-old question – Most JAVA frameworks do this (Like they do embedded COMMIT and ROLLBACK!) and so the simple answer is yes!
4 – Haven’t you got better things to do than this?
Of course you do! We all do, but these little changes can actually have a major impact on the overall machine throughput!
Measured Response
Here are some real word results of tests with dynamic SQL with/without the 1 = 0 and running a million times each. I have tried doing a variety of flavors, like Heinz…
First up are the SELECT * style selects:
SEL-867 SELECT * FROM SYSIBM.SYSDUMMY1 SEL-868 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 SEL-869 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH UR SEL-870 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH UR FOR FETCH ONLY SEL-871 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH CS SEL-872 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH CS FOR FETCH ONLY SEL-873 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH RR SEL-874 SELECT * FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH RR FOR FETCH ONLY
Now the SELECT COUNT(*) style selects:
SEL-875 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 SEL-876 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 SEL-877 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH UR SEL-878 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH UR FOR FETCH ONLY SEL-879 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH CS SEL-880 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH CS FOR FETCH ONLY SEL-881 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH RR SEL-882 SELECT COUNT(*) FROM SYSIBM.SYSDUMMY1 WHERE 1 = 0 WITH RR FOR FETCH ONLY
The results were:
Rows Rows Index Tblsp StmtID Execs Getpg exam proc Scans Scans ---------- ----- ----- ----- ----- ----- ----- 867 1000K 2M 1000K 1000K 0 1000K 868 1000K 0 0 0 0 0 869 1000K 0 0 0 0 0 870 1000K 0 0 0 0 0 871 1000K 0 0 0 0 0 872 1000K 0 0 0 0 0 873 1000K 0 0 0 0 0 874 1000K 0 0 0 0 0 875 1000K 2M 1000K 1000K 0 1000K 876 1000K 0 0 1000K 0 0 877 1000K 0 0 1000K 0 0 878 1000K 0 0 1000K 0 0 879 1000K 0 0 1000K 0 0 880 1000K 0 0 1000K 0 0 881 1000K 0 0 1000K 0 0 882 1000K 0 0 1000K 0 0
StmtID Tot. CPU Avg. CPU Tot. Elap Avg. Elap ---------- SS.mmmmmm SS.mmmmmm SS.mmmmmm SS.mmmmmm 867 2.481068 0.000002 16.869174 0.000017 868 1.291817 0.000001 8.119036 0.000008 869 1.331707 0.000001 6.220308 0.000006 870 1.330709 0.000001 9.460538 0.000009 871 1.306633 0.000001 8.984930 0.000009 872 1.326517 0.000001 9.181043 0.000009 873 1.324213 0.000001 7.392330 0.000007 874 1.322115 0.000001 7.694403 0.000008 875 3.066573 0.000003 18.824193 0.000019 876 1.467454 0.000001 6.168161 0.000006 877 1.478887 0.000001 7.714925 0.000008 878 1.480241 0.000001 9.903416 0.000010 879 1.477947 0.000001 9.965071 0.000010 880 1.469971 0.000001 9.195251 0.000009 881 1.467287 0.000001 8.687336 0.000009 882 1.487021 0.000001 11.742945 0.000012
The SELECT * with WHERE 1 = 0 WITH UR is actually the winner!
Remember – You never stop learning!
As always, I would be pleased to hear from you!
TTFN,
Roy Boxwell
Senior Architect
Reader test results from Isaac Yassin:
Measured for dynamic SQL (SPUFI in UR mode – DSNESPUR), using Omegamon application trace. Same machine, same LPAR, same subsystem, measuring the 3rd consecutive execution of each SQL. Both do a “prepare/open/fetch/close” according to trace.
–
select * from sysibm.sysdummy1 vs. select * from sysibm.sysdummy1 where 0=1
The differences are:
For Db2 10
Without “where 0=1”
PREPARE – InDB2 Time = 0.00048 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00131 InDB2 CPU= 0.00116 (avg. 0.00058)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1
Sequential scan of data page = 2 pages scanned, 1 row qualified
With “where 0=1”
PREPARE - InDB2 Time = 0.00077 InDB2 CPU = 0.00049
FETCH - 1 fetch, InDB2 time = 0.00008 , InDB2 CPU = 0.00008
Locks - none
NO sequential scan
For Db2 11
Without “where 0=1”
PREPARE - InDB2 Time = 0.00045 InDB2 CPU = 0.00045
FETCH - 2 fetches, InDB2 time = 0.00121 InDB2 CPU= 0.00120 (avg. 0.00060)
LOCKs - Type = MDEL , Level = S , DB=DSNDB06 , PS=SYSEBCDC , Count = 1
Sequential scan of data page = 2 pages scanned, 1 row qualified
With “where 0=1”
PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan
select 1 from sysibm.sysdummy1 vs. select 1 from sysibm.sysdummy1 where 0=1
The differences are:
For Db2 10
Without “where 0=1”
PREPARE - InDB2 Time = 0.00057 InDB2 CPU = 0.00054 FETCH - 2 fetches, InDB2 time = 0.00015 InDB2 CPU= 0.00015 (avg. 0.00007) LOCKs - None No Sequential scan of data page
With “where 0=1”
PREPARE - InDB2 Time = 0.00060 InDB2 CPU = 0.00034 FETCH - 1 fetch, InDB2 time = 0.00006 , InDB2 CPU = 0.00006 Locks - none NO sequential scan
For Db2 11
Without “where 0=1”
PREPARE - InDB2 Time = 0.00047 InDB2 CPU = 0.00047
FETCH - 2 fetches, InDB2 time = 0.00087 InDB2 CPU= 0.00082 (avg. 0.00041)
LOCKs - none
NO Sequential scan of data pages
With “where 0=1”
PREPARE - InDB2 Time = 0.00044 InDB2 CPU = 0.00044
FETCH - 1 fetch, InDB2 time = 0.00007 , InDB2 CPU = 0.00007
Locks - none
NO sequential scan of data pages
Bottom line
“select 1” works better than “select *” as you don’t really scan the data and not taking locks, using “where 0=1” eliminates 1 fetch and the locks & scan as well.
Isaac Yassin