My Experiences with Life: An Extract from AskTom

Thursday, April 2, 2009
An Extract from AskTom

Efficient SQL

This was probably the hardest part of the book to write - this chapter.  That is not 
because the material is all that complex, rather because I know what people want - and I 
know what can be delivered.  What people want:  The 10 step process by which you can tune 
any query.  What can be delivered:  Knowledge about how queries are processed, knowledge 
you can use and apply day to day as you develop them.

Think about it for a moment.  If there were a 10 step or even 1,000,000 step process by 
which any query can be tuned (or even X% of queries for that matter), we would write a 
program to do it.  Oh don't get me wrong, there are many programs that actually try to do 
this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others.  What 
they do is primarily recommend indexing schemes to tune a query, suggest materialized 
views, offer to add hints to the query to try other access plans.  They show you 
different query plans for the same statement and allow you to pick one.  They offer 
"rules of thumb" (what I generally call ROT since the acronym and the word is maps to are 
so appropriate for each other) SQL optimizations - which if they were universally 
applicable - the optimizer would do it as a matter of fact.  In fact, the cost based 
optimizer does that already - it rewrites our queries all of the time.  These tuning 
tools use a very limited set of rules that sometimes can suggest that index or set of 
indexes you really should have thought of during your design.  

I'll close this idea out with this thought - if there were an N step process to tuning a 
query, to writing efficient SQL - the optimizer would incorporate it all and we would not 
be having a discussion about this topic at all.  It is like the search for the holy grail 
- maybe someday the software will be sophisticated enough to be perfect in this regards, 
it will be able to take our SQL, understand the question being asked and process the 
question - rather then syntax.  

To me - writing efficient SQL requires a couple of things:

o    Knowledge of the physical organization of what I'm asked to query against.  That is 
- the schema.  Knowledge that the physical organization was actually designed in order to 
help me answer my frequently asked questions (refer back to the chapter on designing an 
efficient schema for advice in that arena)

o    Knowledge of what the database is capable of doing.  If I did not know about "skip 
scan indexes" and what they did (we'll cover them below) - I might look at a schema and 
say "ah hah, we are missing an index" when in fact we are not.

o    Knowledge of all of the intricacies of SQL - from the lowly "WHERE" clause on up to 
analytics and psuedo columns.  Knowledge of what using a particular construct will do to 
my runtime processing.

o    And most importantly of all - a solid understanding of the goal, of what the 
question is.  Tuning a query or process is really hard (impossible I would say) - unless 
you understand the question in the first place.  I cannot tell you how many times I've 
not been able to tune a query until I had the question in hand.  Certainly you can derive 
a question from a query - however, many times that derived question is much more 
confining then the real question being asked.  For example, many people use outer joins 
in all queries - they are "afraid" of losing a row (perhaps they got "burned" in some 
past experience and now use outer joins everywhere).  If the objects are related in a one 
to one mandatory fashion - we don't need an outer join at all.  The question derived from 
the query is much more confining then reality.

That last topic or point is so important, I'll close out this section with it. In this 
chapter we'll cover the topics of what the database is capable of doing in general - 
looking at many of the access paths and join operations available to us.  We'll look at 
what SQL is capable of doing - not by discussing the entire language, that in itself is a 
book.  Rather, we'll look at a couple of things that will whet you appetite - show you 
how powerful this language can be, how much more than just "SELECT" "FROM" "WHERE" and  
"ORDER BY" there is.  Then we'll close up with a look at that most important topic - why 
understanding the question is more important then having a query at hand to tune.

So, this section will not provide you with the N steps you need to follow in order to 
tune a query or write the best queries in the world.  For every rule of thumb out there 
anyone has ever shown me regarding writing "efficient SQL", I've been able to come up 
with a slew of common (not esoteric) counter cases to prove that rule of thumb is wrong 
in as many cases as it is right.  I've talked to people who swear "NOT IN" is fatal, 
never use it - always use NOT EXISTS.  Then I show them NOT IN running a query 10 times 
faster then NOT EXISTS.   I talk with people who feel NOT EXISTS is the worst construct 
on the planet - you must use IN.  Then I do the same - showing them how NOT EXISTS can 
run many times faster then IN.
My Experiences with Life

Thursday, April 2, 2009

An Extract from AskTom

No comments:

Post a Comment