Thursday, April 2, 2009

An Extract from AskTom

Efficient SQL

This was probably the hardest part of the book to write - this chapter. That is not
because the material is all that complex, rather because I know what people want - and I
know what can be delivered. What people want: The 10 step process by which you can tune
any query. What can be delivered: Knowledge about how queries are processed, knowledge
you can use and apply day to day as you develop them.

Think about it for a moment. If there were a 10 step or even 1,000,000 step process by
which any query can be tuned (or even X% of queries for that matter), we would write a
program to do it. Oh don't get me wrong, there are many programs that actually try to do
this - Oracle Enterprise Manager with its tuning pack, SQL Navigator and others. What
they do is primarily recommend indexing schemes to tune a query, suggest materialized
views, offer to add hints to the query to try other access plans. They show you
different query plans for the same statement and allow you to pick one. They offer
"rules of thumb" (what I generally call ROT since the acronym and the word is maps to are
so appropriate for each other) SQL optimizations - which if they were universally
applicable - the optimizer would do it as a matter of fact. In fact, the cost based
optimizer does that already - it rewrites our queries all of the time. These tuning
tools use a very limited set of rules that sometimes can suggest that index or set of
indexes you really should have thought of during your design.

I'll close this idea out with this thought - if there were an N step process to tuning a
query, to writing efficient SQL - the optimizer would incorporate it all and we would not
be having a discussion about this topic at all. It is like the search for the holy grail
- maybe someday the software will be sophisticated enough to be perfect in this regards,
it will be able to take our SQL, understand the question being asked and process the
question - rather then syntax.

To me - writing efficient SQL requires a couple of things:

o Knowledge of the physical organization of what I'm asked to query against. That is
- the schema. Knowledge that the physical organization was actually designed in order to
help me answer my frequently asked questions (refer back to the chapter on designing an
efficient schema for advice in that arena)

o Knowledge of what the database is capable of doing. If I did not know about "skip
scan indexes" and what they did (we'll cover them below) - I might look at a schema and
say "ah hah, we are missing an index" when in fact we are not.

o Knowledge of all of the intricacies of SQL - from the lowly "WHERE" clause on up to
analytics and psuedo columns. Knowledge of what using a particular construct will do to
my runtime processing.

o And most importantly of all - a solid understanding of the goal, of what the
question is. Tuning a query or process is really hard (impossible I would say) - unless
you understand the question in the first place. I cannot tell you how many times I've
not been able to tune a query until I had the question in hand. Certainly you can derive
a question from a query - however, many times that derived question is much more
confining then the real question being asked. For example, many people use outer joins
in all queries - they are "afraid" of losing a row (perhaps they got "burned" in some
past experience and now use outer joins everywhere). If the objects are related in a one
to one mandatory fashion - we don't need an outer join at all. The question derived from
the query is much more confining then reality.

That last topic or point is so important, I'll close out this section with it. In this
chapter we'll cover the topics of what the database is capable of doing in general -
looking at many of the access paths and join operations available to us. We'll look at
what SQL is capable of doing - not by discussing the entire language, that in itself is a
book. Rather, we'll look at a couple of things that will whet you appetite - show you
how powerful this language can be, how much more than just "SELECT" "FROM" "WHERE" and
"ORDER BY" there is. Then we'll close up with a look at that most important topic - why
understanding the question is more important then having a query at hand to tune.

So, this section will not provide you with the N steps you need to follow in order to
tune a query or write the best queries in the world. For every rule of thumb out there
anyone has ever shown me regarding writing "efficient SQL", I've been able to come up
with a slew of common (not esoteric) counter cases to prove that rule of thumb is wrong
in as many cases as it is right. I've talked to people who swear "NOT IN" is fatal,
never use it - always use NOT EXISTS. Then I show them NOT IN running a query 10 times
faster then NOT EXISTS. I talk with people who feel NOT EXISTS is the worst construct
on the planet - you must use IN. Then I do the same - showing them how NOT EXISTS can
run many times faster then IN.

No comments:

Post a Comment