Categories: MSDN / DotNet / Java / Scripts / Linux / PHP Ask - La ask - La Answer

SQL Performance

Group:

I'm working on some SQL and was trying to squeeze the best performance from
the SQL and wondered what is the "Best method"

The Stored Procedure I've written is as follows:
the variables are passed in when the procedure is called.

SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID, Billable,
TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department as
HomeDPT, TS.SundayDT
FROM tblTimesheet ts
INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid

AND SundayDT = @SunDate

INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID

AND Te.empID=@EmpID

LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID, TSC.Billable,
TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
ORDER BY TE.EmpID, ts.sundaydt

My question is this:
Am I better off putting the AND statements that I currently have located
on the inner joins in a where clause or do I get better performance limiting
the join criteria. I'm not exactly sure where or what the performance gains
would be.

The only other question I have is how would I tell the differences in performance?
From a time execution stand point the query takes fractions of a second
to execute on hundreds of thousands of records. What is a good methodology
to find out if a change improves performance?

Q*bert
[1720 byte] By [Q*bert] at [2007-11-9 21:10:30]
# 1 Re: SQL Performance
Hello,

I can't say if using something in a join versus a where clause gives better
performance. I think you need to test it for yourself. I usually use query
analyzer. You can assemble a query two different ways and run both versions
in the same batch with 'Show Execution Plan' on. At a high level, the execution
plan will show you the cost of each towards the batch. This makes it easy
to tell which is faster. You can drill down further into the execution plans
and see what areas are slow. You might find something else that improves
the performance significantly.

Kevin

"Q*bert" <luke_davis_76@hotmail.com> wrote:
>
>Group:
>
>I'm working on some SQL and was trying to squeeze the best performance from
>the SQL and wondered what is the "Best method"
>
>The Stored Procedure I've written is as follows:
>the variables are passed in when the procedure is called.
>
>SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID, Billable,
>TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
as
>HomeDPT, TS.SundayDT
>FROM tblTimesheet ts
>INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>
>AND SundayDT = @SunDate
>
>INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>
>AND Te.empID=@EmpID
>
>LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
>LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
>LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
>LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
>GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID, TSC.Billable,
>TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
>ORDER BY TE.EmpID, ts.sundaydt
>
>My question is this:
>Am I better off putting the AND statements that I currently have located
>on the inner joins in a where clause or do I get better performance limiting
>the join criteria. I'm not exactly sure where or what the performance gains
>would be.
>
>The only other question I have is how would I tell the differences in performance?
> From a time execution stand point the query takes fractions of a second
>to execute on hundreds of thousands of records. What is a good methodology
>to find out if a change improves performance?
>
>Q*bert
>
Kevin at 2007-11-11 23:52:36 >
# 2 Re: SQL Performance
Oracle and SQL server both have query analyzers to help you tweak queries
so that you can determine performance bottlenecks...even down to queries
that (in your case) look the same from a time execution standpoint, the analyzers
can give you relative cost estimates that can be more useful than actual
clock time.

Typically, joins (on indexed fields) will run much faster than 'where in
<subquery>' type SQL statements...but that's not always a hard and fast rule...as
you probably already know...

The other possibility is load a ridiculously large amount of data in your
tables...then do the clock time test of your queries...see what those results
are...

Chris

"Q*bert" <luke_davis_76@hotmail.com> wrote:
>
>Group:
>
>I'm working on some SQL and was trying to squeeze the best performance from
>the SQL and wondered what is the "Best method"
>
>The Stored Procedure I've written is as follows:
>the variables are passed in when the procedure is called.
>
>SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID, Billable,
>TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
as
>HomeDPT, TS.SundayDT
>FROM tblTimesheet ts
>INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>
>AND SundayDT = @SunDate
>
>INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>
>AND Te.empID=@EmpID
>
>LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
>LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
>LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
>LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
>GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID, TSC.Billable,
>TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
>ORDER BY TE.EmpID, ts.sundaydt
>
>My question is this:
>Am I better off putting the AND statements that I currently have located
>on the inner joins in a where clause or do I get better performance limiting
>the join criteria. I'm not exactly sure where or what the performance gains
>would be.
>
>The only other question I have is how would I tell the differences in performance?
> From a time execution stand point the query takes fractions of a second
>to execute on hundreds of thousands of records. What is a good methodology
>to find out if a change improves performance?
>
>Q*bert
>
Chris Hylton at 2007-11-11 23:53:47 >
# 3 Re: SQL Performance
"Q*bert" <luke_davis_76@hotmail.com> wrote:
>SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID, Billable,
>TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
as
>HomeDPT, TS.SundayDT
>FROM tblTimesheet ts
>INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>
>AND SundayDT = @SunDate
>
>INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>
>AND Te.empID=@EmpID
>
>LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
>LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
>LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
>LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
>GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID, TSC.Billable,
>TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
>ORDER BY TE.EmpID, ts.sundaydt
>

You'll also want to try adding "AND Ts.empID=@EmpID".
Even though you'd think that the optimizer should be able to figure it out
from the join criteria, I've reduced query execution time from 1 minute to
a few seconds just by adding redundant clauses - e.g., having both "Te.empID=@EmpID"
and "Ts.empID=@EmpID".
Dave at 2007-11-11 23:54:40 >
# 4 Re: SQL Performance
Q*bert,

It's worth noting that on SQL Server, if you put the criteria in the JOIN,
it will be applied BEFORE the join is performed. If you put it in the WHERE
clause, it will be applied AFTER the join. As a general rule, put as much
criteria in the join as possible. This is particulary tricky with LEFT
JOINS - you must be careful, because the two strategies are not
interchangeable, as you might think, especially if you do IS NOT NULL or IS
NULL on one of the columns from the outer table. You can get two totally
different sets from apparently identical queries, because of the concept
explained above.

Also, for performance testing of queries like this, you must do them against
large volumes of data, with distribution of values that will approximate
your production system. Bear in mind the fact that once you run a given
query, the data for that query now resides in 'warm cache', and the query
will run faster in subsequent executions.

For SQL Server (only on a development box), you can get around this with:

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

James Hokes

"Q*bert" <luke_davis_76@hotmail.com> wrote in message
news:3da2e82a$1@10.1.10.29...
>
> Group:
>
> I'm working on some SQL and was trying to squeeze the best performance
from
> the SQL and wondered what is the "Best method"
>
> The Stored Procedure I've written is as follows:
> the variables are passed in when the procedure is called.
>
> SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID,
Billable,
> TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department as
> HomeDPT, TS.SundayDT
> FROM tblTimesheet ts
> INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>
> AND SundayDT = @SunDate
>
> INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>
> AND Te.empID=@EmpID
>
> LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
> LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
> LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
> LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
> GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID,
TSC.Billable,
> TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
> ORDER BY TE.EmpID, ts.sundaydt
>
> My question is this:
> Am I better off putting the AND statements that I currently have located
> on the inner joins in a where clause or do I get better performance
limiting
> the join criteria. I'm not exactly sure where or what the performance
gains
> would be.
>
> The only other question I have is how would I tell the differences in
performance?
> From a time execution stand point the query takes fractions of a second
> to execute on hundreds of thousands of records. What is a good
methodology
> to find out if a change improves performance?
>
> Q*bert
>
James Hokes at 2007-11-11 23:55:39 >
# 5 Re: SQL Performance
Thanks James. I was pretty certain that placing the criteria on the join
imporved performance. Just for my sanity, where did you uncover this. I've
spent lots of time on MS site and google and haven't found what you stated.
It seems like no one wanted to talk about it when it came to the performance.

Thanks again,
Q*bert

"James Hokes" <noemail@noway.com> wrote:
>Q*bert,
>
>It's worth noting that on SQL Server, if you put the criteria in the JOIN,
>it will be applied BEFORE the join is performed. If you put it in the WHERE
>clause, it will be applied AFTER the join. As a general rule, put as much
>criteria in the join as possible. This is particulary tricky with LEFT
>JOINS - you must be careful, because the two strategies are not
>interchangeable, as you might think, especially if you do IS NOT NULL or
IS
>NULL on one of the columns from the outer table. You can get two totally
>different sets from apparently identical queries, because of the concept
>explained above.
>
>Also, for performance testing of queries like this, you must do them against
>large volumes of data, with distribution of values that will approximate
>your production system. Bear in mind the fact that once you run a given
>query, the data for that query now resides in 'warm cache', and the query
>will run faster in subsequent executions.
>
>For SQL Server (only on a development box), you can get around this with:
>
>DBCC DROPCLEANBUFFERS
>DBCC FREEPROCCACHE
>
>James Hokes
>
>"Q*bert" <luke_davis_76@hotmail.com> wrote in message
>news:3da2e82a$1@10.1.10.29...
>>
>> Group:
>>
>> I'm working on some SQL and was trying to squeeze the best performance
>from
>> the SQL and wondered what is the "Best method"
>>
>> The Stored Procedure I've written is as follows:
>> the variables are passed in when the procedure is called.
>>
>> SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID,
>Billable,
>> TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
as
>> HomeDPT, TS.SundayDT
>> FROM tblTimesheet ts
>> INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>>
>> AND SundayDT = @SunDate
>>
>> INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>>
>> AND Te.empID=@EmpID
>>
>> LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
>> LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
>> LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
>> LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
>> GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID,
>TSC.Billable,
>> TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
>> ORDER BY TE.EmpID, ts.sundaydt
>>
>> My question is this:
>> Am I better off putting the AND statements that I currently have located
>> on the inner joins in a where clause or do I get better performance
>limiting
>> the join criteria. I'm not exactly sure where or what the performance
>gains
>> would be.
>>
>> The only other question I have is how would I tell the differences in
>performance?
>> From a time execution stand point the query takes fractions of a second
>> to execute on hundreds of thousands of records. What is a good
>methodology
>> to find out if a change improves performance?
>>
>> Q*bert
>>
>
>
Q*bert at 2007-11-11 23:56:46 >
# 6 Re: SQL Performance
Q*bert,

The short answer is that I've been working with SQL Server since the early
days of the product, and after a while, you get to where you've just figured
these things out on your own (or sometimes, you get a nugget when you're on
a call with MS support :-). Try hanging on the ms public groups - sometimes
you run into authors and such, and I've picked up a good deal of info that
way, too. Joe Celko and Kalen Delaney come to mind as two personalities from
whom you can learn a great deal.

James Hokes

"Q*bert" <luke_davis_76@hotmail.com> wrote in message
news:3db45340$1@tnews.web.dev-archive.com...
>
> Thanks James. I was pretty certain that placing the criteria on the join
> imporved performance. Just for my sanity, where did you uncover this.
I've
> spent lots of time on MS site and google and haven't found what you
stated.
> It seems like no one wanted to talk about it when it came to the
performance.
>
> Thanks again,
> Q*bert
>
> "James Hokes" <noemail@noway.com> wrote:
> >Q*bert,
> >
> >It's worth noting that on SQL Server, if you put the criteria in the
JOIN,
> >it will be applied BEFORE the join is performed. If you put it in the
WHERE
> >clause, it will be applied AFTER the join. As a general rule, put as much
> >criteria in the join as possible. This is particulary tricky with LEFT
> >JOINS - you must be careful, because the two strategies are not
> >interchangeable, as you might think, especially if you do IS NOT NULL or
> IS
> >NULL on one of the columns from the outer table. You can get two totally
> >different sets from apparently identical queries, because of the concept
> >explained above.
> >
> >Also, for performance testing of queries like this, you must do them
against
> >large volumes of data, with distribution of values that will approximate
> >your production system. Bear in mind the fact that once you run a given
> >query, the data for that query now resides in 'warm cache', and the query
> >will run faster in subsequent executions.
> >
> >For SQL Server (only on a development box), you can get around this with:
> >
> >DBCC DROPCLEANBUFFERS
> >DBCC FREEPROCCACHE
> >
> >James Hokes
> >
> >"Q*bert" <luke_davis_76@hotmail.com> wrote in message
> >news:3da2e82a$1@10.1.10.29...
> >>
> >> Group:
> >>
> >> I'm working on some SQL and was trying to squeeze the best performance
> >from
> >> the SQL and wondered what is the "Best method"
> >>
> >> The Stored Procedure I've written is as follows:
> >> the variables are passed in when the procedure is called.
> >>
> >> SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID,
> >Billable,
> >> TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
> as
> >> HomeDPT, TS.SundayDT
> >> FROM tblTimesheet ts
> >> INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
> >>
> >> AND SundayDT = @SunDate
> >>
> >> INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
> >>
> >> AND Te.empID=@EmpID
> >>
> >> LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
> >> LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
> >> LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
> >> LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
> >> GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID,
> >TSC.Billable,
> >> TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
> >> ORDER BY TE.EmpID, ts.sundaydt
> >>
> >> My question is this:
> >> Am I better off putting the AND statements that I currently have
located
> >> on the inner joins in a where clause or do I get better performance
> >limiting
> >> the join criteria. I'm not exactly sure where or what the performance
> >gains
> >> would be.
> >>
> >> The only other question I have is how would I tell the differences in
> >performance?
> >> From a time execution stand point the query takes fractions of a
second
> >> to execute on hundreds of thousands of records. What is a good
> >methodology
> >> to find out if a change improves performance?
> >>
> >> Q*bert
> >>
> >
> >
>
James Hokes at 2007-11-11 23:57:50 >
# 7 Re: SQL Performance
One thing you can do is plug it into the Query Analyzer and turn on the graphical
query analyzer. This will show you a lot about the costs of your query and
the modifcations you make on it. Look up Showplan in BOL.

"James Hokes" <noemail@noway.com> wrote:
>Q*bert,
>
>The short answer is that I've been working with SQL Server since the early
>days of the product, and after a while, you get to where you've just figured
>these things out on your own (or sometimes, you get a nugget when you're
on
>a call with MS support :-). Try hanging on the ms public groups - sometimes
>you run into authors and such, and I've picked up a good deal of info that
>way, too. Joe Celko and Kalen Delaney come to mind as two personalities
from
>whom you can learn a great deal.
>
>James Hokes
>
>
>"Q*bert" <luke_davis_76@hotmail.com> wrote in message
>news:3db45340$1@tnews.web.dev-archive.com...
>>
>> Thanks James. I was pretty certain that placing the criteria on the join
>> imporved performance. Just for my sanity, where did you uncover this.
>I've
>> spent lots of time on MS site and google and haven't found what you
>stated.
>> It seems like no one wanted to talk about it when it came to the
>performance.
>>
>> Thanks again,
>> Q*bert
>>
>> "James Hokes" <noemail@noway.com> wrote:
>> >Q*bert,
>> >
>> >It's worth noting that on SQL Server, if you put the criteria in the
>JOIN,
>> >it will be applied BEFORE the join is performed. If you put it in the
>WHERE
>> >clause, it will be applied AFTER the join. As a general rule, put as
much
>> >criteria in the join as possible. This is particulary tricky with LEFT
>> >JOINS - you must be careful, because the two strategies are not
>> >interchangeable, as you might think, especially if you do IS NOT NULL
or
>> IS
>> >NULL on one of the columns from the outer table. You can get two totally
>> >different sets from apparently identical queries, because of the concept
>> >explained above.
>> >
>> >Also, for performance testing of queries like this, you must do them
>against
>> >large volumes of data, with distribution of values that will approximate
>> >your production system. Bear in mind the fact that once you run a given
>> >query, the data for that query now resides in 'warm cache', and the query
>> >will run faster in subsequent executions.
>> >
>> >For SQL Server (only on a development box), you can get around this with:
>> >
>> >DBCC DROPCLEANBUFFERS
>> >DBCC FREEPROCCACHE
>> >
>> >James Hokes
>> >
>> >"Q*bert" <luke_davis_76@hotmail.com> wrote in message
>> >news:3da2e82a$1@10.1.10.29...
>> >>
>> >> Group:
>> >>
>> >> I'm working on some SQL and was trying to squeeze the best performance
>> >from
>> >> the SQL and wondered what is the "Best method"
>> >>
>> >> The Stored Procedure I've written is as follows:
>> >> the variables are passed in when the procedure is called.
>> >>
>> >> SELECT TE.EmpID, Sum(Hours) as SumHours, TSC.AllocationID, TypeID,
>> >Billable,
>> >> TD.DepartmentID as ProjectDPT, TTT.DeptID as TimeTypeDPT, Te.Department
>> as
>> >> HomeDPT, TS.SundayDT
>> >> FROM tblTimesheet ts
>> >> INNER JOIN tblTimeSheetCell tsc on ts.timesheetID = tsc.timesheetid
>> >>
>> >> AND SundayDT = @SunDate
>> >>
>> >> INNER JOIN tblEmployee te on te.EmpID = Ts.EmpID
>> >>
>> >> AND Te.empID=@EmpID
>> >>
>> >> LEFT JOIN tblAllocation ta on TA.allocationID = tsc.AllocationID
>> >> LEFT JOIN tblTimeTypes TTT on TTT.TimeTypeID = tsc.TypeID
>> >> LEFT JOIN tblProject TP on TP.ProjectID = TA.ProjectID
>> >> LEFT JOIN tblDepartments td on td.DepartmentID = TP.DepartmentID
>> >> GROUP BY TE.EmpID, TS.SundayDT, TSC.AllocationID, TSC.TypeID,
>> >TSC.Billable,
>> >> TD.DepartmentID, TTT.DeptID, ts.sundaydt, TE.Department
>> >> ORDER BY TE.EmpID, ts.sundaydt
>> >>
>> >> My question is this:
>> >> Am I better off putting the AND statements that I currently have
>located
>> >> on the inner joins in a where clause or do I get better performance
>> >limiting
>> >> the join criteria. I'm not exactly sure where or what the performance
>> >gains
>> >> would be.
>> >>
>> >> The only other question I have is how would I tell the differences
in
>> >performance?
>> >> From a time execution stand point the query takes fractions of a
>second
>> >> to execute on hundreds of thousands of records. What is a good
>> >methodology
>> >> to find out if a change improves performance?
>> >>
>> >> Q*bert
>> >>
>> >
>> >
>>
>
>
C. E. Buttles at 2007-11-11 23:58:43 >