sql server 2008 - How to most efficiently join two tables? -
sql server 2008 - How to most efficiently join two tables? -
i have 2 tables store amounts , adjustments lineitemtypes of specific reportingperiod. looking efficient way query amount , adjustment each reportingperiod/lineitemtype combination exists across 2 tables.
schemas nowadays below:
@reportingperiodcomposition (1030 rows - table variable)
src int, groupreportingperiodid int, reportingperiodid int, clientid int, perioddate date, ... primary key clustered (src, reportingperiodid)
amount (~30,000,000 rows)
reportingperiodid int, lineitemtypeid smallint, amount decimal, primary key clustered (reportingperiodid, lineitemtypeid)
adjustment (~180,000 rows)
reportingperiodid int, lineitemtypeid smallint, amount decimal, comment nvarchar(2500), ... adjustmentid int, primary key nonclustered (adjustmentid), unique key clustered (reportingperiodid, lineitemtypeid)
i select amounts , adjustments unique reportingperiodid/lineitemtypeid yielding next result set:
| reportingperiodid | lineitemtypeid | amount | adjustment |
currently using next query, curious see if has thoughts on how can done more efficiently. suggestions welcome!
select rpc.reportingperiodid, coalesce(a.lineitemtypeid, adj.lineitemtypeid) lineitemtypeid, a.amount, adj.amount adjustment @reportingperiodcomposition rpc left bring together watchlist.risk.amount on rpc.reportingperiodid = a.reportingperiodid left bring together watchlist.risk.adjustment adj on rpc.reportingperiodid = adj.reportingperiodid , (a.reportingperiodid null or a.lineitemtypeid = adj.lineitemtypeid) src = @src , (a.lineitemtypeid not null or adj.lineitemtypeid not null)
note @src variable necessary determine source values need pull @reportingperiodcomposition table variable. query results in ~138,000 rows:
1 row has both amount , adjustment although number may vary depending on reportingperiodcomposition 0 rows have adjustment although status not guaranteedexecution plan xml
<?xml version="1.0" encoding="utf-16"?> <showplanxml xmlns:xsi="http://www.w3.org/2001/xmlschema-instance" xmlns:xsd="http://www.w3.org/2001/xmlschema" version="1.1" build="10.0.4064.0" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan"> <batchsequence> <batch> <statements> <stmtsimple statementcompid="9" statementestrows="104.769" statementid="5" statementoptmlevel="full" statementoptmearlyabortreason="goodenoughplanfound" statementsubtreecost="0.343989" statementtext="select
 rpc.reportingperiodid,
 coalesce(a.lineitemtypeid, adj.lineitemtypeid) lineitemtypeid,
 a.amount,
 adj.amount adjustment
from @reportingperiodcomposition rpc
left bring together rating.risk.amount a
 on rpc.reportingperiodid = a.reportingperiodid
left bring together rating.risk.adjustment adj
 on rpc.reportingperiodid = adj.reportingperiodid
 , (a.reportingperiodid null or a.lineitemtypeid = adj.lineitemtypeid)
where
 src = @src
 , (a.lineitemtypeid not null or adj.lineitemtypeid not null)" statementtype="select" queryhash="0x425781a4c1d20919" queryplanhash="0xf3e9dd0adad04044"> <statementsetoptions ansi_nulls="true" ansi_padding="true" ansi_warnings="true" arithabort="true" concat_null_yields_null="true" numeric_roundabort="false" quoted_identifier="true" /> <queryplan degreeofparallelism="1" cachedplansize="24" compiletime="5" compilecpu="5" compilememory="424"> <relop avgrowsize="31" estimatecpu="1.04769e-05" estimateio="0" estimaterebinds="0" estimaterewinds="0" estimaterows="104.769" logicalop="compute scalar" nodeid="0" parallel="false" physicalop="compute scalar" estimatedtotalsubtreecost="0.343989"> <outputlist> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="amount" /> <columnreference column="expr1006" /> </outputlist> <computescalar> <definedvalues> <definedvalue> <columnreference column="expr1006" /> <scalaroperator scalarstring="case when [rating].[risk].[amount].[lineitemtypeid] [a].[lineitemtypeid] not null [rating].[risk].[amount].[lineitemtypeid] [a].[lineitemtypeid] else [rating].[risk].[adjustment].[lineitemtypeid] [adj].[lineitemtypeid] end"> <if> <condition> <scalaroperator> <compare compareop="is not"> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> </identifier> </scalaroperator> <scalaroperator> <const constvalue="null" /> </scalaroperator> </compare> </scalaroperator> </condition> <then> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> </identifier> </scalaroperator> </then> <else> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> </identifier> </scalaroperator> </else> </if> </scalaroperator> </definedvalue> </definedvalues> <relop avgrowsize="33" estimatecpu="9.21971e-05" estimateio="0" estimaterebinds="0" estimaterewinds="0" estimaterows="104.769" logicalop="filter" nodeid="1" parallel="false" physicalop="filter" estimatedtotalsubtreecost="0.343979"> <outputlist> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="amount" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="137631" actualendofscans="1" actualexecutions="1" /> </runtimeinformation> <filter startupexpression="false"> <relop avgrowsize="33" estimatecpu="0.000437936" estimateio="0" estimaterebinds="0" estimaterewinds="0" estimaterows="104.769" logicalop="left outer join" nodeid="2" parallel="false" physicalop="nested loops" estimatedtotalsubtreecost="0.343886"> <outputlist> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="amount" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="137647" actualendofscans="1" actualexecutions="1" /> </runtimeinformation> <nestedloops optimized="false" withunorderedprefetch="true"> <outerreferences> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> <columnreference column="expr1009" /> </outerreferences> <relop avgrowsize="26" estimatecpu="0.000437936" estimateio="0" estimaterebinds="0" estimaterewinds="0" estimaterows="104.769" logicalop="left outer join" nodeid="4" parallel="false" physicalop="nested loops" estimatedtotalsubtreecost="0.00711828"> <outputlist> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="137647" actualendofscans="1" actualexecutions="1" /> </runtimeinformation> <nestedloops optimized="false"> <outerreferences> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> </outerreferences> <relop avgrowsize="11" estimatecpu="0.0001581" estimateio="0.003125" estimaterebinds="0" estimaterewinds="0" estimaterows="1" logicalop="clustered index seek" nodeid="5" parallel="false" physicalop="clustered index seek" estimatedtotalsubtreecost="0.0032831" tablecardinality="0"> <outputlist> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="1030" actualendofscans="1" actualexecutions="1" /> </runtimeinformation> <indexscan ordered="true" scandirection="forward" forcedindex="false" forceseek="false" noexpandhint="false"> <definedvalues> <definedvalue> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> </definedvalue> </definedvalues> <object table="[@reportingperiodcomposition]" index="[pk__#6fdf7df__f9abee3f71c7c670]" alias="[rpc]" /> <seekpredicates> <seekpredicatenew> <seekkeys> <prefix scantype="eq"> <rangecolumns> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="src" /> </rangecolumns> <rangeexpressions> <scalaroperator scalarstring="[@src]"> <identifier> <columnreference column="@src" /> </identifier> </scalaroperator> </rangeexpressions> </prefix> </seekkeys> </seekpredicatenew> </seekpredicates> </indexscan> </relop> <relop avgrowsize="22" estimatecpu="0.000272246" estimateio="0.003125" estimaterebinds="0" estimaterewinds="0" estimaterows="104.769" logicalop="clustered index seek" nodeid="6" parallel="false" physicalop="clustered index seek" estimatedtotalsubtreecost="0.00339725" tablecardinality="29974300"> <outputlist> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="137631" actualendofscans="1030" actualexecutions="1030" /> </runtimeinformation> <indexscan ordered="true" scandirection="forward" forcedindex="false" forceseek="false" noexpandhint="false"> <definedvalues> <definedvalue> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> </definedvalue> <definedvalue> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> </definedvalue> <definedvalue> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="amount" /> </definedvalue> </definedvalues> <object database="[rating]" schema="[risk]" table="[amount]" index="[pk_amount]" alias="[a]" indexkind="clustered" /> <seekpredicates> <seekpredicatenew> <seekkeys> <prefix scantype="eq"> <rangecolumns> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> </rangecolumns> <rangeexpressions> <scalaroperator scalarstring="@reportingperiodcomposition.[reportingperiodid] [rpc].[reportingperiodid]"> <identifier> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> </identifier> </scalaroperator> </rangeexpressions> </prefix> </seekkeys> </seekpredicatenew> </seekpredicates> </indexscan> </relop> </nestedloops> </relop> <relop avgrowsize="18" estimatecpu="0.000165111" estimateio="0.003125" estimaterebinds="103.769" estimaterewinds="0" estimaterows="1" logicalop="clustered index seek" nodeid="7" parallel="false" physicalop="clustered index seek" estimatedtotalsubtreecost="0.33565" tablecardinality="178911"> <outputlist> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="amount" /> </outputlist> <runtimeinformation> <runtimecountersperthread thread="0" actualrows="1" actualendofscans="137647" actualexecutions="137647" /> </runtimeinformation> <indexscan ordered="true" scandirection="forward" forcedindex="false" forceseek="false" noexpandhint="false"> <definedvalues> <definedvalue> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> </definedvalue> <definedvalue> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="amount" /> </definedvalue> </definedvalues> <object database="[rating]" schema="[risk]" table="[adjustment]" index="[ix_adjustment_reportingperiodid_lineitemtypeid]" alias="[adj]" indexkind="clustered" /> <seekpredicates> <seekpredicatenew> <seekkeys> <prefix scantype="eq"> <rangecolumns> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="reportingperiodid" /> </rangecolumns> <rangeexpressions> <scalaroperator scalarstring="@reportingperiodcomposition.[reportingperiodid] [rpc].[reportingperiodid]"> <identifier> <columnreference table="@reportingperiodcomposition" alias="[rpc]" column="reportingperiodid" /> </identifier> </scalaroperator> </rangeexpressions> </prefix> </seekkeys> </seekpredicatenew> </seekpredicates> <predicate> <scalaroperator scalarstring="[rating].[risk].[amount].[reportingperiodid] [a].[reportingperiodid] null or [rating].[risk].[amount].[lineitemtypeid] [a].[lineitemtypeid]=[rating].[risk].[adjustment].[lineitemtypeid] [adj].[lineitemtypeid]"> <logical operation="or"> <scalaroperator> <compare compareop="is"> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="reportingperiodid" /> </identifier> </scalaroperator> <scalaroperator> <const constvalue="null" /> </scalaroperator> </compare> </scalaroperator> <scalaroperator> <compare compareop="eq"> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> </identifier> </scalaroperator> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> </identifier> </scalaroperator> </compare> </scalaroperator> </logical> </scalaroperator> </predicate> </indexscan> </relop> </nestedloops> </relop> <predicate> <scalaroperator scalarstring="[rating].[risk].[amount].[lineitemtypeid] [a].[lineitemtypeid] not null or [rating].[risk].[adjustment].[lineitemtypeid] [adj].[lineitemtypeid] not null"> <logical operation="or"> <scalaroperator> <compare compareop="is not"> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[amount]" alias="[a]" column="lineitemtypeid" /> </identifier> </scalaroperator> <scalaroperator> <const constvalue="null" /> </scalaroperator> </compare> </scalaroperator> <scalaroperator> <compare compareop="is not"> <scalaroperator> <identifier> <columnreference database="[rating]" schema="[risk]" table="[adjustment]" alias="[adj]" column="lineitemtypeid" /> </identifier> </scalaroperator> <scalaroperator> <const constvalue="null" /> </scalaroperator> </compare> </scalaroperator> </logical> </scalaroperator> </predicate> </filter> </relop> </computescalar> </relop> <parameterlist> <columnreference column="@src" parameterruntimevalue="(2)" /> </parameterlist> </queryplan> </stmtsimple> </statements> </batch> </batchsequence> </showplanxml>
there nil particularily bad within query plan have posted can see - suspect sql making right choices. thing spot dodgy query plan estimates , actual number of rows returned quite far apart - indiates stats not exclusively date - forcibly update stats , see if continues utilize same query plan.
if having issue inconsistent performance, on dev box clear query plan cache , generate query plan @src
value produce few rows, clear plan cache , generate query plan @src
value produce big amount of rows returned. if query plans same ok, if different may need utilize optimize for
hint. happens on parameterized queries first run of them determines plan sits in cache - , until plan ages out, subsequent runs of query utilize same plan.
you have provide more info specific problem encountering / looking solve having reviewed?
sql-server-2008 tsql query-optimization
Comments
Post a Comment