patternsqlMinor
Bad row estimate following Compute Scalar operator in plan
Viewed 0 times
estimateoperatorcomputescalarbadfollowingplanrow
Problem
I'm struggling to understand where a row estimate is coming from in an execution plan.
Paste the plan link
```
declare
@BatchKey INT = 1, @ParentBatchKey INT = 1,
@QuoteRef varchar(50) = 'Q00018249',
@MpanRef varchar(50) = '1425431100004'
SELECT DISTINCT
ISNULL(c.ContractReference,-1) AS [ContractReference] ,
ISNULL(d_cd.ContractDetailsKey,-1) AS [ContractDetailsKey] ,
-1 AccountManagerKey,
-1 SegmentationKey,
ISNULL(d_tpi.TpiKey,-1) AS [TpiKey] ,
ISNULL(d_cu.CustomerKey,-1) AS [CustomerKey] ,
ISNULL(d_p.ProductKey,-1) AS [ProductKey] ,
-1 as PayPointKey,
-1 AS [GspBandingKey], --Not used in Junifer ESOB
ISNULL(d_pps.[ProductPricingStructureKey],-1) AS [ProductPricingStructureKey],
ISNULL(d_tou.TouBandingKey,-1) AS [PricingStructureBandingKey],
-1 AS [VolumePointCategoryKey],
ISNULL(d_ppc.PowerPeriodCategoryKey,-1) AS [PowerPeriodCategoryKey],
ISNULL(d_pcat.[PriceComponentAggregationTypeKey],-1) AS [PriceComponentAggregationTypeKey],
-1 AS [MarginRateBandingKey], --Not used in Junifer ESOB
-1 AS [DuosUrcBandingKey], --Not used in Junifer ESOB
-1 AS [ConsumptionToleranceKey],
ISNULL(d_mp.MeterPointKey,-1) AS [MeterPointKey] ,
ISNULL(d.DateKey,-1) AS [ForecastDateKey] ,
-1 AS [ForecastEFADateKey],
ISNULL(d_cw.DateKey,-1) AS [ContractWonDateKey] ,
ISNULL(f.SiteVolumeKwh,0) AS [SiteVolume] ,
ISNULL(f.GspVolumeKwh,0) AS [GspVolume] ,
ISNULL(f.NbpVolumeKwh,0) AS [NbpVolume],
@BatchKey,
@ParentBatchKey,
CAST(f.ForecastKey as NVARCHAR(100)) AS [SourceId]
FROM
[Electricity].[Forecast] f
INNER JOIN Electricity.ContractMeterPoint cmp ON cmp.MeterPointKey = f.MeterPointKey and cmp.ContractKey = f.ContractKey
INNER JOIN Electricity.Contract c on c.ContractKey = cmp.ContractKey
INNER JOIN Electricity.MeterPoint mp O
Paste the plan link
```
declare
@BatchKey INT = 1, @ParentBatchKey INT = 1,
@QuoteRef varchar(50) = 'Q00018249',
@MpanRef varchar(50) = '1425431100004'
SELECT DISTINCT
ISNULL(c.ContractReference,-1) AS [ContractReference] ,
ISNULL(d_cd.ContractDetailsKey,-1) AS [ContractDetailsKey] ,
-1 AccountManagerKey,
-1 SegmentationKey,
ISNULL(d_tpi.TpiKey,-1) AS [TpiKey] ,
ISNULL(d_cu.CustomerKey,-1) AS [CustomerKey] ,
ISNULL(d_p.ProductKey,-1) AS [ProductKey] ,
-1 as PayPointKey,
-1 AS [GspBandingKey], --Not used in Junifer ESOB
ISNULL(d_pps.[ProductPricingStructureKey],-1) AS [ProductPricingStructureKey],
ISNULL(d_tou.TouBandingKey,-1) AS [PricingStructureBandingKey],
-1 AS [VolumePointCategoryKey],
ISNULL(d_ppc.PowerPeriodCategoryKey,-1) AS [PowerPeriodCategoryKey],
ISNULL(d_pcat.[PriceComponentAggregationTypeKey],-1) AS [PriceComponentAggregationTypeKey],
-1 AS [MarginRateBandingKey], --Not used in Junifer ESOB
-1 AS [DuosUrcBandingKey], --Not used in Junifer ESOB
-1 AS [ConsumptionToleranceKey],
ISNULL(d_mp.MeterPointKey,-1) AS [MeterPointKey] ,
ISNULL(d.DateKey,-1) AS [ForecastDateKey] ,
-1 AS [ForecastEFADateKey],
ISNULL(d_cw.DateKey,-1) AS [ContractWonDateKey] ,
ISNULL(f.SiteVolumeKwh,0) AS [SiteVolume] ,
ISNULL(f.GspVolumeKwh,0) AS [GspVolume] ,
ISNULL(f.NbpVolumeKwh,0) AS [NbpVolume],
@BatchKey,
@ParentBatchKey,
CAST(f.ForecastKey as NVARCHAR(100)) AS [SourceId]
FROM
[Electricity].[Forecast] f
INNER JOIN Electricity.ContractMeterPoint cmp ON cmp.MeterPointKey = f.MeterPointKey and cmp.ContractKey = f.ContractKey
INNER JOIN Electricity.Contract c on c.ContractKey = cmp.ContractKey
INNER JOIN Electricity.MeterPoint mp O
Solution
Why would the scalar operator be fed an estimate of ~14000 rows, then estimate an output of 5? Is this a problem or a red herring?
This is counter-intuitive, but a natural consequence of the way the query optimizer explores the plan space. As it generates new, logically-equivalent, alternatives for a particular plan operator or subtree, it may need to derive a new cardinality estimate.
Since estimation is a statistical process, there is no guarantee that estimates derived on logically-equivalent (but physically different) trees will produce the same number, in fact in the majority of cases, they won't. There is normally no obvious way to prefer one estimate over another.
When optimization reaches its end point, the best physical alternatives found are 'stitched together' to form the final plan. This plan can have 'inconsistencies' as a result, simply because estimates were computed on different logic structures at different times. For example, a Compute Scalar might have started out as a logical aggregate, which was later simplified.
I wrote more about this in my article Indexed Views and Statistics.
If you suspect the cardinality mis-estimate is affecting plan choice (in an important way), you may choose to split the query up manually or use hints. Materializing the small intermediate set at or around node 27 into a temporary table may well improve plan quality, since the optimizer can see accurate cardinality at that point and create automatic statistics. The query writer can also choose to add indexing to the temporary table.
Is it anything to do with the conversions it is performing? I can understand that affecting a join, but why would it affect the output of the conversion?
Not usually, no, though it is best to avoid conversions wherever possible. Certainly conversions can affect cardinality estimation, but there is little indication it is the cause here.
This is counter-intuitive, but a natural consequence of the way the query optimizer explores the plan space. As it generates new, logically-equivalent, alternatives for a particular plan operator or subtree, it may need to derive a new cardinality estimate.
Since estimation is a statistical process, there is no guarantee that estimates derived on logically-equivalent (but physically different) trees will produce the same number, in fact in the majority of cases, they won't. There is normally no obvious way to prefer one estimate over another.
When optimization reaches its end point, the best physical alternatives found are 'stitched together' to form the final plan. This plan can have 'inconsistencies' as a result, simply because estimates were computed on different logic structures at different times. For example, a Compute Scalar might have started out as a logical aggregate, which was later simplified.
I wrote more about this in my article Indexed Views and Statistics.
If you suspect the cardinality mis-estimate is affecting plan choice (in an important way), you may choose to split the query up manually or use hints. Materializing the small intermediate set at or around node 27 into a temporary table may well improve plan quality, since the optimizer can see accurate cardinality at that point and create automatic statistics. The query writer can also choose to add indexing to the temporary table.
Is it anything to do with the conversions it is performing? I can understand that affecting a join, but why would it affect the output of the conversion?
Not usually, no, though it is best to avoid conversions wherever possible. Certainly conversions can affect cardinality estimation, but there is little indication it is the cause here.
Context
StackExchange Database Administrators Q#215450, answer score: 8
Revisions (0)
No revisions yet.