patternsqlMinor
Update Null Values With Other Values
Viewed 0 times
updatewithnullvaluesother
Problem
I am new to the SQL Server world. In my example below, I have two companies:
Only one entry for each company has a value for the columns
I need to update the table so that each company has a value for all columns. In other words, I need to update all blank (null) values.
Example with made-up data:
Question
What would the syntax be to write an update statement to update the null values for columns
- Building Blocks
- Red Cement
Only one entry for each company has a value for the columns
power1, power2, power3.I need to update the table so that each company has a value for all columns. In other words, I need to update all blank (null) values.
Example with made-up data:
Create Table #Rentarious
(
c1 varchar(200)
,yryryryr datetime
,power1 varchar(200)
,power2 varchar(200)
,power3 varchar(200)
)
Insert Into #Rentarious VALUES
('Building Blocks','2016','Red','Blue','Green')
,('Red Cement', '2012', 'Pink','Purple','Orange')
Insert Into #Rentarious(c1, yryryryr) VALUES
('Building Blocks', '2012')
,('Building Blocks', '2013')
,('Building Blocks', '2014')
,('Red Cement', '2016')
,('Red Cement', '2011')Question
What would the syntax be to write an update statement to update the null values for columns
power1, power2, power3 to the values already listed in the table?Solution
What would the syntax be to write an update statement to update the null values for fields
I take this to mean that each company's nulls should be populated based on the values taken from the one populated entry that, as you say elsewhere in your post, each company has. So, in pseudo-code, your UPDATE statement would need to look something like this:
One way to implement the above pattern could be to use correlated subqueries:
While that would work, such an update might not be very efficient, because the same table would be touched three extra times to obtain the source value. Since you know that the non-null values are stored in one row for each company, you could obtain them all in just one extra pass over the table by using a derived table and the proprietary "update with a join" syntax:
You can also rewrite it to use the explicit JOIN syntax:
As you can see, there is only one subquery in both variations and it provides all three values to populate the other rows.
Note, though, that it is also possible to solve this problem without any extra scans. First of all, if this were a SELECT statement, you could return the populated row's values in every row by using a window aggregate function, like this:
The MAX function works in this situation because it returns the maximum value only across the non-null values in the specified set. In your case, there would be only one non-null value per partition of
So the only thing remaining would be just to
And it is indeed possible to do so in SQL Server, because the result of the above SELECT query can be used as the target of your UPDATE statement. You can use it as a derived table:
```
UPDATE
tgt
SET
power1 = populatedPower1,
power2 = populatedPower2,
power3 = populatedPower3
FROM
(
SELECT
c1,
yryryryr,
power1,
power2,
power3,
populatedPower1 = MAX(power1) OVER (PARTITION BY c1),
populatedPower2 = MAX(power2) OVER (PARTITION BY c1),
populatedPower3 = MAX(power3) OVER (PART
power1, power2, power3 with the values already listed in the table?I take this to mean that each company's nulls should be populated based on the values taken from the one populated entry that, as you say elsewhere in your post, each company has. So, in pseudo-code, your UPDATE statement would need to look something like this:
UPDATE
#Rentarious
SET
power1 = power1 from the same company’s populated row,
power2 = power2 from the same company’s populated row,
power3 = power3 from the same company’s populated row
WHERE power1 IS NULL
AND power2 IS NULL
AND power3 IS NULL
;One way to implement the above pattern could be to use correlated subqueries:
UPDATE
#Rentarious
SET
power1 = (SELECT power1 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power1 IS NOT NULL),
power2 = (SELECT power2 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power2 IS NOT NULL),
power3 = (SELECT power3 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power3 IS NOT NULL)
WHERE power1 IS NULL
AND power2 IS NULL
AND power3 IS NULL
;While that would work, such an update might not be very efficient, because the same table would be touched three extra times to obtain the source value. Since you know that the non-null values are stored in one row for each company, you could obtain them all in just one extra pass over the table by using a derived table and the proprietary "update with a join" syntax:
UPDATE
#Rentarious
SET
power1 = sub.power1,
power2 = sub.power2,
power3 = sub.power3
FROM
(
SELECT
c1,
power1,
power2,
power3
FROM
#Rentarious
WHERE power1 IS NOT NULL
AND power2 IS NOT NULL
AND power3 IS NOT NULL
) AS sub
WHERE #Rentarious.c1 = sub.c1
AND #Rentarious.power1 IS NULL
AND #Rentarious.power2 IS NULL
AND #Rentarious.power3 IS NULL
;You can also rewrite it to use the explicit JOIN syntax:
UPDATE
tgt
SET
power1 = src.power1,
power2 = src.power2,
power3 = src.power3
FROM
#Rentarious AS tgt
INNER JOIN
(
SELECT
c1,
power1,
power2,
power3
FROM
#Rentarious
WHERE power1 IS NOT NULL
AND power2 IS NOT NULL
AND power3 IS NOT NULL
) AS sub
ON tgt.c1 = sub.c1
WHERE tgt.power1 IS NULL
AND tgt.power2 IS NULL
AND tgt.power3 IS NULL
;As you can see, there is only one subquery in both variations and it provides all three values to populate the other rows.
Note, though, that it is also possible to solve this problem without any extra scans. First of all, if this were a SELECT statement, you could return the populated row's values in every row by using a window aggregate function, like this:
SELECT
c1,
yryryryr,
power1,
power2,
power3,
populatedPower1 = MAX(power1) OVER (PARTITION BY c1),
populatedPower2 = MAX(power2) OVER (PARTITION BY c1),
populatedPower3 = MAX(power3) OVER (PARTITION BY c1)
FROM
#Rentarious
;The MAX function works in this situation because it returns the maximum value only across the non-null values in the specified set. In your case, there would be only one non-null value per partition of
c1 in each of the three cases, so the function would return that one value. This would be the result of the query for the example in your question:c1 yryryryr power1 power2 power3 populatedPower1 populatedPower2 populatedPower3
--------------- ---------- ------ ------ ------ --------------- --------------- ---------------
Building Blocks 2012-01-01 NULL NULL NULL Red Blue Green
Building Blocks 2013-01-01 NULL NULL NULL Red Blue Green
Building Blocks 2014-01-01 NULL NULL NULL Red Blue Green
Building Blocks 2016-01-01 Red Blue Green Red Blue Green
Red Cement 2012-01-01 Pink Purple Orange Pink Purple Orange
Red Cement 2016-01-01 NULL NULL NULL Pink Purple Orange
Red Cement 2011-01-01 NULL NULL NULL Pink Purple OrangeSo the only thing remaining would be just to
SET
power1 = populatedPower1,
power2 = populatedPower2,
power3 = populatedPower3And it is indeed possible to do so in SQL Server, because the result of the above SELECT query can be used as the target of your UPDATE statement. You can use it as a derived table:
```
UPDATE
tgt
SET
power1 = populatedPower1,
power2 = populatedPower2,
power3 = populatedPower3
FROM
(
SELECT
c1,
yryryryr,
power1,
power2,
power3,
populatedPower1 = MAX(power1) OVER (PARTITION BY c1),
populatedPower2 = MAX(power2) OVER (PARTITION BY c1),
populatedPower3 = MAX(power3) OVER (PART
Code Snippets
UPDATE
#Rentarious
SET
power1 = power1 from the same company’s populated row,
power2 = power2 from the same company’s populated row,
power3 = power3 from the same company’s populated row
WHERE power1 IS NULL
AND power2 IS NULL
AND power3 IS NULL
;UPDATE
#Rentarious
SET
power1 = (SELECT power1 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power1 IS NOT NULL),
power2 = (SELECT power2 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power2 IS NOT NULL),
power3 = (SELECT power3 FROM #Rentarious AS src WHERE src.c1 = #Rentarious.c1 AND src.power3 IS NOT NULL)
WHERE power1 IS NULL
AND power2 IS NULL
AND power3 IS NULL
;UPDATE
#Rentarious
SET
power1 = sub.power1,
power2 = sub.power2,
power3 = sub.power3
FROM
(
SELECT
c1,
power1,
power2,
power3
FROM
#Rentarious
WHERE power1 IS NOT NULL
AND power2 IS NOT NULL
AND power3 IS NOT NULL
) AS sub
WHERE #Rentarious.c1 = sub.c1
AND #Rentarious.power1 IS NULL
AND #Rentarious.power2 IS NULL
AND #Rentarious.power3 IS NULL
;UPDATE
tgt
SET
power1 = src.power1,
power2 = src.power2,
power3 = src.power3
FROM
#Rentarious AS tgt
INNER JOIN
(
SELECT
c1,
power1,
power2,
power3
FROM
#Rentarious
WHERE power1 IS NOT NULL
AND power2 IS NOT NULL
AND power3 IS NOT NULL
) AS sub
ON tgt.c1 = sub.c1
WHERE tgt.power1 IS NULL
AND tgt.power2 IS NULL
AND tgt.power3 IS NULL
;SELECT
c1,
yryryryr,
power1,
power2,
power3,
populatedPower1 = MAX(power1) OVER (PARTITION BY c1),
populatedPower2 = MAX(power2) OVER (PARTITION BY c1),
populatedPower3 = MAX(power3) OVER (PARTITION BY c1)
FROM
#Rentarious
;Context
StackExchange Database Administrators Q#140613, answer score: 6
Revisions (0)
No revisions yet.