Wednesday, October 10, 2007

MERGE statement in SQL Server 2008

The MERGE statement in SQL Server 2008 is a new DML statement that combines multiple DML operations. It performs INSERT, UPDATE, or DELETE operations on a target table based on the results of a join with a source table. This statement enables you to merge two tables based on a set of criteria. In previous versions of SQL Server, you have to create separate statements if you need to insert, update, or delete data in one table based on certain conditions in another table or query. With MERGE, you can include the logic for these data modifications in one statement. You specify a source record set, which could be a table join or a view, and a target table, and the join condition between the two. You then specify the DML statement that you want to execute when the records between the two data are matched or are not matched. One typical use for this is bulk loading data warehouse tables or maybe executing DML statements during off-peak hours. To demonstrate how the MERGE statement works, here's a very simple example. I'll create two tables and populate them with values.

USE Northwind
GO
CREATE TABLE Table1 (ID int, [Value] varchar(20), Comment varchar(50))
CREATE TABLE Table2 (ID int, [Value] varchar(20), Comment varchar(50))
GO
INSERT INTO Table1 VALUES (1, 'A', 'Inserted'), (2, 'B', 'Inserted')
INSERT INTO Table2 VALUES (1, 'C', 'Inserted'), (3, 'D', 'Inserted')
GO


Here is where I'll be using the MERGE statement. What I'll do is I'll use Table1 as my source data and Table2 as my target table. I'll look at Table1 based on the ID column and, if a match is found, I'll update the Comment column with a value Match Found from Source. If the row from the Table1 does not match another row in Table2, I'll insert a new row and specify the Comment column with a value No Match Found on Target. If the row from Table2 does not match another row from Table1, I'll update the Comment column value to No Match Found from Source. The MERGE statement for this scenario looks like this

MERGE Table2 AS targetTable
USING (SELECT ID, [Value] FROM Table1) SourceTable
ON (targetTable.ID = SourceTable.ID)
WHEN MATCHED THEN UPDATE SET Comment = 'Match Found from Source'
WHEN TARGET NOT MATCHED THEN INSERT VALUES (ID, [Value], 'No Match Found on Target')
WHEN SOURCE NOT MATCHED THEN UPDATE SET Comment = 'No Match Found from Source';
GO


If you analyze the query, it will update the first record in Table2 with ID=1 and set the Comment field value to Match Found from Source as they both have the same values. Since the second record in Table1 has a an ID value of 2, it will update the Comment field value to No Match Found on Target on Table2 since the ID column has a value of 3. And since the Table2 has a record with no matching ID value from Table1, it will update Comment field value to No Match Found on Target.

Having explained how the concept of the MERGE statement works, let's look at a practical application. Using the Northwind database, I have created a script which will update the Products table based on the sales made for the day. It will subtract the total number of units for a specific product from the Order Details table from the UnitsInStock column of the Products table. You can run this query at the end of the day when there is not much queries running during the day. Normally, we would either create a trigger or a stored procedure which is wrapped in a transaction to solve this problem.

USE Northwind

MERGE Products AS P
USING (SELECT ProductID, SUM(Quantity) FROM [Order Details] OD
JOIN Orders O
ON OD.OrderID = O.OrderID
AND O.OrderDate = GETDATE()
GROUP BY ProductID) AS src (ProductID, OrderQty)
ON (P.ProductID = src.ProductID)
WHEN MATCHED AND P.UnitsInStock - src.OrderQty <> 0
THEN UPDATE SET P.UnitsInStock = P.UnitsInStock - src.OrderQty
WHEN MATCHED AND P.UnitsInStock - src.OrderQty = 0
THEN DELETE;


Check out my SQL Server 2008 videos at BlogCastRepository.com

2 comments:

Abbas said...

You should not use

AND O.OrderDate = GETDATE()

as it may return unexpected results as Microsoft suggests.

" Caution:
It is important to specify only the columns from the target table that are used for matching purposes. That is, specify columns from the target table that are compared to the corresponding column of the source table. Do not attempt to improve query performance by filtering out rows in the target table in the ON clause, such as by specifying AND NOT target_table.column_x = value. Doing so may return unexpected and incorrect results. "

http://technet.microsoft.com/en-us/library/bb510625.aspx

Edwin Sarmiento said...

Thanks for highlighting this. I forgot to use this approach when using GETDATE()

O.OrderDate=(SELECT convert(datetime, floor(convert(float, getdate()))))

Google