Thursday, November 11, 2010

How to determine if a table has changed in SQL

Often you need to know if a sql tables data has changed.  The change can be a new row, deleted row or updated row.  To determine if a row has changed you can use the following command that will return a checksum identifying a change to the table.
You can also use a where clause.  For example get the checksum on just part of the table.



Returns the checksum value computed over a row of a table, or over a list of expressions. CHECKSUM is intended for use in building hash indices.
CHECKSUM ( * | expression [ ,...n ] )
Specifies that computation is over all the columns of the table. CHECKSUM returns an error if any column is of noncomparable data type. Noncomparable data types are text, ntext, image, and cursor, as well as sql_variant with any of the above types as its base type.
Is an expression of any type except a noncomparable data type.
Return Types
CHECKSUM computes a hash value, called the checksum, over its list of arguments. The hash value is intended for use in building hash indices. If the arguments to CHECKSUM are columns, and an index is built over the computed CHECKSUM value, the result is a hash index, which can be used for equality searches over the columns.
CHECKSUM satisfies the properties of a hash function: CHECKSUM applied over any two lists of expressions returns the same value if the corresponding elements of the two lists have the same type and are equal when compared using the equals (=) operator. For the purpose of this definition, NULL values of a given type are considered to compare as equal. If one of the values in the expression list changes, the checksum of the list also usually changes. However, there is a small chance that the checksum will not change.
BINARY_CHECKSUM and CHECKSUM are similar functions: they can be used to compute a checksum value on a list of expressions, and the order of expressions affects the resultant value. The order of columns used in the case of CHECKSUM(*) is the order of columns specified in the table or view definition, including computed columns.
CHECKSUM and BINARY_CHECKSUM return different values for the string data types, where locale can cause strings with different representation to compare equal. The string data types are char, varchar, nchar, nvarchar, or sql_variant (if its base type is a string data type). For example, the BINARY_CHECKSUM values for the strings "McCavity" and "Mccavity" are different. In contrast, in a case-insensitive server, CHECKSUM returns the same checksum values for those strings. CHECKSUM values should not be compared against BINARY_CHECKSUM values.
Using CHECKSUM to build hash indices
The CHECKSUM function may be used to build hash indices. The hash index is built by adding a computed checksum column to the table being indexed, then building an index on the checksum column.
-- Create a checksum index.
USE Northwind
ADD cs_Pname AS checksum(ProductName)
CREATE INDEX Pname_index ON Products (cs_Pname)

The checksum index can be used as a hash index, particularly to improve indexing speed when the column to be indexed is a long character column. The checksum index can be used for equality searches.

/*Use the index in a SELECT query. Add a second search 
condition to catch stray cases where checksums match, 
but the values are not identical.*/
FROM Products
WHERE checksum(N'Vegie-spread') = cs_Pname
AND ProductName = N'Vegie-spread'

Creating the index on the computed column materializes the checksum column, and any changes to the ProductName value will be propagated to the checksum column. Alternatively, an index could be built directly on the column indexed. However, if the key values are long, a regular index is not likely to perform as well as a checksum index.

See Also




1 comment:

sqlking said...

Great post! Been trying to do this for years!