SQL Server databases are the backbone of many enterprise applications, and good Transact-SQL (T-SQL) code is the best way to maximize SQL Server performance. Therefore, it is important for SQL developers to follow T-SQL best practices and guidelines when writing code. This article highlights some common T-SQL best practices to help ensure reliable, robust and efficient SQL code.
Choose appropriate data type
When you create a table, you must decide on the data type to use for column definitions. A data type defines the kind of data you can store in a column. You also use data types to define variables and stored procedure input and output parameters. You must select a data type for each column or variable appropriate to the data stored in that column or variable. In addition, you must consider storage requirements and choose data types that allow for efficient storage. For example, always use tinyint instead of smallint, int or bigint when you want to store whole positive integers between 0 and 255. This is because, tinyint is a fixed 1 byte field whereas smallint is 2 byte, int is 4 byte and bigint is an 8 byte fixed field.
Choosing the right data types also improves data integrity. For example, if you use a datetime data type for a column of dates, then only dates can be stored in this column. However, if you use a character or numeric data type for the column, then eventually someone is able to store any type of character or numeric data value in that column that does not represent a date.
Lastly, choosing the correct data type improves performance by resulting in the correct execution plan.
Avoid using DISTINCT or UNION clauses
Placing a DISTINCT clause in your T-SQL queries takes the results and removes duplicate rows. If you are sure that your query result set will only contain unique rows, one of the T-SQL best practices is to avoid using the DISTINCT clause, as it causes an unnecessary sorting operation.
The UNION clause also adds an additional sorting operation by eliminating duplicate records from two or more SELECT statements. If you use UNION to combine the results of two or more SELECT statements that contain only a single set of data, it is better to use the UNION ALL clause. The UNION ALL does not remove duplicates, and as a result requires the least SQL Server backend processing to perform the union operation.
Avoid using NOLOCK query hint
The NOLOCK query hint is one of the most common T-SQL best practices, but it can also be one of the worst. Most developers think that the risk of using a NOLOCK hint is the possibility of getting inconsistent data, since it only reads rows and doesn’t wait for others to commit other SQL statements such as SELECT and UPDATE. That is true, but there is more to it than just reading uncommitted rows. Transactions do more than just select, update and delete rows. For example, the transaction often requires an index that is updated or runs out of space on the data page. This may require the allocation of new pages and relocation of existing rows on the existing page to this new page, which is called a page split. Because of this, you may miss several rows or have rows twice, which usually is not allowed if you are running your queries without NOLOCK query hint.
Provide full column lists for SELECT or INSERT statement
Another T-SQL best practice is to always provide full column lists that are required for the SELECT and INSERT statement. For example, if you use SELECT * FROM in your code or in a stored procedure, the column list is resolved each time you run the SELECT statement. Moreover, SELECT or INSERT statements generate an error or returns a different set of columns if the underlying tables schema changes.
Therefore, when performing selects, avoid using SELECT * FROM [TableName], instead provide the full column list, as follows:
SELECT [col1],…[coln] FROM [TableName].
Similarly, when performing Inserts, use column list in the INSERT clause, as follows:
INSERT INTO [TableName] [col1],[col2]…[coln])
VALUES (‘Value1, Value2,…ValueN)
Some more T-SQL best practices
Use SET NOCOUNT ON. SET NOCOUNT ON within batches, stored procedures and triggers to increase performance. This is because, when specified, the statement does not return the number of rows affected.
Prefer EXISTS keyword over IN keyword. When checking for the existence of records, favor EXISTS keyword over the IN keyword. This is because the IN keyword operates on lists and returns the complete result set from subqueries before further processing. Subqueries using the EXISTS keyword return either TRUE or FALSE, which is faster because once the match is found, it will quit looking as the condition has proven true.
Avoid cursors. Avoid cursors as much as possible. Instead, use a set-based approach to updating or inserting data from one table to another.
Steer clear of dynamic SQL. Avoid using dynamic SQL; try to find alternatives that do not require dynamic SQL. If you use dynamic SQL, use sp_executesql instead of EXECUTE (EXEC) because sp_executesql is more efficient and versatile than EXECUTE. It supports parameter substitution and generates execution plans that are more likely to be reused by SQL Server.
Use schema-qualified object names. Refer to table names with schema name prefixes. For example, use SELECT * FROM [SchemaName].[TableName] instead of SELECT * FROM [TableName].