Tuesday, October 11, 2011
Improving ADO.NET Performance
Objectives
- Optimize your data access design.
- Choose between DataSets and DataReaders.
- Run efficient database commands.
- Pass data between layers efficiently.
- Perform efficient transactions.
- Optimize connection management.
- Evaluate the cost of paging through records.
- Evaluate criteria for analyzing data access performance.
- Apply performance considerations to binary large object (BLOB) manipulation.
Overview
Well-designed data access code and data processing commands are essential elements for application performance and scalability. Typically, the database is a focal point for application load because the majority of application requests require data that comes from a database.This chapter provides proven strategies for designing and implementing data access code for performance and scalability.
How to Use This Chapter
Use this chapter to improve the implementation of your data access code for performance and scalability. To get the most out of this chapter, consider the following:- Jump to topics or read beginning to end. The main headings in this chapter help you to quickly identify and then locate the topic that interests you. Alternatively, you can read the chapter beginning to end to gain a thorough appreciation of the issues that affect ADO.NET performance.
- Use the checklist. Use "Checklist: ADO.NET Performance" in the "Checklists" section of this guide to quickly view and evaluate the guidelines presented in this chapter.
- Use the "Architecture" section of this chapter to understand how ADO.NET works. By understanding the architecture, you can make better design and implementation choices. Understand core ADO.NET components, such as data provider objects and the DataSet object.
- Use the "Design Considerations" section of this chapter to understand the high-level decisions that will affect implementation choices for ADO.NET code.
- Read Chapter 13, "Code Review: .NET Application Performance" See the "Data Access" section for specific guidance.
- Measure your application performance. Read the "ADO.NET/Data Access" and ".NET Framework Technologies" sections of Chapter 15, "Measuring .NET Application Performance" to learn about the key metrics that you can use to measure application performance. You have to measure application performance so that you can identify and resolve performance issues.
- Test your application performance. Read Chapter 16, "Testing .NET Application Performance" to learn how to apply performance testing to your application. You have to apply a coherent testing process and analyze the results.
- Tune your application performance. Read the "ADO.NET Tuning" section of Chapter 17, "Tuning .NET Application Performance" to learn how to resolve performance issues that you identify through the use of tuning metrics.
- Tune SQL Server. Read Chapter 14, "Improving SQL Server Performance" to ensure that your Microsoft® SQL Server database is appropriately configured.
Architecture
ADO.NET relies on data providers to provide access to the underlying data source. Each data provider exposes a set of objects that you use to manage connections, retrieve data, and update data. The core objects are the following:- Connection
- Command
- DataReader
- DataAdapter
The basic ADO.NET architecture is shown in Figure 12.1.
Figure 12.1: ADO.NET architecture
The following list outlines the purpose of each of the main ADO.NET objects: - Connection. This object represents a connection to a database.
- Command. This object represents an SQL statement that is run while connected to a data source. This object can be a stored procedure or a direct SQL statement.
- DataReader. This object retrieves a read-only, forward-only stream of data from a database. The DataReader object is designed for connected scenarios and offers better performance than reading data into a DataSet object at the expense of functionality. For more information about how to use DataReader objects and DataSet objects, see "DataSet vs. DataReader" later in this chapter.
- DataAdapter. This object channels data to and from a DataSet object and the underlying data source. The DataAdapter object also provides enhanced batch update features that were previously associated with the ADO Recordset object.
- DataSet. The DataSet object represents a disconnected, cached set of data. The DataSet is independent of the provider and is not tied to the underlying data source that might have been used to populate it. DataSet can easily be passed from component to component through the various layers of an application, and it can be serialized as XML. You should be aware of the way a DataSet is internally constructed because the DataSet contains a potentially large number of internal objects. This means that a large number of memory allocations are required to construct a typical DataSet.
A DataSet consists of one or more DataTable objects together with DataRelation objects that maintain table relationship information. Each DataTable contains DataRow objects and DataColumn objects. Constraint objects are used to represent a constraint that can be enforced on one or more DataColumn objects.Note You can also use typed datasets that derive from the basic DataSet class. Typed datasets provide benefits at build time and at run time. For more information, see "Typed DataSets" later in this chapter.
- DataView. Although the DataView object is not shown in Figure 12.1, you can use a DataView to sort and filter data in a DataTable. This capability is often used for data binding.
Abstracting Data Access
ADO.NET is designed around a set of generic interfaces that abstract the underlying data processing functionality. You can use these interfaces directly to abstract your data access layer so that you can minimize the impact of changing the type of data source that you use. Abstracting data access is extremely helpful when you are designing systems where your customer chooses the database server.The core interfaces provided by ADO.NET are found in the System.Data namespace:
- IDbConnection. This is an interface for managing database connections.
- IDbCommand. This is an interface for running SQL commands.
- IDbTransaction. This is an interface for managing transactions.
- IDataReader. This is an interface for reading data returned by a command.
- IDataAdapter. This is an interface for channeling data to and from datasets.
- There is some small cost associated with a virtual call through an interface.
- Certain expanded functionality is lost when you use the generic interfaces. For example, the ExecuteXmlReader method is implemented by the SqlCommand object but not by the IDbCommand interface.
- There is no generic base exception type, so you must catch provider-specific exception types, such as SqlException, OleDbException, or OdbcException.
- When you use the generic interfaces, you cannot take advantage of database-specific types that are defined for the managed providers; for example, you cannot take advantage of SqlDbType in SqlClient and Oracle-specific types in the Oracle provider. Using specific database types is helpful for type checking and parameter binding.
More Information
For more information about how to use the generic interfaces to abstract your data access, see the following resources:- Knowledge Base article 313304, "HOW TO: Use Base Classes to Reduce Code Forking with Managed Providers in Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;313304. This article includes sample code.
- Use MSDN® to look at the interfaces that are described earlier in this section to identify the providers that implement each of the interfaces.
Performance and Scalability Issues
The following is a list of the main issues that can adversely affect the performance and scalability of data access in your application.- Inefficient queries. Queries that process and then return more columns or rows than necessary waste processing cycles that could best be used for servicing other requests. Queries that do not take advantage of indexes may also cause poor performance.
- Retrieving too much data. Too much data in your results is usually the result of inefficient queries. The SELECT * query often causes this problem. You do not usually need to return all the columns in a row. Also, analyze the WHERE clause in your queries to ensure that you are not returning too many rows. Try to make the WHERE clause as specific as possible to ensure that the least number of rows are returned.
- Inefficient or missing indexes. Query efficiency decreases when indexes are missing because a full table scan must be performed. Also, as your data grows, tables may become fragmented. Failure to periodically rebuild indexes may also result in poor query performance.
- Unnecessary round trips. Round trips significantly affect performance. They are subject to network latency and to downstream server latency. Many data-driven Web sites heavily access the database for every user request. While connection pooling helps, the increased network traffic and processing load on the database server can adversely affect performance. Keep round trips to an absolute minimum.
- Too many open connections. Connections are an expensive and scarce resource, which should be shared between callers by using connection pooling. Opening a connection for each caller limits scalability. To ensure the efficient use of connection pooling, avoid keeping connections open and avoid varying connection strings.
- Failure to release resources. Failing to release resources can prevent them from being reused efficiently. If you fail to close connections before the connections fall out of scope, they are not reclaimed until garbage collection occurs for the connection. Failing to release resources can cause serious resource pressure and lead to shortages and timeouts.
- Transaction misuse. If you select the wrong type of transaction management, you may add latency to each operation. Additionally, if you keep transactions active for long periods of time, the active transactions may cause resource pressure. Transactions are necessary to ensure the integrity of your data, but you need to ensure that you use the appropriate type of transaction for the shortest duration possible and only where necessary.
- Overnormalized tables. Overnormalized tables may require excessive joins for simple operations. These additional steps may significantly affect the performance and scalability of your application, especially as the number of users and requests increases.
Design Considerations
To help ensure that data access in your application is optimized for performance, there are several issues that you must consider and a number of decisions that you must make at design time:- Design your data access layer based on how the data is used.
- Cache data to avoid unnecessary work.
- Connect by using service accounts.
- Acquire late, release early.
- Close disposable resources.
- Reduce round trips.
- Return only the data you need.
- Use Windows authentication.
- Choose the appropriate transaction type.
- Use stored procedures.
- Prioritize performance, maintainability, and productivity when you choose how to pass data across layers.
- Consider how to handle exceptions.
- Use appropriate normalization.
Design Your Data Access Layer Based on How the Data Is Used
If you choose to access tables directly from your application without an intermediate data access layer, you may improve the performance of your application at the expense of maintainability. The data access logic layer provides a level of abstraction from the underlying data store. A well-designed data access layer exposes data and functionality based on how the data is used and abstracts the underlying data store complexity.Do not arbitrarily map objects to tables and columns, and avoid deep object hierarchies. For example, if you want to display a subset of data, and your design retrieves an entire object graph instead of the necessary portions, there is unnecessary object creation overhead. Evaluate the data you need and how you want to use the data against your underlying data store.
Cache Data to Avoid Unnecessary Work
Caching data can substantially reduce the load on your database server. By caching data, you avoid the overhead of connecting to your database, searching, processing, and transmitting data from your database server. By caching data, you directly improve performance and scalability in your application.When you define your caching strategy, consider the following:
- Is the data used application-wide and shared by all users, or is the data specific to each user? Data that is used across the application, such as a list of products, is a better caching candidate than data that is specific to each user.
- How frequently do you need to update the cache? Even though the source data may change frequently, your application may not need to update the cache as often. If your data changes too frequently, it may not be a good caching candidate. You need to evaluate the expense of updating the cache compared to the cost of fetching the data as needed.
- Where should you cache data? You can cache data throughout the application layers. By caching data as close as possible to the consumer of the data, you can reduce the impact of network latency.
- What form of the data should you cache? The best form of data to cache is usually determined by the form that your clients require the data to be in. Try to reduce the number of times that you need to transform data.
- How do you expire items in the cache? Consider the mechanism that you will use to expire old items in the cache and the best expiration time for your application.
Connect by Using Service Accounts
There are several ways to authenticate and to open a connection to the database. You can use SQL authentication that has an identity specified in the connection string. Or, you can use Windows authentication by using the process identity, by using a specific service identity, or by impersonating the original caller's identity.From a security perspective, you should use Windows authentication. From a performance perspective, you should use a fixed service account and avoid impersonation. The fixed service account is typically the process account of the application. By using a fixed service account and a consistent connection string, you help ensure that database connections are pooled efficiently. You also help ensure that the database connections are shared by multiple clients. Using a fixed service account and a consistent connection string is a major factor in helping application scalability.
More Information
For more information, see "The Trusted Subsystem Model" in the "Authentication and Authorization" chapter of "Building Secure ASP.NET Applications: Authentication, Authorization and Secure Communication" on MSDN at http://msdn.microsoft.com/en-us/library/aa302383.aspx. This chapter explains how to use service accounts or process identity to connect to a database. You can also use the chapter to learn about the advantages and disadvantages of Windows and SQL authentication.
Acquire Late, Release Early
Your application should share expensive resources efficiently by acquiring the resources late, and then releasing them as early as possible. To do so:- Open database connections right when you need them. Close the database connections as soon as you are finished. Do not open them early, and do not hold them open across calls.
- Acquire locks late, and release them early.
Close Disposable Resources
Usually, disposable resources are represented by objects that provide a Dispose method or a Close method. Make sure that you call one of these methods as soon as you are finished with the resource. For more information about closing Connection objects and DataReader objects, see "Connections" later in this chapter.Reduce Round Trips
Network round trips are expensive and affect performance. Minimize round trips by using the following techniques:- If possible, batch SQL statements together. Failure to batch work creates additional and often unnecessary trips to the database. You can batch text SQL statements by separating them with a semicolon or by using a stored procedure. If you need to read multiple result sets, use the NextResult method of the DataReader object to access subsequent result sets.
- Use connection pooling to help avoid extra round trips. By reusing connections from a connection pool, you avoid the round trips that are associated with connection establishment and authentication. For more information, see "Connections" later in this chapter.
- Do not return results if you do not need them. If you only need to retrieve a single value, use the ExecuteScalar method to avoid the operations that are required to create a result set. You can also use the ExecuteNonQuery method when you perform data definition language (DDL) operations such as the create table operation. This also avoids the expense of creating a result set.
- Use caching to bring nearly static data close to the consumer instead of performing round trips for each request.
Implicit Round Trips
Be aware that certain operations can cause implicit round trips. Typically, any operation that extracts metadata from the database causes an implicit round trip. For example, avoid calling DeriveParameters if you know the parameter information in advance. It is more efficient to fill the parameters collection by setting the information explicitly. The following code sample illustrates a call that causes an implicit round trip.// This causes an implicit round trip to the database
SqlCommandBuilder.DeriveParameters(cmd);
Return Only the Data You Need
Evaluate the data that your application actually requires. Minimize the data that is sent over the network to minimize bandwidth consumption. The following approaches help reduce data over the network:- Return only the columns and rows that you need.
- Cache data where possible.
- Provide data paging for large results. For more information about paging, see "Paging Records" later in this chapter.
Use Windows Authentication
From a security perspective, you should use Windows authentication to connect to Microsoft SQL Server. There are several advantages to using Windows authentication. For example, credentials are not passed over the network, database connection strings do not contain credentials, and you can apply standard Windows security policies to accounts. For example, you can enforce use of strong passwords and apply password expiration periods.From a performance perspective, SQL authentication is slightly faster than Windows authentication, but connection pooling helps minimize the difference. You also need to help protect the credentials in the connection string and in transit between your application and the database. Helping to protect the credentials adds to the overhead and minimizes the performance difference.
Note Generally, local accounts are faster than domain accounts when you use Windows authentication. However, the performance saving needs to be balanced with the administration benefits of using domain accounts.
Choose the Appropriate Transaction Type
Proper transaction management minimizes locking and contention, and provides data integrity. The three transaction types and their usage considerations include the following:- Native database support. Native database support for transactions permits you to control the transaction from a stored procedure. In SQL Server, use BEGIN TRAN, COMMIT TRAN, and ROLLBACK to control the transaction's outcome. This type of transaction is limited to a single call from your code to the database, although the SQL query or stored procedure can make use of multiple stored procedures.
- ADO.NET transactions. Manual transactions in ADO.NET enable you to span a transaction across multiple calls to a single data store. Both the SQL Server .NET Data Provider and the OLE DB .NET Data Provider implement the IDbTransaction interface and expose BeginTransaction on their respective connection object. This permits you to begin a transaction and to run multiple SQL calls using that connection instance and control the transaction outcome from your data access code.
- Enterprise Services distributed transactions. Use declarative, distributed transactions when you need transactions to span multiple data stores or resource managers or where you need to flow transaction context between components. Also consider Enterprise Services transaction support for compensating transactions that permit you to enlist nontransactional resources in a transaction. For example, you can use the Compensating Resource Manager to combine a file system update and a database update into a single atomic transaction. Enterprise Services distributed transactions use the services of the Microsoft Distributed Transaction Coordinator (DTC). The DTC introduces additional performance overhead. The DTC requires several round trips to the server and performs complex interactions to complete a transaction.
Use Stored Procedures
Avoid embedded SQL statements. Generally, well-designed stored procedures outperform embedded SQL statements. However, performance is not the only consideration. When you choose whether to store your SQL commands on the server by using stored procedures or to embed commands in your application by using embedded SQL statements, consider the following issues:- Logic separation. When you design your data access strategy, separate business logic from data manipulation logic for performance, maintainability, and flexibility benefits. Validate business rules before you send the data to the database to help reduce network round trips. Separate your business logic from data manipulation logic to isolate the impact of database changes or business rule changes. Use stored procedures to clarify the separation by moving the data manipulation logic away from the business logic so that the two do not become intertwined.
- SQL optimizations. Some databases provide optimizations to stored procedures that do not apply to dynamic SQL. For example, Microsoft SQL Server versions prior to SQL Server 2000 kept a cached execution plan for stored procedures. The cached execution plan for stored procedures reduced the need to compile each stored procedure request. SQL Server 2000 is optimized to cache query plans for both stored procedure and for dynamic SQL query plans.
- Tuning/deployment. Stored procedure code is stored in the database and permits database administrators to review data access code. Database administrators can tune both the stored procedures and the database, independent of the deployed application. The application does not always need to be redeployed when stored procedures change. Embedded SQL is deployed as part of the application code and requires database administrators to profile the application to identify the SQL actually used. Profiling the application complicates tuning, because the application must be redeployed if any changes are made.
- Network traffic sent to the server. Source code for stored procedures is stored on the server. Only the name and parameters are sent across the network to the server. Conversely, when you use embedded SQL, the full source of the commands must be transmitted each time the commands are run. When you use stored procedures, you can reduce the amount of data that is sent to the server when large SQL operations are frequently run.
- Simplified batching of commands. Stored procedures make it easy to batch work and provide simpler maintenance.
- Data security and integrity. With stored procedures, administrators can secure tables against direct access or manipulation, and they can only permit the execution of selected stored procedures. Both users and applications are granted access to the stored procedures that enforce data integrity rules. Embedded SQL usually requires advanced permissions on tables. Using advanced permissions on tables is a more complex security model to maintain.
- SQL Injection. Avoid using dynamically generated SQL with user input. SQL injection occurs when input from a malicious user is used to perform unauthorized actions, such as retrieving too much data or destructively modifying data. Parameterized stored procedures and parameterized SQL statements can help reduce the likelihood of SQL injection. Parameter collections force parameters to be treated as literal values so that the parameters are not treated as executable code. You should also constrain all user input to reduce the likelihood that a malicious user could use SQL injection to perform unauthorized actions.
More Information
For more information about how to prevent SQL injection, see Chapter 14, "Building Secure Data Access," in Improving Web Application Security: Threats and Countermeasures on MSDN at http://msdn.microsoft.com/en-us/library/aa302430.aspx.Prioritize Performance, Maintainability, and Productivity when You Choose How to Pass Data Across Layers
You should consider several factors when you choose an approach for passing data across layers:- Maintainability. Consider how hard it is to build and keep up with changes.
- Productivity. Consider how hard it is to implement the solution.
- Programmability. Consider how hard it is to code.
- Performance. Consider how efficient it is for collections, browsing, and serializing.
- DataSets. With this approach, you use a generic DataSet object. This approach offers great flexibility because of the extensive functionality of the DataSet. This includes serialization, XML support, ability to handle complex relationships, support for optimistic concurrency, and others. However, DataSet objects are expensive to create because of their internal object hierarchy, and clients must access them through collections. The DataSet contains collections of many subobjects, such as the DataTable, DataRow, DataColumn, DataRelation and Constraint objects. Most of these objects are passed with the DataSet between the layers. This is a lot of objects and a lot of data to be passed between the layers. It also takes time to fill a DataSet, because there are many objects that need to be instantiated and populated. All of this affects performance. Generally, the DataSet is most useful for caching when you want to create an in-memory representation of your database, when you want to work with relations between tables, and when you want to perform sorting and filtering operations.
- Typed DataSets. Instantiation and marshaling performance of the typed DataSet is roughly equivalent to the DataSet. The main performance advantage of the typed DataSet is that clients can access methods and properties directly, without having to use collections.
- DataReaders. This approach offers the optimum performance when you need to render data as quickly as possible. You should close DataReader objects as soon as possible and make sure that client applications cannot affect the amount of time the DataReader and, hence, the database connection is held open. The DataReader is very fast compared to a DataSet, but you should avoid passing DataReader objects between layers, because they require an open connection.
- XML. This is a loosely coupled approach that natively supports serialization and collections of data. For example, an XML document can contain data for multiple business entities. It also supports a wide range of client types. Performance issues to consider include the fact that XML strings can require substantial parsing effort, and large and verbose strings can consume large amounts of memory.
- Custom Classes. With this approach, you use private data members to maintain the object's state and provide public accessor methods. For simple types, you can use structures instead of classes, which means you avoid having to implement your own serialization. The main performance benefit of custom classes is that they enable you to create your own optimized serialization. You should avoid complex object hierarchies and optimize your class design to minimize memory consumption and reduce the amount of data that needs to be serialized when the object is passed between layers.
For more information about how to pass data across the layers, see "Designing Data Tier Components and Passing Data Through Tiers" at http://msdn.microsoft.com/en-us/library/ms978496.aspx.
Consider How to Handle Exceptions
In data access code in particular, you can use try/finally blocks to ensure that connections and other resources are closed, regardless of whether exceptions are generated. However, be aware of the following considerations:- Exceptions are expensive. Do not catch exceptions and then throw them again if your data access logic cannot add any value. A less costly approach is to permit the exception to propagate from the database to the caller. Similarly, do not wrap transaction attempts with try/catch blocks unless you plan to implement retry mechanisms.
- If you want to completely abstract your caller from the data-specific details, you have to catch the exception, you have to log detailed information to a log store, and then you have to return an enumerated value from a list of application-specific error codes. The log store could be a file or the Windows event log.
Use Appropriate Normalization
Overnormalization of a database schema can affect performance and scalability. For example, if you program against a fully normalized database, you are often forced to use cross-table joins, subqueries, and data views as data sources. Obtaining the right degree of normalization involves tradeoffs.On one hand, you want a normalized database to minimize data duplication, to ensure that data is logically organized, and to help maintain data integrity. On the other hand, it may be harder to program against fully normalized databases, and performance can suffer. Consider the following techniques:
- Start with a normalized model. Start with a normalized model and then de-normalize later.
- Reduce the cost of joins by repeating certain columns. Deep joins may result in the creation of temporary tables and table scans. To reduce the cost of joining across multiple tables, consider repeating certain columns.
- Store precomputed results. Consider storing precomputed results, such as subtotals, instead of computing them dynamically for each request.
Implementation Considerations
When you move from application design to application development, consider the implementation details of your ADO.NET code. You can improve resource management by acquiring connections late, by releasing them early, and by using connection pooling.When you run code on the server, prefer stored procedures. Stored procedures are optimized by the database and use provider-specific types when they pass parameters, to reduce processing. Choose the best transaction management mechanism, and then choose an appropriate isolation level. Keep transactions as short as possible, and avoid code that can lead to deadlocks. You can improve responsiveness by employing the correct paging strategy. Employing the correct paging strategy can also reduce server load.
By following best practice implementation guidelines, you can increase the performance of your ADO.NET code. The following sections summarize performance considerations for ADO.NET features and scenarios.
.NET Framework Data Providers
Microsoft .NET Framework Data Providers are divided into two categories: bridge providers and native providers. Bridge providers permit you to use data access libraries, such as OLE DB and Open Database Connectivity (ODBC). The bridge provider wraps the underlying data access library. Native providers, such as those for SQL Server and Oracle, typically offer performance improvements due, in part, to the fact that there is less abstraction. It is important to choose the correct data provider for your specific data source as described below:- Use System.Data.SqlClient for SQL Server 7.0 and later.
- Use System.Data.OleDb for SQL Server 6.5 or OLE DB providers.
- Use System.Data.ODBC for ODBC data sources.
- Use System.Data.OracleClient for Oracle.
- Use SQLXML managed classes for XML data and for SQL Server 2000.
Use System.Data.SqlClient for SQL Server 7.0 and Later
For SQL Server 7.0 or later, use the .NET Framework Data Provider for SQL Server, System.Data.SqlClient. It is optimized for accessing SQL Server and communicates directly by using the Tabular Data Stream (TDS) protocol. TDS is the native data transfer protocol of SQL Server.The commonly used classes in System.Data.SqlClient are SqlConnection, SqlCommand, SqlDataAdapter, and SqlDataReader.
Use System.Data.OleDb for SQL Server 6.5 or OLE DB Providers
For SQL Server 6.5 or OLE DB data sources, use the .NET Framework Data Provider for OLE DB, System.Data.OleDb. For example, in SQL Server 6.5 or earlier, you would use the OLE DB Provider for SQL Server (SQLOLEDB) with the .NET Framework Data Provider for OLE DB. Using the OLE DB provider is less efficient than using the .NET Framework Data Provider for SQL Server, because it calls through the OLE DB layer by using COM interop when communicating with the database.The commonly used classes in System.Data.OleDb are OleDbConnection, OleDbCommand, OleDbDataAdapter, and OleDbDataReader.
Note The .NET Framework Data Provider for OLE DB does not support the Microsoft OLE DB Provider for ODBC (MSDASQL). For ODBC data sources, use the .NET Framework Data Provider for ODBC instead.
Use System.Data.ODBC for ODBC Data Sources
For ODBC data sources, use the .NET Framework Data Provider for ODBC, System.Data.ODBC. This provider uses the native ODBC Driver Manager through COM interop.If you are using .NET Framework 1.0, you must download the .NET Framework Data Provider for ODBC from the Microsoft .NET Framework Developer Center on MSDN at http://msdn.microsoft.com/en-us/library/a6cd7c08.aspx. Note that the namespace is Microsoft.Data.Odbc. If you are using .NET Framework 1.1, it is included in the System.Data.Odbc namespace.
The commonly used classes in System.Data.Odbc are OdbcConnection, OdbcCommand, OdbcDataAdapter, and OdbcDataReader.
Use System.Data.OracleClient for Oracle
For Oracle data sources, use the .NET Framework Data Provider for Oracle, System.Data.OracleClient. This provider enables data access to Oracle data sources through Oracle client connectivity software. The data provider supports Oracle client software version 8.1.7 and later.If you are using .NET Framework 1.0, you must download the .NET Framework Data provider for Oracle from the Microsoft .NET Framework Developer Center on MSDN at http://www.microsoft.com/downloads/details.aspx?familyid=4f55d429-17dc-45ea-bfb3-076d1c052524&displaylang;=en. If you are using .NET Framework 1.1 or later, the .NET Framework Data provider for Oracle is included in the System.Data.OracleClient namespace in System.Data.OracleClient.dll.
The commonly used classes in System.Data.OracleClient are OracleConnection, OracleCommand, OracleDataAdapter, and OracleDataReader.
Use SQLXML Managed Classes for XML Data and SQL Server 2000
To manipulate data in a SQL Server database as XML, use SQLXML Managed Classes. You can download SQLXML 3.0 from the Data Access and Storage Developer Center on MSDN at http://msdn.microsoft.com/en-us/library/aa286527.aspx.More Information
For more information about .NET Framework Data Providers, see the following:- Knowledge Base article 313480, "INFO: Roadmap for .NET Data Providers," at http://support.microsoft.com/default.aspx?scid=kb;en-us;313480
- MSDN article, "Using .NET Framework Data Provider for Oracle to Improve .NET Application Performance," at http://msdn.microsoft.com/en-us/library/ms973871.aspx
- "Implementing a .NET Framework Data Provider" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/4ksaf9z5(VS.71).aspx
Connections
Database connections are an expensive and limited resource. Your approach to connection management can significantly affect the overall performance and scalability of your application. Issues to consider include acquiring and releasing connections, pooling, and authentication. To improve database connection performance and scalability, apply the following strategies to your connection management policy:- Open and close the connection in the method.
- Explicitly close connections.
- When using DataReaders, specify CommandBehavior.CloseConnection.
- Do not explicitly open a connection if you use Fill or Update for a single operation.
- Avoid checking the State property of OleDbConnection.
- Pool connections.
Open and Close the Connection in the Method
Acquire connections late and release them early. Opening connections before they are needed reduces the number of connections that are available and increases resource pressure. Close connections quickly to ensure that they can be reused as soon as possible. Do not hold on to connections. Holding on to connections reduces the connections that are available to other code and increases resource pressure. The general pattern is to open and close connections on a per-method basis.Explicitly Close Connections
Explicitly call the Close or Dispose methods on SqlConnection objects as soon as you finish using them to release the resources that they use. Do not wait for the connection to fall out of scope. The connection is not returned to the pool until garbage collection occurs. This delays the reuse of the connection and negatively affects performance and scalability. The following are guidelines to consider. These guidelines are specific to SqlConnection because of the way it is implemented. These guidelines are not universal for all classes that have Close and Dispose functionality.- Using either the Close method or the Dispose method is sufficient. You do not have to call one method after the other. There is no benefit to calling one method after the other.
- Dispose internally calls Close. In addition, Dispose clears the connection string.
- If you do not call Dispose or Close, and if you do not use the using statement, you are reliant upon the finalization of the inner object to free the physical connection.
- Use the using statement, instead of Dispose or Close, when you are working with a single type, and you are coding in Visual C#®. Dispose is automatically called for you when you use the using statement, even when an exception occurs.
- If you do not use the using statement, close connections inside a finally block. Code in the finally block always runs, regardless of whether an exception occurs.
- You do not have to set the SqlConnection reference to null or Nothing because there is no complex object graph. Setting object references to null or to Nothing is usually done to make a graph of objects unreachable.
Note Closing a connection automatically closes any active DataReader objects that are associated with the connection.
Closing Connections in Visual Basic .NET
The following Visual Basic® .NET code snippet shows how to explicitly close a connection as soon as the connection is no longer needed.Try
conn.Open()
cmd.ExecuteNonQuery()
customerCount = paramCustCount.Value
Catch ex As Exception
' handle exception
Finally
' This is guaranteed to run regardless of whether an exception occurs
' in the Try block.
If Not(conn is Nothing) Then
conn.Close()
End If
End Try
Closing Connections in C#
The following example shows how to close connections in C#.public void DoSomeWork()
{
SqlConnection conn = new SqlConnection(connectionString);
try
{
conn.Open();
// Do Work
}
catch (Exception e)
{
// Handle and log error
}
finally
{
if(null!=conn)
conn.Close();
}
}
Closing Connections with the Using Statement in C#
The using statement simplifies code for C# developers by automatically generating a try and finally block when the code is compiled. This ensures that the Dispose method is called even if an exception occurs. The following code fragment shows how to use the using statement.using (SqlConnection conn = new SqlConnection(connString))The C# compiler converts this code into the following equivalent code, which has a try and finally block to ensure that the Dispose method on the SqlConnection object is called, regardless of whether an exception occurs.
{
conn.Open();
. . .
} // Dispose is automatically called on the conn variable here
SqlConnection conn = new SqlConnection(connString);One limitation of the using statement is that you can only put a single type in the parentheses. If you want to ensure that Dispose is called on additional resources, you must nest the using statements as shown in the following example.
try
{
conn.Open();
}
finally
{
conn.Dispose();
}
using (SqlConnection conn = new SqlConnection(connString))
{
SqlCommand cmd = new SqlCommand("CustomerRead");
conn.Open();
using (SqlDataReader dr = cmd.ExecuteReader())
{
while (dr.Read())
Console.WriteLine(dr.GetString(0));
}
}
Note Using a nested using statement on the DataReader object is useful only if you need to perform further operations with the same connection after the inner using block. If you close the connection right away, this approach is of limited value because any active DataReader objects are closed automatically when the connection closes.
When Using DataReaders, Specify CommandBehavior.CloseConnection
When you create a DataReader object, specify the CommandBehavior.CloseConnection enumeration in your call to ExecuteReader. This ensures that when you close the DataReader, the connection is also closed. The following code fragment shows how to use the CommandBehavior enumeration.// Create connection and command. Open connection.The CommandBehavior.CloseConnection is especially helpful when you return a DataReader from a function, and you do not have control over the calling code. If the caller forgets to close the connection but closes the reader, both are closed when the DataReader is created by using CommandBehavior.CloseConnection. This is shown in the following code fragment.
. . .
SqlDataReader myReader= myCommand.ExecuteReader(CommandBehavior.CloseConnection);
// read some data
. . .
myReader.Close(); // The connection and reader are closed.
public SqlDataReader CustomerRead(int CustomerID)
{
//... create connection and command, open connection
return myCommand.ExecuteReader(CommandBehavior.CloseConnection);
}
//... client code
SqlDataReader myReader = CustomerRead(10248);
//... read some data
myReader.Close(); // reader and connection are closed
Do Not Explicitly Open a Connection if You Use Fill or Update for a Single Operation
If you perform a single Fill or Update operation, do not open the connection before you call the Fill method, because the DataAdapter automatically opens and closes the connection for you. The following code fragment shows how to call Fill.DataSet dSet = new DataSet("test");The SqlDataAdapter automatically opens the connection, runs the selected command, and then closes the connection when it is finished. This enables the connection to be open for the shortest period of time.
SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand(sqlQuery,conn);
SqlDataAdapter dAdapter = new SqlDataAdapter(cmd);
dAdapter.Fill(dSet); // The connection was not explicitly opened.
// The connection is opened and closed by the DataAdapter automatically.
Note that if you need to perform multiple file or update operations, you need to open the connection before the first Fill or Update method and close it after the last one. Alternatively, you could wrap multiple Fill or Update operations inside a C# using block to ensure that the connection is closed after the last use.
Avoid Checking the State Property of OleDbConnection
If you need to monitor or check connection status and you are using an OleDbConnection, consider handling the StateChange event, and avoid checking the State property. This approach helps to minimize round trips.Using the State property increases application overhead, because each call results in a call to the OLE DB DBPROP_CONNECTIONSTATUS property (if the connection is an OleDbConnection) for an open connection.
Note The .NET Framework 2.0 (code named "Whidbey"), at the time of writing, provides an updated OLE DB .NET Data Provider that resolves this problem.The following code fragment shows how to implement the StateChange event. This event is raised when the state of the connection changes from open to closed or from closed to open.
OleDbConnection conn = new OleDbConnection(connStr);
// Set up a connection state change handler.
conn.StateChange += new StateChangeEventHandler(OnStateChange);
. . .
// StateChange event handler.
protected static void OnStateChange(object sender, StateChangeEventArgs args)
{
Console.WriteLine("The current Connection state has changed from {0} to {1}.",
args.OriginalState, args.CurrentState);
}
Note The ODBC provider also incurs similar overhead when using the State property.
Pool Connections
Creating database connections is expensive. You reduce overhead by pooling your database connections. Make sure you call Close or Dispose on a connection as soon as possible. When pooling is enabled, calling Close or Dispose returns the connection to the pool instead of closing the underlying database connection.You must account for the following issues when pooling is part of your design:
- Share connections. Use a per-application or per-group service account to connect to the database. This creates a single pool or a small number of pools, and it enables many client requests to share the same connections.
- Avoid per-user logons to the database. Each logon creates a separate pooled connection. This means that you end up with a large number of small pools. If you need a different user for each connection, disable pooling or set a small maximum size for the pool.
- Do not vary connection strings. Different connection strings generate different connection pools. For example, using different capitalization, extra spaces, or different ordering of attributes causes connections to go to different pools. The SQL Server .NET Data Provider performs a byte-by-byte comparison to determine whether connection strings match.
- Release connections. Do not cache connections. For example, do not put them in session or application variables. Close connections as soon as you are finished with them. Busy connections are not pooled.
- Passing connections. Do not pass connections between logical or physical application layers.
- Consider tuning your pool size if needed. For example, in the case of the .NET Framework Data Provider for SQL Server, the default minimum pool size is zero and the maximum is 100. You might need to increase the minimum size to reduce warm-up time. You might need to increase the maximum size if your application needs more than 100 connections.
- Connection pools are managed by the specific database provider. SqlClient, OleDB client, and third-party clients may provide different configuration and monitoring options.
- The .NET Framework Data Provider for SQL Server pools connections by using a pooling mechanism implemented in managed code. You control pooling behaviors such as lifetime and pool size through connection string arguments.
- The .NET Framework Data Provider for Oracle also pools connections by using a managed code solution.
- The .NET Framework Data Provider for OLE DB automatically uses OLE DB session pooling to pool connections. You control pooling behavior through connection string arguments.
- The .NET Framework Data Provider for ODBC uses ODBC connection pooling.
Monitoring Pooling
You can monitor connection pooling to determine that it is working as expected and to help you identify the best minimum and maximum pool sizes.Monitoring Pooling on a Computer that is Running SQL Server
You can monitor the number of open connections to SQL Server by using the SQL Server SQLServer:General Statistics performance counter object. This object is available only on a computer that is running SQL Server.
The connections are not specific to one particular application. If there are multiple applications accessing the server, this object reflects the total number of open connections for every application. Figure 12.2 shows the SQLServer:General Statistics object in the Performance Monitor tool.
Figure 12.2: Performance monitor showing the SQLServer:General Statistics counter
When monitoring SQLServer:General Statistics, you should observe the following: - The number of logins per second increases during application startup when the connection pool is established. The number of logins per second should then drop to zero and stay there. Repeated logins and logouts per second indicate that the connection pool is not being used because a different security context is being used to establish the connection.
- The User Connections value should stabilize and remain constant. If this value increases and you see a jagged pattern in the number of logins per second, you may be experiencing a connection leak in the connection pool.
The .NET Framework Data Provider for SQL Server provides several counters. The following counters are of particular significance:
- SqlClient: Current # connection pools
- SqlClient: Current # pooled and nonpooled connections
- SqlClient: Current # pooled connections
- SqlClient: Peak # pooled connections
The SqlClient: Peak # pooled connections counter indicates the maximum number of connections that are currently in use. If this value remains at its peak, consider measuring the performance impact of increasing the Max Pool Size attribute in your connection string. The default value is 100. If you see this value at its peak in conjunction with a high number of failed connections in the SqlClient: Total # failed connects counter, consider changing the value and monitoring performance.
Note These SqlClient counters may not be reset in .NET Framework version 1.1 when you stop and then restart an application. To reset the counters, stop the application and exit System Monitor, and then start the application and System Monitor again.
More Information
For more information about pooling connections, see the following resources on MSDN:- "Connection Pooling for the .NET Framework Data Provider for SQL Server" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/8xx3tyca(VS.71).aspx
- 164221, "INFO: How to Enable Connection Pooling in an ODBC Application" at http://support.microsoft.com/default.aspx?scid=kb;en-us;164221
- 166083, "INFO: How to Enable Connection Pooling in an OLE DB Application" at http://support.microsoft.com/default.aspx?scid=kb;en-us;166083
- 169470, "INFO: Frequently Asked Questions About ODBC Connection Pooling" at http://support.microsoft.com/default.aspx?scid=kb;en-us;169470
- 216950, "How to Enable ODBC Connection Pooling Performance Counters" at http://support.microsoft.com/default.aspx?scid=kb;en-us;216950
- 237977, "INFO: OLE DB Session Pooling Timeout Configuration" at http://support.microsoft.com/default.aspx?scid=kb;en-us;237977
- 316757, "BUG: SqlClient Connection Pooling That Uses Integrated Security Is Slower Than OleDb" at http://support.microsoft.com/default.aspx?scid=kb;en-us;316757
Commands
You use Command objects such as SqlCommand, OleDbCommand, or OdbcCommand to run SQL commands against the database. You can run dynamic SQL statements or stored procedures by setting the CommandType property. You can set the CommandText property to contain the name of a stored procedure or the SQL statement that you want to run.You use command Parameter objects, such as SqlParameter, OleDbParameter, or OdbcParameter, to specify the input and output parameters required by the current command. The recommended approach for running commands is to call stored procedures by using Parameter objects that provide parameter type checking. This approach provides both performance and security benefits. When you run commands against a database, consider the following recommendations:
- Validate SQL input and use Parameter objects.
- Retrieve only the columns and rows you need.
- Support paging over large result sets.
- Batch SQL statements to reduce round trips.
- Use ExecuteNonQuery for commands that do not return data.
- Use ExecuteScalar to return single values.
- Use CommandBehavior.SequentialAccess for very wide rows or for rows with BLOBs.
- Do not use CommandBuilder at run time.
Validate SQL Input and Use Parameter Objects
Validate all the input data that you use in SQL commands. Do not permit the client to retrieve more data than it should. Also, do not trust user input, and do not permit the client to perform operations that it should not perform. Doing so helps to lower the risk of SQL injection. By rejecting invalid data early before you issue a command that has the invalid data, you can improve performance by eliminating unnecessary database requests.Use Parameter objects when you build database commands. When you use Parameter objects, each parameter is automatically type checked. Checking the type is another effective countermeasure you can use to help prevent SQL injection. Ideally, use Parameter objects in conjunction with stored procedures to improve performance. For more information about using parameters, see "Parameters" later in this chapter.
Using Parameters with Stored Procedures
The following code sample illustrates how to use the Parameters collection.SqlDataAdapter myCommand = new SqlDataAdapter("AuthorLogin", conn);In the code sample, the @au_id parameter is treated as a literal value and not as code that can be run. Also, the parameter is checked for type and length. In the code fragment, the input value cannot be longer than 11 characters. If the data does not conform to the type or length that is defined by the parameter, an exception is generated.
myCommand.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter parm = myCommand.SelectCommand.Parameters.Add(
"@au_id", SqlDbType.VarChar, 11);
parm.Value = Login.Text;
Using stored procedures alone does not necessarily prevent SQL injection. The important thing to do is use parameters with stored procedures. If you do not use parameters, your stored procedures may be susceptible to SQL injection if the stored procedures use unfiltered input. For example, the following code fragment is susceptible to SQL injection.
SqlDataAdapter myCommand = new SqlDataAdapter("LoginStoredProcedure '" +
Login.Text + "'", conn);
Using Parameters with Dynamic SQL
If you cannot use stored procedures, you can still use parameters with dynamic SQL as shown in the following code fragment.SqlDataAdapter myCommand = new SqlDataAdapter(
"SELECT au_lname, au_fname FROM Authors WHERE au_id = @au_id", conn);
SqlParameter parm = myCommand.SelectCommand.Parameters.Add("@au_id",
SqlDbType.VarChar, 11);
parm.Value = Login.Text;
Retrieve Only the Columns and Rows You Need
Reduce unnecessary processing and network traffic by retrieving only the columns and the rows you need. Do not use the SELECT * query. This is poor practice because you might not know the schema, or it might change. It is easy to retrieve more data than you expect. Consider a scenario where you want four columns, but you perform an operation by using the SELECT * query on a 400-column table. In that scenario, you receive many more results than you expect. Instead, use WHERE clauses to filter the rows.Support Paging Over Large Result Sets
If you have a large result set that contains many rows of data, consider whether you can implement a paging technique to batch the retrieval of data. Batching the retrieval of data helps to reduce database server load, to reduce network traffic, and to put fewer memory requirements on the data access client. For more information, see "Paging Records" later in this chapter.Batch SQL Statements to Reduce Round Trips
Batching is the process of grouping several SQL statements in one trip to the server. The syntax in the following code fragment calls a stored procedure (that groups several queries) to return multiple result sets. The code uses the NextResult method of the DataReader object to advance to the next result set. NextResult can be called multiple times, and it returns true when another result set exists. It returns false when there are no more result sets.SqlCommand cmd = new SqlCommand();If you build complex SQL strings dynamically, you can use a StringBuilder object to reduce the performance cost of building the strings.
cmd.CommandText = "ReadCustomerAndOrders";
// The stored procedure returns multiple result sets.
SqlDataReader myReader = cmd.ExecuteReader();
if (myReader.read())
//... read first result set
reader.NextResult();
if (myReader.read())
//... read
More Information
You can also use stored procedures to batch SQL operations. For more information, see Knowledge Base article 311274, "HOW TO: Handle Multiple Results by Using the DataReader in Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;311274.Use ExecuteNonQuery for Commands That Do Not Return Data
If you want to run commands that do not retrieve data, use the ExecuteNonQuery method. For example, you would use ExecuteNonQuery for the following types of commands:- Data Definition Language commands such as CREATE and ALTER
- Data Modification Language commands such as INSERT, UPDATE, and DELETE
- Data Control Language commands such as GRANT and REVOKE.
SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand(
"UPDATE Customer SET Freight = 45.44 WHERE CustomerID = 10248", conn);
cmd.ExecuteNonQuery();
Use ExecuteScalar to Return Single Values
If you want to retrieve a single value from your query by using a function such as COUNT(*) or SUM(Price), you can use a stored procedure output parameter, and then use the Command.ExecuteNonQuery method. This eliminates the overhead that is associated with creating a result set.The following stored procedure returns the number of rows in a Customers table.
CREATE PROCEDURE GetNumberOfCustomers(To call the stored procedure, use the following code.
@CustomerCount int OUTPUT)
AS
SELECT @CustomerCount = COUNT(*)
FROM Customers
static int GetCustomerCount()If you do not have control over the stored procedure, and if the stored procedure returns the number of rows as a return value, then you can use Command.ExecuteScalar as shown in the following code fragment. The ExecuteScalar method returns the value of the first column of the first row of the result set.
{
int customerCount = 0;
SqlConnection conn = new SqlConnection("server=(local);" +
"Integrated Security=SSPI;database=northwind");
SqlCommand cmd = new SqlCommand("GetNumberOfCustomers", conn );
cmd.CommandType = CommandType.StoredProcedure;
SqlParameter paramCustCount =
cmd.Parameters.Add("@CustomerCount", SqlDbType.Int );
paramCustCount.Direction = ParameterDirection.Output;
try
{
conn.Open();
cmd.ExecuteNonQuery();
customerCount = (int)paramCustCount.Value;
}
finally
{
if(null!=conn)
conn.Close();
}
return customerCount;
}
static int GetCustomerCountWithScalar()The previous code fragment requires the following stored procedure.
{
int customerCount = 0;
SqlConnection conn = new SqlConnection(
"server=(local);Integrated Security=SSPI;database=northwind");
SqlCommand cmd = new SqlCommand("GetCustomerCountWithScalar", conn );
cmd.CommandType = CommandType.StoredProcedure;
try
{
conn.Open();
customerCount = (int)cmd.ExecuteScalar();
}
finally
{
if(null!=conn)
conn.Close();
}
return customerCount;
}
CREATE PROCEDURE GetCustomerCountWithScalar
AS
SELECT COUNT(*) FROM Customers
Use CommandBehavior.SequentialAccess for Very Wide Rows or for Rows with BLOBs
Use the CommandBehavior.SequentialAccess enumeration for very wide rows or for rows that contain binary large object (BLOB) data. This permits you to return specific bytes from the retrieved row instead of returning the entire row. Returning the entire row may consume large amounts of memory because of the BLOB data.When you use CommandBehavior.SequentialAccess, the BLOB data is retrieved only when you reference it. For example, you can call the GetBytes method. The GetBytes method permits you to control the precise number of bytes that are read. The following code fragment shows how to use CommandBehavior.SequentialAccess.
SqlDataReader reader = cmd.ExecuteReader(CommandBehavior.SequentialAccess)Also, if you are performing optimistic locking against a table with very wide rows or against rows that contain BLOB data, use timestamps. Use timestamps instead of comparing all the fields in the table to the original versions. Using time stamps reduces the number of arguments by a value that is equal to n/2+1.
More Information
For a complete sample, see "Obtaining BLOB Values from a Database" in .NET Framework Developer's Guide on MSDN at http://msdn.microsoft.com/en-us/library/87z0hy49(VS.71).aspx.Do Not Use CommandBuilder at Run Time
CommandBuilder objects such as SqlCommandBuilder and OleDbCommandBuilder automatically generate the InsertCommand, UpdateCommand, and DeleteCommand properties of a DataAdapter. The CommandBuilder objects generate these properties based on the SelectCommand property of the DataAdapter. CommandBuilder objects are useful when you are designing and prototyping your application. However, you should not use them in production applications. The processing required to generate the commands affects performance. Manually create stored procedures for your commands, or use the Visual Studio® .NET design-time wizard and customize them later if necessary.Stored Procedures
This section discusses how to write and to call stored procedures for maximum performance. You should generally prefer stored procedures over direct SQL statements, because stored procedures perform better. Stored procedures perform better because the database can optimize the data access plan used by the procedure and then cache it for subsequent reuse. In addition, stored procedures provide security and maintenance benefits.- Use stored procedures.
- Use CommandType.Text with OleDbCommand.
- Use CommandType.StoredProcedure with SqlCommand.
- Consider using Command.Prepare.
- Use output parameters where possible.
- Consider SET NOCOUNT ON for SQL Server.
Use Stored Procedures
Stored procedures generally provide improved performance in comparison to SQL statements that are run directly. The following list explains the benefits of stored procedures compared to running data access logic directly from the middle tier of your application:- The database can prepare, optimize, and cache the execution plan so that the execution plan can be reused at a later time.
- Stored procedures pass less information over the network on the initial request, because they only need to transmit the procedure name and the parameters. Everything else is already at the server.
- Stored procedures abstract SQL statements from the client and business object developers and put responsibility for their maintenance in the hands of SQL experts.
- Stored procedures also provide maintenance and security benefits.
More Information
For more information about the security benefits of stored procedures and about how you can use them as a countermeasure for SQL injection, see the following:- Chapter 14, "Building Secure Data Access," in "Improving Web Application Security: Threats and Countermeasures" on MSDN at http://msdn.microsoft.com/en-us/library/aa302430.aspx
- "Stored Procedure and Trigger Execution" in Microsoft SQL Server 2000: SQL Server Architecture on MSDN at http://msdn.microsoft.com/en-us/library/aa175256(SQL.80).aspx
- MSDN article, "Query Recompilation in SQL Server 2000," at http://msdn.microsoft.com/en-us/library/aa902682(SQL.80).aspx
- Knowledge Base article 243586, "INF: Troubleshooting Stored Procedure Recompilation," at http://support.microsoft.com/default.aspx?scid=kb;en-us;243586
- SQL Server Magazine article, "Inside SQL Server: Stored Procedure Plans: Improve performance through caching" at http://www.sqlmag.com/Articles/Print.cfm?ArticleID=6113.
Use CommandType.Text with OleDbCommand
If you use an OleDbCommand object to call a stored procedure, use the CommandType.Text enumeration with the ODBC call syntax. If you use CommandType.StoredProcedure, ODBC call syntax is generated by the provider anyway. By using explicit call syntax, you reduce the work of the provider.You should also set the type and length of any parameters that the stored procedure requires. Set the type and length of the parameters to prevent the provider from performing an additional round trip to obtain the parameter information from the database. The following code fragment demonstrates how to use ODBC call syntax and CommandType.Text, and how to explicitly set parameter information.
using (OleDbConnection conn = new OleDbConnection(connStr))
{
OleDbCommand cmd = new OleDbCommand("call CustOrderHist(?)", conn);
cmd.CommandType = CommandType.Text;
OleDbParameter param = cmd.Parameters.Add("@CustomerID", OleDbType.Char, 5);
param.Value = "ALFKI";
conn.Open();
OleDbDataReader reader = cmd.ExecuteReader();
try
{
// List each product.
while (reader.Read())
Console.WriteLine(reader.GetString(0));
}
finally
{
reader.Close();
}
} // Dispose is called on conn here
Use CommandType.StoredProcedure with SqlCommand
If you are using the SqlCommand object, use CommandType.StoredProcedure when you call stored procedures. Do not use CommandType.Text because it requires extra parsing. The following code fragment shows how to set the CommandType property to avoid extra parsing on the server.SqlConnection conn = new SqlConnection(connString);
SqlCommand cmd = new SqlCommand("UpdateCustomerProcedure", conn);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(...
Consider Using Command.Prepare
If your application runs the same set of SQL queries multiple times, preparing those queries by using the Command.Prepare method may give you better performance. In ADO.NET, the SqlCommand.Prepare method calls the sp_prepare stored procedure for SQL Server 7. The SqlCommand.Prepare method calls sp_prepexec for SQL Server 2000 and later. SqlCommand.Prepare makes these calls instead of running a regular batch remote procedure call (RPC). The following code fragment shows how to use Command.Prepare.cmd.CommandText =Using the Prepare method does not yield a benefit if you are only going to run the statement one or two times. The next version of SQL Server will better leverage how plans are cached, so using it would not make a difference. You should only use the Prepare method for those statements that you run multiple times.
"insert into Region (RegionID, RegionDescription) values (@id, @desc)";
cmd.Parameters.Add ( "@id", SqlDbType.Int, 4, "RegionID") ;
cmd.Parameters.Add ( "@desc", SqlDbType.NChar, 50, "RegionDescription") ;
cmd.Parameters[0].Value = 5;
cmd.Parameters[1].Value = "North West";
cmd.Prepare();
cmd.ExecuteNonQuery();
cmd.Parameters[0].Value = 6;
cmd.Parameters[1].Value = "North East";
cmd.ExecuteNonQuery();
cmd.Parameters[0].Value = 7;
cmd.Parameters[1].Value = "South East";
cmd.ExecuteNonQuery();
cmd.Parameters[0].Value = 8;
cmd.Parameters[1].Value = "South West";
cmd.ExecuteNonQuery();
Use Output Parameters Where Possible
Use output parameters and ExecuteNonQuery to return small amounts of data instead of returning a result set that contains a single row. When you use output parameters and ExecuteNonQuery to return small amounts of data, you avoid the performance overhead that is associated with creating the result set on the server.The following code fragment uses a stored procedure to retrieve the product name and unit price for a specific product that is contained in the Products table in the Northwind database.
void GetProductDetails( int ProductID,
out string ProductName, out decimal UnitPrice )
{
using( SqlConnection conn = new SqlConnection(
"server=(local);Integrated Security=SSPI;database=Northwind") )
{
// Set up the command object used to run the stored procedure.
SqlCommand cmd = new SqlCommand( "DATGetProductDetailsSPOutput", conn );
cmd.CommandType = CommandType.StoredProcedure;
// Establish stored procedure parameters.
// @ProductID int INPUT
// @ProductName nvarchar(40) OUTPUT
// @UnitPrice money OUTPUT
// Must explicitly set the direction of the output parameters.
SqlParameter paramProdID =
cmd.Parameters.Add( "@ProductID", ProductID );
paramProdID.Direction = ParameterDirection.Input;
SqlParameter paramProdName =
cmd.Parameters.Add( "@ProductName", SqlDbType.VarChar, 40 );
paramProdName.Direction = ParameterDirection.Output;
SqlParameter paramUnitPrice =
cmd.Parameters.Add( "@UnitPrice", SqlDbType.Money );
paramUnitPrice.Direction = ParameterDirection.Output;
conn.Open();
// Use ExecuteNonQuery to run the command.
// Although no rows are returned, any mapped output parameters
// (and potential return values) are populated
cmd.ExecuteNonQuery( );
// Return output parameters from stored procedure.
ProductName = paramProdName.Value.ToString();
UnitPrice = (decimal)paramUnitPrice.Value;
}
}
Consider SET NOCOUNT ON for SQL Server
When you use SET NOCOUNT ON, the message that indicates the number of rows that are affected by the T-SQL statement is not returned as part of the results. When you use SET NOCOUNT OFF, the count is returned. Using SET NOCOUNT ON can improve performance because network traffic can be reduced. SET NOCOUNT ON prevents SQL Server from sending the DONE_IN_PROC message for each statement in a stored procedure or batch of SQL statements.For example, if you have eight operations in a stored procedure, eight messages are returned to the caller. Each message contains the number of rows affected by the respective statement. When you use SET NOCOUNT ON, you reduce the processing that SQL Server performs and the size of the response that is sent across the network.
Note In Query Analyzer, the DONE_IN_PROC message is intercepted and displayed as "N rows affected".
Parameters
Most SQL commands require input or output parameters, regardless of whether they are stored procedures or direct SQL statements. Each .NET Framework data provider provides a Parameter object implementation. You can use a Parameter object implementation in conjunction with a Command object. Some sample parameter objects include SqlParameter, OleDbParameter, and OdbcParameter. When you use parameters, consider the following recommendations:- Use the Parameters collection when you call a stored procedure.
- Use the Parameters collection when you build SQL statements.
- Explicitly create stored procedure parameters.
- Specify parameter types.
- Cache stored procedure SqlParameter objects.
Use the Parameters Collection When You Call a Stored Procedure
Use the Parameters collection property of the SqlCommand object to pass parameters to a stored procedure. By using strongly typed parameters, types do not need to be discovered at run time. You can also save round trips by checking the data type on the client; for example, you can save round trips by checking the Web server. This prevents wasted cycles and wasted bandwidth that is caused by passing invalid data to the database server. The following code fragment shows how to add a typed parameter to the Parameters collection.SqlDataAdapter adapter = new SqlDataAdapter("GetProductDesc",
conn);
adapter.SelectCommand.CommandType = CommandType.StoredProcedure;
SqlParameter parm = adapter.SelectCommand.Parameters.Add(
"@ProdID", SqlDbType.Int);
parm.Value = 10;
Use the Parameters Collection When You Build SQL Statements
Even if you do not use stored procedures for data access, you should still use the Parameters collection when you build your SQL statements in code. By using the Parameter collection and by explicitly setting the data type, you reduce the likelihood that the Parameter object could set an invalid type. The following code fragment shows how to use the Parameters collection when you build your SQL statements in code.SqlDataAdapter adapter = new SqlDataAdapter(
"SELECT ProductID, ProductName FROM Products WHERE ProductID = @ProdID", conn);
// Set the parameter including name and type.
SqlParameter parm = adapter.SelectCommand.Parameters.Add("@ProdID",
SqlDbType.Int);
// Set the parameter value.
parm.Value = 10;
Explicitly Create Stored Procedure Parameters
Identifying parameters at run time requires a round trip to the server for each use of a stored procedure. This is an expensive operation. Explicitly create parameters for stored procedures. Explicitly supply the parameter type, size, precision, and scale information to prevent the Command object from recreating them every time a command is run. The following code demonstrates how to set the type, size, and direction.void GetProductDetails( int productID,
out string productName, out decimal unitPrice)
{
using( SqlConnection conn = new SqlConnection(
"server=(local);Integrated Security=SSPI;database=Northwind") )
{
// Set up the command object used to run the stored procedure.
SqlCommand cmd = new SqlCommand( "GetProductDetails", conn );
cmd.CommandType = CommandType.StoredProcedure;
// Establish stored procedure parameters.
// @ProductID int INPUT
// @ProductName nvarchar(40) OUTPUT
// @UnitPrice money OUTPUT
// Must explicitly set the direction of output parameters.
SqlParameter paramProdID =
cmd.Parameters.Add( "@ProductID", SqlDbType.Int );
paramProdID.Direction = ParameterDirection.Input;
SqlParameter paramProdName =
cmd.Parameters.Add( "@ProductName", SqlDbType.VarChar, 40 );
paramProdName.Direction = ParameterDirection.Output;
SqlParameter paramUnitPrice =
cmd.Parameters.Add( "@UnitPrice", SqlDbType.Money );
paramUnitPrice.Direction = ParameterDirection.Output;
conn.Open();
cmd.ExecuteNonQuery( );
// Return output parameters from the stored procedure.
productName = paramProdName.Value.ToString();
unitPrice = (decimal)paramUnitPrice.Value;
}
}
Specify Parameter Types
When you create a new parameter, use the relevant enumerated type to specify the data type of the parameter. Use an enumerated type such as SqlDbType or OleDbType. This prevents unnecessary type conversions that are otherwise performed by the data provider.Cache Stored Procedure SqlParameter Objects
Often, applications must run commands multiple times. To avoid recreating the SqlParameter objects each time, cache them so that they can be reused later. A good approach is to cache parameter arrays in a Hashtable object. Each parameter array contains the parameters that are required by a particular stored procedure that is used by a particular connection. The following code fragment shows this approach.public static void CacheParameterSet(string connectionString,The following function shows the equivalent parameter retrieval function
string commandText,
params SqlParameter[] commandParameters)
{
if( connectionString == null || connectionString.Length == 0 )
throw new ArgumentNullException( "connectionString" );
if( commandText == null || commandText.Length == 0 )
throw new ArgumentNullException( "commandText" );
string hashKey = connectionString + ":" + commandText;
paramCache[hashKey] = commandParameters;
}
public static SqlParameter[] GetCachedParameterSet(string connectionString, string commandText)When parameters are retrieved from the cache, a cloned copy is created so that the client application can change parameter values, without affecting the cached parameters. The CloneParameters method is shown in the following code fragment.
{
if( connectionString == null || connectionString.Length == 0 )
throw new ArgumentNullException( "connectionString" );
if( commandText == null || commandText.Length == 0 )
throw new ArgumentNullException( "commandText" );
string hashKey = connectionString + ":" + commandText;
SqlParameter[] cachedParameters = paramCache[hashKey] as SqlParameter[];
if (cachedParameters == null)
{
return null;
}
else
{
return CloneParameters(cachedParameters);
}
}
private static SqlParameter[] CloneParameters(SqlParameter[] originalParameters)
{
SqlParameter[] clonedParameters = new SqlParameter[originalParameters.Length];
for (int i = 0, j = originalParameters.Length; i < j; i++)
{
clonedParameters[i] =
(SqlParameter)((ICloneable)originalParameters[i]).Clone();
}
return clonedParameters;
}
More Information
The code samples for the parameter caching approach that is shown above are based on samples from the Data Access Application Block. The Data Access Application Block implements this functionality in a generic data access component. For more information, see the Data Access Application Block on MSDN at http://msdn.microsoft.com/en-us/library/cc309504.aspx.DataSet vs. DataReader
When you need to retrieve multiple rows of data so that you can display or process the data in some other way, you have two basic choices. You can use a DataSet object or a DataReader object.The DataReader approach is generally quicker because it avoids the overhead that is associated with creating a DataSet object. The overhead that is associated with a DataSet object includes creating DataSet subobjects such as DataTables, DataRows, and DataColumns. However, the DataReader provides less flexibility, and is less suited to situations where you have to cache data and pass the data to components in an application that has multiple tiers.
Note The DataAdapter used to fill the DataSet uses a DataReader internally.Use a DataReader when the following conditions are true:
- You need forward-only, read-only access to data (the fire hose scenario), and you want to access the data as quickly as possible, and you do not need to cache it.
- You have a data container such as a business component that you can put the data in.
- You have to cache or pass the data between layers.
- You require an in-memory relational view of the data for XML or non-XML manipulation.
- You want to update some or all the retrieved rows, and you want to use the batch update facilities of the SqlDataAdapter class.
- You have to bind data to a control type that the DataReader cannot be bound to. Many Windows Forms controls capable of data binding require a data source that implements the IList interface. The DataSet implements IList, but the DataReader implements IEnumerable. IEnumerable supports data binding to most Web Form controls but not to certain Windows Forms controls. Check the data source requirements for the particular control type that you want to bind.
- You have to access multiple sets of data at the same time, and you do not want to hold open server resources.
DataReader
The DataReader provides a read-only, forward-only stream of data from a database. When you use DataReader objects such as SqlDataReader or OleDbDataReader, consider the following recommendations:- Close DataReader objects.
- Consider using CommandBehavior.CloseConnection to close connections.
- Cancel pending data.
- Consider using CommandBehavior.SequentialAccess with ExecuteReader.
- Use GetOrdinal when using an index-based lookup.
Close DataReader Objects
Close your DataReader object as soon as you are finished with it, either by calling its Close method or by calling its Dispose method. It is best to use a finally block to ensure that the DataReader is closed as shown in the following code fragment.using (SqlConnection conn = new SqlConnection(connString))
{
SqlCommand cmd = new SqlCommand("CustomerRead",conn);
conn.Open();
SqlDataReader dr = cmd.ExecuteReader();
try
{
while (dr.Read())
Console.WriteLine(dr.GetString(0));
}
finally
{
dr.Close();
}
}
Consider Using CommandBehavior.CloseConnection to Close Connections
If you need to return a DataReader from a method, consider using the CommandBehavior.CloseConnection method to ensure that the associated connection is closed when the DataReader is closed. The following code fragment shows this approach.public SqlDataReader RetrieveRowsWithDataReader()
{
SqlConnection conn = new SqlConnection(
"server=(local);Integrated Security=SSPI;database=northwind");
SqlCommand cmd = new SqlCommand("RetrieveProducts", conn );
cmd.CommandType = CommandType.StoredProcedure;
try
{
conn.Open();
// Generate the reader. CommandBehavior.CloseConnection causes
// the connection to be closed when the reader object is closed.
return( cmd.ExecuteReader( CommandBehavior.CloseConnection ) );
}
finally
{
if(null!=conn)
conn.Close();
}
}
// Display the product list using the console.
private void DisplayProducts()
{
SqlDataReader reader = RetrieveRowsWithDataReader();
try
{
while (reader.Read())
{
Console.WriteLine("{0} {1}",
reader.GetInt32(0).ToString(),
reader.GetString(1));
}
}
finally
{
if(null!= reader)
reader.Close(); // Also closes the connection due to the CommandBehavior
// enumerator used when generating the reader.
}
}
Cancel Pending Data
When you call the Close method, the method does not return until all the remaining data has been fetched. If you know you have pending data when you want to close your DataReader, you can call the Cancel method before you call Close to tell the server to stop sending data.This approach does not always result in a performance improvement, because Cancel is not guaranteed to make the server stop sending data. Control information is still exchanged after the call to Cancel, and the control information may or may not be interleaved with leftover data. Therefore, before you restructure your code to call Cancel before Close, test Cancel to learn if it actually helps in your particular scenario and to learn if you really need the extra performance at the expense of readability.
Note If you need output parameters, do not call Close until you have retrieved the output parameters. After you retrieve the output parameters, you can then call Close.
Consider Using CommandBehavior.SequentialAccess with ExecuteReader
If you do not have to have random access to columns, use CommandBehavior.SequentialAccess when you call the ExecuteReader method of the Command object.Use GetOrdinal When Using an Index-Based Lookup
Using an index or ordinal-based lookup is faster than using string-based column names. However, using an index adds code maintenance overhead. Using an index requires you to change the index when the query column-order changes or when table columns are changed. Instead of hard coding the values, you can use GetOrdinal to get the index as shown in the following code fragment.cmd.CommandText = "Select RegionDescription, RegionId from Region";
SqlDataReader dr = cmd.ExecuteReader();
int RegionId = dr.GetOrdinal("RegionId");
int RegionDescription = dr.GetOrdinal("RegionDescription");
while( dr.Read())
{
Console.WriteLine(dr[RegionId] + " - " + dr[RegionDescription]);
}
DataSet
If you need to work with a disconnected, cached set of data, you usually create a DataSet by using a DataAdapter. To help optimize the performance of DataSet objects, consider the following recommendations:- Reduce serialization.
- Use primary keys and Rows.Find for indexed searching.
- Use a DataView for repetitive non-primary key searches.
- Use the Optimistic concurrency model for datasets.
Reduce Serialization
DataSet serialization is more efficiently implemented in .NET Framework version 1.1 than in version 1.0. However, DataSet serialization often introduces performance bottlenecks. You can reduce the performance impact in a number of ways:- Use column name aliasing. The serialized data contains column names so that you can use column name aliasing to reduce the size of the serialized data.
- Avoid serializing multiple versions of the same data. The DataSet maintains the original data along with the changed values. If you do not need to serialize new and old values, call AcceptChanges before you serialize a DataSet to reset the internal buffers.
- Reduce the number of DataTable objects that are serialized. If you do not need to send all the DataTable objects contained in a DataSet, consider copying the DataTable objects you need to send into a separate DataSet.
More Information
For more information, see "How To: Improve Serialization Performance" in the "How To" section of this guide.Use Primary Keys and Rows.Find for Indexed Searching
If you need to search a DataSet by using a primary key, create the primary key on the DataTable. This creates an index that the Rows.Find method can use to quickly find the records that you want. Do not use DataTable.Select because DataTable.Select does not use indices.Use a DataView for Repetitive Non-Primary Key Searches
If you need to repetitively search by using non-primary key data, create a DataView that has a sort order. This creates an index that can be used to perform the search. This is best suited to repetitive searches because there is some cost to creating the index.The DataView object exposes the Find and FindRows methods so that you can query the data in the underlying DataTable. If you are only performing a single query, the processing that is required to create the index reduces the performance that is gained by using the index.
When you create a DataView object, use the DataView constructor that takes the Sort, RowFilter, and RowStateFilter values as constructor arguments along with the underlying DataTable. Using the DataView constructor ensures that the index is built once. If you create an empty DataView and set the Sort, RowFilter, or RowStateFilter properties afterwards, the index is built at least two times.
Use the Optimistic Concurrency Model for Datasets
There are two concurrency models that you can use when working with datasets in an environment that has multiple users. These two models are the pessimistic and optimistic models. When you read data and use the pessimistic model, locks are established and held until updates are made and the locks are released. Holding locks on server resources, in this case database tables, leads to contention issues. It is best to use granular locks for very short durations.The optimistic model does not lock the data when the data is read. The optimistic model locks the data just before the data is updated and releases the lock afterwards. There is less contention for data with the optimistic model, which is good for shared server scenarios; however, you should take into account the scenarios for managing the concurrency violations. A common technique you can use to manage concurrency violations is to implement a timestamp column or to verify against the original copy of data.
More Information
For more information about how to implement optimistic concurrency solutions, see "Optimistic Concurrency" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/aa0416cz(VS.71).aspx.XML and DataSet Objects
Data and schema information maintained within DataSet objects can be output as XML. Also, you can populate a DataSet object from an XML data stream. If you use XML and DataSets, consider the following recommendations:- Do not infer schemas at run time.
- Perform bulk updates and inserts by using OpenXML.
Do Not Infer Schemas at Run Time
Limit schema inference to design time. When you load a DataSet, ensure that your schema is not inferred, which can happen by default. The inference process is costly. To ensure that your existing schema is used and that no schema is inferred, pass XmlReadMode.IgnoreSchema to the ReadXml method.Perform Bulk Updates and Inserts by Using OpenXML
Different .NET Framework data providers enable you to do bulk updates and inserts by using the OpenXML method. You can use OpenXML to minimize SQL Server database calls, because you can use the OpenXML function to insert multiple rows of data in a single database call. OpenXML enables you to effectively package data together in a single call as XML, map it to a rowset view, and execute all of the inserts within the same database call. This helps reduce calls and resource utilization. The following code fragment shows you how to use OpenXML for updates and inserts.--This code UPDATES data.
UPDATE Employee
SET
Employee.FirstName = XMLEmployee.FirstName,
Employee.LastName = XMLEmployee.LastName
FROM OPENXML(@hDoc, 'NewDataSet/Employee')
WITH (EmployeeId Integer, FirstName varchar(100), LastName varchar(100)) XMLEmployee
WHERE Employee.EmployeeId = XMLEmployee.EmployeeId
--This code inserts new data.
Insert Into Employee
SELECT EmployeeId, FirstName, LastName
FROM OPENXML (@hdoc, '/NewDataSet/Employee',1)
WITH (EmployeeId Integer, FirstName varchar(100), LastName varchar(100)) XMLEmployee
Where XMLEmployee.EmployeeId Not IN (Select EmployeeID from Employee)
More Information
For a complete code sample that shows how to use the OpenXML method, see Knowledge Base article 315968, "HOW TO: Perform Bulk Updates and Inserts Using OpenXML with .NET Providers in Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;315968.For more information about XML and DataSet objects, see "Employing XML in the .NET Framework" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/2bcctyt8(VS.71).aspx.
Typed DataSets
A typed DataSet is a custom object that derives from the DataSet base class. It supports typed data access through an exposed set of properties specific to the encapsulated data.Use typed DataSet objects to avoid late-bound field access. The strongly typed accessors are provided by the typed DataSet and are faster because they eliminate column or table name lookups in a collection.
In addition to run-time performance benefits, typed datasets provide strong type checking and IntelliSense® by using custom field names at design time.
More Information
For more information about typed DataSet objects, see the following Knowledge Base articles:- 320714, "HOW TO: Create and Use a Typed DataSet by Using Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;320714
- 313486, "INFO: Roadmap for Visual Database Tools and Typed DataSets," at http://support.microsoft.com/default.aspx?scid=kb;en-us;313486
Types
When you access a data source by using stored procedures or dynamic SQL commands, make sure that you specify the precise data type for the input and output parameters. By specifying the precise database type, you can help performance in the following ways:- You help prevent internal type conversions from being performed by the data provider. Internal type conversions can lead to loss of precision.
- You help reduce round trips that the data provider might make to the database to discover type information at run time.
- You enable the data provider to perform type checks at the client and fail early with type exceptions. This helps avoid unnecessary round trips to the server.
Avoid Unnecessary Type Conversions
Type conversions can occur when you pass parameters to and from stored procedures or other SQL statements. To avoid type conversions, make sure that you:- Set the provider-specific type property of each Parameter object.
- Pass a consistent object type when you set the Value property of the Parameter object.
The following code demonstrates how to avoid type conversions for the .NET Framework Data Provider for SQL Server.
// Set the provider-specific type for the parameter.The type namespaces and type enumerations for each data provider are summarized in Table 12.1.
SqlParameter param = new SqlParameter("@Name", SqlDbType.NVarChar, 20);
// Use the right provider-specific type. In this case use SqlString, which
// corresponds to SqlDbType.NVarChar
param.Value = new SqlString("Frederick Smith");
Table 12.1: Database Provider Types
Provider | Type namespace | Type enumeration |
SQL Server | System.Data.SqlTypes | SqlDbType |
OLE DB | System.Data.OleDb | OleDbType |
ODBC | System.Data.OdbcType | OdbcType |
Oracle | System.Data.OracleClient | OracleType |
Exception Management
ADO.NET errors that are propagated through SqlException or OleDbException objects use custom error handling for specific data access code. Consider the following guidelines for exception handling in ADO.NET data access code:- Use the ConnectionState property. Avoid relying on an error handler to detect connection state availability. When you can, use the ConnectionState.Open or ConnectionState.Close method to check the state before use.
- Use try/finally to clean up resources. Use try/finally more often than try/catch/finally. Using finally gives you the option to close the connection, even if an exception occurs. If you develop in C#, the using statement provides this functionality with code that is easy to maintain, as shown in the following code fragment.
using( SqlConnection conn = new SqlConnection(
"server=(local);Integrated Security=SSPI;database=Northwind") )
{ . . . } - Use specific handlers to catch specific exceptions. If you know that there are scenarios where specific errors could possibly occur, use specific handlers. For example, if you want to know if a concurrency violation occurs when multiple updates are occurring, look for exceptions of type DBConcurrencyException. The specific exception classes SqlException and OleDbException provide a detailed message when errors occur. Use this message to log the details. Ensure that specific exceptions precede generic handlers, as shown in the following code fragment.
try
{ ...
}
catch (SqlException sqlex) // specific handler
{ ...
}
catch (Exception ex) // Generic handler
{ ...
}
Transactions
Transactions are important for ensuring data integrity but come at an operational cost. Selecting the right transaction management mechanism for your application can significantly improve scalability and performance. Key considerations include the type and quantity of resources involved and the isolation level required for the transactions. When you determine how you should manage transactions in your system, consider the following recommendations:- Use SQL transactions for server-controlled transactions on a single data store.
- Use ADO.NET transactions for client-controlled transactions on a single data store.
- Use DTC for transactions that span multiple data stores.
- Keep transactions as short as possible.
- Use the appropriate isolation level.
- Avoid code that can lead to deadlock.
- Set the connection string Enlist property to false.
Use SQL Transactions for Server-Controlled Transactions on a Single Data Store
If you need to write to a single data store, and if you can complete the operation in a single call to the database, use the transaction control provided by the SQL language on your database server. The transaction runs close to the data and reduces the cost of the transaction. Running the transaction close to the data also permits database administrators to tune the operation without changing the deployment of your application code. The following code fragment shows a simple T-SQL transaction performed in a stored procedure.BEGIN TRAN
UPDATE Orders SET Freight=@Freight Where OrderID=@OrderID
UPDATE [Order Details] SET Quantity=@Quantity Where OrderID=@OrderID
IF (@@ERROR > 0)
ROLLBACK TRANSACTION
ELSE
COMMIT TRANSACTION
Note If you need to control a transaction across multiple calls to a single data store, use ADO.NET manual transactions.
Use ADO.NET Transactions for Client-Controlled Transactions on a Single Data Store
If you need to make multiple calls to a single data store participate in a transaction, use ADO.NET manual transactions. The .NET Data Provider for SQL Server and the .NET Data Provider for Oracle use the appropriate transaction language to enforce transactions on all subsequent SQL commands.If you use SQL Profiler to monitor your use of ADO.NET manual transactions, you see that BEGIN TRAN, COMMIT TRAN, or ROLLBACK TRAN is run against the data store on your behalf by the provider. This enables you to control the transaction from your .NET Framework code and to maintain performance at a level that is similar to SQL transactions. The following code fragment shows how to use ADO.NET transactions.
SqlConnection conn = new SqlConnection(connString);When you use ADO.NET manual transactions, you can set the desired isolation level on the BeginTransacion method as shown in the following code fragment.
SqlTransaction trans = conn.BeginTransaction();
try
{
SqlCommand cmd = new SqlCommand("MyWriteProc",conn, trans);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(....
// additional transactioned writes to database
trans.Commit();
}
catch
{
trans.Rollback();
}
SqlConnection conn = new SqlConnection(connString);More Information
SqlTransaction trans = conn.BeginTransaction(IsolationLevel.ReadCommitted);
For a more information about isolation levels, see "Use the Appropriate Isolation Level" later in this chapter.
Use DTC for Transactions That Span Multiple Data Stores
Enterprise Services uses the Microsoft Distributed Transaction Coordinator (DTC) to enforce transactions. If you have a transaction that spans multiple data stores or resource manager types, it is best to use Enterprise Services to enlist the data sources in a distributed transaction. Using Enterprise Services to enlist the data sources in this scenario is simple to configure.The DTC performs the inter-data source communication and ensures that either all the data is committed or that none of the data is committed. This action creates an operational cost. If you do not have transactions that span multiple data sources, use SQL or ADO.NET manual transactions because they perform better.
Keep Transactions as Short as Possible
Design your code to keep transactions as short as possible to help minimize lock contention and to increase throughput. Avoid selecting data or performing long operations in the middle of a transaction.Use the Appropriate Isolation Level
Resource managers such as SQL Server and other database systems support various levels of isolation for transactions. Isolation shields operations from the effect of other concurrent transactions. Most resource managers support the four isolation levels shown in Table 12.2. The isolation level determines the types of operation that can occur. The types of operation that can occur include dirty reads, nonrepeatable reads, or phantoms.Table 12.2: Isolation Levels
Isolation level | Dirty reads | Nonrepeatable reads | Phantoms |
Read uncommitted | Yes | Yes | Yes |
Read committed | No | Yes | Yes |
Repeatable read | No | No | Yes |
Serializable | No | No | No |
Avoid Code That Can Lead to Deadlock
Consider the following general guidelines when you use transactions so that you can avoid causing deadlocks:- Always access tables in the same order across transactions in your application. The likelihood of a deadlock increases when you access tables in a different order each time you access them.
- Keep transactions as short as possible. Do not make blocking or long-running calls from a transaction. Keep the duration of the transactions short. One approach is to run transactions close to the data source. For example, run a transaction from a stored procedure instead of running the transaction from a different computer.
- Choose a level of isolation that balances concurrency and data integrity. The highest isolation level, serializable, reduces concurrency and provides the highest level of data integrity. The lowest isolation level, read uncommitted, gives the opposite result. For more information, see "Use the Appropriate Isolation Level" earlier in this chapter.
Set the Connection String Enlist Property to False
A pooled transactional object must enlist its connection into the current transaction manually. To enable it to do so, you must disable automatic transaction enlistment by setting the connection string Enlist property to False.Note This applies to a SqlConnection. For an OleDbConnection, you need to set OLE DB Services=-7 as a connection string parameter.Pooled components that maintain database connections might be used in different transactions by separate clients. A pooled transactional object must be able to determine if it is activated in a new transaction that is different from the last time it was activated.
Each time a pooled transactional object is activated, it should check for the presence of a COM+ transaction in its context by examining ContextUtil.Transaction. If a transaction is present and the connection is not already enlisted, the object should enlist its connection manually by calling the EnlistDistributedTransaction method of the Connection object.
More Information
For more information about transaction options and how to analyze transaction performance, see the following resources on MSDN:- "Performance Comparison: Transaction Control" at http://msdn.microsoft.com/en-us/library/ms978408.aspx
- "Implementing Database Transactions with Microsoft .NET" at http://msdn.microsoft.com/en-us/library/ms954625.aspx
Binary Large Objects
A binary large object (BLOB) is a binary resource such as an image, a sound or video clip, or a document. Storing BLOBs in a database can cause significant resource pressure. For example, large BLOBs can consume large amounts of memory, CPU, and networking resources on both the client and the server.You can choose to handle BLOBs as a whole or handle them in chunks.
Handling BLOBs as a whole is useful when the BLOB is not very large in size, and you require the complete BLOB to be in memory before you perform the operation. This approach tends to put excessive memory pressure on the server as well as on network bandwidth.
Compared to handling BLOBs as a whole, chunking does cause more round trips, but chunking creates less load on the server and reduces network bandwidth use. The network bandwidth is not excessively consumed because you transfer the data in chunks rather than passing the BLOB all at one time. Therefore, the server only has to take care of the immediate buffer passed to it. The server can either store the buffer to the disk or redirect it as an output stream to the client.
ADO.NET data providers do not provide the GetChunk and AppendChunk methods in the same way that Data Access Objects (DAO) and ActiveX Data Objects (ADO) do with Recordset objects. However, this section describes the alternate options that exist.
Consider the following when you are working with BLOBs:
- Use CommandBehavior.SequentialAccess and GetBytes to read data.
- Use READTEXT to read from SQL Server 2000.
- Use OracleLob.Read to read from Oracle databases.
- Use UpdateText to write to SQL Server databases.
- Use OracleLob.Write to write to Oracle databases.
- Avoid moving binary large objects repeatedly.
Use CommandBehavior.SequentialAccess and GetBytes to Read Data
The default behavior of the DataReader is to read an entire row into memory. All columns are accessible in any order until the next row is read.If you retrieve large BLOBs, reading the whole BLOB into memory may cause excessive memory consumption. Using CommandBehavior.SequentialAccess enables you to stream the data or to send the data in chunks from the column containing the BLOB by using the GetBytes, GetChars, or GetString methods.
The following code fragment shows how to use the SequentialAccess and GetBytes methods.
// Allocate a buffer to hold a BLOB chunk.
int bufferSize = 100; // the size of the buffer to hold interim chunks of the BLOB
byte[] outbyte = new byte[bufferSize]; // The buffer to hold the BLOB
SqlDataReader myReader = empCmd.ExecuteReader(CommandBehavior.SequentialAccess);
while (myReader.Read())
{
// The BLOB data is in column two. Must get the first column
// before the BLOB data.
empID = myReader.GetInt32(0); // First column
// Read the bytes into outbyte[] and retain the number of bytes returned.
retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);
// Continue reading and writing while there are bytes beyond the
// Size of the buffer.
while (retval == bufferSize)
{
// Write data to a file or to a Web page (omitted for brevity).
. . .
// Reposition the start index to the end of the last buffer
// and fill the buffer.
startIndex += bufferSize;
retval = myReader.GetBytes(1, startIndex, outbyte, 0, bufferSize);
}
}
Note When you use CommandBehavior.SequentialAccess, you must retrieve columns in sequence. For example, if you have three columns, and the BLOB data is in the third column, you must retrieve the data from the first and second columns, before you retrieve the data from the third column.
More Information
For more information, see "Obtaining BLOB Values from a Database" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/87z0hy49(VS.71).aspx.Use READTEXT to Read from SQL Server 2000
The READTEXT command reads text, ntext, or image values from a text, ntext, or image column. The READTEXT command starts reading from a specified offset and reads the specified number of bytes. This command is available in SQL Server 2000 and later. This command enables you to read data in chunks by sending a fixed set of bytes over the network for each iteration. The following are the steps you must follow to use the READTEXT command:- Obtain a pointer to the BLOB by using the TEXTPTR command.
- Read the BLOB, by using the READTEXT command, in the required chunk size, with the help of the pointer that you obtained in step 1.
- Send the data to the client.
- Read the data on the client, and then store it in a buffer or a stream.
int BUFFER_LENGTH = 32768; // chunk size
// Obtain a pointer to the BLOB using TEXTPTR.
SqlCommand cmdGetPointer = new SqlCommand(
"SELECT @Pointer=TEXTPTR(Picture), @Length=DataLength(Picture)" +
"FROM Categories WHERE CategoryName='Test'", conn);
// Set up the parameters.
SqlParameter PointerOutParam = cmdGetPointer.Parameters.Add("@Pointer", SqlDbType.VarBinary, 100);
// Run the query.
// Set up the READTEXT command to read the BLOB by passing the following
// parameters: @Pointer pointer to blob, @Offset number of bytes to
// skip before starting the read, @Size number of bytes to read.
SqlCommand cmdReadBinary = new SqlCommand(
"READTEXT Categories.Picture @Pointer @Offset @Size HOLDLOCK", conn);
// Set up the parameters for the command.
SqlParameter SizeParam = cmdReadBinary.Parameters.Add("@Size", SqlDbType.Int);
SqlDataReader dr;
int Offset= 0;
Byte []Buffer = new Byte[BUFFER_LENGTH ];
// Read buffer full of data.
do {
// Add code for calculating the buffer size - may be less than
// BUFFER LENGTH for the last block.
dr = cmdReadBinary.ExecuteReader(CommandBehavior.SingleResult);
dr.Read();
dr.GetBytes(PictureCol, 0, Buffer, 0, System.Convert.ToInt32(SizeParam.Value));
Offset += System.Convert.ToInt32(SizeParam.Value);
OffsetParam.Value = Offset;
} while( //Check for the offset until it reaches the maximum size.);
More Information
For more information about the READTEXT command, see Knowledge Base article 317043, "HOW TO: Read and Write a File to and from a BLOB Column by Using Chunking in ADO.NET and Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;317043.Use OracleLob.Read to Read from Oracle Databases
To read BLOBs from an Oracle database, use the .NET Framework Data Provider for Oracle. This data provider provides the System.Data.OracleClient.OracleLob class that can read BLOBs. The following code fragment shows how the OracleLob.Read method enables you to read the data in chunks.byte[] buffer = new byte[100];
while((actual = blob.Read(buffer, 0/*buffer offset*/,
buffer.Length/*count*/)) >0)
{ //write the buffer to some stream
}
More Information
For more information about OracleLob.Read, see "OracleLob.Read Method" in .NET Framework Class Library at http://msdn.microsoft.com/en-us/library/system.data.oracleclient.oraclelob.read(VS.71).aspx.Use UpdateText to Write to SQL Server Databases
If you are using SQL Server, you can use the UpdateText function to write the data in chunks, as shown in the following code fragment.int BUFFER_LENGTH = 32768; // Chunk size.
// Set the existing BLOB to null and
// Obtain a pointer to the BLOB using TEXTPTR
SqlCommand cmdGetPointer = new SqlCommand(
"SET NOCOUNT ON;UPDATE Categories SET Picture = 0x0 WHERE CategoryName='Test';" +
"SELECT @Pointer=TEXTPTR(Picture) FROM Categories WHERE CategoryName='Test'",
cn);
// Set up the parameters.
// Run the query.
// Set up the UPDATETEXT command to read the BLOB by passing the following
// parameters: @Pointer pointer to blob, @Offset number of bytes to
// skip before starting the read, @Size number of bytes to read.
SqlCommand cmdUploadBinary = new SqlCommand(
"UPDATETEXT Categories.Picture @Pointer @Offset @Delete WITH LOG @Bytes", cn);
// Set up the parameters.
// Read buffer full of data and then run the UPDATETEXT statement.
Byte [] Buffer = br.ReadBytes(BUFFER_LENGTH);
while(Buffer.Length > 0)
{
PointerParam.Value = PointerOutParam.Value;
BytesParam.Value = Buffer;
cmdUploadBinary.ExecuteNonQuery();
DeleteParam.Value = 0; //Do not delete any other data.
Offset += Buffer.Length;
OffsetParam.Value = Offset;
Buffer = br.ReadBytes(BUFFER_LENGTH);
}
More Information
For more information about writing BLOB data to SQL Server, see "Conserving Resources When Writing BLOB Values to SQL Server" in .NET Framework Developer's Guide at http://msdn.microsoft.com/en-us/library/3517w44b(VS.71).aspx.Or, see Knowledge Base article 317043, "HOW TO: Read and Write a File to and from a BLOB Column by Using Chunking in ADO.NET and Visual C# .NET," at http://support.microsoft.com/default.aspx?scid=kb;en-us;317043.
Use OracleLob.Write to Write to Oracle Databases
You can write BLOBs to an Oracle database by using the .NET Framework data provider for Oracle. This data provider permits the System.Data.OracleClient.OracleLob class to write BLOBs. The OracleLob.Write method enables you to write data in chunks.More Information
For more information, see "OracleLob.Write Method," in .NET Framework Class Library at http://msdn.microsoft.com/en-us/library/system.data.oracleclient.oraclelob.write(VS.71).aspx.Avoid Moving Binary Large Objects Repeatedly
Avoid moving BLOB data more than one time. For example, if you build a Web application that serves images, store the images on the file system and the file names in the database instead of storing the images as BLOBs in the database.Storing the images as BLOBs in the database means that you must read the BLOB from the database to the Web server and then send the image from the Web server to the browser. Reading the file name from the database and having the Web server send the image to the browser reduces the load on the database server. It also reduces the data that is sent between the database and the Web server. This can significantly affect performance and scalability.
Paging Records
Paging records is a common application scenario. The records that you need to page through can often be based on user input. For example, they can be based on a search keyword entered through a search screen. Or, the records can be common to all users. For example, a product catalogue is a record that is common to all users.Paging costs can be divided into the following stages:
- Processing cost at the database. This includes processor and memory use, and disk I/O.
- Network cost for the amount of data sent across the network.
- Processing cost at the client. This includes the memory required to store records, and processor use for processing the records.
- Inefficient queries may increase the processing cost in all the stages mentioned in this section. The database has to process an increased number of rows, more data than is required is sent over the network, and the client has to process additional records to show the relevant ones to the user.
- Inappropriate caching of data to be paged. Some of the paging implementations require the client to cache data and then page through it. These solutions can lead to excessive memory pressure if the cache is maintained on a per-user basis.
- You can return the whole result set from the database to the client. The client caches the result set and then displays the most relevant results to the user by using the correct filtering mechanism.
- You can have the database assume the additional role of a filter by making the database return only the most relevant result set to the client.
More Information
For more information about how to choose and implement the best solution for your scenario, see "How To: Page Records in .NET Applications" in the "How To" section of this guide.Analyzing Performance and Scalability of Data Access
When you evaluate the performance and scalability of your data access decisions, you should examine the impact that your code has on the server, on the network, and on the client. A good data access solution uses server resources in a timely and efficient manner, transports only the data that is required, and permits the client to quickly consume the data and then release resources.Start by running simple logical operation tests, and then examine the key metrics and related questions listed in Table 12.3. Review the results, and then use the information in the table to improve performance and scalability.
Table 12.3: Metrics for Analyzing Data Access Performance
Metric | Questions |
Run time | How long did the operation take on the server? How long did the operation take on the client? |
Network trips | How many network trips were required to complete the operation? Is there a way to reduce or consolidate the trips? |
Index use | Did the operation use indexes? Was the index use efficient? |
Records processed/retrieved | Did the operation process more records than it returned? Did the operation return more records than you wanted? Can paging help reduce the records processed and returned? |
CPU use | Was CPU use on the server excessive? Can it be reduced by different SQL language or by computing data beforehand? |
Memory use | How much memory on the server was used to process the SQL operation? How much memory on the client was used to process the data that was retrieved? |
Network bandwidth | How much bandwidth did the operation use? Is there a way to decrease that amount used by returning fewer rows or fewer columns? |
Transactions | Are transactions creating a deadlock or failing? Is there a way to commit the transaction faster? |
- Pooling. Monitor pooling to ensure that connections are returned efficiently to the pool so that they can be reused. Ensure that you can close your connections early in your code.
- Locks. Monitor locks to find out whether the locks are held as long as they could be held. Find out if you can reduce the number of locks that you hold, and if you can shorten the duration that you hold the existing locks.
Tool Support
Use the following tools to monitor the metrics that are listed in Table 12.3:- SQL Query Analyzer. When you run an SQL command, you can use the Statistics tab to monitor the duration (in milliseconds), the affected rows, the server round trips, and the bytes transmitted. You can also use SQL Query Analyzer to show you the execution plan that SQL Server uses to run your SQL operation. You can use this feature to identify missed indexes that manifest as table scans.
- SQL Profiler. You can use SQL Profiler to monitor an enormous amount of information that includes cursors, locks, and transactions. Use this tool to identify the resources that are used, the operations that are performed, and the length of time (in milliseconds) that particular operations take.
- Performance Counters. Use performance counters to monitor connection pooling, index hits and misses, cache hits and misses, and locks.
More Information
For more information about measuring data access performance, see "ADO.NET/Data Access" inSummary
The database is often a focal point for application load because the majority of application requests require data that comes from a database. Therefore, developing efficient ADO.NET data access code is critical.This chapter has provided a brief overview of ADO.NET architecture and has highlighted the main performance and scalability issues that you need to be aware of when you develop data access code. By following the design and implementation guidelines in this chapter, you will greatly increase your chances of building data access code that enables your application to meet its performance objectives.