Server Architecture
Figure "Server Architecture" illustrates the main components of the relational engine portion of SQL Server. The illustrated components can be organized into three groupings of subsystems.
Server Architecture
Open Data Services(ODS)
ODS manages the network: it listens for new connections, cleans up failed connections, acknowledges "attentions" (cancellations of commands), coordinates threading services to SQL Server, and returns result sets, messages, and status values back to the client by using TDS.
Compilation components: On the left are the components that address compiling queries.
Command parser handles language events raised by ODS. It checks for proper syntax and translates Transact-SQL commands into an internal format that can be operated on. This internal format is known as a query tree. If the parser doesn't recognize the syntax, a syntax error is immediately raised and identifies where the error occurred.
The optimizer is the responsibility of the query optimizer to take the nonprocedural request expressed in SQL and translate it into a set of disk I/Os, filtering, and other procedural logic that efficiently satisfies the request.(optimizer只负责非程序化的SQL语句,即严格意义上的SQL、DML语句)< /font>
Execution components: On the right-hand side is the execution infrastructure. This is really a much smaller set of facilities. Once the compilation components have finished their job, they have created something that can be directly executed with just a few services thrown in.
SQL Manager: In the middle of the figure is something called the SQL Manager. It controls the flow of everything inside the SQL Server. RPC messages are handled by the SQL Manager. The pseudo system stored procedures are also logically a part of the SQL Manager. SQL statements typically coming in as TDS SQL Language messages are processed directly from the compilation side. Results are sent back out by components in the execution engine calling ODS to format the TDS results messages.(SQL Manager负责处理程序化的TSQL语句,即严格意义上的Transact-SQL或TSQL。也包括对其它的RPC events以及pseudo system stored procedures的处理)
Expression service. The Expression Services library is a component that does data conversion, predicate evaluation (filtering), and arithmetic calculations. It also works with the ODS layer to format output results into TDS messages.
For example:SELECT @MyQty=Qty/10 FROM MyTable
The expression manager copies the value of qty from the row set returned by the storage engine, multiplies it by 10, and stores the result in @myqty.
Other components:
Catalog services component, which handles data such definition statements as CREATE TABLE, CREATE VIEW, and so forth. Catalog services also handles system tables, materializing those that are really pseudo tables. The catalog services component is located primarily in the relational engine, but actually about one-third of it operates within the sphere of the storage engine, so it is treated as a shared component.
User Mode Scheduler (UMS): SQL Server's own internal scheduler for fibers and threads. There is a very sophisticated internal mechanism for scheduling how work is assigned to either fibers or threads, depending on how you've configured the server, and allows SQL Server to do the proper load balancing across processors on an SMP system. The UMS also keeps SQL Server from thrashing by running too many threads concurrently. Finally, there are the system procedures(指那些非pseudo system stored procedures) that people are familiar with; logically they are part of the relational engine. These system procedures are treated as part of the server because their purpose is to expose primitive server capabilities, like the system tables, at a higher and more appropriate level for application use. 应用程序使用这些存储过程,可以避免系统表变化时应用程序变得不可用。
Client/Server Interactions When Processing SQL
Ad hoc(non-parameterized SQL) & default result set: Here is an example of an ODBC call:(There is an almost direct equivalent of this call for OLE-DB, which we won't look at because the processing is practically identical to the ODBC call.)
SQLExecDirect(hstmt, "SELECT * FROM parts where partid = 7", SQL_NTS)
This is a classic example of ad hoc SQL. Clients all provide some notion of cursors, so one of the questions that must be asked by the client internally is what kind of result set or what kind of cursor is the programmer asking for. The fastest type is what the documentation calls a default result set. This type of cursor has also been historically called a firehouse cursor and sometimes it isn't even thought of as a cursor at all. After the SQL request is sent to the server, the server starts sending results back to the client and won't stop sending results until the client has consumed the entire set. This is like a giant firehouse pumping data out at the client.
RPC(parameterized SQL): Once the client has determined that it's a default result set, the next step is to determine if there are any parameter markers. One of the options when using this SQLExecDirect call in ODBC (and its equivalent in OLE-DB) is that, instead of supplying a specific value like 7 in the WHERE clause, you can pass in a parameter marker by replacing the constant with a question mark, as shown here:
SQLExecDirect(hstmt, "SELECT * FROM parts where partid = ?", SQL_NTS)
Note that you must separately provide the actual value of the parameter.
The client needs to know if there are any parameter markers present in this SQL statement, or is it true ad hoc. That will affect what the client does with this statement internally and determines what is actually sent as messages to the SQL Server. In the case where there is no question mark, it is clear that the client simply wants to send this request as SQL Language TDS messages, and then the client will sit at the end of the firehose and take the results back. The client can then return the result to the application based on the application's parameters. The client's internal processing choices can be a little obscure in terms of what you request through the ODBC or OLE DB APIs. For example, an application program doesn't directly request a default result set. Instead, in ODBC, if you ask for a cursor that is read-only and forward-only and gives you one row at a time, that defines it to be a firehose cursor (a default result set) as far as the client internals goes.
There is one main problem with a firehose cursor. The client can't send any other SQL statements down to the server until it has consumed all the rows. Because the result set may have a very large number of rows, some applications won't work well with firehose cursors.
Prior to SQL Server version 7.0, the SQLExecDirect call would be processed in much the same way whether or not parameter markers were substituted for the constant. If you specified a parameter marker, the client would actually take the value that you supplied through a different call and plug it in where the question mark was. The new statement with the substituted value was then sent down as an ad hoc SQL statement. There was no benefit of using parameterized SQL at the server. In SQL Server 7.0, however, if parameter markers are used with SQLExecDirect, the TDS message sent down to SQL Server isn't a SQL language message. Instead, it's sent down to server using the sp_executesql procedure, so it's an RPC as far as the TDS protocol is concerned. At the client, the result is basically the same. The client will get the firehose of data back.
Scrollable cursor: If you don't want this firehose of data back, you can always use a block or a scrollable cursor. In this case, the flow becomes very different. A call is made to the sp_cursoropen entry point (one of these pseudo-stored procedures) passing in the SQL text. The sp_cursoropen manipulates the SQL to add additional logic to enable it to scroll, it potentially redirects some results into a temp table, and then it gives a response with a handle to the cursor indicating that the cursor is now open. Still outside of the programmer's control, the client calls sp_cursorfetch, brings down one or more rows to the client that are then returned to the user application. The client can also use sp_cursor to reposition the cursor, or change certain statistics. When you're done processing the cursor, the client will call sp_cursorclose.
Fast forward-only cursor: Let's examine a simple case where we're just returning one row to the client. In the default result set case, you have one round trip of messages from the client to the server, and back again. There is the SQL message (or sp_executesql) going down to the server, and then the results coming back. In the case of a (nonfirehose) cursor for the same one row, you see what you've traditionally seen with SQL Server. There is a round-trip to do the open, a round-trip to do the fetch, and a round-trip to do the close. The process has used three times as many messages as the default result set would have used. In SQL Server 7.0, there is something called a fast forward-only cursor, which uses the same cursor infrastructure. It doesn't behave like a firehose, because it doesn't require that you process all the result rows prior to sending any additional SQL messages. So if you bring back five rows and there is still more data, you can still send an update down to the server.
A fast forward-only cursor is faster on the server than a regular cursor, and it lets you specify two additional options. One is called autofetch and one is called autoclose. Autofetch will return the first set of rows as part of the response message to the open. Autoclose automatically closes the cursor after the last row is read. Because it is forward-only and read-only, you can't scroll back. SQL Server simply passes a message back with that last set of data saying that the cursor is closed. If you're using fast forward-only cursors, you can get the communication down to the same one round-trip in messages for small numbers of rows. If you have large numbers of rows, you're at least only paying the additional cost for each block of rows. Cursor processing has gotten a lot closer to that default result set if you use fast forward-only cursors.
Flow of Client/server interactions
The Prepare/Execute Model
In addition to the execute direct model (invoked in ODBC with SQLExecDirect), there is another execution model exposed in ODBC and OLE-DB, called the prepare/execute model. Defining the SQL to be executed is done as a separate step from actually executing the SQL. Here's an example in ODBC:
SQLPrepare(hstmt, "SELECT * FROM parts where partid = ?", SQL_NTS)
SQLExecute(hstmt)
In 7.0, there are two pseudo system stored procedures that provide a native interface. For the prepare call, we again take a look at what kind of cursor it is, and then we either call sp_prepare or sp_cursorprepare. That does the compilation part of processing the SQL or the store procedure, but doesn't actually execute the plan. Instead, the pseudo system stored procedure returns a handle to the plan. Now your application can repeatedly re-execute the SQL, passing in different parameter values, for example, without needing to recompile.
In SQL Server 7.0, the prepare/execute method is a native feature of SQL Server. After the SQL statement has been prepared, it is executed. In the case of default result sets, that is just by the application programmer calling sp_execute with the handle supplied from the prepare operation, and then the statement runs. In the case of cursors, it looks exactly like the other cursor processing and, in fact, it has the same characteristics, including allowing autofetch and autoclose if the cursor is fast forward-only.
Prepare / Execute Model
Calling Stored Procedures
Stored procedures are generally invoked from ODBC and OLE-DB by sending a SQL statement down to the SQL Server that uses the ODBC canonical CALL syntax to call a procedure. It might look something like this:
SQLExecDirect(hstm, "{call addorder(?)}", SQL_NTS)
In the case of a default result set, it's a simple flow because this is what RPC messages were originally intended for. The client sends an RPC message to the server and gets back the results coming from the procedure. If it's a cursor, it's a little more complicated. The client calls sp_cursoropen, like with any other cursor. Sp_cursoropen has some logic built into it to detect whether the stored procedure contains only a single SELECT statement. If so, a cursor is opened on that SELECT. If it's not a single SELECT statement in the procedure, then the client gets back a message with an indicator that says "we opened this for you, but we're going to stream the results back to you as a firehose and you can present that to the user."
Calling stored procedures
The SQL Manager
The SQL Manager is the driving force in a lot of the server processing. It's really at the heart of the server. The SQL Manager deals with all of the requests to run stored procedures. It manages the procedure cache, it has the pseudo system stored procedures, and it's involved in auto parameterization of ad hoc queries, which we'll discuss shortly.
Typically, the SQL manager is invoked with an RPC message, as you ask SQL Server to do some work for you. However, when a SQL language statement comes in through a SQL message and goes into the compilation side of the engine, the SQL manager is also involved. It can be involved when a procedure or a batch has an EXEC statement in it, because the EXEC is actually calling the SQL manger. If the SQL statement passes the autoparameterization template, then the SQL manager is called to parameterize the queries. It's also called when ad hoc queries need to be placed in the cache.
<Microsoft SQL Server Query Processor Internals and Architecture>