Previous | Table of Contents | Next |
Often, people are looking for a quick way to increase the size of a table during testing. You can double the size of the sales table with this statement:
insert sales select * from sales
NOTE: Its only fair to warn you that this statement only works when you have no unique indexes, and when your table does not include a timestamp or identity column. It also wont work if you have a clustered index on the table.So when does it work? When you have a raw table (no indexes) with two or three hundred rows and need it to be much larger. Run it five or ten times and its also an excellent method for testing what happens when you run out of log space!
When you execute an INSERT SELECT statement, the server generally uses the direct mode of inserting, in which rows are added to the log and data pages concurrently. When the source table of an insert is the same as the destination table, SQL Server uses a deferred mode to insert the rows, where planned modifications are first noted in the log and then written to the log and data. The deferred mode writes two sets of log records, but it is a safer method.
Why does the optimizer use a deferred method? If you dont defer the writes to the data until you have recorded them in the log, the rows added by the INSERT are visible to the SELECT process. This leads to an infinite loop, where rows are added until the database runs out of space.
To confirm that the optimizer uses a deferred method, look at the optimization plan from showplan:
insert sales select * from sales
STEP 1 The type of query is INSERT The update mode is deferred FROMTABLE sales Nested iteration Table Scan FROMTABLE titles CONSTRAINT : nested iteration Using Clustered Index TOTABLE sales
The sales table is both source and destination for the INSERT , so the server uses a deferred method. Log writes are doubled, but the server avoids the infinite loop.
What about INSERT EXECUTE? The optimizer doesnt know whether the EXECUTE statement is reading the INSERT destination table or some other table. Unfortunatelyand this in my humble opinion is a bugthe server uses a direct method to perform the INSERT. Heres the output from the same operation, performed using an EXECUTE:
insert sales execute ("select * from sales")
STEP 1 The type of query is INSERT The update mode is direct TOTABLE sales STEP 1 The type of query is EXECUTE STEP 1 The type of query is SELECT FROMTABLE sales Nested iteration Table Scan Msg 1105, Level 17, State 2 Cant allocate space for object Syslogs in database tempdb because thelogsegment segment is full. If you ran out of space in Syslogs, dump the transaction log. Otherwise, use ALTER DATABASE or sp_extendsegment to increase the size of the segment.
The server used a direct method to insert the rows. As a result, it entered the infinite loop and kept inserting rows until it ran out of space in the logsegment , although its just as likely to run out of space in the default segment, where the data is stored.
The primary benefit of sourcing inserts from procedures is to use the results of a remote procedure call as the basis of an INSERT into a local table. (Theres more on remote procedures later today.) This is an important benefit for multiserver systems because it allows data to flow among servers.
NOTE: Remote procedure calls are not considered part of a transaction. You need to use distributed transactions to get multiserver transactions. Distributed transactions and the DTC are not covered in this book.
Using a procedure to source your inserts is a dangerous business. Make certain that you completely control the procedure that is executing and use careful transactional control and error checking. Watch out especially for problems with direct mode inserts when source and destination tables are the same.
Previous | Table of Contents | Next |