Sep 29, 2012

Supplement to page Splits: Change in SQL Server 2012


On one sentence, Page splits does not always make a 50/50 split in SQL Server 2012!

Recently I was preparing for a training and checking on page splits. Suddenly I thought that it is better to tell people about the numberof page splits can occur in a single statement and the amount of activities it can take. I found something new … well its new to me,  as my subsequent search over the internet didn't give any answers.



As I was preparing about the subject some time back, page split is something blogged by many:  They have unearthed many hidden issues with page splits.

One of the key issues I remembered about page split is how it splits the page.  Generally, if a row to be inserted into a page where it does not have enough free space, the page is split at the middle (half of the rows are deleted and inserted into new page) and the row is inserted based on where it fits.  In the order of clustered key if the page needs to come into first page it goes there, else it goes to the new (second) page. What if the row you are inserting is so big, it doesn't fit into half of the page? It leads to further  splits until it finds enough space.

Kalen Delaney blogged about this domino effect of page splits here 

However, When I wrote my script for the training, it didn't split into so many pages.  I then got the script Kalen used and tried it too. The maximum I got was 3 pages – 2 splits only.

When I dinged further, I found this difference.  In SQL Server 2012 the page split mechanism is changed.
First of all, it checks whether a formal 50/50 split could solve the issue.  If so, it will do so.  Otherwise, it will find the place the place where the data to be inserted and splits there.  The data less than the inserted row stays where it should be and the data greater than the inserted row goes to the new page.  SQL Server creates another new page and inserts this particular row.  Now the page links change appropriately so that the order is maintained.

I see this as a definite improvement. In SQL Server 2008 R2 and before, a situation can cause up to 11 page splits is now reduced to 2 splits only.  (Actually, the insert of a new row in to a new page is not considered as page split in that sense. New page to be created anyway)

The only issue I see here was this:  the new row is inserted after the page split.  It makes the inserted row (which should be in between the pages), to be located after both pages under ideal conditions. It makes the data to be saved in different physical order from the logical order.  It makes the disks to spin once more to get all data in certain situations.

Let’s see the example of Kalen and try it in both SQL Server 2008 R2 or before and in SQL Server 2012


-- Create the table
USE tempdb;
GO
SET NOCOUNT ON
GO
IF EXISTS (SELECT * FROM sys.tables
            WHERE name = 'split_page')
    DROP TABLE split_page;
GO
CREATE TABLE split_page
(id int identity(0,2) PRIMARY KEY,
id2 bigint default 0,
data1 varchar(33) null,
data2 varchar(8000) null);
GO

-- fill page until no more rows fit
INSERT INTO split_page DEFAULT VALUES;
GO 385


Kalen  has used DBCC IND undocumented command to identify the row.  To know the place where the split takes place you may like to use another undocumented function which accepts another undocumented data stored with the data:  sys.fn_PhysLocFormatter(%%physloc%%).   %%physloc%% refers to the physical location of the rows it comes as binary (6) which the function translates into file id : page id : row id format.  Please follow this post about the function and related information. Paul Randal has written something cool about this.

-- verify that there is only one data page
DBCC IND(tempdb, split_page, -1);
GO

SELECT sys.fn_PhysLocFormatter(%%physloc%%), * FROM split_page

-- Now insert one more row, this time filling the VARCHARs to the maximum length.
SET IDENTITY_INSERT split_page  ON;
GO
INSERT INTO split_page (id, id2, data1, data2)
SELECT 111, 0, replicate('a', 33), replicate('b', 8000);
GO
SET IDENTITY_INSERT split_page  OFF;
GO

DBCC IND(tempdb, split_page, -1);
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%), * FROM split_page


Try the same code in SQL Server 2012 and earlier versions and see the output. 

Special thanks to Kalen who posted about this in 2008 December which I believe Microsoft to rethink their page split strategy.

1 comment:

Unknown said...

attractive piece of information, I had come to know about your blog from my friend arjun, ahmedabad,i have read atleast eleven posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanks a lot once again, Regards, Single Row Function in sql


Post a Comment