On one sentence, Page splits does not always make a 50/50 split in SQL Server
2012!
Recently I was preparing for a training and checking on page splits. Suddenly I thought that it is better to tell people about the numberof page splits can occur in a single statement and the amount of activities it can take. I found
something new … well its new to me, as my subsequent search over the internet didn't give any answers.
As I was preparing about the subject some time back, page
split is something blogged by many: They
have unearthed many hidden issues with page splits.
One of the key issues I remembered about page split is how
it splits the page. Generally, if a row to
be inserted into a page where it does not have enough free space, the page is
split at the middle (half of the rows are deleted and inserted into new page) and
the row is inserted based on where it fits.
In the order of clustered key if the page needs to come into first page
it goes there, else it goes to the new (second) page. What if the row you are
inserting is so big, it doesn't fit into half of the page? It leads to further splits until it finds enough space.
Kalen Delaney blogged about this domino effect of page
splits here
However, When I wrote
my script for the training, it didn't split into so many pages. I then got the script Kalen used and tried it
too. The maximum I got was 3 pages – 2 splits only.
When I dinged further, I found this difference. In SQL Server 2012 the page split mechanism
is changed.
First of all, it checks whether a formal 50/50 split could
solve the issue. If so, it will do
so. Otherwise, it will find the place
the place where the data to be inserted and splits there. The data less than the inserted row stays
where it should be and the data greater than the inserted row goes to the new
page. SQL Server creates another new
page and inserts this particular row.
Now the page links change appropriately so that the order is maintained.
I see this as a definite improvement. In SQL Server 2008 R2
and before, a situation can cause up to 11 page splits is now reduced to 2
splits only. (Actually, the insert of a
new row in to a new page is not considered as page split in that sense. New
page to be created anyway)
The only issue I see here was this: the new row is inserted after the page
split. It makes the inserted row (which
should be in between the pages), to be located after both pages under ideal
conditions. It makes the data to be saved in different physical order from the logical
order. It makes the disks to spin once
more to get all data in certain situations.
-- Create the table
USE tempdb;
GO
SET NOCOUNT ON
GO
IF EXISTS (SELECT * FROM sys.tables
WHERE name = 'split_page')
DROP TABLE split_page;
GO
CREATE TABLE split_page
(id int identity(0,2) PRIMARY KEY,
id2 bigint default 0,
data1 varchar(33) null,
data2 varchar(8000) null);
GO
-- fill page until no more rows fit
INSERT INTO split_page DEFAULT VALUES;
GO 385
Kalen has used DBCC IND undocumented command to identify the row. To know the place where the split takes place you may like to use another undocumented function which accepts another undocumented data stored with the data: sys.fn_PhysLocFormatter(%%physloc%%). %%physloc%% refers to the physical location of the rows it comes as binary (6) which the function translates into file id : page id : row id format. Please follow this post about the function and related information. Paul Randal has written something cool about this.
-- verify that there is only one data
page
DBCC IND(tempdb, split_page, -1);
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%), * FROM split_page
-- Now insert one more row, this time
filling the VARCHARs to the maximum length.
SET IDENTITY_INSERT split_page ON;
GO
INSERT INTO split_page (id, id2, data1, data2)
SELECT 111, 0, replicate('a', 33), replicate('b', 8000);
GO
SET IDENTITY_INSERT split_page OFF;
GO
DBCC IND(tempdb, split_page, -1);
GO
SELECT sys.fn_PhysLocFormatter(%%physloc%%), * FROM split_page
Try the same code in SQL Server 2012 and earlier versions and see the output.
Special thanks to Kalen who posted about this in 2008
December which I believe Microsoft to rethink their page split strategy.
1 comment:
attractive piece of information, I had come to know about your blog from my friend arjun, ahmedabad,i have read atleast eleven posts of yours by now, and let me tell you, your website gives the best and the most interesting information. This is just the kind of information that i had been looking for, i'm already your rss reader now and i would regularly watch out for the new posts, once again hats off to you! Thanks a lot once again, Regards, Single Row Function in sql
Post a Comment