dbthoughts: July 2011

Jul 14, 2011

Denali CTP3, Juneau CTP3 & SQL Server 2008 R2 sp1 Available

Let me give a different order:

SQL Server 2008 R2 sp1

Yes, it is finally available. Check out the links below.

You can download the files based on the processor you have. In addition to bug fixes some interesting enhancements too are added to this service pack.

Dynamic Management Views for increased supportability:

sys.dm_exec_query_stats DMV is extended with additional columns to improve supportabilities over troubleshooting long-running queries. New DMVs and XEvents on select performance counters are introduced to monitor OS configurations and resource conditions related to the SQL Server instance.

FORCESEEK for improved querying performance :

Syntax for FORCESEEK index hint has been modified to take optional parameters allowing it to control the access method on the index even further. Using old style syntax for FORCESEEK remains unmodified and works as before. In addition to that, a new query hint, FORCESCAN has been added. It complements the FORCESEEK hint allowing specifying ‘scan’ as the access method to the index. No changes to applications are necessary if you do not plan to use this new functionality.

Data-tier Application Component Framework (DAC Fx) for improved database upgrades:

The new Data-tier Application (DAC) Framework v1.1 and DAC upgrade wizard enable the new in-place upgrade service for database schema management. The new in-place upgrade service will upgrade the schema for an existing database in SQL Azure and the versions of SQL Server supported by DAC. A DAC is an entity that contains all of the database objects and instance objects used by an application. A DAC provides a single unit for authoring, deploying, and managing the data-tier objects. For more information, see

Designing and Implementing Data-tier Applications.

Disk space control for PowerPivot:

This update introduces two new configuration settings that let you determine how long cached data stays in the system. In the new Disk Cache section on the PowerPivot configuration page, you can specify how long an inactive database remains in memory before it is unloaded. You can also limit how long a cached file is kept on disk before it is deleted.

As you may aware, SQL Server 2008 R2 is not supported on Windows XP.

On feature packs, some of the cools stuffs are added. The list below is my favourites

Report Builder 3.0 enhancements:

Additional visualizations including maps, spark lines and data bars are introduced. It also comes with report part gallery. It means you can develop report parts and add them to the reports you want.

Microsoft® System CLR Types for SQL Server® 2008 R2

This allows the CLR based data types of SQL Server, geography, geometry and hierarchyid to be installed outside the server. By doing so, a client application can use the same data types.

Microsoft® SQL Server® 2008 R2 Remote Blob Store

I first heard about it as part of Denali. However, I am surprised to see it as part of feature pack. This component allows the data to be stored in an external data store. Microsoft is providing a client side DLL and SQL Server stored procedures

Microsoft® Windows PowerShell Extensions for SQL Server® 2008 R2

How can I hide my joy to see something connected to PowerShell. In fact one of my favourite tools is PowerShell. It allows server administration, irrespective of whether it is Windows, SQL Server or Exchange, through a set of cmdlets. It eases the tasks of administrators. I hear that a couple of new cmdlets and a new provider have beed added. Need to have a look.

Denali CTP 3

Despite some earlier announcements, CTP3 was made available to public on 12th. Denali is the next version of SQL Server. Earlier references said it would be released in the second half of 2011, yes we are already there. However, Now the messages are suggesting that it will be named as SQL Server 2012 indicating the release date to end of this year, or somewhere in next year.

Let me warn you that you way start downloading files of 2.4 GB of size. Do you have the facility to do so? then go for.

Want to download? Try this link:

https://www.microsoft.com/betaexperience/pd/SQLDCTP3CTA/enus/

I’ll write about the features of Denali on a different post.

Juneau CTP3

It gives a different user interface, to database developers. It allows Visual Studio to be used to create all database objects from table to stored procedures. People who are familiar with “Data Dude” or “Visual Studio for Database Professionals” can consider Juneau as the next generation of that.

There is little more on it: You can do all design in your local machine, and deploy to SQL Server or SQL Azure. you don’t need different tools to find the changes happened, it is included in Juneau. Additionally, it all in one source control..

Try this link:

http://msdn.microsoft.com/en-us/data/hh297027

There are many more internal projects, Apollo, Velocity, Crescent and many more are on the pipeline. The information we receive are exciting. Lets wait to see them too.

Jul 13, 2011

Understanding Page Splits

Note: I thought of writing about page splits. Initially I started writing a blog and it slowly became a huge article. I was having two minds on whether I should write a series, of blogs or go for a big article. Finally, I have made up my mind for the first option and here is the first part of the article. My plan is to write different aspects of page splits in coming days…

In SQL Server data is stored into a page. A data page is 8 kb of size. It means when the rows are inserted they need to be inserted into a page. There is an exception when large blob data is inserted sql server can insert the data separately into another page and keep only the pointer (For more details refer: row overflow). Apart from that, a row should fit into the page and cannot split into multiple pages.

The decision on where the data should be inserted is decided by the clustered key. The data is logically stored in clustered key order. (I’ll discuss later on why the word “logically” is used here. But it is safe to assume the order of the pages is clustered key order) When a row need to be inserted or updated but it cannot fit into the empty space, the content of the row splits into two pages (without breaking the row) and it is called page split.
There are primarily two reasons for it to occur:
· Data inserts not in clustered key order
· Update of variable length columns
Let’s analyse each scenario:
Now let us assume we have a table which has a row size of 1300 bytes. It means a fully occupied page will have 6 rows. Let us assume that we have a table where a particular page has 5 rows having the clustered key values 2,4,6,8, 10 and 12. Now a new row is inserted with the clustered key of 5. It should be inserted into the same page as the value 5 falls between 4 and 6. But the page does not have space and it needs a new page. However, if the only the new row is copied into the new page, the clustered key order cannot be maintained.
Now SQL server will perform the operation of page split. It will create a new page and copy half of the rows into the new page. Now both pages will be half empty, so that the new page will be inserted into the page it needs to be inserted. In this case the first page will have rows 2,4,5 and 6 and the next page will have rows 8, 10 and 12.
The following code illustrates the point:

CREATE TABLE dbo.Note

(

NoteID int NOT NULL CONSTRAINT PK_Note PRIMARY KEY CLUSTERED (NoteID),

NoteText varchar(2000) NOT NULL

)

GO

INSERT INTO dbo.Note VALUES( 2,'Some notes here.')

INSERT INTO dbo.Note VALUES( 4,'Some notes here.')

INSERT INTO dbo.Note VALUES( 6,'Some notes here.')

INSERT INTO dbo.Note VALUES( 8,'Some notes here.')

INSERT INTO dbo.Note VALUES(10,'Some notes here.')

INSERT INTO dbo.Note VALUES(12,'Some notes here.')

As the data is ordered on clustered key (i.e. NoteID) this will not create page split. However, when the data for the next day is inserted it will cause page split.

INSERT INTO dbo.Note VALUES( 5,'New notes here.')

The reason for page split is it cannot hold the all the rows in the page. When a page split happens the row will be divided into two and the second half of the data will be moved into the new page.
There is another reason for page splits. Assume you have a variable length column in the table. Assume it is defined as varchar(2000). Since it is a variable length column, it will not occupy all 2000 bytes. Lets take this example:

CREATE TABLE dbo.Note

(

NoteID int NOT NULL CONSTRAINT PK_Note PRIMARY KEY CLUSTERED (NoteID),

NoteText varchar(2000) NOT NULL

)

GO

INSERT INTO dbo.Note VALUES ( 1,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 2,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 3,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 4,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 5,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 6,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 7,REPLICATE('a', 900))

INSERT INTO dbo.Note VALUES ( 8,REPLICATE('a', 900))

Even though the NoteText is specified to have 2000 bytes, they hold only 900 bytes. Now when a row is updated with a higher length string, it may need space more than what is available.

UPDATE dbo.Note SET NoteText = REPLICATE('c', 2000) WHERE NoteID = 3
Since there is no space to accommodate the additional 1100 bytes, it needs a page split. Again the page is divided where the original page will have only four rows ( NoteID <=4) and rest will be in new page.
There are couple of differences between a new page getting added and page split.

Data growth in the order of clustered key leads to new pages getting added. But it ALWAYS happen at the end of the page. Page split refers to a page getting inserted in the middle of the list
Adding new page starts with the blank page and only updates a pointer in the previous page. When page split occurs half of the rows are getting copied into the new page.

There are there any more reasons for page splits? Lets look it in another blog…