Nov 2, 2011

Will The Performance Improve If The Database Size is Smaller?

Few days back, when I was speaking at SSLUG (SQL Server Sri Lanka User Group) meeting on "managing VLDB" (VLDB stands for Very Large Data Bases) there was a question came on improving the performance of VLDB. There was a participant suggested that by shrinking the database we could improve the performance. I noted it then and answered later, but it could be useful for many if I share in the blog.  So I am sharing it here for the benefit of many.
Argument:  Large databases may take longer time to execute a query because scan operations may have to read more pages.
Answer: If you have an operation which scans the entire database (generally a maintenance or troubleshooting related operation) yes you may have to scan more pages.  However when you run a query against tables, you will touch only the pages allocated to the table and indexes.  If the table is clustered, each page will have a pointer to the next page and all read operations happen based on these pointers.  The performance is better, if you have read forward only. when you have free pages at the end of the database, it will not be touched at all. But if you have free pages  in between, shrink operation moves the pages at the end to the free space available. It will make the physical order different from logical order and forces read backward. This is the major concern for performance.
Argument: Shrink operations reduces the pages occupied by an object
Answer: NO. Please note this point:  Shrink operation does not remove the free space available within a page. Shrink operation does not compress the data either. The free space within a page can occur due to many reasons.

  1. If the clustered index was created with a fill factor  between 1-99 generally you expect some space within a page when the index is created/rebuilt.
  2. When page split happens due to updates there is some space left in old page. 
  3. When a row within a page is deleted their is free space created. (Okay, SQL Server does not initialize every bit of deleted row to 0. But rather the entry is removed from index and row mark within a page. SQL Server treats that area as free space)
Shrink operations work at page level and not at object level or row level.  If there is a free page in the middle and last non-free page will be moved there and that last page will be marked as free. Then it changes the pointers in the pages accordingly At the end, the free pages at the end could truncated if the options are set properly. The internal fragmentation (the fragmentation happens within a page due to page splits etc.)
Then what is the mechanism to remove the fragmentation? The answer is to rebuild the clustered index. As the data is ordered in the clustered index key, it can move the data pages according to the clustered key order.  This is why rebuilding a clustered index is a resource consuming operation.
 




   

Sep 14, 2011

SQL Server OLEDB Deprecation

Few days back I sent a tweet update on this matter using my twitter account @preethi_1965
For those who want to hear more on this, here there are few things:

Microsoft has officially announced that they are moving away from OLEDB. 
http://blogs.msdn.com/b/sqlnativeclient/archive/2011/08/29/microsoft-is-aligning-with-odbc-for-native-relational-data-access.aspx
http://social.technet.microsoft.com/Forums/en/sqldataaccess/thread/e696d0ac-f8e2-4b19-8a08-7a357d3d780f

There could be a couple of reasons behind it.
  • SNAC OLEDB does not scale upt o the customer expectation
  • ODBC is industry standard and It supports cross platform. To meet the complex market and technology needs, Microsoft needs to Maintain ODBC anyway. 
  • Maintaining both OLEDB and ODBC at the same time providing new features is a difficult task.
  • Interestingly, Initially Microsoft pushed for OLEDB as it can give a better performance over ODBC.  However, especially in 64 bit platforms, OLEDB didn't takeover.  According to insiders, OLEDB takes shorter paths, less code and provides better performance.    
With that said, SQL Server "Denali" most probably be the last version OLEDB.  As Denali CTP 3 is not yet feature complete, I expect another 6 to 12 months for the RTM to be released. It gives us space of another seven to ten years to change the code which uses OLEDB.  Until then we are rightfully expect Microsoft to support OLEDB.

Just a note, OLEDB is shipped in two forms. SNC(SQL Server Native Client) and MSOLAP. Microsoft is deprecating only the SNC OLEDB. OLEDB for other platforms will be supported continually. SSIS, SSRS and SSAS clients need not to be panicked. Especially SSIS is heavily depending on OLEDB currently and until something else is developed and in place for few years, it will be supported.

I saw an interesting article on  Impact of the SQL Server OLE DB Deprecation on Analysis Services.
Please have a look.



Sep 8, 2011

Rename sa login to improve security

Whenever I speak about security, I tell that we should minimally use sa account. SQL server by default keep the sa account disabled.
In addition to this, we can restrict sysadmin rights to few people, and give only what is needed.

Recently I found out that you can even rename sa account. It is a better practice so that people will not even know what the sa account is.
Simply go to security and logins and select sa account in object explorer. Right click there and select “rename” option. You replace the sa name with the new name and you are done. All the database user accounts mapped to sa login and sa owned jobs and objects will work without any issues. This is because even though you have renamed the login name you are still using the same SID 0x01


This means, you have an additional security; earlier Brute force mechanism needs to get only the password of sa. (It already know that they have a user named sa and sa has sysadmin rights. ) Now needs to pass three different options. Finding a valid login name; getting the password; the login should sysadmin have access.


Sep 4, 2011

Tool Evaluation: uCertify PrepEngine for 70-432: SQL Server Implementation and Maintenance.

Recently I was asked to evaluate an exam tool. A tool developed for 70-432 exam: MSTS: Implementation and maintenance of SQL Server 2008.

 
Before talking about the tool, Let me introduce my readers to the tool itself. It is “uCertify PrepEngine” from uCertify.com. The tool helps the user to learn and practice for the said exam before taking the actual exam. I know that they provide the preparation tool for various other exams too. 
To use the tool, you have two options: An evaluation version which gives limited usage. If you find it useful, (There is high possibility you may find it useful) you can upgrade it to a licensed edition.

 
 
Let me set this clear first: 100% money back guarantee and braindump free.
  • Ucertify supports 100% money back guarantee. In their own words, if you don’t get certified in the first attempt, fax them your score and we will return your money. As simple as that.
  • But it does not mean it is another collection of braindump. In fact, it is brain dump free. That means, you can learn the basics and sit for the exam and pass the exam without cheating. You can really be proud of your results.


Doesn’t that sound interesting? Don't you want to download the evaluation version?

Aug 30, 2011

UNION Vs UNION ALL

This issue came up during one of the training I conducted recently. There is a popular belief that UNION will ALWAYS perform a unique filter on the data.  While it is true that UNION Clause will ALWAYS RETURN unique set of data, filtering operation either sort or distinct select is not guaranteed.
Take this example:


SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 799
UNION
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 989
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 799
UNION ALL
SELECT SalesOrderID, SalesOrderDetailID, OrderQty, ProductID, UnitPrice, UnitPriceDiscount FROM SalesLT.SalesOrderDetail WHERE ProductID = 989

Aug 28, 2011

Removing unnecessary files using powershell

Now a days some people (including me) used to save whole heap of files into their hard disks. Word documents, excel sheets, powerpoint presentations, pictures  taken, pictures received from others and white papers (some times they are word documents), code files (including SQL),   Then you have Acrobat reader files, e-books, video and audio files.  Most of them take a lot of space. In addition to it, I also have multiple versions of it.

Aug 24, 2011

Speaking again at SSSLUG

I was scheduled to speak at this user group meeting (That's today.) I am ready with the presentation. However, after having a small hickup on some personal issues, I asked Dinesh Asanka to do it. He too was ready with the presentation, but at our office he got something. He presence was needed at office at that time. So, I am back to square one is doing the presentation.
I am speaking on SQL Server 2008 R2 StreamInsight. This session gives an introduction to Stream Insight.
by the way, we are planning to have live tweets of the session. If you are unable to come please watch the tweets Follow SqlSvrUniverse. 

Aug 11, 2011

Resources on Denali


onYesterday, Microsoft has announced the release of Product Guide v1 for SQL Server Code Name “Denali” Community Technology Preview 3 (CTP3). This is a single download that organizes Denali related content for easier viewing. It is available for download at http://go.microsoft.com/fwlink/?LinkID=225814

The SQL Server Code Name “Denali” CTP3 Product Guide includes useful resources and demos that will help IT Professionals better evaluate of CTP3.  This includes:
-          14 Product Datasheets
-          8 PowerPoint Presentations
-          5 Technical White Papers
-          13 Hands-On Lab Preview Documents
-          6 Click-Through Demonstrations
-          13 Self-Running Demonstrations
-          26 Links to On-Line References
-          44 Links to On-Line Videos including 26 Presentations from North America TechEd 2011
Microsoft also stated that they have a short windows of opportunity to shape the final release, so itmay be one of the rare opportunities to provide feedback on the features, usability and effectiveness. Microsoft expects the customers, communities and organizations to provide feedback on SQL Server CTP3 release in the next 60 days.
More to come, stay tuned.

Jul 14, 2011

Denali CTP3, Juneau CTP3 & SQL Server 2008 R2 sp1 Available

 

Let me give a different order:

SQL Server 2008 R2 sp1

Yes, it is finally available.  Check out the links below.

You can download the files based on the processor you have.  In addition to bug fixes some interesting enhancements too are added to this service pack.

  • Dynamic Management Views for increased supportability:
      sys.dm_exec_query_stats DMV is extended with additional columns to improve supportabilities over troubleshooting long-running queries. New DMVs and XEvents on select performance counters are introduced to monitor OS configurations and resource conditions related to the SQL Server instance.
  • FORCESEEK for improved querying performance :
      Syntax for FORCESEEK index hint has been modified to take optional parameters allowing it to control the access method on the index even further. Using old style syntax for FORCESEEK remains unmodified and works as before. In addition to that, a new query hint, FORCESCAN has been added. It complements the FORCESEEK hint allowing specifying ‘scan’ as the access method to the index. No changes to applications are necessary if you do not plan to use this new functionality.
  • Data-tier Application Component Framework (DAC Fx) for improved database upgrades:
      The new Data-tier Application (DAC) Framework v1.1 and DAC upgrade wizard enable the new in-place upgrade service for database schema management. The new in-place upgrade service will upgrade the schema for an existing database in SQL Azure and the versions of SQL Server supported by DAC. A DAC is an entity that contains all of the database objects and instance objects used by an application. A DAC provides a single unit for authoring, deploying, and managing the data-tier objects. For more information, see Designing and Implementing Data-tier Applications.
  • Disk space control for PowerPivot:
      This update introduces two new configuration settings that let you determine how long cached data stays in the system. In the new Disk Cache section on the PowerPivot configuration page, you can specify how long an inactive database remains in memory before it is unloaded. You can also limit how long a cached file is kept on disk before it is deleted.
  • As you may aware, SQL Server 2008 R2 is not supported on Windows XP.

    On feature packs, some of the cools stuffs are added.  The list below is my favourites

    • Report Builder 3.0 enhancements:

    Additional visualizations including maps, spark lines and data bars are introduced.  It also comes with report part gallery. It means you can develop report parts and add them to the reports you want.

    • Microsoft® System CLR Types for SQL Server® 2008 R2

    This allows the CLR based data types of SQL Server, geography, geometry and hierarchyid to be installed outside the server. By doing so, a client application can use the same data types. 

    • Microsoft® SQL Server® 2008 R2 Remote Blob Store

    I first heard about it as part of Denali.  However, I am surprised to see it as part of feature pack.  This component allows the data to be stored in an external data store. Microsoft is providing a client side DLL and SQL Server stored procedures

    • Microsoft® Windows PowerShell Extensions for SQL Server® 2008 R2

    How can I hide my joy to see something connected to PowerShell. In fact one of my favourite tools is PowerShell. It allows server administration, irrespective of whether it is Windows,  SQL Server or Exchange, through a set of cmdlets.  It eases the tasks of administrators.  I hear that a couple of new cmdlets and a new provider have beed added.  Need to have a look.

    Denali CTP 3

    Despite some earlier announcements, CTP3 was made available to public on 12th.  Denali is the next version of SQL Server.  Earlier references said it would be released in the second half of 2011, yes we are already there.  However, Now the messages are suggesting that it will be named as SQL Server 2012 indicating the release date to end of this year, or somewhere in next year.

    Let me warn you that you way start downloading files of 2.4 GB of size. Do you have the facility to do so? then go for.

    Want to download? Try this link:

    https://www.microsoft.com/betaexperience/pd/SQLDCTP3CTA/enus/

    I’ll write about the features of Denali on a different post.

    Juneau CTP3

    It gives a different user interface, to database developers. It allows Visual Studio to be used to create all database objects from table to stored procedures. People who are familiar with “Data Dude” or “Visual Studio for Database Professionals” can consider Juneau as the next generation of that.

    There is little more on it:  You can do all design in your local machine, and deploy to SQL Server or SQL Azure. you don’t need different tools to find the changes happened, it is included in Juneau. Additionally, it all in one source control..

    Try this link:

    http://msdn.microsoft.com/en-us/data/hh297027

    There are many more internal projects, Apollo, Velocity, Crescent and many more are on the pipeline. The information we receive are exciting. Lets wait to see them too.

    Jul 13, 2011

    Understanding Page Splits

    Note: I thought of writing about page splits. Initially I started writing a blog and it slowly became a huge article. I was having two minds on whether I should write a series, of blogs or go for a big article. Finally, I have made up my mind for the first option and here is the first part of the article. My plan is to write different aspects of page splits in coming days…


    In SQL Server data is stored into a page. A data page is 8 kb of size. It means when the rows are inserted they need to be inserted into a page. There is an exception when large blob data is inserted sql server can insert the data separately into another page and keep only the pointer (For more details refer: row overflow). Apart from that, a row should fit into the page and cannot split into multiple pages.

    The decision on where the data should be inserted is decided by the clustered key. The data is logically stored in clustered key order. (I’ll discuss later on why the word “logically” is used here. But it is safe to assume the order of the pages is clustered key order) When a row need to be inserted or updated but it cannot fit into the empty space, the content of the row splits into two pages (without breaking the row) and it is called page split.
    There are primarily two reasons for it to occur:
    · Data inserts not in clustered key order
    · Update of variable length columns
    Let’s analyse each scenario:
    Now let us assume we have a table which has a row size of 1300 bytes. It means a fully occupied page will have 6 rows. Let us assume that we have a table where a particular page has 5 rows having the clustered key values 2,4,6,8, 10 and 12. Now a new row is inserted with the clustered key of 5. It should be inserted into the same page as the value 5 falls between 4 and 6. But the page does not have space and it needs a new page. However, if the only the new row is copied into the new page, the clustered key order cannot be maintained.
    Now SQL server will perform the operation of page split. It will create a new page and copy half of the rows into the new page. Now both pages will be half empty, so that the new page will be inserted into the page it needs to be inserted. In this case the first page will have rows 2,4,5 and 6 and the next page will have rows 8, 10 and 12.
    The following code illustrates the point:
     

    CREATE TABLE dbo.Note

    (

        NoteID int NOT NULL CONSTRAINT PK_Note PRIMARY KEY CLUSTERED (NoteID),

        NoteText varchar(2000) NOT NULL

    )

    GO

    INSERT INTO dbo.Note VALUES( 2,'Some notes here.')

    INSERT INTO dbo.Note VALUES( 4,'Some notes here.')

    INSERT INTO dbo.Note VALUES( 6,'Some notes here.')

    INSERT INTO dbo.Note VALUES( 8,'Some notes here.')

    INSERT INTO dbo.Note VALUES(10,'Some notes here.')

    INSERT INTO dbo.Note VALUES(12,'Some notes here.')

    As the data is ordered on clustered key (i.e. NoteID) this will not create page split. However, when the data for the next day is inserted it will cause page split.

    INSERT INTO dbo.Note VALUES( 5,'New notes here.')

    The reason for page split is it cannot hold the all the rows in the page. When a page split happens the row will be divided into two and the second half of the data will be moved into the new page.
    There is another reason for page splits. Assume you have a variable length column in the table. Assume it is defined as varchar(2000). Since it is a variable length column, it will not occupy all 2000 bytes. Lets take this example:
     

    CREATE TABLE dbo.Note

    (

         NoteID int NOT NULL CONSTRAINT PK_Note PRIMARY KEY CLUSTERED (NoteID),

         NoteText varchar(2000) NOT NULL

    )

    GO

    INSERT INTO dbo.Note VALUES ( 1,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 2,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 3,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 4,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 5,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 6,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 7,REPLICATE('a', 900))

    INSERT INTO dbo.Note VALUES ( 8,REPLICATE('a', 900))

    Even though the NoteText is specified to have 2000 bytes, they hold only 900 bytes. Now when a row is updated with a higher length string, it may need space more than what is available.


    UPDATE dbo.Note SET NoteText = REPLICATE('c', 2000) WHERE NoteID = 3
    Since there is no space to accommodate the additional 1100 bytes, it needs a page split. Again the page is divided where the original page will have only four rows ( NoteID <=4) and rest will be in new page.
    There are couple of differences between a new page getting added and page split.
    1. Data growth in the order of clustered key leads to new pages getting added. But it ALWAYS happen at the end of the page. Page split refers to a page getting inserted in the middle of the list
    2. Adding new page starts with the blank page and only updates a pointer in the previous page.  When page split occurs half of the rows are getting copied into the new page.
    There are there any more reasons for page splits? Lets look it in another blog…   

    May 15, 2011

    Speaking Again: Wait Statistics

    Yes I am back from Hibernate mode.   Due to various reasons I couldn't update my blog for four long months. A couple of my friends were repeatedly reminding me.  Now, I am starting again.
    This month,  I am scheduled to speak about Wait Statistics at SQL Server Sri Lanka User Group (SSSLUG).  I gave this topic somewhere in the beginning of this year, By I need to wait till this month to speak about wait statistics.

    Wait statistics is a key part in tracking down application and server performance issues. I believe it is very important to Server Administrators, DBAs and Developers

    Due to Vesak holidays, the meeting is postponed to next week and I am expecting an official notice soon.
    See you all there...

    Jan 2, 2011

    MVP Award

    As some of you may know,  Microsoft has sent me a new year gift.  I have been awarded with MVP status.
    I thank all those recommended and proposed my name for this award.  this is really an encouragement to the work and service I do.
    I was never satisfied with all the work I have done in the past. I know as a fact that there is more I can do.