Jan 28, 2010

Best Method to archive data

Hi all,
Recently I got into a particular situation:
We have a process (process 1) which goes through all the data inserted today, from one table and do a complex calculation using multiple tables. If a particular condition is not met, I copy the data into a table (ProcessData).  This process will go though > 100K of rows but will insert <1000 rows into the table.
Then I have another process (Process B),  Which takes the rows from ProcessData table one by one and executes a complex processing.
Both Process A and  Process B are windows services working at off peak time.  They access some large and heavily used tables.  (Some tables are having more than 30 million rows and get around 160, 000 kits per day)
Now what is the best way of archiving the data from ProcessData table.
Suggestion 1:  When Process B completes a row delete the row and move the data into archive table
Suggestion 2:  Wait until Process B completes all rows and move all rows to Archive table and truncate the table
Suggestion 3:  When Process B completes a row mark the row (using a flag). Create another scheduled job to move the data using the flag periodically.
What is the best method?

Jan 25, 2010

Identifying Identity related functions

Almost after a week I am writing this post. The heave work load at office and some of the other commitments prevented me from writing during most of the last week.

This time I decided to write about Identity column as I see this issue coming again and again in various forums and blogs.

These are the principles:
When you create a table with the identity property, it does not have any values assigned to it.
USE tempdb;
CREATE TABLE t1 (id int IDENTITY(1,1) )
GO

  
Now we'll check each identity related command and see the results.

SELECT IDENT_CURRENT('t1'), SCOPE_IDENTITY(), @@IDENTITY


IDENT_CURRENT function will return the seed value. This will return null under one condition: The table does not have any column with identity property (or table does not exist).

As there is no statement in this batch has inserted value into this table, both SCOPE_IDENTITY and @@IDENTITY return null.

Now Let us insert a row into t1
INSERT INTO t1 DEFAULT VALUES

SELECT IDENT_CURRENT('t1'), SCOPE_IDENTITY(), @@IDENTITY


Once the first row is inserted, the current identity value will be set to 1.
As the statement in the query inserts a row to t1, scope_identity returns 1.  Also, as that is the last row to be inserted, @@Identity too will return 1


Now we'll do another experiment.
DROP TABLE t1
GO
SELECT IDENT_CURRENT('t1'), SCOPE_IDENTITY(), @@IDENTITY



Are tou receiving values different from what you expected?

  1. Ident_Current gives null
  2. Both Scope_Identity and @@Identity return a new value:1
As the table does not exist, Ident_Current returns null.
Scope_Identity returns a value even though the table does not have any values. Even if the table is not recreated, scope_identity returns the same value it returned when the table was existed.
@@Identity will return the last identity value inserted within the session. This is not tied to a table.  Currently even though we do not have any table with the identity column, @@Identity will return a value it got within the session.

These values will again reset to null when the session is closed.

I'll continue on Identity on future blogs as well.

Jan 19, 2010

New book from Red-Gate


Red-gate has released another free e-book recently: Brad's Sure Guide to SQL Server Maintenance Plans. This book is all about maintenance plans and covers both SQL Server 2005 and 2008. The book is designed not to just talk on theories but the practical aspects of maintenance plans. It is written in a way that new and accidental DBAs can understand things clearly. I recommend this book as a must have to all DBAs.

As you all know red-gate is committed to bring better products and bring better DBAs to the society. They have produced some great tools, out of which I love SQL Compare and SQL Data Compare. They also support many SQL Server related sites including www.sqlservercentral.com and www.simple-talk.com. Through their continuous support SQL Server Sri Lanka User group (SSSLUG) is able manage the monthly meetings. In addition to that they also produce great e-books.

I have not finished reading this book yet, but I need to admit that this book is well written.This book talks about many things you can do with maintenance plans.  If you are a DBA or want to be a DBA you need to study this book clearly.
Enough advertising :) You can get the book from this link

Jan 17, 2010

24 Hours of PASS: you can view the sessions now!

I am not sure how many of you remember about the 24 hours of online sessions which PASS had a few months before the PASS annual summit. I attended a few sessions including Kalen’s topic: “What’s simple about simple recovery model ?” But I couldn’t attend all the sessions and I had a few technical issues in connecting to the site as well. They were online events and initially they were not available for on-demand streaming. A couple of days later a couple of sessions were made available for on-demand streaming and PASS promised that all sessions will be available for on-demand streaming after the PASS annual summit.

The sessions are really valuable and they cover various tracks such as Development (Dev) Database Administration (DBA) Personal Development (PD) and Business Intelligence (BI)
I forget to check them later, but a few days back when I checked the site, they were made available.

Jan 15, 2010

Lies, damn lies and statistics

I was thinking for a long time to write something about statistics. In fact, I even know what to write: How much statistics are reliable and how they affect the query execution plans etc. I even thought of a heading: which you see now. But some or other reason, I didn’t do it.
Finally now I made it.

Statistics is an internal mechanism used by SQL Server to make decisions on query execution plans. For example, assume that we have a query like this (I am using AdventureWorks database which came with SQL Server 2005. It is available for download from Codeplex)


SELECT * FROM HumanResources.Employee WHERE ManagerID = 263
SELECT * FROM HumanResources.Employee WHERE ManagerID = 6

We have an index on ManagerID. Now the issue is you will get two possible (most likely) execution plans generated.