Sep 9, 2009

Getting The Table Usage

Recently I had an issue with replication. Suddenly due to some reason, some of the replications have failed but no error is reported in replciation monitor. We found out only when developers complained that the data they entered has not reflected properly in all places. We identified that the issue is with replication and quickly checked the tables he mentioned. There was a difference in number of rows between the source table and the destination table. We immediately wrote some scripts to transfer the remaining data and dropped the subscription and created it again.
When replication started working, (Okay it didn’t start working until we find out that there was a security issue and correct it.) I wanted to check what are the other tables were affected.

As I know the publication database and subscription database, we can get the data from sysarticles table from the publication database.

There were three methods before me to get row count.
  1. Selecting data by using SELECT COUNT(1) FROM <table> method. This result is accurate, but it will consume a lot of memory and time. This will result reading the entire clustered index (a full clustered index scan) or a table scan.
  2. Reading the rows FROM sysindexes for the clustered index or heap. This is the fastest method but there is a possibility of inaccuracy. (Even when I ran the test I got some slightly inaccurate results.) This method is fairly okay if you want to know the approximate row count, but in case you need to get the exact row count, this method is not recommended.
  3. Using sp_SpaceUsed system procedure. This method also does not give accurate numbers; additionally as this is a stored procedure, it does not gives the flexibility to add additional columns or remove unnecessary columns. Additionally, even though this stored procedure accepts schema_name.table_name format as input, on the output only table name is mentioned. This may give some issues if your database having multiple tables with the same name (in different schemas).
I went with the first method to get the row count and completed the issues as I got the weekend to resolve it. (Remember last weekend was a long weekend.) as I already had a script to work with and I added a few things to complete it. I have added the script with some modifications for you all to use it.

SELECT '
        SELECT ''source'' AS Table_Location,
               '''+ OBJECT_SCHEMA_NAME(a.objid)+''' AS SchemaName,
               '''+ OBJECT_NAME(a.objid) +''' AS TableName,
               COUNT(*) as row_count
        FROM '+ OBJECT_SCHEMA_NAME(a.objid)+'.'+ OBJECT_NAME(a.objid)+'
        UNION ALL
        SELECT ''destination'' AS Table_Location,
               '''+ a.dest_owner +''' AS SchemaName,
               '''+ a.dest_table +''' AS TableName,
               COUNT(*) as row_count
         FROM '+ s.dest_db +'.'+ a.dest_owner +'.'+ a.dest_table
FROM dbo.sysarticles a
INNER JOIN dbo.syssubscriptions s
ON a.artid = s.artid


Is this usefull?

    No comments:

    Post a Comment