Statistics VS Index Rebuild
Hi folks, a question came to me asking about if is necessary update the statistics even after an index rebuild. Because the answer is it depends and we don’t like this answer I did some tests to show you when you need update statistics after a rebuild and when not.
As we know SQL Server could create a statistic if the Auto Update Statistics parameter is Enabled. These statistics are that one starting with “_WA_Sys…”, be aware that SQL Server do not auto-create multi-column statistics automatically(except with indexes). We also have statistics creates by Indexes, and statistics created manually with the CREATE STATISTICS command.
So, let’s check all of this… First I’ll create a table with four columns, and populate with some garbage data.
use tempdb
GO
IF OBJECT_ID(‘T1’) IS NOT NULL
DROP TABLE T1
GO
CREATE TABLE T1 (Col1 Int, Col2 Int, Col3 Int, Col4 Int)
GO
INSERT INTO T1 VALUES(1,1,1,1)
GO 1000
CREATE CLUSTERED INDEX ix_Col1 ON T1(Col1)
CREATE INDEX ix_Col2 ON T1(Col2)
CREATE STATISTICS stats_Col3 ON T1(Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As you can see we have 3 statistics for the table T1. Two for the index and one called stats_Col3. Now in order to create an auto-statistics I’ll run the following query using the column C4 into the where condition.
SELECT * FROM T1
WHERE Col4 = 1
GO
If we check the sysindexes again, now we’ll see the auto-created statistic.
Following let’s check some different scenarios to rebuild an index.
Scenario 1
Now, the following script runs a DBCC REINDEX to rebuild the Clustered Index ix_Col1.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As we can see in the picture above, the statistics for the indexes were updated, but the auto-created and the manually don’t.
Scenario 2
— Scenario 2: DBCC REINDEX in the NonClustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col2)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As expected now the only statistic updated was the ix_Col2.
Scenario 3
— Scenario 3: DBCC REINDEX without specify the index. That means all index must be updated.
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Now all statistics were updated, including the auto and manually created.
Scenario 4
— Scenario 4: ALTER INDEX to update the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ix_Col1 ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Here, look that just to use the ALTER INDEX instead the DBCC REINDEX things are different. In the Scenario 1 we got the two index Updated, and using the ALTER INDEX we don’t, just the ix_Col1 was updated.
Scenario 5
— Scenario 5: ALTER INDEX to update the NonClustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ix_Col2 ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As expected now the only statistic updated was the ix_Col2. This is the same behavior that the Scenario 2.
Scenario 6
— Scenario 6: ALTER INDEX to update the ALL Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ALL ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Now both index statistics were update, but the auto and manually created don’t. Note that we expect that this act like the Scenario 3 where we use the DBCC REINDEX, but this is don’t have the same behavior.
It depends
As you can see in the scenarios above it depends to the command you are using may some statistics are updated and some not. But that’s is not all J, it also depends the SQL Server version you are using. I ran the scripts above into SQL Server 2008 R2 and SQL Server 2008 and the results were the same, but when I ran into SQL Server 2005 SP3 I got a slight difference.
Following we can see that the Scenario 1 don’t update all index statistics, just the clustered index.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
The big surprise was when I run it into SQL Server 2000 SP4. The Scenario 1 acts very different to the other versions.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As you can see, when you update the Clustered Index all statistics are updated, including the auto and manually created.
Resume
Following are a resume of the scenarios.