Arquivo
Fooling Statistics and Number of CPUs v1
There are many cases when QO can create an execution plan that access the table using a join operator, and using parallelism. To see a sample I’ll show the index intersection.
It is used when SQL can join the result of two indexes to return the query.
An interesting sample could be the following query using AdventureWorksDWR2:
SELECT [ProductAlternateKey], [ProductSubcategoryKey]
FROM DimProduct
WHERE [ProductAlternateKey] = 'BA-8327'
OR [ProductSubcategoryKey] = 12
OPTION (RECOMPILE)
The query is using the indexes by ProductAlternateKey and ProductSubcategoryKey on the table DimProduct to read all the data, joining the results using the Hash Join and then it applies the filter on the where clause.
Now to simulate a parallel plan, we can do some good tricks :-). First you can fool SQL about the number of CPUs that you have. Second you can fool SQL about the size of your table.
To fool SQL about the number of CPUs you can start the SQL using the parameter –P16.
SQL Server Configuration Manager:
Error Log file after restart SQL Server:
Alternatively you can check how many CPUs do you have using the DMV sys.dm_os_sys_info.
SELECT cpu_count FROM sys.dm_os_sys_info
cpu_count
———–
16
And to fool SQL Server about the size of the table, we could change the statistics of the clustered index of the table. For that, we can use the UPDATE STATISTICS and specify the RowCount and PageCount options.
UPDATE STATISTICS DimProduct [PK_DimProduct_ProductKey] WITH RowCount = 500000,
PageCount = 100000
After run the update statistics our query starts to use parallelism to perform the reads and join using all CPUs in parallel.
SELECT [ProductAlternateKey], [ProductSubcategoryKey]
FROM DimProduct
WHERE [ProductAlternateKey] = 'BA-8327'
OR [ProductSubcategoryKey] = 12
OPTION (RECOMPILE)
“Esconder” código do Profiler
Estou em um cliente fazendo alguns trabalhos de segurança e precisei esconder um código SQL dos DBAs que tem permissão de ALTER TRACE (podem usar o Profiler), mas que não tem acesso aos objetos (procs, functions, views e etc…) do banco.
Ou seja, o cara pode ver os códigos enviados para o SQL Server, mas não tem acesso para manipulação dos objetos nos bancos de dados.
Uma maneira seria criptografar a procedure, isso faz com que o Profiler não mostre o código executado.
Se você não quiser criar uma proc para encapsular este código, você pode usar uma método, digamos que, alternativo.
(quem nunca pecou que atire a primeira pedra. Eu poderia estar matando, roubando ou coisa pior, mas não, só estou criando um métodozinho alternativo).
Use a function EncryptByPassPhrase, veja um exemplo:
SELECT 'Teste Código que ninguem pode ver pelo Profiler'
WHERE EncryptByPassPhrase('','') <> ''
Veja o que aparece no Profiler:
Is it nice or what? ![]()
#Fica_a_dica
Undocumented,"STATISTICS_ONLY", "DBCC AUTOPILOT" and "SET AUTOPILOT"
Introduction
As we know SQL Server uses a cost based optimizer to create the execution plans. That means SQL Server evaluates many possibilities to create an execution plan and chose for the plan with the lower cost.
One of the problems we have today is that if we want to predict how a query will perform adding a new index on the table, we need to wait for the creation of the index.
Sometimes waiting for the creation of the index is a nightmare, especially if the table is too big. Furthermore, after wait for 20 minutes for the creation of the index, when you go look at the query plan it is not using the index
.
So the question is, how to create an hypothetically index? Just to test if the index really will be useful for the query.
WITH STATISTICS_ONLY
To create an hypothetically index you could use an undocumented syntax in the create index command.
For instance:
USE AdventureWorksDW
GO
CREATE INDEX ix_FirstName ON DimCustomer(FirstName) WITH STATISTICS_ONLY = -1
GO
This will create the index only with the statistics (density and histogram stuff). You could check the index using the sp_helpindex:
sp_HelpIndex DimCustomer
DBCC SHOW_STATISTICS(DimCustomer, ix_FirstName)
My friend Benjamin Nevares did a very good explanation about this here:
PS: If you create the index using “WITH STATISTICS_ONLY = 0” SQL will not create the statistic for your table. Just the hypothetical index.
DBCC AUTOPILOT and SET AUTOPILOT
Now that we have the hypothetical index, how to use it?
Let’s try to use the “index” hint :
SELECT * FROM DimCustomer WITH(index=ix_FirstName)
WHERE FirstName = N’Eugene’
Msg 308, Level 16, State 1, Line 1
Index ‘ix_FirstName’ on table ‘DimCustomer’ (specified in the FROM clause) does not exist.
What about use the indexid?
SELECT * FROM DimCustomer WITH(index=5)
WHERE FirstName = N’Eugene’
Msg 307, Level 16, State 1, Line 1
Index ID 5 on table ‘DimCustomer’ (specified in the FROM clause) does not exist.
So how we can create a new query plan that consider this index?
Now the fun stuff. ![]()
The DBCC AUTOPILOT is used to tell to the Optimizer to consider an specific index to the query plan creation. This DBCC plus the SET AUTOPILOT ON it’s the way to allow the use of the index.
Following is the syntax of the commands:
SET AUTOPILOT ON|OFF
/*DBCC TRACEON (2588)
DBCC HELP(‘AUTOPILOT’)
*/
DBCC AUTOPILOT (typeid [, dbid [, {maxQueryCost | tabid [, indid [, pages [, flag [, rowcounts]]]]} ]])
Following is a complete scenario with an sample of the commands:
— Current Cost = 0,762133
— Clustered Index Scan on pkSELECT * FROM DimCustomer
WHERE FirstName = N’Eugene’
GO
— creating the index
— DROP INDEX ix_FirstName ON DimCustomerCREATE INDEX ix_FirstName ON DimCustomer(FirstName) WITH STATISTICS_ONLY = -1
GO
— Looking at the info necessary in the DBCC AUTOPILOT comandSELECT name, id, Indid, Dpages, rowcnt
FROM sysindexes
WHERE id = object_id(‘DimCustomer’)
GO
DBCC AUTOPILOT (5, 9, 0, 0, 0, 0, 0) — Starting with the TypeID
DBCC AUTOPILOT (6, 9, 37575172, 1, 0, 0, 0) — Clustered Index with TypeID 6DBCC AUTOPILOT (0, 9, 37575172, 2, 0, 0, 0) — All other index with TypeID 0
DBCC AUTOPILOT (0, 9, 37575172, 3, 0, 0, 0) — All other index with TypeID 0
DBCC AUTOPILOT (0, 9, 37575172, 5, 0, 0, 0) — All other index with TypeID 0
GO
SET AUTOPILOT ON
GO
— Query to create the estimated execution plan with the cost = 0,0750712
SELECT * FROM dbo.DimCustomer
WHERE FirstName = N’Eugene’
OPTION (RECOMPILE)
GO
SET AUTOPILOT OFF
GO
I realized that you can fool the Optimizer telling different numbers for the Pages and RowCount parameters of the DBCC AUTOPILOT. If you let it zero it use the Clustered index values.
Conclusion
There is a lot of mystery about this features, but I’m sure this will be a good start of tests for you.
I’m still playing with this, so be confortable to ask something or share a new discovery. ![]()
I don’t need to tell you to don’t use this is a production environment do I? This is a undocumented stuff so nobody can guaranty what it is really doing unless MS folks make it officially public and documented.
That’s all folks
Count(*), Count(1), Count(PK)
Responda rápido! O que é melhor?
1: SELECT COUNT(1)
2: FROM Produtos
3: GO
4: SELECT COUNT(*)
5: FROM Produtos
6: GO
7: SELECT COUNT(ID_Produto) -- PK
8: FROM Produtos
9: GO
Tanto faz. Na verdade o Query Optimizer traduz tudo para COUNT(*).
O que ????
Pois é, quer que eu prove?
Fácil, veja os planos:
1: SELECT COUNT(1) FROM Produtos
2: |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1004],0)))
3: |--Stream Aggregate(DEFINE:([Expr1004]=Count(*)))
4: |--Clustered Index Scan(OBJECT:([Produtos].[xpk_Produtos]))
5:
6: SELECT COUNT(*) FROM Produtos
7: |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1004],0)))
8: |--Stream Aggregate(DEFINE:([Expr1004]= Count(*)))
9: |--Clustered Index Scan(OBJECT:([Produtos].[xpk_Produtos]))
10:
11: SELECT COUNT(ID_Produto) FROM Produtos
12: |--Compute Scalar(DEFINE:([Expr1003]=CONVERT_IMPLICIT(int,[Expr1004],0)))
13: |--Stream Aggregate(DEFINE:([Expr1004]= Count(*)))
SQL11, Denali, O que tem de novo?
Como alguns já sabem o CTP1 do SQL Server 11 Codinome “Delani” foi lançado oficialmente esta semana no PASS.
Infelizmente a documentação liberada junto com o download do CTP1 tem pouquíssimas informações sobre o que realmente mudou e nesta versão.
Eu estou fazendo alguns testes e encontrei algumas coisas interessantes:
Novos Objetos
- sp_server_diagnostics – Como o nome já diz, esta SP faz um diagnostico do servidor e retorna o resultado separados em 5 categorias.
- System
- Resource
- Query_Processing
- Is_subSystem
- Events
Eu achei os resultados fantásticos, informações valiosíssimas sobre o que está acontecendo no seu servidor.
Segue um print da execução da proc e os resultados que ela apresenta.
Utilizei o XML Notepad para visualizar os dados XML da columa data, veja abaixo alguns prints:
- DMV sys.dm_exec_describe_first_result_set, dm_exec_describe_first_result_set_for_object – Pelo que o que percebi, estas DMVs retornam um metadata completíssimo de uma determinada consulta ou objeto… por ex:
select * from sys.dm_exec_describe_first_result_set(‘SELECT * FROM Produtos ORDER BY ID_Produto DESC’, ”, 0)
O Interessante é que da pra passar um T-SQL com erro, dai você pode ler o erro nas colunas de erro…
select * from sys.dm_exec_describe_first_result_set(‘SELECT * FROM Produtos ORDER BY ID_Produto DESCC’, ”, 1)
CREATE PROC st_Test
AS
SELECT * FROM Pedidos
GO
select * from sys.dm_exec_describe_first_result_set_for_object(Object_ID(‘st_Test’), 0)
- Novas regas para otimização das consultas, para encontrar estas regras utilizei a DVM sys.dm_exec_query_transformation_stats. Comparando a versão do SQL 11 com meu SQL 2008 R2.
Agora nos resta entender o que estas regras fazem 🙂
select * from sys.dm_exec_query_transformation_stats a
where not exists (select 1
from [NB_FABIANO\SQL2008R2_1].master.sys.dm_exec_query_transformation_stats b
where a.name = b.name)
Fique de olho.. logo devo postar mais sobre algumas novidades bem interessantes.. .
Abraços
IT Material VS Political Correctly Text
I really love to writing about something especially about SQL Server, I also like to read good stuff and it is much better if the text is funny, I call this a “clean read”.
I’m not talking about a lot of jokes and nothing related to the content of the material, but sometimes the read is better when we see some comparisons with our life…
I’ve seen and tried to write good texts using this, for instance, comparisons with movies characters, famous quotes by someone (Developers, Developers, Developers… J) or predictions ("Two years from now, spam will be solved." Bill Gates, 2004) and so on.
One day in a chat with Fernando Guerreiro (Blog | Twitter) I asked: – Why text editors don’t like jokes? And he wisely said, Jokes don’t work at a global level.
I struggle with this problem myself and now I know that, if we want to reach global readers we need be very carefully about the different cultures. I also learn that It’s impossible write something that will receive 100% of approbations for everybody.
So now, some questions rises: – What do you think about it? Do we have to be political correctly and write regular texts? Do we have to write a book or an article using those things? Why IT books in general have to looks like a Manual, and even sometimes quite boring?
I would love to hear your opinion, please leave a comment about your thoughts.
Statistics VS Index Rebuild
Hi folks, a question came to me asking about if is necessary update the statistics even after an index rebuild. Because the answer is it depends and we don’t like this answer I did some tests to show you when you need update statistics after a rebuild and when not.
As we know SQL Server could create a statistic if the Auto Update Statistics parameter is Enabled. These statistics are that one starting with “_WA_Sys…”, be aware that SQL Server do not auto-create multi-column statistics automatically(except with indexes). We also have statistics creates by Indexes, and statistics created manually with the CREATE STATISTICS command.
So, let’s check all of this… First I’ll create a table with four columns, and populate with some garbage data.
use tempdb
GO
IF OBJECT_ID(‘T1’) IS NOT NULL
DROP TABLE T1
GO
CREATE TABLE T1 (Col1 Int, Col2 Int, Col3 Int, Col4 Int)
GO
INSERT INTO T1 VALUES(1,1,1,1)
GO 1000
CREATE CLUSTERED INDEX ix_Col1 ON T1(Col1)
CREATE INDEX ix_Col2 ON T1(Col2)
CREATE STATISTICS stats_Col3 ON T1(Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As you can see we have 3 statistics for the table T1. Two for the index and one called stats_Col3. Now in order to create an auto-statistics I’ll run the following query using the column C4 into the where condition.
SELECT * FROM T1
WHERE Col4 = 1
GO
If we check the sysindexes again, now we’ll see the auto-created statistic.
Following let’s check some different scenarios to rebuild an index.
Scenario 1
Now, the following script runs a DBCC REINDEX to rebuild the Clustered Index ix_Col1.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As we can see in the picture above, the statistics for the indexes were updated, but the auto-created and the manually don’t.
Scenario 2
— Scenario 2: DBCC REINDEX in the NonClustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col2)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As expected now the only statistic updated was the ix_Col2.
Scenario 3
— Scenario 3: DBCC REINDEX without specify the index. That means all index must be updated.
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Now all statistics were updated, including the auto and manually created.
Scenario 4
— Scenario 4: ALTER INDEX to update the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ix_Col1 ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Here, look that just to use the ALTER INDEX instead the DBCC REINDEX things are different. In the Scenario 1 we got the two index Updated, and using the ALTER INDEX we don’t, just the ix_Col1 was updated.
Scenario 5
— Scenario 5: ALTER INDEX to update the NonClustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ix_Col2 ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As expected now the only statistic updated was the ix_Col2. This is the same behavior that the Scenario 2.
Scenario 6
— Scenario 6: ALTER INDEX to update the ALL Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
ALTER INDEX ALL ON T1 REBUILD;
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
Now both index statistics were update, but the auto and manually created don’t. Note that we expect that this act like the Scenario 3 where we use the DBCC REINDEX, but this is don’t have the same behavior.
It depends
As you can see in the scenarios above it depends to the command you are using may some statistics are updated and some not. But that’s is not all J, it also depends the SQL Server version you are using. I ran the scripts above into SQL Server 2008 R2 and SQL Server 2008 and the results were the same, but when I ran into SQL Server 2005 SP3 I got a slight difference.
Following we can see that the Scenario 1 don’t update all index statistics, just the clustered index.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
The big surprise was when I run it into SQL Server 2000 SP4. The Scenario 1 acts very different to the other versions.
— Scenario 1: DBCC REINDEX in the Clustered Index
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
DBCC DBREINDEX(T1, ix_Col1)
GO
SELECT name, Stats_date(id, indid) AS LastUpdate
FROM sysindexes
WHERE id = Object_id(‘T1’)
GO
As you can see, when you update the Clustered Index all statistics are updated, including the auto and manually created.
Resume
Following are a resume of the scenarios.
Cursos T-SQL da SolidQ no Brasil
Pessoal, é com GRANDE prazer que falo a vocês que estamos trazendo os cursos de T-SQL da SolidQ para o Brasil.
Eu estou trabalhando na tradução do material de 2 cursos criados pelo nosso amigo e gênio Itzik Ben-Gan, vai por mim, tem muita coisa boa, e informações privilegiadas que você só verá no curso… vale muito a pena.
Portanto se você é DBA ou Desenvolvedor, não perca a chance…. logo estarei publicando mais informações sobre isso, fique de olho.
É claro que o treinamento no Brasil terá alguns “toques” especiais que irei incluir no curso
, com minha experiência de como usamos o T-SQL no Brasil mais o material da SolidQ, não tenho dúvida que iremos ministrar o melhor curso de T-SQL da América Latina.
Por enquanto para aguçar sua vontade, segue uma breve descrição dos dois cursos de T-SQL.
Fundamentos do T-SQL (Não se deixe enganar pelo nome, é um curso com muitos exemplos nível 400)
- Os tópicos cobertos por esse curso incluem: Background para Consultas e Programação em T-SQL; Consultas em uma única tabela; Joins; Sub-Consultas; Table Expressions; Operações com Conjuntos; Pivot, Unpivot e Groupins Sets; Modificação de Dados; Transações; Overview dos Objetos de Programação.
- Para complementar a experiência de aprendizado os alunos receberão exercícios que irão habilitá-los de praticar o que eles aprenderam.
T-SQL Querying Avançado, Programação e Tuning para SQL Server 2005 e 2008 (Nivel 400)
- Este curso é direcionado em escrever e aperfeiçoar consultas e programação com T-SQL no SQL Server 2005 e 2008. Neste curso você irá aprender os detalhes e capacidades do T-SQL nas seguintes áreas: Logical Query Processing, Query Tunning, SubQueries, Ranking, Functions, Joins e Grouping sets; Aggregating e Pivot de dados; TOP and APPLY; Modificações de dados; Problemas relacionados a Data Types; Objects Programáveis (SQL dinâmico, Views, User Defined Functions, Stored Procedures, Triggers, Transações e Concorrência, Controle de Exceções); Árvores e Hierarquias.
Galera, imperdível… agora deixo outra supresa… Estou preparando um curso e um seminário de Indexação e Execution Plans… logo irei publicar o conteudo do curso.
Agora deixo uma pergunta… você gostaria de fazer estes cursos? Responda aqui no blog com um comentário…
Abraços
Memory Grant, Sort Warnings
When I wrote this is an indication, I don´t mean 100% of certain that if you add more memory you will have a performance gain. But sort warnings can indicate memory pressure.
For instance, suppose a very concurrently scenario, with many queries requiring memory grant. When I run a query the server calculates the memory limit for that query. By default, this is 25% (20% on 32bit SQL 2005) of total query memory (which is set by memory broker as about 90% of server memory). This per-query limit helps to prevent one query from dominating the whole server.
Following is some picture of a simulation of this scenario.
1024 MB of Memory
EXEC sys.sp_configure N’max server memory (MB)’, N’1024′
GO
RECONFIGURE WITH OVERRIDE
GO
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
512 MB of Memory
EXEC sys.sp_configure N’max server memory (MB)’, N’512′
GO
RECONFIGURE WITH OVERRIDE
GO
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
256 MB of Memory
EXEC sys.sp_configure N’max server memory (MB)’, N’256′
GO
RECONFIGURE WITH OVERRIDE
GO
DBCC FREEPROCCACHE
DBCC DROPCLEANBUFFERS
GO
64 MB of Memory
As you can see according to the memory change, the SQL can´t grant memory to some queries. And with just a few memory available the sort is exhibited.
Instead of change the available memory I could add more threads into SQLQueryStress to simulate the concurrency into the server, if you do this, may you see
some Sort Warnings into Profiler.
I’m not saying that you are wrong, I totally agree with you that SQL uses cardinality to estimate the amount of memory to the query, and the main problem is, a wrong estimation can lead to a bad memory grant. The point here is, yes, the sort warning can indicate a memory pressure.
SQLQueryStrees by Adam Machanic http://www.datamanipulation.net
T-SQL VS *
Hoje li uma muito boa…
Lembro uma vez em que meu amigo Felipe Ferreira (Blog|Twitter) disse…
– Não use asterisco em suas consultas, nunca, asterisco é coisa do Demonio! 🙂
Agora te pergunto, como fazer isso sem violar esta regra sagrada? Que tal o código abaixo?
🙂
Abraços





