SQL Server T-SQL query to get Index fragmentation information

nwlqm0z1  于 2023-08-02  发布在  其他
关注(0)|答案(2)|浏览(91)

I've been developing a query to get index fragmentation information using DMVs.

However, the query gives more results than expected. I believe the problem is in joins.

Any thoughts?

select distinct '['+DB_NAME(database_id)+']' as DatabaseName,
    '['+DB_NAME(database_id)+'].['+sch.name+'].['
    + OBJECT_NAME(ips.object_id)+']' as TableName,
    i.name as IndexName,
    ips.index_type_desc as IndexType,
    avg_fragmentation_in_percent as avg_fragmentation,
    SUM(row_count) as Rows
FROM
    sys.indexes i INNER JOIN
    sys.dm_db_index_physical_stats(NULL,NULL,NULL,NULL,'LIMITED') ips ON
        i.object_id = ips.object_id INNER JOIN
    sys.tables tbl ON tbl.object_id  = ips.object_id INNER JOIN
    sys.schemas sch ON sch.schema_id = tbl.schema_id INNER JOIN
    sys.dm_db_partition_stats ps ON ps.object_id = ips.object_id
WHERE
    avg_fragmentation_in_percent <> 0.0 AND ips.database_id = 6
    AND OBJECT_NAME(ips.object_id) not like '%sys%'
GROUP BY database_id, sch.name, ips.object_id, avg_fragmentation_in_percent,
    i.name, ips.index_type_desc
ORDER BY avg_fragmentation_in_percent desc
sy5wg1nm

sy5wg1nm1#

I think you need index_id in the joins against sys.dm_db_partition_stats and sys.indexes .

It is probably better to use the first parameter of sys.dm_db_index_physical_stats to filter on db instead of the where clause ips.database_id = 6 .

I do not understand the distinct , group by or sum(row_count) clauses.

Here is a query you can try and see if it does what you want.

select
  db_name(ips.database_id) as DataBaseName,
  object_name(ips.object_id) as ObjectName,
  sch.name as SchemaName,
  ind.name as IndexName,
  ips.index_type_desc,
  ps.row_count
from sys.dm_db_index_physical_stats(6,NULL,NULL,NULL,'LIMITED') as ips
  inner join sys.tables as tbl
    on ips.object_id = tbl.object_id
  inner join sys.schemas as sch
    on tbl.schema_id = sch.schema_id  
  inner join sys.indexes as ind
    on ips.index_id = ind.index_id and
       ips.object_id = ind.object_id
  inner join sys.dm_db_partition_stats as ps
    on ps.object_id = ips.object_id and
       ps.index_id = ips.index_id and
       ps.partition_number = ips.partition_number
gcuhipw9

gcuhipw92#

This is an old post, but I think it still gets hits.

Here’s the answer to the original question. There is no need for distinct and sum. In fact, the sum will give you inaccurate results. sys.dm_db_partition_stats should be joined on Object_ID AND Index_ID. That’s what’s causing your dupes. The way to include row count is in an OUTER APPLY.

Here’s the first iteration of a working query:

SELECT DB_NAME(DB_ID()) AS DatabaseName, Schema_Name(T.schema_id) AS SchemaName, T.Name AS TableName, R.Rows, I.Name AS IndexName, I.Type_Desc AS IndexType, I.is_primary_key AS PK, I.Is_Disabled AS Disabled,
    I.Fill_Factor,  S.avg_fragmentation_in_percent AS Fragmentation_Pct, S.avg_fragment_size_in_pages AS Fragmentation_Pages,  S.Page_Count
FROM sys.Tables                T (NOLOCK)
INNER JOIN sys.Indexes         I (NOLOCK) ON T.object_id = I.object_id
LEFT OUTER JOIN sys.dm_db_index_physical_stats(DB_ID(), NULL, NULL, NULL, NULL) S ON  T.object_id = S.object_id AND I.index_id = S.index_id
OUTER APPLY (SELECT TOP 1 Rows FROM sys.partitions P WHERE P.object_id = T.object_id) R
WHERE T.Name = 'MY_TABLE'
ORDER BY T.Name, I.Is_Primary_Key DESC, I.Name

You can jigger with the WHERE clause to suit your needs.

I use sys.partitions to get the row count, but it’s all the same. The idea is that we only want one record from that table to get the row count, and it’s not normalized.

Note that sys.dm_db_index_physical_stats only returns information for indexes that are enabled, hence the OUTER JOIN.

Here’s a version of the above query that performs much better. sys.dm_db_index_physical_stats will search the entire server if you let it. The first parameter limits the search to the current database, but it’s still gathering information on all tables and indexes in the database, even if you don’t need them. Moving it to an OUTER APPLY means it’s only doing it’s thing for tables and indexes you’re actually interested in, which speed things up enormously.

SELECT DB_NAME(DB_ID()) AS DatabaseName, Schema_Name(T.schema_id) AS SchemaName, T.Name AS TableName, R.Rows, I.Name AS IndexName, I.Type_Desc AS IndexType, I.is_primary_key AS PK, I.Is_Disabled AS Disabled,
    I.Fill_Factor,  S.avg_fragmentation_in_percent AS Fragmentation_Pct, S.avg_fragment_size_in_pages AS Fragmentation_Pages,  S.Page_Count
FROM sys.Tables                T (NOLOCK)
INNER JOIN sys.Indexes         I (NOLOCK) ON T.object_id = I.object_id
OUTER APPLY (SELECT TOP 1 Rows FROM sys.partitions P WHERE P.object_id = T.object_id) R
OUTER APPLY (SELECT TOP 1 * FROM sys.dm_db_index_physical_stats(DB_ID(), T.object_id, I.index_id, NULL, 'LIMITED') WHERE I.is_disabled = 0) S
WHERE T.Name = 'MY_TABLE'
ORDER BY T.Name, I.Is_Primary_Key DESC, I.Name

One other thing to note is that sys.dm_db_index_physical_stats throws an error if you ask it for information about a disabled index. Hence the WHERE clause in the second OUTER APPLY.

相关问题