BorderLayoutBoxedLayoutOpenLayoutMaximum textMedium textSmall text


Register
Tuesday, March 09, 2010
MyStreamMinimize
Print  

Entries for the 'Indexes' Category

Greater Fort Worth SQL Server Users Group

Posted by Jason Massie Click to IM Jason Massie on Saturday, April 11, 2009 at 2:07 PM to SQL Server 2008, SQL performance tuning, Indexes, PASS, DMVs
1523 Views | 0 Comments | Article Rating

I will be speaking at the Greater Fort Worth SQL Server Users group this Wednesday. The talk will be on Troubleshooting with the DMV’s. If you are a DFW SQL’er come on out. Directions and more info can be found here.

If you don’t come, I am sending my boy after you with blow torch and a pair of pliers.

email it! |   |   | 

Mixed Workloads Part 3

Posted by Jason Massie Click to IM Jason Massie on Tuesday, March 31, 2009 at 8:17 AM to SQL Server 2008, SQL Server 2005, SQL performance tuning, tsql, Indexes, consolidation, editorial, syndicate
1585 Views | 0 Comments | Article Rating

In part 1, I talk about what I consider feeble attempts at implementing a reporting server through log shipping, mirroring\snapshots and, to a lesser extent, replication. Unless you invest in a real ETL solution, I argued that it is better to run a mixed workload. I talked about the architectural advantages of running mixed workloads in part 2. In a nutshell, doubling the hardware and cutting the data in half.

In this final post, we will talk about new features in SQL Server 2008 and some features that have been around a while that can help with mixed workloads.There are also some bad practices that could be the right answer that we won’t  talk about but let us mention triggers, table valued functions, 20 table outer joins, some correlated subqueries and table variables. These are options but usually not good ones. In the right circumstances, they could be right like an end of year report.

Here are the main tools in your arsenal:

  • Resource governor
  • Covering filtered Indexes
  • Indexed views
  • Partitioning + data compression
  • Persisted computed columns
  • Archiving on the same server
  • After hours denormalization(indexes views with deferred updates)

Resource Governor

Chances are you do not want to limit throughput of your OLTP queries. The resource governor does not do a good job with these queries anyway because their duration is usually so short. However, let’s say you have reports that run by executives. You can put them in a workload group that gives them as much resources as possible without affecting OLTP traffic. You may also have a less important group of reports from the marketing or sales teams that you can limit further. One caveat to the resource governor is it cannot limit, disk IO so if that is your bottleneck, this will not help much.

Covering Filtered indexes

Filtered indexes are a great new feature in SQL Server 2008. When optimizing for reporting queries on your OLTP system, you are probably going to be touching a lot of rows so covering the query is important. For example, the order fulfillment team works off a report of unfulfilled orders that pulls in order data, customer data, shipping data etc. In this case, you would add covering filtering indexes on each of those tables. The filtered indexes reduce write overhead on your OLTP writes and reduces read overhead of your reporting.

Indexed Views

Indexed views take filtered indexed view a step further. You can create indexes on multiple tables. Think of it as denormalization alongside your OLTP optimized schema. In the previous order fulfillment example, we can basically persist that report and have it updated in real time. There is more overhead to your OLTP transactions so weigh the pro’s and con’s. Test if possible.Unfortunately, you cannot defer changes to your indexed views but I believe there is a feature request for this on Connect and I will tell you about a workaround shortly.

Partitioning and Compression

This is the dynamic duo when mixing workloads. Unfortunately, the nitty gritty details would require their own post. For example, one mixed workload may benefit from compression on the hottest partition while the older data should be uncompressed. However, another workload may benefit from the opposite. The key here is really understanding your workload, data and hardware limitations. Most importantly, plan then TEST, TEST, TEST! Once you partition, you loose online operations so if you do it wrong, you are stuck.

 Persisted Computed Columns

This is an easy one. It is a simple trade off. Writes take a little more CPU and space in exchange for reduced CPU time when you report. Take your orders table, for example. You could calculate and save shipping costs when you insert the rows. If it adds a few milliseconds to the insert but shaves seconds off the hourly open orders report that the execs are looking at, it may be an easy decision.

Archival

This might not always be possible depending on your data. It may not be necessary if you have finely tuned indexes. However, it could make a night and day difference. If you need the data, UNION ALL’ing the production table with the archive table has little overhead. I do suggest you keep the archive database on the same server unless it will rarely be accessed. Trying to do this with linked servers is bad.

After hours denormalization

This is basically precreating reports during off hours. Think of it as indexed view with deferred updates. You can UNION with the OLTP tables if you need realtime data in your report. In an ideal world, touching less rows in the OLTP table and then UNIONing with the denormalized data will result in the best of both worlds if you need real time data.

The final word

As the concurrency and size of data scales, both a pseudo reporting database and a mixed work load scenario will not meet business requirements. A business requirement of real time data may dictate a mixed workload. There may be plenty of workloads where scaling out and scaling up both meet performance demands. I just wanted to play devil’s advocate and  let you know there is another option when planning reporting.

email it! |   |   | 

Key Lookup Threshold

Posted by Jason Massie Click to IM Jason Massie on Thursday, February 05, 2009 at 8:05 AM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, Indexes
2273 Views | 1 Comments | Article Rating

Gail talks about bookmark lookups…. err.. key lookups in this post. So are they good or bad? Well, like many things in SQL, it depends. The main factor is the number of rows returned. A few rows are fine but the cost rises sharply with larger result sets. There comes a point where the threshold is crossed and a scan is more efficient. This is because a scan leverages sequential IO while a lookup does random IO.

Here are the results of the code at the end of the post.

RowsIO 
Scan422 
1 row4 
5 rows12 
10 rows22 
25 rows52 
50 rows114 
100 rows217 
250 Rows526<--Threshold
500 Rows1043 
1000 Rows4141 

 

As you can see, at 250 rows, we have crossed the threshold and it is cheaper do a scan. If you are passing in a literal, the optimizer can detect this and switch to a scan. If it is a stored proc or parameterized SQL, a plan is cached the first time it is run. Problems happen when the result size greatly varies depending on the parameter. There are ways around this all with their pro’s and con’s. Here are some:

  • A covering index.
  • A statement level recompile hint
  • Plan guides and hints

Here is the sample code that can repro these numbers on SQL Server 2008.

CREATE TABLE #temp
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
c1 CHAR(1000) DEFAULT( 'blah'),
c2 INT
)
INSERT INTO #temp(c2)
VALUES
(1)
GO 
INSERT INTO #temp(c2)
VALUES
(5)
GO 5
INSERT INTO #temp(c2)
VALUES
(10)
GO 10
INSERT INTO #temp(c2)
VALUES
(25)
GO 25
INSERT INTO #temp(c2)
VALUES
(50)
GO 50
INSERT INTO #temp(c2)
VALUES
(100)
GO 100
INSERT INTO #temp(c2)
VALUES
(250)
GO 250
INSERT INTO #temp(c2)
VALUES
(500)
GO 500
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000
INSERT INTO #temp(c2)
VALUES
(1000)
GO 1000

CREATE INDEX ix ON #temp(c2)

--The baseline
SET STATISTICS io ON
SELECT
*
FROM #temp WITH (INDEX=1)
WHERE c2 = 1

--1 row returned
SELECT *
FROM #temp
WHERE c2 = 1

--5 rows returned
SELECT *
FROM #temp
WHERE c2 = 5

--10 rows returned
SELECT *
FROM #temp
WHERE c2 = 10

--25 rows returned
SELECT *
FROM #temp
WHERE c2 = 25

--50 rows returned
SELECT *
FROM #temp
WHERE c2 = 50

--100 rows returned
SELECT *
FROM #temp
WHERE c2 = 100

--250 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 250

--1000 rows returned
--Must begin using hints because the optimizer can tell that a scan is better
SELECT *
FROM #temp WITH (INDEX=ix, forceseek)
WHERE c2 = 1000

email it! |   |   | 

Is 80/20 a 90’s Estimate?

Posted by Jason Massie Click to IM Jason Massie on Friday, January 23, 2009 at 3:08 PM to SQL Shorts, SQL Server 2005, Indexes, editorial
1760 Views | 15 Comments | Article Rating

It has often been said that even the most write intensive OLTP databases have an average of 80% reads and 20% writes. I am thinking interactivity has increased in applications because technology has allowed it to get more write intensive but I am just guessing.

Run this on your OLTP Crown Jewels and post the results. While not very scientific, it will give us some anecdotal data. The query could also be useful when sizing new hardware.

I came up with 97/3, 74/26 and 60/40 on three high volume databases.


 

--This query has minimal impact.

--Looks at index metadata to determine read\write ratio since the last restart of the instance.
SELECT  
CAST(SUM(user_seeks+user_scans+user_lookups) AS decimal)/CAST(SUM(user_updates)+SUM(user_seeks+user_scans+user_lookups) AS decimal)
AS ReadPercent
, CAST(SUM(user_updates) AS decimal)/CAST(SUM(user_updates) + SUM(user_seeks + user_scans + user_lookups) AS decimal
AS WriteRatio
FROM sys.dm_db_index_usage_stats


Post your results!

email it! |   |   | 

Updated Unused Index Query

Posted by Jason Massie Click to IM Jason Massie on Tuesday, January 13, 2009 at 5:41 PM to SQL Server 2008, SQL Server 2005, SQL performance tuning, tsql, Indexes
2580 Views | 4 Comments | Article Rating

I have cleaned up and updated my unused index query. It returns a list of indexes ordered by ascending reads. If the instance has been up for a long time, it is probably safe to drop indexes with zero or close to zero reads. If the reads are low and the writes are high, this may help improve your transactions per second count. Otherwise, you are just cleaning up unused space. Use caution though. A missing index is worse than an unused index.

SELECT objectname=OBJECT_NAME(s.OBJECT_ID)
,
indexname=i.name
, i.index_id  
, reads=user_seeks + user_scans + user_lookups  
, writes user_updates  
, p.rows
FROM sys.dm_db_index_usage_stats s JOIN sys.indexes i 
ON i.index_id = s.index_id AND s.OBJECT_ID = i.OBJECT_ID  
JOIN sys.partitions p ON p.index_id = s.index_id AND s.OBJECT_ID = p.OBJECT_ID
WHERE OBJECTPROPERTY(s.OBJECT_ID,'IsUserTable') = 1  
AND s.database_id = DB_ID()  
AND
i.type_desc = 'nonclustered'
AND i.is_primary_key = 0
AND i.is_unique_constraint = 0
AND p.rows > 10000
ORDER BY reads, rows DESC

I will also be putting this query up at the SSP WIKI.

email it! |   |   | 

Finding Index Scans due to Implicit Conversions

Posted by Jason Massie Click to IM Jason Massie on Monday, January 12, 2009 at 7:00 PM to SQL Server 2008, SQL Server 2005, query optimizer, SQL performance tuning, tsql, Indexes, Procedure Cache, DMVs
2451 Views | 8 Comments | Article Rating

It is fairly widely known that implicit conversions of data types can cause scans. However, you may not be able to control what goes in your database, you may have inherited an app and we just are not perfect.

On top of that some can be tricky. For example, did you know that scope_identity and @@identity both return numeric data types. An AVG of a SMALLINT column returns an INT.

Here is a simple repro which the same query does an index scan instead of an index seek

CREATE TABLE t1
(
id INT IDENTITY(1,1) PRIMARY KEY CLUSTERED
, c1 VARCHAR(100)
)

GO

INSERT INTO t1(c1)
SELECT 1
GO 10000

INSERT INTO t1(c1)
SELECT 0
GO 10

CREATE INDEX ix ON t1 (c1)

CREATE PROC p1
@blah NVARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p1 N'0'

CREATE PROC p2
@blah VARCHAR(100)
AS
SELECT
id FROM t1
WHERE c1 = @blah

EXEC p2 '0'

DROP TABLE t1
DROP PROC p1
DROP PROC p2

 

I found a sample query from Umachandar Jayachandran and made some mod’s to it. It basically scans your procedure cache for index scans containing implicit conversions.

I have posted it at the SQLServerPedia Wiki here. It is the 2nd query listed.

Note: Implicit conversions may not always result in a scan.

email it! |   |   | 

Never Index a BIT?

Posted by Jason Massie Click to IM Jason Massie on Tuesday, December 09, 2008 at 8:03 AM to SQL Server 2008, SQL performance tuning, Indexes
1994 Views | 4 Comments | Article Rating

Never say never, right? For example, you have an orders table. It has an IsShipped bit column which is what marks an order as done to the business. You might have a more complex version of this example so employees can see the outstanding order count in the app:

CREATE TABLE#temp
   
(
               
c1INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,
               
c2 VARCHAR(10) DEFAULT  'bob',
               
c3 bit
   
)

INSERT INTO#temp(c3)
       
VALUES
   
(1)
GO 5000000

INSERT INTO#temp(c3)
       
VALUES
   
(0)
GO 500 

--13,000 IO's
SET STATISTICS io ON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0

CREATE INDEXix ON#temp(c3) include (c2

--6 IO's 
SET STATISTICS io ON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0
           

In SQL 2008, we can get a little better with filtered indexes but really all we are saving is space.

CREATE INDEXix2 ON#temp(c3) include (c2
   
WHEREc3 = 0

--4 IO's

SET STATISTICS ioON
SELECT
COUNT(*)
   
FROM #temp
   
WHEREc3 = 0

email it! |   |   | 

5 Quick Tips for the Query using the “Wrong” Index

Posted by Jason Massie Click to IM Jason Massie on Thursday, October 30, 2008 at 10:59 PM to SQL Server 2008, SQL Server 2005, query optimizer, Indexes
1537 Views | 1 Comments | Article Rating

I quote “wrong” because with the info the optimizer has, it thinks it has the right index. :)

1. Look for cardinality problems. You can do this by looking at the actual rows versus estimated rows in the query plan. If they are off, chances are you are not following the best practices listed here.

2. Conversions in the WHERE clause can cause issues. Be sure to look at query plan for implicit conversions

3. Complicated logic in the WHERE clause can also cause problems. Consider indexed persisted computed columns or indexed views.

4. Constraints give the optimizer more info when choosing a plan with the “right” index. It can take a PK or unique constraint into consideration but it does not know about similar logic in a trigger.

5. Large datasets are being returned but the index is not covering. An index with just the columns in the WHERE clause is ok if just a few rows are returned but it does not take much for an index scan to be more efficient.

#1 is the most important one because the link gives 5 or 7 more reason why the “wrong” index is chosen.

P.S. If all else fails, update stats and if that does not work, make the index covering.

email it! |   |   | 

Indexed Views vs. Filtered Indexes Part 2

Posted by Jason Massie Click to IM Jason Massie on Monday, August 25, 2008 at 9:21 AM to SQL Server 2008, SQL performance tuning, tsql, Indexes
1274 Views | 1 Comments | Article Rating

image In the first part, we looked at the a very simple example with a single table SELECT using covering indexes. This one is a little more complex. We are still covering but we are joining two tables, ordering and grouping. Filtered indexes do not support joins so we have to create two indexes. In this test, the indexed view wins but filtered indexes come a respectable 2nd place compared to normal indexes.

 

 

The DDL(2.5M in tblitem and 25M in tblitemdetails)

CREATE TABLE [dbo].[tblItem](
 
    [Itemid] [int] IDENTITY(1,1) NOT NULL,
 
    [ItemName] [varchar](100) NULL,
 
    [Datecreated] [datetime] NULL,
 
    [Closed] [bit] NULL,
 
PRIMARY KEY CLUSTERED 
 
(
 
    [Itemid] ASC
 
) ON [PRIMARY]
 
) ON [PRIMARY]
 
GO
 
CREATE TABLE [dbo].[tblItemDetails](
    [ItemDetailsID] [int] IDENTITY(1,1) NOT NULL,
    [ItemID] [int] NULL,
    [Color] [varchar](20) NULL,
    [Size] [varchar](20) NULL,
    [Flavor] [varchar](20) NULL,
PRIMARY KEY CLUSTERED 
(
    [ItemDetailsID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
 
GO
 
SET ANSI_PADDING OFF
GO
 
ALTER TABLE [dbo].[tblItemDetails]  WITH CHECK ADDCONSTRAINT [fk01] FOREIGN KEY([ItemID])
REFERENCES [dbo].[tblItem] ([Itemid])
GO
 
ALTER TABLE [dbo].[tblItemDetails] CHECK CONSTRAINT [fk01]
GO
 
 
--Create normal and filtered indexes
create index ix01 on tblitem(Datecreated, ItemName)  
 
create index ix02 on tblitem(Datecreated, ItemName) where Datecreated > '2007-08-22' and itemid > 0 and itemid < 50 
 
create index ix01 on tblitemdetails(itemid, color)  
 
create index ix02 on tblitemdetails(itemid, color) where itemid > 0 and itemid < 50
 
--Create indexed views
alter view ivw01
 
with schemabinding
 
as 
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a join dbo.tblItemDetails b on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
 
create unique clustered index ix01 on ivw01(itemid)
create index ix02 on ivw01(itemid, color) 


Test Queries

set statistics io on
 
--Using a normal covering index
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a with (index=ix01) join dbo.tblItemDetails b with (index=ix01) on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
order by cnt
 
 
 
--using a filtered index
 
select  a.ItemID, b.Color, COUNT_big(*) as cnt
from dbo.tblItem a with (index=ix02) join dbo.tblItemDetails b with (index=ix02) on a.Itemid= b.ItemID 
where Datecreated > convert(datetime, '2007-08-22', 101) and a.itemid > 0 and a.itemid < 50
group by a.ItemID, b.Color
order by cnt
 
 
--using an indexed view
 
select * from ivw01
order by cnt

IO numbers:

Normal covering indexes

(8 row(s) affected)
Table 'tblItemDetails'. Scan count 8, logical reads 24, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblItem'. Scan count 1, logical reads 991, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Filtered covering indexes

(8 row(s) affected)
Table 'tblItem'. Scan count 1, logical reads 99, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'tblItemDetails'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Indexed View

(8 row(s) affected)
Table 'ivw01'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Note: By increasing the result set 100 fold(itemid > 0 and itemid < 5000), we see very similar results except with merge joins.

 

Query Plans:

The query plans are different. Most notably being the scans for the filtered index and indexed view vs. the normal index. This is fine though. They as just a fraction of the size of the normal index.

image

 

Conclusion

The mere fact that the indexed view runs with 50 times less IO overhead does not make it the automatic choice. What if you are on standard edition? Filtered indexes work on standard edition. Also we filtered on a date column. What if you need a real time rolling total? You have to use a normal index. Know you tools and then test, test and test some more.

email it! |   |   | 

Filtered Indexes VS. Indexed Views

Posted by Jason Massie Click to IM Jason Massie on Thursday, August 21, 2008 at 10:29 PM to SQL Server 2008, Indexes
1517 Views | 1 Comments | Article Rating

Note: This is a simple example and YMMV. I may test queries with lookups, hashes, aggregations etc later. Let me know if you have an interesting test case.

I decided to run a quick test comparing covering filtered indexes vs. covering indexed views vs. a normal covering index. Read performance-wise, it is a statistical draw.

The DDL and queries

USE [demodb]
GO
 
 
CREATE TABLE [dbo].[tblItem](
    [Itemid] [int] IDENTITY(1,1) NOT NULL,
    [ItemName] [varchar](100) NULL,
    [Datecreated] [datetime] NULL,
    [Closed] [bit] NULL,
PRIMARY KEY CLUSTERED 
(
    [Itemid] ASC
) ON [PRIMARY]
) ON [PRIMARY]
 
GO
 
create index ix01 on tblitem(Datecreated, ItemName)  
create index ix02 on tblitem(Datecreated, ItemName) where Datecreated > '2007-08-22'
 
go
 
create view ivw01
with schemabinding
as 
select ItemName, Itemid from dbo.tblItem
where Datecreated > convert(datetime, '2007-08-22', 101)
 
create unique clustered index ix01 on ivw01(itemid)
create index ix02 on ivw01(ItemName) 
 
set statistics io on
 
--Using a normal covering index
select ItemName, Itemid from tblItem with (index=ix01)
where Datecreated > convert(datetime, '2007-08-22', 101)
 
--using a filtered index
select ItemName, Itemid from tblItem with (index=ix02)
where Datecreated > convert(datetime, '2007-08-22', 101)
 
--using an indexed view
select * from ivw01

The graphical query plans

There are no surprises in the query plan. It does a scan on the filtered index but that doesn't matter since it is ...err... filtered.

image

Statistics IO Output

Normal covering index:

(346129 row(s) affected)
Table 'tblItem'. Scan count 1, logical reads 991, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Filtered Index:

(346129 row(s) affected)
Table 'tblItem'. Scan count 1, logical reads 990, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Indexed view:

(346129 row(s) affected)
Table 'tblItem'. Scan count 1, logical reads 991, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Other Considerations

Some other things to consider is the fact that the filtered index is a fraction of the size of the normal index. The indexed views are an enterprise feature but filtered indexes are not. The optimizer may have a hard time choosing a filtered index but these problems are not present with indexed views. So the right choice may vary with different apps even if the schema is the same.

email it! |   |   | 

NTSSUG June Presentation

Posted by Jason Massie Click to IM Jason Massie on Thursday, July 17, 2008 at 2:37 PM to query optimizer, SQL performance tuning, Indexes, PASS, Data collection
1196 Views | 0 Comments | Article Rating

Here are the slides and scripts from my June presentation at the North Texas SQL Server Users Group. The presentation was on the Data Collector in SQL Server 2008 but it was short so I also touched on filtered indexes, query hashes and troubleshooting. Enjoy and let me know if you have any questions.

Also Peter Debetta is presenting TONIGHT at 6:30 on the following topics:

"We'll be discussing the various ways you can secure your SQL Server data via encryption, including the new Transparent Data Encryption feature of SQL Server 2008."

 Stop by.

email it! |   |   | 

Stupid XML SSMS Trick

Posted by Jason Massie Click to IM Jason Massie on Tuesday, July 15, 2008 at 5:40 PM to SQL Server 2008, SQL Server 2005, tsql, Indexes
1377 Views | 2 Comments | Article Rating

This might be HAWTALTA for some of you. I just thought of doing this today even though I have been using xml showplan since I was knee high to a junebug. You can make query results clickable just by casting them as XML. For example:

select cast('select 1' as xml)

By clicking the resulting "select 1", a new query editor window opens with "select 1" ready to execute. This would be useful for scripts that generate scripts. Here is a practical example. (Forgive the formatting. I cannot get too wide in the post.)

SELECT  sys.objects.name
, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact
,  cast('CREATE NONCLUSTERED INDEX ~NewNameHere~ ON ' 
+ sys.objects.name + ' ( ' + mid.equality_columns + 
CASE WHEN mid.inequality_columns IS NULL THEN '' 
ELSE CASE WHEN mid.equality_columns IS NULL 
THEN '' ELSE ',' END + mid.inequality_columns END + ' ) ' 
+ CASE WHEN mid.included_columns IS NULL 
THEN '' ELSE 'INCLUDE (' + mid.included_columns + ')' 
END + ';' as xml) AS CreateIndexStatement
, mid.equality_columns
, mid.inequality_columns
, mid.included_columns
FROM sys.dm_db_missing_index_group_stats AS migs 
INNER JOIN sys.dm_db_missing_index_groups AS mig 
ON migs.group_handle = mig.index_group_handle 
INNER JOIN sys.dm_db_missing_index_details AS mid 
ON mig.index_handle = mid.index_handle 
INNER JOIN sys.objects WITH (nolock) 
ON mid.object_id = sys.objects.object_id
WHERE     (migs.group_handle IN
(SELECT TOP (500) group_handle
FROM  sys.dm_db_missing_index_group_stats WITH (nolock)
ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC))
ORDER BY 2 DESC 

 

Here are what some of the results look like.

results

 

Update : You also have to associate the XML file extension with the SQL editor to be executable.

 extension

email it! |   |   | 

Reminder: SSWUG Virtual Conference Starts Tomorrow

Posted by Jason Massie Click to IM Jason Massie on Monday, June 23, 2008 at 2:35 PM to SQL Server 2008, SQL performance tuning, Indexes, SSRS
878 Views | 0 Comments | Article Rating

Just a quick reminder that the SSWUG virtual conference starts tomorrow. I hope you can attend. They have given me a discount code to distribute: VIPJM2008DIS I do not get any extra cha-ching from you using it. I just state that so this post doesn't sound spammy.

Actually hearing my recorded voice makes me cringe so this will definitely be interesting for me. :)

 

Here are the sessions I am doing:

Filtered Indexes and Statistics in SQL 2008

SQL Reporting Services for the DBA

SQL Profiler: Configuration, Analysis, and SQL Server 2008 Enhancements

http://vconferenceonline.com/sswug/demo.asp

email it! |   |   | 

Attn DFW SQL'ers

Posted by Jason Massie Click to IM Jason Massie on Wednesday, June 18, 2008 at 2:58 PM to SQL Server 2008, SQL performance tuning, Indexes, PASS, Data collection, RC0
1215 Views | 0 Comments | Article Rating

Reminder: You can catch me at the North Texas SQL Server User's Group tomorrow night. The Data Collection presentation that I put together may be short so I have a few backup slides and demo's on filtered indexes, queries hashes and the 2 second SQL Server 2008 upgrade. Directions and details can be found at http://northtexas.sqlpass.org

Hope to see you there.

email it! |   |   | 

SSWUG Virtual Conference Discount Code

Posted by Jason Massie Click to IM Jason Massie on Thursday, June 12, 2008 at 6:49 PM to SQL Server 2008, Indexes, SSRS, Professional Development
866 Views | 0 Comments | Article Rating

I just got a discount code for the SSWUG conference. The code is: VIPJM2008DIS . It is good for 10% off. I am not a sales person so watch the commercial. Hope to "see" you there.

 

email it! |   |   | 

SQL Server goes GREEN.

Posted by Jason Massie Click to IM Jason Massie on Thursday, April 10, 2008 at 10:42 AM to Indexes, Offtopic
628 Views | 0 Comments | Article Rating

Greg Linwood filed a connect with MSFT to remove nonclustered indexes (NCI) from the backups of SQL Server. The idea being less backup overhead and less power. That is a good idea. :) Check out his blog and then vote at the connect site.

email it! |   |   | 

RE: Filtered indexes not compatible plan reuse

Posted by Jason Massie Click to IM Jason Massie on Friday, March 21, 2008 at 11:40 PM to SQL Server 2008, query optimizer, Indexes, CTP6, Procedure Cache
1732 Views | 1 Comments | Article Rating

Before the original post, I had submitted a connect item for something similar.

It turns out that filtered indexes may not be used when auto-parameterization(or for stored proc) occurs without a recompile hint. This is sort of understandable but I can think of workarounds and I suspect additional logic will be added in the future as mentioned below. Here is the feedback from a connect item I submitted.

"

Thanks for your feedback. The query in question, select lastname from Person.Contact where ContactID <=(100-20) is auto-parameterized by SQL Server into the following form, using the standard auto-parameterization rules: select lastname from Person.Contact where ContactID <=(@1-@2) These rules are designed to minimize compile time cost for simple queries like this, possibly at the expense of additional optimizations such as using a filtered index. Admittedly, it is a difficult tradeoff. In a future release, we'll consider extending the design to make a better decision in a cost-based way. For now, this behavior is by design.

"

alt head: You can't have your cake and eat it too.

email it! |   |   | 

SQL shorts

Posted by Jason Massie Click to IM Jason Massie on Friday, March 21, 2008 at 1:27 PM to SQL Server 2008, SQL Shorts, SQL Server 2005, Windows Server 2008, tsql, Indexes, Offtopic, Virtualization, CTP6
2395 Views | 1 Comments | Article Rating

We regret to inform you of a passing in the SQL community.   Read more here.

Rick Heiges has a post on my favorite new feature in SQL Server 2008, filtered indexes

So does Decipherinfosys along with a good description on the difference of indexes and statistics if you need some background.

SQLBlogcasts has gotten an upgrade and Tony posts some great stats. Congrats!

Not SQL per say but the Hyper-V release candidate has been released. Speaking of Hyper-V, Sriram posts his slide decks on virtualizing SQL. Part 1 and Part 2.

This is a great starting point for SQL Server 2008 as is this. These come by way of the MSDN\Technet update blog.

The SQL Server 2005 sp3 debate continues.

The SQL ISV team posts a performance improving cursor rewrite sample. However, it is not ANSI compliant :) which is odd since most ISV tsql code needs to be portable.

Paul Nielson will be releasing a DVD.

While we are at it, check out the new SQL Server social network.

alt head: Got ADD?

email it! |   |   | 

Filtered indexes not compatible plan reuse??

Posted by Click to IM Jason Massie on Sunday, March 16, 2008 at 5:31 AM to SQL Server 2008, query optimizer, Indexes, CTP6, Procedure Cache
2437 Views | 0 Comments | Article Rating

So I was messing around with filtered indexes tonight. Yes, beta software is my idea of a wild Saturday night. Hey, I played some Smash Bros first. ;) I was trying to figure out what happens if a plan is created using a covering but filtered index and try to reuse it with a parameter that is covered vertically but not horizontally. Let's look at this example: 

create database dbtest01;

--Force parameterization

alter database dbtest01 set parameterization forced;

 

use dbtest01

--create test table 1

create table t1

(id int identity(1,1),

c1 int,

c2 int)

 

--insert dummy data

declare @ctr int = 2000000

while @ctr > 0

begin

insert into t1(c1, c2) values (@ctr, @ctr)

select @ctr=@ctr-1

end

 

 

--create clustered indexes

create clustered index CI on t1(id)

 

 

--Let create a covering indexes that is pretty extreme.

create index ix01 on t1(c1) where c1 = 1990001

 

 

--Test queries

--ctrl + m

dbcc freeproccache

set statistics io on

 

--Does a CI scan.

select c1 from t1 

where c1 = 1990001

 

--Nope, index hints returns an error

--Query processor could not produce a query plan because of the hints defined in this query. Resubmit the query without specifying any hints and without using SET FORCEPLAN.

select c1 from t1  with (index=ix01)

where c1 = 1990001

 

--Ahh, recompile works. Now it uses the filtered index

select c1 from t1 

where c1 = 1990001

option(recompile)

 

 

--What about simple parameterization and a stored proc?

alter database dbtest01 set parameterization simple;

 

create proc proc1

@p1 int

as

select c1 from t1 

where c1 = @p1

 

--It does the CI scan

exec proc1 1990001

 

 

alter proc proc1

@p1 int

as

select c1 from t1 

where c1 = @p1

option(recompile)

 

--It uses the filtered index :)

exec proc1 1990001

email it! |   |   | 

SQL Server 2008 filtered indexes in 5 minutes

Posted by Jason Massie Click to IM Jason Massie on Saturday, February 23, 2008 at 11:11 PM to SQL Server 2008, SQL performance tuning, tsql, Indexes, CTP6, In 5 minutes
3679 Views | 1 Comments | Article Rating

My jaw literally dropped when I saw it. If this works as advertised, it has the potential of changing everything. The days of over indexing will be over. Dynamic indexing off of the missing and unused index DMV's could be possible especially with added support. I also think this will better accomplish what people tried to do with partitioning for performance reasons in SQL Server 2005.

It will be really interesting to see how it gets applied in production and where this leads in the next versions. Maybe an autoindex checkbox will replace the DBA :)

This is a small and quick example. We will have to see how it scales to larger environments.

 

use adventureworks

--Let's nullify a random 400

update  Production.WorkOrder

set EndDate = null

where WorkOrderID in (select top 400 WorkOrderID from Production.WorkOrder where enddate is not null)


--Let's give a random 200 a recent date to mimic prod data

update top (200) Production.WorkOrder

set duedate = getdate()-10

where enddate is null

--Base query
--This is query type that should be simple yet common
--Let's get open WO's that will be due soon

set statistics io on

select p.Name, wo.OrderQty, wo.DueDate

from Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID

where wo.EndDate is null and wo.DueDate >= '2007-11-24'

 --CI Scan with loop join

--Table 'WorkOrder'. Scan count 1, logical reads 530, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Now let's create a covering index

create index ix01 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty)

--Run the base query

set statistics io on
select
p.Name, wo.OrderQty, wo.DueDate
from
Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID
where
wo.EndDate is null and wo.DueDate >= '2007-11-24'

--Does NCI seek

--Table 'WorkOrder'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

 

--Clean up

drop index [Production].[WorkOrder].[ix01]

 

--Create Filter index

create index ix01 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty) where EndDate is null and DueDate >= '2007-11-24'

--Run base query

set statistics io on
select
p.Name, wo.OrderQty, wo.DueDate

from Production.WorkOrder wo JOIN Production.Product p on wo.ProductID=p.ProductID

where wo.EndDate is null and wo.DueDate >= '2007-11-24'

 

--It uses it. We have a real small sample but it performs better

--Table 'WorkOrder'. Scan count 1, logical reads 2, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
--Now let's look at size
--Create unfiltered index for comparison

create index ix02 on Production.WorkOrder(EndDate,DueDate, ProductID) include (OrderQty)

--note the new syntax

declare @dbid int = db_id()

declare @objid int = object_id('[Production].[WorkOrder]')

select * from sys.dm_db_index_physical_stats(@dbid, @objid, null, null, 'detailed') ps join sys.indexes i

on ps.object_id=i.object_id and ps.index_id=i.index_id

and i.name in ('ix01', 'ix02') and i.type_desc='NONCLUSTERED'

--We are looking at 2 page vs. 301 pages

That is a huge difference in size. The major point is not the fact that we retrieve the same while taking up so much less space on disk but so much less space in memory as well. We are going to get into this much more!

 

 Technorati Tags: ,,

email it! |   |   | 

The problem with local variables

Posted by Jason Massie Click to IM Jason Massie on Friday, January 25, 2008 at 9:32 AM to SQL Server 2005, query optimizer, SQL performance tuning, Indexes
1235 Views | 0 Comments | Article Rating

Have you ever been writing a query and just cannot get it to use the right index. This could be one of the reasons why. Let's use this query with local variables as our example.

declare @Start datetime
declare @End datetime
select @Start = '2004-08-01 00:00:00.000'
select @End = '2004-07-28 00:00:00.000'
select ProductID from sales.SalesOrderDetail where ModifiedDate >= @End and ModifiedDate <= @Start
It generates this plan:

SQL Server Clustered index scan

but we have an index on ModifiedDate. There are many reason why SQL would not use this index but, for this post, we will assume we have eliminated them. Finally, we hard code the dates and we get this plan.

SQL Server index seek with bookmark lookup

So why is it doing this? The reason is because the query optimizer cannot accurately use the statistics to estimate how many rows are returned with local variables. Let's look at how we can tell there is a problem with the cardinality estimates. In the query with the local variables, the optimizer thinks we are getting 10918.5 rows so we do the index scan. In the query with hard coded literals, the estimated rows and actual rows are the same and accurate.

SQL Server Cardinality underestimation

How can we fix this? There are several way. This is another situation that makes a case for stored procedures or parameterized queries.

create proc pDemo01 @Start datetime, @End datetime
as
select ProductID from sales.SalesOrderDetail
where ModifiedDate >= @End and ModifiedDate <= @Start

The stored proc generates the proper plan. However, you will run into the same problem if you modify the parameter within the stored proc like select @start = @start-90. In this case, to should use sp_executeSQL. What if you cannot use a stored proc because it is a 3rd party app or some other reason? A covering index is probably the solution. Once we create this index, it will always be used:

create index ix01 on sales.SalesOrderDetail(ModifiedDate) include (ProductID)

We could use a plan guide or an index hint with a forceseek(SQL 2008) but performance will be really bad when we really do need to get 10k rows. The same problem can happen with stored proc's but that is another post.

To get deeper into this subject, check out this.

email it! |   |   | 

The forceseek hint in SQL Server 2008

Posted by Jason Massie Click to IM Jason Massie on Thursday, December 06, 2007 at 10:38 PM to Windows Server 2008, query optimizer, SQL performance tuning, tsql, Indexes
1284 Views | 0 Comments | Article Rating

Have you every had a query that should totally be using an existing index and doing a seek but it keeps doing a clustered index scan? In SQL Server 2008, you can force it to do a seek. That may or may not be a good thing. Let's take a look at an AdventureWorks example.

--logical reads 1246

select ProductID, OrderQty from sales.SalesOrderDetail

where ProductID > 776

This query generates this plan.


Now let's add the hint.

--logical reads 271264

select ProductID, OrderQty from sales.SalesOrderDetail with (forceseek)

where ProductID > 776


This query generates this plan.

The second query plan contains the index seek but logical reads are 20 times more. This is because it has to do lookup for OrderQty. Obviously using the hint would be a bad thing.

So when would be a good time to use this hint? Let's look at this query
create proc p1
@p int
as
select c1, c2 from t1 where c3 = @p

t1 has 1 million rows
99% of c3 =1
.5% of c3 = 2
.5% of c3 = 3
99% of the queries run with @p = 2 or 3

The problem is if the query compiles with c3 = 1. The optimizer will choose a CI scan. In SQL 2005, we could recompile each time at the statement level or we could use an optimize for hint. Now we have another tool in the bag of tricks.

Use with caution.

email it! |   |   | 

Take it a step further with the unused and missing index DMV's

Posted by Jason Massie Click to IM Jason Massie on Sunday, November 25, 2007 at 10:33 AM to SQL Server 2005, query optimizer, SQL performance tuning, tsql, memory bottleneck, Indexes
3589 Views | 0 Comments | Article Rating

In SQL 2000 I wanted to find unused indexes but found it to be difficult. I had to capture a huge trace that fully represents the workload and “trust” that the ITW knows what it’s talking about. You could also use the scan started trace event filtered by dbid, objectid, indexid. This procedure was very tedious.

In SS2005, a quick query of the dynamic management views lets you know which indexes are not being used. You can do this in a few minutes what had previously taken days or weeks. This alone is a very powerful feature.

I am proposing that you take it a step further. We are going to use a little logic and the missing index DMV’s to combine indexes and remove indexes that are still used but redundant. The steps in this process would look like this:

1.       Remove unused indexes with the unused index script

2.       Get your list of tables to analyze.

3.       Remove redundant but used indexes.

4.       Revue missing index DMV’s for mistaken index drops.

5.       Combine indexes that where it is logical to do so.

6.       Revue missing index DMV’s for mistaken index drops.

This methodology is most effective and viable when these conditions are met:

·         The server has been online and thus collecting stats for a long time.

·         The server is not pushing a hardware bottleneck. If so, this should be done during maint window.

·         The server is enterprise edition and the tables allow online operations (i.e. no LOB data or partitions).

What are the benefits of removing unused indexes?

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

What are the benefits of removing unused indexes, used but redundant indexes and combining indexes?

·         Less memory footprint for the same amount of data

·         A different angle to index tuning

·         Indexes are more likely to be covering

·         Reduced writes during updates

·         Reduced space usage

·         Reduced backup\restore space and time

·         Reduced index maintenance time

Now, let’s get down to business…

 

Remove Unused Indexes

This is pretty straightforward and relatively safe. You want to make sure that your server has been up long enough to get good index usage. This includes infrequent operations like month end reporting, etc.

--Unused indexes

declare @dbid int 

select @dbid = db_id() 

 

select object_name(s.object_id) as ObjName

, i.name as IndName

, i.index_id 

, user_seeks + user_scans + user_lookups  as reads

, user_updates as writes 

, sum(p.rows) as rows

from sys.dm_db_index_usage_stats s join sys.indexes i on s.object_id = i.object_id  and i.index_id = s.index_id 

join sys.partitions p on s.object_id = p.object_id and p.index_id= s.index_id

where objectproperty(s.object_id,'IsUserTable') = 1 and s.index_id> 0 and s.database_id = @dbid 

group by object_name(s.object_id), i.name, i.index_id, user_seeks+ user_scans + user_lookups, user_updates

order by reads, writes desc 

 

You can most likely drop any indexes with zero or close to zero reads. The more rows, the more space you will reclaim. The more writes, the better write performance you get by removing them. If the index is not being read but writes are minimal and rows are low, there is little benefit of dropping it. Keep in mind that not having an index that you need is a lot worse than having an index you don’t need in most situations. I suggest starting off with a conservative approach.

Now that we have gotten rid of the low hanging fruit, we can get deeper. I get a list of tables by size and work my way down.

select object_name(object_id), max(rows)

from sys.partitions

group by object_name(object_id)

order by 2 desc

After running that query we see that tEvent is the largest table and should provide nice gains. This is where we start. Let’s use this schema for our example.

create table tEvent

(EventID int primary key clustered,

EventType int,

EventName varchar(100),

EventDetailID int,

CustomerID int,

CompanyID int,

DateOpen datetime,

DateClose datetime)

 

create index ind1 on tEvents(EventDetailID);

 

create index ind2 on tEvents(CustomerID);

 

create index ind3 on tEvents(EventDetailID, CustomerID, CompanyID);

 

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

Removing used but redundant indexes

If we do a sp_helpindex, we can see that ind3 should satisfy queries currently using ind1 and ind2. The keyword is “should”. It’s a fairly safe bet so we drop ind1 and ind2. Now we should monitor the missing index DMV to see if there is any negative impact. We should also see the read count substantially increase by using the unused index query filtered by tEvent. Here is the missing index monitoring query:

--Missing indexes

SELECT sys.objects.name

, (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) as Impact

, mid.equality_columns

, mid.inequality_columns

, mid.included_columns

FROM sys.dm_db_missing_index_group_stats AS migs

INNER JOIN sys.dm_db_missing_index_groups AS mig ON migs.group_handle = mig.index_group_handle

INNER JOIN sys.dm_db_missing_index_details AS mid ON mig.index_handle = mid.index_handle

INNER JOIN sys.objects WITH (nolock) ON mid.object_id = sys.objects.object_id

WHERE     (migs.group_handleIN

(

SELECT     TOP (5000) group_handle

FROM sys.dm_db_missing_index_group_stats WITH (nolock)

ORDER BY (avg_total_user_cost * avg_user_impact) * (user_seeks + user_scans) DESC)

)

and objectproperty(sys.objects.object_id, 'isusertable')=1and name = 'tEvent'

ORDER BY 2 DESC , 3 desc

As long as the indexes do not come back up in this report, we should be ok. Now we are servicing the same queries with 2 less indexes. Those are index pages that are no longer taking up buffer pool space!

Combining indexes

The same concept can be applied to the following scenario but a little more “feel” and understanding of how the app accesses the data is needed.

create index ind4 on tEvents(CompanyID, DateClose, EventType);

 

create index ind5 on tEvents(CustomerID, DateClose, EventType);

We know a former DBA added these. It is logical for us, based on our knowledge of the app and the cardinality of the data, to try to replace these indexes with this one.

create index ind6 on tEvents(CustomerID, CompanyID,  DateClose, EventType) with (online=on, maxdop=8);

This index should satisfy all queries using both indexes. Again, we want go back to the missing index report to see if SQL thinks it needs one of those indexes.

Now work your way down your list of tables by row count. I was able to reduce the size of a 200GB database by 15% in addition to removing the unused indexes. That’s a big gain! The only problem I ran into was changing the name of an index that had a hint. There were a few indexes that I had to add back but since my approach was conservative it was nothing drastic and completely online.

Let me know if you have any other tips or questions regarding this topic.

email it! |   |   | 

Page 1 of 1First   Previous   Next   Last   


Copyright 2006 by Statistics IO, My SQL Server Blog