Claude Code and MSSQL MCP Server – Part 1

(DBA fuel)

In this new series of blog posts, we are going to look into hooking Claude Code into MSSQL MCP Server. I am going to leave most of the conclusions up to the audience; however, I will do a summary at the end. A small forewarning, this is a screenshot-heavy post, and direct screenshots of Claude Code do not ‘read’ nicely. This series is inspired by Oli Flindall’s last few posts on Claude. In addition, due to the ever increasing advancement of the models, this post will likely age poorly.

I have been using Claude Code since practically day one, and it has transformed the way I think about doing anything technical on my side projects. I thought it would be cool to see what Claude can do via this new MCP server as it comes.

I am using Claude Code on a Windows 11 environment under WSL. Assuming you also have Claude Code installed, setting up the MCP is fairly straightforward. Just navigate to your project folder of choice in something like VS Code and follow the instructions on the MCP page. Currently, Claude Code via the MCP is only pointing to a small VM with nothing important or large on.

For all prompts, unless otherwise explicitly stated, Sonnet will be used (using Max plan). Opus 4 has been erratic the past few weeks, and in my own experience, Sonnet has given me more stable results. I will do future posts on the recent Opus 4.1 (as of August 2025) as and when it proves it can do as I ask and not go off piste. In addition, Claude will not be able to execute any installed stored procedures, so it will have to find it’s way through the prompts.

Initial Setup

The out-of-the-box experience may not be what you want, as it will try and use Microsoft Entra authentication unless you make some changes. In my case, I asked Claude Code to re-jig things to facilitate good old SQL login authentication (I couldn’t get the workaround to work). Once this is done, to add the SQL Server MCP to Claude Code, run something akin to this:

claude mcp add mssql-server 
    -e SERVER_NAME="sksqs25dveh_dv" 
    -e DATABASE_NAME="master" 
    -e SQL_USERNAME="sa" 
    -e SQL_PASSWORD="********" 
    -e READONLY="false" 
    -e TRUST_SERVER_CERTIFICATE="true" 
    -- node /mnt/c/coding/SQL_MCP/SQL-AI-samples/MssqlMcp/Node/dist/index.js

Yes, I am using ‘sa’….

Once this is done, we can check if it is present by typing /mcp in Claude Code:

Looks good. Sweet! Job done as far as plumbing this in.

Basic Commands

To test the waters, let’s ask Claude to list the databases on SKSQS25DV\EH_DV:

Cool.

Next, let’s ask Claude to list the size of StackOverflow2010:

This is correct, but I would not describe it as a substantial database. Nitpicks, but just saying.

More complex prompts

Let’s ask Claude what he thinks of the storage performance of the SQL Server:

It’s at this point that I will start splitting up the output into small excerpts, as it will make for painful reading otherwise…

These numbers look intuitively correct, as it is indeed a fast NVME device hosting the data and log files. It has provided some unnecessary extra insight, suggesting that the tpcc_2025 database is used for TPC-C benchmarking. This is correct.

Next, I asked it to look for the most CPU-intensive query in the plan cache. I forgot to screenshot the prompt (doh), but I did not ask it for either the total or average. I just let it run with it:

The query it picked out was indeed the most resource-intensive individual (total CPU) query and is part of the New Orders mechanism used by HammerDB. Unfortunately, I made a schoolboy error and did not stop my running workload, so the numbers are not quite aligned in my sp_blitzcache output, but you get the idea:

Let’s see what it produces when I ask it to fetch missing indexes and combine them into fewer:

Here is the result:

Note the ‘2.5M’ improvement measure (more on this in a while). I have seen these derived scores/improvement ratings somewhere else…

I did not test any of the recommendations, as this is just to show you what it produces at stock settings. I still think it is pretty cool. Anything that can broadly speaking, point you vaguely in the right direction is a good start.

Even more complex prompt

I decided to low-ball Claude by asking it to look through the XML of the execution plans for any anti-patterns in the top 10 queries (didn’t specify by what category):

Here is the full output:

# SQL Server Execution Plan Anti-Pattern Analysis
*Analysis Date: August 7, 2025*
*Server: SKSQS25DVEH_DV*

## Executive Summary

This document presents an analysis of the top 10 CPU-intensive queries from the SQL Server plan cache, focusing on identifying performance anti-patterns and optimization opportunities. The analysis reveals several critical issues primarily in TPC-C benchmark workloads running on the `tpcc_2025` database.

## Query Performance Overview

| Rank | Total CPU Time (μs) | Execution Count | Avg CPU (μs) | Database | Primary Issue |
|------|--------------------|-----------------|--------------|-----------|--------------|
| 1 | 2,341,464,328 | 255,352,163 | 9 | tpcc_2025 | Complex UPDATE logic |
| 2 | 1,697,520,420 | 15,324,439 | 111 | tpcc_2025 | High-frequency operations |
| 3 | 719,596,500 | 255,352,164 | 3 | tpcc_2025 | Frequent lookups |
| 7 | 350,445,448 | 1,532,483 | 228 | tpcc_2025 | Complex subqueries |

## Detailed Anti-Pattern Analysis

### 🔴 Critical Issues (High Impact)

#### 1. Complex Stock Update Logic - `[dbo].[neword]` Procedure
**Statistics:**
- **Total CPU Time:** 2,341,464,328 microseconds (~39 minutes cumulative)
- **Executions:** 255,352,163
- **Average CPU:** 9 microseconds per execution

**Anti-Pattern:** Complex CASE expression in UPDATE statement with conditional inventory logic
```sql
UPDATE dbo.stock
SET s_quantity = s_quantity - @no_ol_quantity +
CASE WHEN (s_quantity > @no_ol_quantity) THEN 0 ELSE 91 END,
-- Complex CASE logic for distribution info
@no_ol_dist_info = CASE @no_d_id WHEN 1 THEN s_dist_01 ... END
```

**Impact:** Despite low per-execution cost, massive execution volume creates significant cumulative CPU load.

**Recommendations:**
- Simplify conditional logic in UPDATE statements
- Consider breaking complex CASE expressions into multiple simpler operations
- Review if inventory replenishment logic (adding 91) can be optimized
- Consider using computed columns for frequently accessed CASE expressions

#### 2. Serializable Isolation Level Usage - `[dbo].[ostat]` Procedure
**Statistics:**
- **Total CPU Time:** 350,445,448 microseconds
- **Executions:** 1,532,483
- **Average Duration:** High variance with serializable operations

**Anti-Pattern:** Excessive use of SERIALIZABLE isolation level causing blocking and contention.

**Impact:** Creates unnecessary blocking and reduces concurrency in OLTP workload.

**Recommendations:**
- Reduce isolation level to READ COMMITTED or REPEATABLE READ where possible
- Use snapshot isolation for read-heavy operations
- Implement application-level optimistic concurrency control

#### 3. Double-Sorting Customer Lookup
**Statistics:**
- **Complex Query Pattern:** TOP 50 PERCENT with subsequent ORDER BY DESC

**Anti-Pattern:** Inefficient customer name lookup using double sorting logic:
```sql
-- Inefficient pattern
SELECT TOP 50 PERCENT ... ORDER BY c_last ASC
-- Then ORDER BY DESC in outer query
```

**Impact:** Causes unnecessary CPU overhead for customer searches.

**Recommendations:**
- Rewrite query to use single sorting operation
- Create composite index: `(c_last, c_w_id, c_d_id, c_first)`
- Consider using ROW_NUMBER() with proper indexing instead of TOP 50%

### 🟡 Medium Impact Issues

#### 4. High-Frequency Single-Row Operations
**Pattern:** Repeated individual lookups instead of batch operations
- Item table lookups: 255M+ executions
- Customer table lookups: 25M+ executions

**Recommendations:**
- Implement query batching where possible
- Use table-valued parameters for bulk operations
- Consider caching frequently accessed reference data

#### 5. Missing Index Opportunities
**Affected Tables:** Customer, Item, Stock tables in TPC-C workload

**Recommendations:**
- Add covering indexes for frequently queried columns
- Consider partitioning for high-volume tables
- Implement filtered indexes for common query patterns

## Database-Specific Findings

### tpcc_2025 Database
- **Workload Type:** TPC-C OLTP benchmark
- **Primary Issues:** High-frequency operations with minor inefficiencies compound into major performance impacts
- **Key Procedures:** `neword`, `ostat` handling millions of executions

### StackOverflow2010 Database
- **Lower Activity:** Only one query (#9) in top 10
- **Issue:** Complex analytical query with high logical reads (40,865 per execution)
- **Recommendation:** Add previously identified missing indexes

## Priority Recommendations

### Immediate Actions (Next 24-48 Hours)
1. **Review Isolation Levels** - Reduce SERIALIZABLE usage in `ostat` procedure
2. **Index Creation** - Implement composite customer index for name-based lookups
3. **Query Rewrite** - Fix double-sorting customer lookup pattern

### Short-Term (1-2 Weeks)
1. **Stored Procedure Optimization** - Simplify complex CASE logic in `neword`
2. **Index Analysis** - Implement missing indexes for StackOverflow2010
3. **Monitoring Setup** - Implement Query Store for ongoing plan analysis

### Long-Term (1-3 Months)
1. **Architecture Review** - Consider query batching opportunities
2. **Partitioning Strategy** - Evaluate partitioning for high-volume tables
3. **Caching Layer** - Implement application-level caching for reference data

## Monitoring Recommendations

1. **Enable Query Store** - For better historical plan analysis
2. **Set Up Alerts** - Monitor for plan changes and regression
3. **Regular Reviews** - Weekly analysis of top resource-consuming queries
4. **Baseline Metrics** - Establish performance baselines before optimization

## Conclusion

The analysis reveals that while individual query performance is generally good (low per-execution costs), the massive execution volume in the TPC-C workload amplifies even minor inefficiencies. The primary focus should be on:

1. Reducing blocking through isolation level optimization
2. Simplifying complex conditional logic
3. Implementing targeted indexing strategies
4. Establishing ongoing performance monitoring

These optimizations should provide significant cumulative performance improvements across the high-volume OLTP workload.

---
*Generated by Claude Code SQL Server Analysis*
*For questions or additional analysis, refer to the SQL Server performance tuning documentation.*

The output is certainly nice and neat. The validity of the output was not reviewed properly, although I did check dbo.ostat in tpcc_2025 for any table hints regarding the serializable isolation level (it’s there):

       , @os_c_middle = customer.c_middle, @os_c_last = customer.c_last
       FROM dbo.customer WITH (repeatableread)
       WHERE customer.c_id = @os_c_id AND customer.c_d_id = @os_d_id AND customer.c_w_id = @os_w_id
       END
       BEGIN
       SELECT TOP (1) @os_o_id = o_id, @os_o_carrier_id = o_carrier_id, @os_entdate = o_entry_d
       FROM dbo.orders WITH (serializable)
       WHERE orders.o_d_id = @os_d_id
       AND orders.o_w_id = @os_w_id
       AND orders.o_c_id = @os_c_id
       ORDER BY orders.o_id DESC
       IF @@ROWCOUNT = 0
       PRINT 'No orders for customer';
       END
       SELECT order_line.ol_i_id
       , order_line.ol_supply_w_id
       , order_line.ol_quantity
       , order_line.ol_amount
       , order_line.ol_delivery_d
       FROM dbo.order_line WITH (repeatableread)

At this stage, I could have gone on all day prompting for stuff, but I decided to introduce Context7, another cool MCP server, into the fray.

Bringing in some knowledge – context7

Please note, do not expect continuity between results in the early parts of this blog post, as I have run other workloads on this server and DBCC FREEPROCCACHE for testing purposes for other projects. Please view the outputs as-is

To read more about context7, documentation can be found here. It can essentially provide Claude Code with hooks into more up-to-date documentation. That is the way I understand it anyway. Setting it up is easy. Navigate to your project folder and run something like this:

Launch Claude Code and check it is configured by typing /mcp again:

Nice.

I asked Claude to find the most resource-intensive query run against the StackOverflow2010 database using context7:

It discovered Query Store and went to town:

It came up with this:

This was not what I expected, as it was not a big query (I even ran it to check my sanity). I decided to challenge this:

Of course I am right…

Here is Claude’s next answer:

It corrected the units but then did not re-perform the analysis correctly. I decided to point it at the query myself to see what it makes of it:

I’m absolutely right (again)! I still think it may have now confused microseconds and milliseconds again with its durations and Total CPU Time (should be 30,000+ seconds, not 30). I did not verify any of this at the time.

Trying Again

I decided to clear the context window and re-roll the dice, and ask it a similar question as before, but this time to find the most resource-intensive query (not specified by what metric or database):

Looks like it hooked into context7 again to look for documents.

It gets a bit confused around the server name but saves the situation by deciding to find the query across all databases:

Using Query Store again:

Well, looking at Query Store for total CPU, we can see it is indeed a big consumer of CPU:

I recognise these top 5 as regular queries on my test VM, so I think it has performed a little better this time. The top query runs overnight if you let it run and run due to the scalar function. As I write this post, I now regret doing half a job and not running sp_blitzcache again to compare to the golden standard of plan cache analysis (assuming I had not restarted the VM at some point).

I next decided to see what the top ten queries by total CPU on the StackOverflow2010 database as I felt Claude was doing a little better:

Much better output (seconds are seconds this time):

It did not hypothesise that 749 was possibly the SQL in the function called by 747. Anyway, here is a comparison from sp_QuickieStore (badass) using default parameters:

Broadly speaking, Claude finds most of the queries, but they are in a different order than sp_QuickieStore.

Next, I asked Claude to perform index analysis this time using context7:

Here were the results:

There was a little more insight and the second index on dbo.Posts had an extra column. Note the Impact: ~5.9M improvement commentary on these. I’m sure I have seen this style of feedback before somewhere…

Finally, I asked Claude for anti-patterns (by sifting through the XML) for the queries run against StackOverflow2010:

# StackOverflow2010 Execution Plan Anti-Pattern Analysis

## Executive Summary

Analysis of the top CPU-consuming queries in StackOverflow2010 database reveals several critical execution plan anti-patterns that are significantly impacting performance. These anti-patterns are causing extensive resource consumption and should be addressed immediately.

## Key Findings

**Total Impact:** 115.9 seconds of CPU time across just 3 analyzed queries
**Primary Issues:** Scalar function abuse, missing indexes, full table scans, expensive sorting operations

---

## Anti-Pattern #1: Scalar Function in Performance-Critical Path 🚨 **CRITICAL**

### Query Details
- **CPU Impact:** 57.0 seconds (49% of analyzed workload)
- **Executions:** 25
- **Query:** `dbo.fn_IsHighValueUser` scalar function

### Anti-Pattern Analysis
```sql
-- PROBLEMATIC SCALAR FUNCTION
CREATE FUNCTION dbo.fn_IsHighValueUser(@UserId INT)
RETURNS BIT
AS
BEGIN
    DECLARE @PostCount INT;
    DECLARE @Reputation INT;
    
    SELECT @Reputation = Reputation
    FROM dbo.Users
    WHERE Id = @UserId;
    
    SELECT @PostCount = COUNT(*)
    FROM dbo.Posts
    WHERE OwnerUserId = @UserId;
    
    IF @Reputation > 10000 AND @PostCount > 100
        RETURN 1;
    
    RETURN 0;
END;
```

### Execution Plan Issues Identified:
1. **Scalar Function Anti-Pattern:** Forces row-by-row execution when used in WHERE clauses
2. **Multiple Table Scans:** Function contains two separate queries executed per row
3. **Missing Index Impact:** 99.9945% impact missing index on `Posts.OwnerUserId`
4. **Nested Execution:** Function called 25 times, each triggering expensive operations

### Context7 Best Practice Violations:
- ❌ Scalar functions in WHERE clauses prevent set-based operations
- ❌ Functions with multiple queries cause excessive context switching  
- ❌ Non-deterministic functions prevent plan reuse

### Recommended Solution:
```sql
-- REFACTOR TO INLINE TABLE-VALUED FUNCTION OR CTE
WITH HighValueUsers AS (
    SELECT u.Id, u.Reputation,
           COUNT(p.Id) as PostCount,
           CASE WHEN u.Reputation > 10000 AND COUNT(p.Id) > 100 
                THEN 1 ELSE 0 END as IsHighValue
    FROM dbo.Users u
    LEFT JOIN dbo.Posts p ON u.Id = p.OwnerUserId
    GROUP BY u.Id, u.Reputation
)
SELECT * FROM HighValueUsers WHERE IsHighValue = 1;
```

---

## Anti-Pattern #2: Full Table Scan with Expensive Join 🚨 **HIGH PRIORITY**

### Query Details  
- **CPU Impact:** 22.4 seconds (19% of analyzed workload)
- **Executions:** 6
- **Query:** `SELECT TOP 100 u.DisplayName, p.* FROM Posts p JOIN Users u WHERE p.ParentId = @ParentId ORDER BY u.DisplayName DESC`

### Execution Plan Issues Identified:

#### 2.1 Missing Index Anti-Pattern
```xml
<MissingIndexGroup Impact="22.7731">
    <MissingIndex Database="[StackOverflow2010]" Schema="[dbo]" Table="[Posts]">
        <ColumnGroup Usage="EQUALITY">
            <Column Name="[OwnerUserId]" ColumnId="14"/>
            <Column Name="[ParentId]" ColumnId="15"/>
        </ColumnGroup>
        <ColumnGroup Usage="INCLUDE">
            <!-- All columns included due to SELECT * -->
        </ColumnGroup>
    </MissingIndex>
</MissingIndexGroup>
```

#### 2.2 Expensive Operations Chain:
1. **Clustered Index Scan:** Reading 3.7M rows from Posts table (`EstimatedRowsRead="3.7292e+06"`)
2. **Hash Match Join:** Expensive hash join between Posts and Users
3. **TopN Sort:** Sorting 1.07M rows to get TOP 100 (`EstimateRowsWithoutRowGoal="1.07442e+06"`)
4. **Parallelism Overhead:** Multiple gather streams operations

#### 2.3 Performance Metrics:
- **Estimated Total Cost:** 2,620.95
- **Sort Cost:** 2,016.56 (77% of total cost)
- **Estimated Rows:** 1.07M rows processed for 100 result rows

### Context7 Best Practice Violations:
- ❌ SELECT * anti-pattern forcing covering index needs
- ❌ Missing indexes on filtered columns  
- ❌ Expensive sorting on non-indexed columns
- ❌ Join order not optimized for selective predicates

---

## Anti-Pattern #3: Table Scan for Simple Aggregation 🚨 **HIGH PRIORITY**

### Query Details
- **CPU Impact:** 21.1 seconds (18% of analyzed workload)  
- **Executions:** 9
- **Query:** `SELECT COUNT_BIG(*) FROM Posts WHERE ParentId = @ParentId`

### Execution Plan Issues Identified:

#### 3.1 Critical Missing Index
```xml
<MissingIndexGroup Impact="99.9933">
    <MissingIndex Database="[StackOverflow2010]" Schema="[dbo]" Table="[Posts]">
        <ColumnGroup Usage="EQUALITY">
            <Column Name="[ParentId]" ColumnId="15"/>
        </ColumnGroup>
    </MissingIndex>
</MissingIndexGroup>
```

#### 3.2 Inefficient Execution Plan:
- **Clustered Index Scan:** Full table scan of 3.7M rows (`EstimatedRowsRead="3.7292e+06"`)
- **Hash Match Aggregate:** Expensive aggregation operation
- **Cost:** 594.012 for simple COUNT operation
- **Parallel Execution:** Unnecessarily complex parallel plan for simple query

### Context7 Best Practice Violations:
- ❌ Full table scan for filtered aggregation
- ❌ Missing covering index prevents efficient seeks
- ❌ Over-parallelization for simple operations

---

## Anti-Pattern #4: DisplayName Lookup Performance Issue

### Query Details  
- **CPU Impact:** 15.2 seconds
- **Executions:** 470  
- **Query:** `SELECT * FROM Users WHERE DisplayName = @1`

### Issues Identified:
- **High execution frequency:** 470 executions suggest N+1 query pattern
- **Missing index:** No index on DisplayName column causing table scans
- **SELECT * anti-pattern:** Retrieving unnecessary columns

---

## Comprehensive Remediation Plan

### Immediate Actions (P0 - Critical)

1. **Create Missing Indexes:**
```sql
-- High-impact indexes identified from execution plans
CREATE INDEX IX_Posts_ParentId_Includes 
ON dbo.Posts (ParentId) 
INCLUDE (OwnerUserId, AcceptedAnswerId, AnswerCount, Body, /* ... all columns */);

CREATE INDEX IX_Posts_OwnerUserId 
ON dbo.Posts (OwnerUserId);

CREATE INDEX IX_Users_DisplayName 
ON dbo.Users (DisplayName);

CREATE INDEX IX_Posts_OwnerUserId_ParentId_Includes
ON dbo.Posts (OwnerUserId, ParentId) 
INCLUDE (/* covering columns */);
```

2. **Refactor Scalar Function:**
```sql
-- Replace fn_IsHighValueUser with inline logic or table-valued function
-- Eliminate row-by-row processing
```

### Short-term Actions (P1 - High Priority)

1. **Query Optimization:**
```sql
-- Replace SELECT * with specific column lists
-- Optimize JOIN order based on selectivity
-- Add appropriate WHERE clause indexes
```

2. **Parameter Sensitive Plan Optimization:**
   - The queries show PSP (Parameter Sensitive Plan) hints
   - Monitor for plan regression with different parameter values
   - Consider query hints if parameter sniffing causes issues

### Long-term Actions (P2 - Medium Priority)

1. **Application Architecture Review:**
   - Address N+1 query patterns (470 DisplayName lookups)
   - Implement query result caching
   - Batch operations where possible

2. **Monitoring Implementation:**
   - Set up Query Store monitoring for plan regression
   - Monitor wait statistics for I/O bottlenecks  
   - Track missing index recommendations

---

## Performance Impact Projections

### Expected Improvements:
- **Scalar Function Elimination:** 90-95% reduction in execution time
- **Missing Index Implementation:** 99.99% reduction in scan costs
- **Query Refactoring:** 80-90% reduction in CPU and I/O

### Risk Assessment:
- **Low Risk:** Index additions (non-breaking changes)
- **Medium Risk:** Scalar function refactoring (requires application testing)
- **High Risk:** Query structure changes (requires regression testing)

---

## References

Based on Context7 best practices for SQL Server execution plan analysis:
- [Microsoft SQL Server Performance Tuning Guide](https://docs.microsoft.com/sql-server/performance)
- Execution Plan Anti-Patterns identified using `sys.dm_exec_query_plan` analysis
- Missing Index recommendations from execution plan XML analysis

---

## Technical Notes

**Analysis Method:** Execution plan XML parsing using SQL Server DMVs
**Scope:** Top 5 CPU-consuming queries from StackOverflow2010 database  
**Tools:** SQL Server Management Studio, Context7 best practices framework
**Date:** 2025-01-16

**Key Performance Counters:**
- `EstimatedTotalSubtreeCost`: Primary cost metric
- `MissingIndexGroup Impact`: Index recommendation priority
- `EstimatedRowsRead` vs `EstimateRows`: Efficiency ratio analysis

Output looks nice and neat (again). I decided to test its recommendation of the rewrite for sp_GetHighValueUSers:

WITH HighValueUsers AS (
    SELECT u.Id, u.Reputation,
           COUNT(p.Id) as PostCount,
           CASE WHEN u.Reputation > 10000 AND COUNT(p.Id) > 100 
                THEN 1 ELSE 0 END as IsHighValue
    FROM dbo.Users u
    LEFT JOIN dbo.Posts p ON u.Id = p.OwnerUserId
    GROUP BY u.Id, u.Reputation
)
SELECT * FROM HighValueUsers WHERE IsHighValue = 1;

It ran in a few seconds and brought back 4,484 rows (no idea if the results are correct):

I went to test the OG version but got bored after 18 hours:

Not to mention it was slamming one of my disks:

As for the rest of the recommendations, I only glanced at them. However, I am disappointed there was no continuity from the index recommendations in the early section in the output (I ran this prompt straight after). I appreciate that I asked it to look at the XML only, but just mentioning this.

Bringing in the ensemble (sub agents)

Sub-agents were recently introduced by Anthropic to parallelise and manage context better when working with more complex concepts. I have been using them extensively on other projects, and they have saved the day from a time point of (fewer mistakes, more efficient use of tokens, less reliant on Opus). They allow you to spawn a sub-agent with its own ‘persona’ or role it has been given. For this blog post, I just pulled down all these pre-made ones from VoltAgent. Here is how I installed them:

To check they are being picked up in Claude Code. Close and re-open it (if it was open) and run /agents:

Buzzin.

As you can see, this is complete overkill as we will only be directing Claude Code to use three of them. These will be database-admin, sql-pro and performance-engineer.

Here are the definitions for all three:

DATABASE ADMIN

---
name: database-admin
description: Manage database operations, backups, replication, and monitoring. Handles user permissions, maintenance tasks, and disaster recovery. Use PROACTIVELY for database setup, operational issues, or recovery procedures.
model: sonnet
---

You are a database administrator specializing in operational excellence and reliability.

## Focus Areas
- Backup strategies and disaster recovery
- Replication setup (master-slave, multi-master)
- User management and access control
- Performance monitoring and alerting
- Database maintenance (vacuum, analyze, optimize)
- High availability and failover procedures

## Approach
1. Automate routine maintenance tasks
2. Test backups regularly - untested backups don't exist
3. Monitor key metrics (connections, locks, replication lag)
4. Document procedures for 3am emergencies
5. Plan capacity before hitting limits

## Output
- Backup scripts with retention policies
- Replication configuration and monitoring
- User permission matrix with least privilege
- Monitoring queries and alert thresholds
- Maintenance schedule and automation
- Disaster recovery runbook with RTO/RPO

Include connection pooling setup. Show both automated and manual recovery steps.

PERFORMANCE ENGINEER

---
name: performance-engineer
description: Profile applications, optimize bottlenecks, and implement caching strategies. Handles load testing, CDN setup, and query optimization. Use PROACTIVELY for performance issues or optimization tasks.
model: opus
---

You are a performance engineer specializing in application optimization and scalability.

## Focus Areas
- Application profiling (CPU, memory, I/O)
- Load testing with JMeter/k6/Locust
- Caching strategies (Redis, CDN, browser)
- Database query optimization
- Frontend performance (Core Web Vitals)
- API response time optimization

## Approach
1. Measure before optimizing
2. Focus on biggest bottlenecks first
3. Set performance budgets
4. Cache at appropriate layers
5. Load test realistic scenarios

## Output
- Performance profiling results with flamegraphs
- Load test scripts and results
- Caching implementation with TTL strategy
- Optimization recommendations ranked by impact
- Before/after performance metrics
- Monitoring dashboard setup

Include specific numbers and benchmarks. Focus on user-perceived performance.

SQL-PRO

---
name: sql-pro
description: Write complex SQL queries, optimize execution plans, and design normalized schemas. Masters CTEs, window functions, and stored procedures. Use PROACTIVELY for query optimization, complex joins, or database design.
model: sonnet
---

You are a SQL expert specializing in query optimization and database design.

## Focus Areas

- Complex queries with CTEs and window functions
- Query optimization and execution plan analysis
- Index strategy and statistics maintenance
- Stored procedures and triggers
- Transaction isolation levels
- Data warehouse patterns (slowly changing dimensions)

## Approach

1. Write readable SQL - CTEs over nested subqueries
2. EXPLAIN ANALYZE before optimizing
3. Indexes are not free - balance write/read performance
4. Use appropriate data types - save space and improve speed
5. Handle NULL values explicitly

## Output

- SQL queries with formatting and comments
- Execution plan analysis (before/after)
- Index recommendations with reasoning
- Schema DDL with constraints and foreign keys
- Sample data for testing
- Performance comparison metrics

Support PostgreSQL/MySQL/SQL Server syntax. Always specify which dialect.

Note, you can specify the model for each of the sub-agents. The performance-engineer is using Opus, which we have not seen thus far.

In the interests of time, I will skip ahead to looking at the top 10 queries rather than the simple prompts:

Nice. It looks like the three sub-agents are ‘doing stuff’ as well as referencing context7.

Here is the output:

Cool. It gives us some nicely formatted output. It would have been nice to have the actual queries themselves listed rather than the ‘Stored Procedures’.

I challenged Claude to dig deeper:

Here is the output:

It seemed to give up on providing the SQL at number 7:

It does highlight locking, which can often feature in my top 10 wait stats if I abuse HammerDB. These locks are due to running far too many virtual users for the number of warehouses, and not necessarily a ‘problem’ as LCK waits are way down the list when running things sensibly like a normal person. Cool that it is mentioned though.

Finally (at this point, I had seen enough for one weekend), it is time to see if using the sub-agents can give us an improvement in index analysis. It is at this point, whilst watching the Claude Code prompt window I saw some freaky output. Let’s start with the initial prompt:

It spooled up what looks like four sessions (not sure what the technical term for these is). It’s at this point that we have a behaviour change:

It wants to insert data or pretend to insert data? This would be a horrendous experience to go through on a production database if you have set the MCP with write capability and given it an elevated user. As this is my smallest, least important VM, let’s button through it, answering yes every time:

No idea what is going on here (I can see DBCC FREEPROCCACHE commented out). Is it going to run it?!

After all that excitement (I don’t think it made any changes), here is the output:

Here is the said markdown output:

# SQL Queries Used for StackOverflow2010 Missing Index Analysis

This document contains all the SQL queries executed against the SKSQS25DVEH_DV server to analyze the StackOverflow2010 database and identify missing indexes.

## **1. Missing Index Analysis - Core DMV Query**

This was the primary query to identify all missing indexes with their impact scores:

```sql
SELECT 
    d.database_id,
    d.object_id,
    d.statement AS table_name,
    d.equality_columns,
    d.inequality_columns,
    d.included_columns,
    gs.avg_total_user_cost,
    gs.avg_user_impact,
    gs.user_seeks,
    gs.user_scans,
    gs.unique_compiles,
    (gs.avg_total_user_cost * gs.avg_user_impact * (gs.user_seeks + gs.user_scans)) AS missing_index_score
FROM sys.dm_db_missing_index_details d
INNER JOIN sys.dm_db_missing_index_groups g ON d.index_handle = g.index_handle
INNER JOIN sys.dm_db_missing_index_group_stats gs ON g.index_group_handle = gs.group_handle
WHERE d.database_id = DB_ID('StackOverflow2010')
ORDER BY missing_index_score DESC
```

**Results**: 9 missing indexes identified, with impact scores ranging from 12,613 to 2,527,651.

### **Key Results Data:**

| Table | Equality Columns | Included Columns | Impact Score | User Seeks | Avg Impact % |
|-------|------------------|-------------------|--------------|------------|--------------|
| Posts | ParentId | [AcceptedAnswerId], [AnswerCount], [Body]... | 2,527,651 | 26 | 74.32% |
| Posts | OwnerUserId | null | 1,544,259 | 26 | 99.99% |
| Posts | OwnerUserId, ParentId | [AcceptedAnswerId], [AnswerCount], [Body]... | 1,348,958 | 12 | 52.54% |
| Posts | ParentId | null | 1,069,109 | 18 | 99.99% |
| Users | DisplayName | null | 769,266 | 1350 | 99.42% |

## **2. Table Analysis - Row Counts and Structure**

This query provided table sizes and index counts:

```sql
SELECT 
    OBJECT_SCHEMA_NAME(t.object_id) AS schema_name,
    OBJECT_NAME(t.object_id) AS table_name,
    t.object_id,
    SUM(CASE WHEN i.type IN (0, 1) THEN p.rows ELSE 0 END) AS row_count,
    COUNT(i.index_id) AS index_count
FROM StackOverflow2010.sys.tables t
LEFT JOIN StackOverflow2010.sys.indexes i ON t.object_id = i.object_id
LEFT JOIN StackOverflow2010.sys.partitions p ON t.object_id = p.object_id AND i.index_id = p.index_id
GROUP BY t.object_id, OBJECT_SCHEMA_NAME(t.object_id), OBJECT_NAME(t.object_id)
ORDER BY row_count DESC
```

**Results**: Identified table sizes and confirmed each table has only 1 index (primary key).

### **Table Size Results:**

| Object ID | Row Count | Index Count | Table Name (Inferred) |
|-----------|-----------|-------------|----------------------|
| 197575742 | 10,143,364 | 1 | Votes |
| 5575058 | 3,875,183 | 1 | Comments |
| 101575400 | 3,729,195 | 1 | Posts |
| 2121058592 | 1,102,019 | 1 | Badges |
| 165575628 | 299,398 | 1 | Users |
| 69575286 | 161,519 | 1 | PostLinks |

## **3. Current Index Inventory**

This query examined existing indexes and their usage:

```sql
SELECT 
    i.name AS index_name,
    OBJECT_NAME(i.object_id) AS table_name,
    i.type_desc,
    i.is_unique,
    i.is_primary_key,
    i.fill_factor,
    ius.user_seeks,
    ius.user_scans,
    ius.user_lookups,
    ius.user_updates
FROM StackOverflow2010.sys.indexes i
LEFT JOIN StackOverflow2010.sys.dm_db_index_usage_stats ius 
    ON i.object_id = ius.object_id AND i.index_id = ius.index_id
WHERE OBJECT_NAME(i.object_id) IN ('Posts', 'Comments', 'Votes', 'Users', 'Badges')
ORDER BY OBJECT_NAME(i.object_id), i.index_id
```

**Results**: Returned 0 records, indicating no usage statistics were available (likely due to recent restart).

## **4. Database Table Catalog**

This query provided the complete table inventory:

```sql
SELECT 
    name as table_name,
    object_id,
    schema_id,
    type_desc
FROM StackOverflow2010.sys.tables 
ORDER BY name
```

**Results**: Confirmed 9 tables in the database.

### **Complete Table Inventory:**

| Table Name | Object ID | Schema ID | Type |
|------------|-----------|-----------|------|
| Badges | 2121058592 | 1 | USER_TABLE |
| Comments | 5575058 | 1 | USER_TABLE |
| LinkTypes | 37575172 | 1 | USER_TABLE |
| PostLinks | 69575286 | 1 | USER_TABLE |
| Posts | 101575400 | 1 | USER_TABLE |
| PostTypes | 133575514 | 1 | USER_TABLE |
| Users | 165575628 | 1 | USER_TABLE |
| Votes | 197575742 | 1 | USER_TABLE |
| VoteTypes | 229575856 | 1 | USER_TABLE |

## **5. Current Index Structure Analysis**

This query showed all existing indexes with their key columns:

```sql
SELECT 
    t.name AS table_name,
    i.name AS index_name,
    i.type_desc,
    i.is_unique,
    i.is_primary_key,
    STRING_AGG(c.name, ', ') WITHIN GROUP (ORDER BY ic.key_ordinal) AS key_columns
FROM StackOverflow2010.sys.tables t
JOIN StackOverflow2010.sys.indexes i ON t.object_id = i.object_id
LEFT JOIN StackOverflow2010.sys.index_columns ic ON i.object_id = ic.object_id AND i.index_id = ic.index_id
LEFT JOIN StackOverflow2010.sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE i.index_id > 0
GROUP BY t.name, i.name, i.type_desc, i.is_unique, i.is_primary_key, i.index_id
ORDER BY t.name, i.index_id
```

**Results**: Revealed that every table has only a single clustered primary key index on the Id column.

### **Current Index Structure:**

| Table | Index Name | Type | Unique | Primary Key | Key Columns |
|-------|------------|------|--------|-------------|-------------|
| Badges | PK_Badges__Id | CLUSTERED | true | true | Id |
| Comments | PK_Comments__Id | CLUSTERED | true | true | Id |
| LinkTypes | PK_LinkTypes__Id | CLUSTERED | true | true | Id |
| PostLinks | PK_PostLinks__Id | CLUSTERED | true | true | Id |
| Posts | PK_Posts__Id | CLUSTERED | true | true | Id |
| PostTypes | PK_PostTypes__Id | CLUSTERED | true | true | Id |
| Users | PK_Users_Id | CLUSTERED | true | true | Id |
| Votes | PK_Votes__Id | CLUSTERED | true | true | Id |
| VoteTypes | PK_VoteType__Id | CLUSTERED | true | true | Id |

## **6. Earlier CPU Analysis Queries (Context for Workload)**

From previous analysis of the server's CPU-intensive queries, these queries provided workload context:

```sql
-- Top CPU consuming queries
SELECT TOP 20 
    qs.total_worker_time / 1000 AS total_cpu_ms,
    qs.total_worker_time / qs.execution_count / 1000 AS avg_cpu_ms,
    qs.execution_count,
    qs.total_elapsed_time / 1000 AS total_duration_ms,
    qs.total_elapsed_time / qs.execution_count / 1000 AS avg_duration_ms,
    qs.total_logical_reads,
    qs.total_logical_reads / qs.execution_count AS avg_logical_reads,
    qs.creation_time,
    qs.last_execution_time,
    SUBSTRING(st.text, 1, 500) AS query_text_preview
FROM sys.dm_exec_query_stats qs
CROSS APPLY sys.dm_exec_sql_text(qs.sql_handle) st
ORDER BY qs.total_worker_time DESC
```

```sql
-- Resource impact analysis
WITH QueryStats AS (
    SELECT 
        qs.sql_handle,
        qs.plan_handle,
        qs.execution_count,
        qs.total_worker_time,
        qs.total_elapsed_time,
        qs.total_logical_reads,
        qs.total_physical_reads,
        qs.total_logical_writes,
        (qs.total_worker_time + qs.total_logical_reads * 10 + qs.total_physical_reads * 100) AS resource_impact,
        qs.creation_time,
        qs.last_execution_time
    FROM sys.dm_exec_query_stats qs
)
SELECT TOP 10
    rq.execution_count,
    rq.total_worker_time / 1000 AS total_cpu_ms,
    rq.total_worker_time / rq.execution_count / 1000 AS avg_cpu_ms,
    rq.total_logical_reads,
    rq.total_logical_reads / rq.execution_count AS avg_logical_reads,
    rq.total_elapsed_time / 1000 AS total_duration_ms,
    rq.total_elapsed_time / rq.execution_count / 1000 AS avg_duration_ms,
    rq.resource_impact,
    rq.creation_time,
    rq.last_execution_time,
    SUBSTRING(st.text, 1, 200) AS query_text_preview
FROM QueryStats rq
CROSS APPLY sys.dm_exec_sql_text(rq.sql_handle) st
ORDER BY rq.resource_impact DESC
```

## **Analysis Summary from Query Results**

### **Key Findings from the SQL Queries:**

#### **1. Database Structure** (from queries #2, #4, #5):
- **Total Tables**: 9 tables with 18+ million total rows
- **Current Indexing**: Only primary key clustered indexes exist (one per table)
- **Missing Infrastructure**: No foreign key indexes, covering indexes, or secondary indexes

#### **2. Missing Index Impact** (from query #1):
- **Total Missing Indexes**: 9 high-impact missing indexes identified
- **Top Impact Score**: 2,527,651 (Posts.ParentId covering index)
- **Combined Top 3**: Impact score of 5.4M for the highest priority indexes
- **Table Distribution**: Posts table dominates missing index recommendations (7 out of 9)

#### **3. Query Patterns** (inferred from missing index columns):
- **Parent-Child Navigation**: Heavy use of ParentId (questions→answers relationships)
- **User Activity Queries**: Frequent OwnerUserId lookups (user's posts, contributions)
- **User Search**: DisplayName searches on Users table (1,350 seeks observed)
- **Comment Threading**: PostId-based Comment lookups

#### **4. Performance Impact Indicators**:
- **Improvement Potential**: avg_user_impact values of 99%+ indicate scan-to-seek conversion opportunities
- **Query Frequency**: High user_seeks counts show frequent usage patterns
- **Query Cost**: avg_total_user_cost values up to 2,139 show expensive query operations
- **Resource Waste**: Massive table scans for simple lookups

### **Critical Performance Issues Identified:**

1. **No Secondary Indexes**: Every non-primary key query results in full table scans
2. **Missing Foreign Key Indexes**: All JOIN operations are inefficient
3. **No Covering Indexes**: Frequent key lookups required for data retrieval
4. **No Temporal Indexes**: Date-based queries are extremely inefficient

### **Recommended Action Priority:**

| Priority | Index Recommendation | Expected Improvement | Implementation Complexity |
|----------|---------------------|---------------------|--------------------------|
| **CRITICAL** | Posts.ParentId (Covering) | 74.32% improvement, 2.5M impact score | Medium |
| **CRITICAL** | Posts.OwnerUserId | 99.99% improvement, 1.5M impact score | Low |
| **HIGH** | Users.DisplayName | 99.42% improvement, 769K impact score | Low |
| **MEDIUM** | Posts composite indexes | 52.54% improvement, 1.3M impact score | High |
| **LOW** | Comments.PostId | 98.75% improvement, 12K impact score | Low |

## **Consolidated Index Strategy**

Based on these SQL query results, the recommended consolidation reduces 9 individual missing indexes to 4 optimized composite indexes:

1. **Posts - User Activity Index**: Covers OwnerUserId-based queries with temporal sorting
2. **Posts - Parent-Child Index**: Covers ParentId relationships with score-based sorting  
3. **Users - DisplayName Index**: Covers user search functionality
4. **Comments - Post Relationship Index**: Covers comment threading

These SQL queries provided the empirical foundation for the missing index analysis, revealing a database that is structurally sound but completely unoptimized for typical Q&A platform query patterns, with massive performance improvement opportunities available through strategic index creation.

---

**Analysis performed on**: SKSQS25DVEH_DV SQL Server instance  
**Database**: StackOverflow2010  
**Analysis date**: August 2025  
**Tools used**: SQL Server DMVs, specialized database analysis agents, Context7 best practices

Do you recall I have been mentioning the impact ‘score’ it gives us? I challenged it to where it gets this idea from:

I knew it. Looks like it is inspired by sp_blitzindex and other resources…

Summary

Plugging Claude Code into a SQL Server is a fun curiosity in a safe, non-important environment
It’s no replacement for running tried and tested community-driven scripts
- Just run the proper scripts directly
It is hard to tell if the addition of context7 and the sub-agents truly helped
- One could just keep prompting using stock settings until the desired results were displayed.
  - The context window would allow for some improvement as you went along
- You could probably just use Opus (assuming it is playing ball) and waste a load of money

There will be more parts to this shortly, and we will experiment more with the sub-agents doing further (probably dumber) tasks.

Resources

How I Accidentally Built a Better Database Admin Than Myself

https://www.brentozar.com/first-aid

sp_QuickieStore

One comment

Claude Code and MSSQL MCP Server – Part 2 – eheaton.com

19th August 2025 / 8:05 am Reply

[…] the previous post, we looked at asking Claude via the MSSQL MCP server to give us some basic info around the […]

Loading...

Claude Code and MSSQL MCP Server – Part 1

Initial Setup

Basic Commands

More complex prompts

Even more complex prompt

Bringing in some knowledge – context7

Trying Again

Bringing in the ensemble (sub agents)

Summary

Resources

Like this:

Related

One comment

Leave a Reply to Claude Code and MSSQL MCP Server – Part 2 – eheaton.comCancel reply

Initial Setup

Basic Commands

More complex prompts

Even more complex prompt

Bringing in some knowledge – context7

Trying Again

Bringing in the ensemble (sub agents)

Summary

Resources

Share this:

Like this:

Related

One comment

Leave a Reply to Claude Code and MSSQL MCP Server – Part 2 – eheaton.comCancel reply

Discover more from eheaton.com