Thursday, May 27, 2010

it's been a while

I'm still around... I know I haven't been as diligent with this blog as I should be. I've been buried in normal day-to-day tasks, not doing anything all that exciting.

The Reporting Cluster has gone live, and the discovery that users have been doing some interesting things with their scheduled subscriptions has come to my attention. I've done a little digging and have a few interesting SSRS queries that show some dups of parameters by owner.

We have over 1,000 daily subscriptions, with 70 + of them being dups (sometimes dups of dups)... so that makes 200 subscriptions not needed right there.

THEN there are some reports that don't make sense to run every day, yet we have users doing that.

On top of that, we have some reports that have 2 and 3 'modes' to show different views, with the design being that one customer would want only one of the modes.... well we have users running all possible modes of the report all on the same shared schedule.

Lots of words, but here is where the rubber hits the road. I failed, as a DBA and a Developer to clearly communicate and train the end users on the usage of our new reporting system.

The old regime held tight the reigns of reporting with an iron fist, demanding that the IT Group schedule each and every report. And while I can understand wanting that level of control, I personally don't want the responsibility. I have a perfectly good tool that end users can use to schedule their own reports.

I made the (stupid) assumption that they could use it with little education. I was wrong.

SO I am holding a training session and will be encouraging the users to clean up their subscriptions all in the name of easing pressures on our servers.

What was the pressure? Well the SQL back end wasn't hurt TOO badly, but when you have 4 of the copies of the same stored procedure running with the exact same parameters at the same time, and they are written so to not lock/block they will step all over each other, take hours to run instead of a minute and cause the SSRS front end to have all sorts of issues.

SSRS was delivering reports between 5 to 25 hours late. yup. All because a couple hundred reports caused thrashing.

In other "NEWS". We have cleared up the last major performance issue on our first SQL Cluster. Turns out the iSCSI array only had 1 gbps connection to it when it could handle 4 gbps. AND each server only had 1 gbps connection to the array.

AND there were several servers pointing to it.... great. So ANY one server could satruate the array's network and cause it to grind to a halt.

the network admin didn't see the issue initially because the disks weren't thrashing about.. they were loafing to be honest.

We have quadrupled the total throughput to the array and doubled each servers bandwidth to the array allowing MUCH faster access.

The great news is that having such limited bandwidth to the disks forced us to tune all of our major SQL Code for limited IO, which is something we had a problem with anyway. I plan on adding data compression to the mix to further enhance our IO throughput.

I have compressed several of our key large tables getting an average of 50% (give or take 10%) compression. Sure it taxes the CPU a little, but we have a dual quad core system that is pretty beefy, and only 2 Gbps to the array.. so helping the IO bottle neck even more will net faster performance.

This seems to bear out in testing, where by I have had this in both the dev and staging environments for 3 weeks.... I will start rolling it out this summer little by little, an hopefully it will help stave off my need for another terabyte of storage for the next 3 years.

BTW
The code to show duplicate subs by Owner

SELECT c.name, us.username as owner, CONVERT(VARCHAR(8000),PARAMETERS), COUNT(*)
FROM ReportServer.dbo.Subscriptions s
JOIN ReportServer.dbo.Catalog c
ON c.ItemID = s.Report_OID
JOIN ReportServer.dbo.ReportSchedule rs
ON rs.SubscriptionID = s.SubscriptionID
JOIN ReportServer.dbo.Users uc
ON uc.UserID = c.ModifiedByID
JOIN ReportServer.dbo.Users us
ON us.UserID = s.OwnerId
JOIN msdb.dbo.sysjobs j
ON j.name = CONVERT(NVARCHAR(128),rs.ScheduleId)
GROUP BY CONVERT(VARCHAR(8000),PARAMETERS) , c.name,us.UserName
HAVING COUNT(*) > 1
ORDER BY 4 DESC