Wednesday, December 30, 2009

MTL or VAS memory Pressure

So I've been re-combating my VAS memory issue (also called MTL or Mem To Leave). It was to the point that I opened a ticket up with Microsoft. Then one of my developers alerted me to the fact that a new system was put in place about the same time the issue happened, and that the server was creating (and closing) 46 or so connections to the SQL Server per second.

That is a lot of connections.

The system was shut off and all of the sudden my VAS memory pressure eased. What I ended up with, however, is a set of queries that can aid in trouble shooting this sort of error, and was plenty of information for Microsoft's tech support to help me focus in on where the error was.

All of that said, I am going to continue down the path with Microsoft because I need to guard against this sort of issue in the future.

I am going to work on packaging the entire diagnostic script (and maybe if you are good a .rdl of the graphical report I use to track it all)

Now, this script will utilize some Red Gate SQL Backup XP's so it won't be as useful for non-red gate users. I will make 2 versions the full blown Red Gate version and a 'stripped down' non red gate version.

The main difference? I have a (cobbled together) SSRS report that shows me trends of data collection for various memory segments that is from a XP that ships with Red Gate SQL Backup.

So the lesson learned? I have a lot more stuff I need to monitor and log. I'm sure all the seasoned DBA's out there already knew this but it is a good thing to know and to keep track of!

Labels: , , , ,

Friday, December 11, 2009

Happy TC Day!

Today I mockingly celebrate a 'pivotal' day in my companies IT Shop history. You see, I am the first person hired to be full time in our shop. Prior to that our company had brought in a contracting company who used 3 offshore teams.

2 years ago today there were 2 of us in the department and we were making a hiring decision. The model we were choosing was up in the air, so we were looking for a jack of all trades who can program, do SQL and most importantly do Business Analsys work. So we brought in 2 candidates. I didn't like either, but time was of the essense. So I picked TC from the litter and so did the other IT guy.

So TC came into our shop for 3 glorious days. He met with a couple of our CXO's to get the feel for the company, he setup his laptop. He played with his iPhone. He started a half-arsed data dictionary, then, on Friday (after starting on Wednesday) our schedule was light enough that we decided to go grab lunch in his honor.

He had other plans. He told us he had "Insurance" papers to sign, and we never saw him again. After 2.5 days of being in the office, and giving nothing to the company he had the recruiter call us and tell us that she would drop off his company issued laptop.

That changed our direction. We decided to take more time on interviewing people. We also decided to start moving away from offshore and we built a dedicated dev team and I celebrate this day because of the man who had the gall to be a more dishonrable person than I have run into in my professional career.

Thank you TC for showing me how not to behave!

Labels: , ,

Friday, December 04, 2009

How do I use SQL Server

I've been following a series of blogs of real DBA's who are talking about how they use SQL Server. As a reformed Programmer/Analyst turning into a DBA is an interesting journey and I figure I'll try to add what little I can to the community. I echo the statements of others that it may be easier to say what SQL is not doing in my company, but I figured I'd join in the fun as well.
I work at a small/mid sized pharmaceutical marketing data company. The business leaders would have you think we do marketing, but really we deal in the business of data. We setup promotions where by patients receive the ability to get a financial reward for filling a prescription. This can be instant at the Pharmacy (on a large data network), or via a rebate. Payments from the rebate can be from a check that we cut here on site, or from a debit card that we can issue and place funds on.
The main thing that our customers are interested in are reports and data. We have lots of data coming and going, in and out all the time.
Our environment is ever changing in regards to the infrastructure. When I started, there was no DBA, no IT Managers, we had an off-shore coordinator and 3 teams of off shore developers who rolled a lot of their own code. So I inherited several SQL Server 2000 Standard edition servers all linked together using Linked Servers on some relatively weak hardware. There were performance issues galore, and one of the more interesting features was a home rolled "replication" of sorts that used triggers, sync tables, SQL Agent jobs and stored procs to push data from server to server to have some sort of High Availability. That wasn't so good. I had blocking, locking, deadlocking, long running processes and no one person who knew more than 33% of the picture.
Now we have moved to 2 SQL 2008 Enterprise clusters on windows 2008 (One cluster is on Windows 2008 R2, and I highly recommend that). We jumped SQL 2005 and I have limited knowledge of 2005 myself so moving from SQL 2000 to 2008 has been a huge learning curve for me, and I'm still coming in to knowledge of all the cool 'new' features that make life easier.
ETL is a huge part of our business, but highly over looked. We have some hybrid systems that utilize custom .NET code that then pushes to SQL Server to manipulate the data via stored procs. We have some DTS, some SSIS and actually are moving towards BizTalk.
Our OLTP is handled 100% by SQL 2008 at this time, there are 50+ databases on the one cluster for a grand total of half a terabyte of data. Not hugely impressive, but considering we do 50,000 transactions a day on just one of these databases the growth rate is rather impressive.
Reporting is handled by a mis-mash of technologies. We have a crude datamart using SQL 2000 enterprise that uses linked servers, views and stored procs to create 'fact tables' in a nightly process that runs 9-11 hours 6 days a week. We are moving to having another SQL 2008 cluster that is a transactional replication subscriber to the OLTP cluster. This reporting cluster will have a subset of the databases of the OLTP cluster because we don't report on all of our data. the app config databases, app security databases, error logging, system logging etc databases (almost all legacy) will not be replicated.
We will then create OLAP cubes and dimensions and what not (as I learn SSAS) to create a data warehouse and data marts.
We use 2 home rolled reporting solutions that are being slowly taken out of the enterprise and being replaced with SQL Server Reporting Services 2008. We have the internal SSRS 'portal' that we are using, it is what ships with SSRS and our internal folk use that for ad-hoc reporting, to schedule reports to be emailed or delivered to a folder to have another process FTP it to customers.
We are also developing a SSRS portion to our data portals for customers to directly link to their reports online. This means that our reporting database has to be very responsive and up 99.999% of the time.
I am currently using RedGate products to handle many tasks. RedGate SQL backup 6.0, RedGate SQL Response 1.2 I lean on most, along with some scripts that I have written myself or modified to maintain indexes and take care of system integrity checks. We also use other tools to monitor the physical boxes but that is a job for the sys-admin.
I also have a few SSRS reports I have written to keep tabs on data growth and VDI memory use.
I've been using SQL Server in one way shape or form for almost 15 years now dating back to SQL 6.5 on Windows NT 4, but really that was more from a programmer and Database Architect stand. Sure, I also was on very small teams where there wasn't a DBA so I did some of that (Backups and what not) but now that I'm in an area where data is what we do, I'm moving more in a DBA role and learning as I go. I suppose that is how most people become DBA's.

Labels: , ,

Tuesday, December 01, 2009

Odd "Replication" error

So I have been attempting to do some replication on my SQL Server 2008 Enterprise (64 bit) cluster.

Well, I was building a subscription on a 2008 Enterprise SQL Server on Windows 2008 Enterprise to be subscribed to by a 2008 Enterprise 64 bit SQL Server on Windows 2008 R2.

However the subscriber doesn't really enter this equasion.

I had built several large publications, but one really gave me an issue.

The error I got ( and could not find ANYWHERE on the 'net) was

Error messages:
Message: Data conversion failed
Stack: at Microsoft.SqlServer.Replication.Snapshot.SqlServer.NativeBcpOutProvider.ThrowNativeBcpOutException(CConnection* pNativeConnectionWrapper)

Now, there was more to that stack dump but the key was the Message: Data conversion Failed.

So I did what a programmer (by training) would do, and first I searched the error, and since I didn't see anything about it I narrowed the possibilites. I figured it was a table that was bombing when trying to build the BCP files because that is the process that was failing. So I did a little manual 'binary' search, split the tables in half and got the error. Split them again, didn't. Did the second half got it... after a while I got it down to one table.

Great. Something was corrupted in my table. But what? Then daily life at the office got in the way and I had to drop it. Today I did a DBCC CheckTable with a few options to be specific

WITH ALL_ERRORMSGS, EXTENDED_LOGICAL_CHECKS, TABLOCK, DATA_PURITY

The data purity told me that an updatedate column was out of range... but the data it brought back was less than helpful. So I ran a couple of queries that did the trick.

SELECT ID,CreateDate FROM Prescriber
WHERE UpdateDate < '1/1/1753 12:00:00 AM' OR UpdateDate > '12/31/9999 11:59:59 PM'


and

SELECT ID FROM Prescriber WHERE
((DATEPART(ms,UpdateDate)+ (1000*DATEPART(s,UpdateDate)) + (1000*60*DATEPART(mi,UpdateDate)) + (1000*60*60*DATEPART(hh,UpdateDate)))/(1000*0.00333))
> 25919999


And I found about 830 rows that must have been BCP'ed in many moons ago that had bad dates. I nulled the dates out (as they aren't needed really) and bam. Everything works.

So, long story short, Conversion error points to bad data. And in retrospect my debugging worked out well, because it took 5 minutes to scan that one table, and there were several much larger tables in this database so a DBCC CheckDB is out of the question.

NOW

My more expireced DBA readers will say "Why aren't you doing CHCECKDB weekly.... yeah, I have a script that I'm still testing for this. I need to tweak it, but real life has prohibited that.

Labels: , , , ,