Friday, April 26, 2013

DBA Part 3 Pit Crew of the company


I am a die-hard Formula 1 and LeMans style endurance racing fan.  I watch every F1 race and qualifying session broadcast in the US every year, and have for over 10 years.  I watch ALMS and Le Mans style racing too.

They are vastly different forms of racing that require a different approach and have different rules, yet one thing they have in common is that the less time spent in the pits with the car being serviced the better chance they have at victory.

In many ways a DBA is like this as well.  More often than not these days, a company’s data needs to be accessible 24 hours a day, 7 days a week, 365.25 days a year.  

Not only must it be accessible, but retrieval must be quick, and the data must be right.

What this means for your friendly local DBA is that every move must be planned out with precision, and there are tasks that must happen to keep everything smooth.  Granted, any good (read lazy) DBA will automate as many of these as possible, and then keep tabs on how the automation is running with logs and reports.  A big portion of the DBA’s time is spent on making sure that THEY get the data they need on how things are working.

In this (labored) metaphor, the database server (or data) is like the race car, the driver would be the end user and the DBA is the pit crew.  DBA’s do not drive the car, we are not the car, we build, maintain and fix the car as need arises.

When the server (car) enters the race (goes into production) it is costly to bring the server in for repair.  Be that the tasks mentioned before about index building or maintenance, or checking for corruption.

There are a few tools your DBA has in his toolbag for this.  First and foremost is knowledge of your business cycle, the daily, weekly, monthly and quarterly cycles of your business is crucial.  You may hear your DBA ask all sorts of questions about your business.  He isn’t nosey, he is trying to do what is best for the company.  There are times in Endurance Racing where the team will drive to a pace, and times the team will go flat out.  In many ways the daily and weekly maintenance windows are these times.

In most business’ based in the US nothing can happen until after 9PM.  Us folks on the east coast have to take into account the pesky west-coasters and keep things running in ‘production mode’ for them until their business day is over.  Some business’ cater to end users who may not use their services until after the workday is over, which case, it is Midnight or later when the window opens.

The DBA will work hard at getting the backups, defrags, reorgs and any nightly roll ups for reporting done in this time.  The data must still be available but it is okay to be a bit.. slow.

The DBA also makes sure the databases are in tip-top shape and check for all sorts of issues using automated tasks and reports based upon data collection.  There will be charts about locks, blocks, CPU utilization, Disk Utilization, you may hear him utter things about split pages, buffer cache hit ratio or any number of other things in a language you may not understand.  This is normal!

This is the normal routine.  What happens when the car breaks, or even worse, the driver crashes it?

Professional pit crews plan for just about every possible scenario and practice each and every move many times until it is second nature.  Pit crews also know their car inside and out, and can diagnose issues fairly quickly, and for instances where something is truly broken beyond repair will have spares to use to get the car back into action as quickly as possible.

The same is true for a DBA.   We must know our data, our infrastructure, our applications and business’ so well that when something goes wrong (it always does) we can respond properly and quickly.

This brings me to one key point of a DBA’s design and development work.  High Availability.   High Availability or HA is simply a way to make sure that if one set of hardware were to break there is a second nearby to take over without much fuss.  I have built this many times over in my travels, and it works well for all sorts of issues both planned (patching) and unplanned (memory going bad in a server).  

Having the HA plan built and well documented is essential to the smooth operation of an IT Shop.  The DBA’s first responsibility here is to make it as seamless as possible so that the least number of people need to be involved in any failure.   What good is HA if the dev staff has to be dragged out of bed kicking and screaming to fix an issue.  Budget is also a huge part of this.  The DBA wants to do the best possible job, and may have the inclination to spend copious amounts of cash on this issue.  Trust me, he has the best interest of the company in mind.  Left to their own devices most good DBA’s don’t want to do work that isn’t needed.

After the HA plan is documented, approved and built, the DBA must test it and schedule a routine test of the plan to make sure it works.  This is nerve racking, because if it doesn’t work the DBA is on the spot to fix the issue.  This is the company’s data, and when the test fails, the company is most likely down.
The DBA must also practice for all sorts of other issues.  Practice restoring data, practice debugging or performance tuning.  Learn about all of the bits and pieces the database depends on (The Database Engine of course, also the OS, Hardware, networking and storage).  The DBA cannot possibly know all of this, but must keep a working knowledge of all of these subjects and have a close and good relationship with the people that do know. There has to be a trust there on both sides, a mutual respect, and if that is there then some healthy joking.  I mean hardware people are only good for plugging stuff in after all ;)  

A good DBA will have all of this going on in the back of his mind at all times.  Always keeping an eye on how things are going, always thinking about how things can go wrong and planning and practicing for every possibility.

Many have key scripts written, and instructions for most errors.  Many use Google to find things ( I know I do) that they don’t know.  DBAs like the boring life, and strive and work hard to have one.  When they are surprised the stress level builds.

When an emergency does happen please remember this.  The DBA will have a Director, VP and possibly a C-Level person in their office until things are fixed.  This is very stressful.  DBAs are often asked questions that they do not have the answer.  I mean, if I KNEW why it was broken right away, it would be fixed, or better yet would never have gotten to this state.  

DBA’s are stressed and are paid to be negative.  We are paid to think about every single way data can be made unavailable, corrupted or destroyed and then to formulate a plan to keep that from happening and fix it when (not if) it does happen.  I know I’m slipping into DR here, and I promise I will touch on that more deeply later, but remember this always.  The DBA has lots of important tasks to do, and if they do not jump on your query or answer your question straight away it isn’t because they don’t want to, it is because they are trying to keep the race car running, keep the valuable data from breaking down, crashing and fixing it when it does.

Stick with me, next time I’ll talk about how a DBA architect the data inside of applications for you people who have home grown apps, and how the DBA works with the infrastructure manager to build a high performance system using bits of technology that many don’t see.  

Until then, bring your DBA a donut and thank them for all the hard work they do.  Most DBA’s respond well to donuts and praise, and who knows maybe your query request will be done more quickly.  Not that I’m advocating bribery or flattery......

Labels: , , ,

0 Comments:

Post a Comment

<< Home