Thursday, April 25, 2013

DBA Part 2, Shepherd of the data


In my intro I mentioned that the DBA is the Shepherd of the Company’s data.  First we must define what a shepherd is and what one even does.  In ancient times, a shepherd took care of a flock or flocks of domesticated animals.  These animals were pretty much helpless, easy prey and needed constant care and attention.  They needed good fresh pastures to eat, clean water to drink, and the protection from the odd wolf, bear or lion.


In biblical accounts, a shepherd took great pains to care for his flock, counting each one, knowing them by name and what each animal would do.


This is not a stretch for a DBA.  The DBA cares for and protects the Company’s data.  Ever diligent to threats to the data and the servers it is stored on.  A DBA will know his environment, he will know the servers.  He will know what alerts mean what, he will know when to spring into action and when to let things run their course.

The first duty of data protection is security. The DBA is responsible to design and maintain a security methodology that not only keeps the data safe from prying eyes looking to steal or destroy it, but also must understand the business and the applications that need access to the data and design the method to be easy to maintain, easy to administer and create fewer problems than it solves.  

Security is one of the big items many users simply do not understand.  Many users think that data access should be wide open, and free range.  Users tend to be offended when access is restricted and good security policies are in place.  

While I understand, as most issues are, this one is far more complex and gets more complex the deeper in you get.

Databases are not like Excel spreadsheets, they aren’t laid out in an easy to understand way to people who have not studied relational databases.  This isn’t done to have job security, but rather to meet other goals of a good DBA.  

The process that complicates data retrieval is called Normalization.  There are many forms of normalization from first to fifth (perhaps beyond?)  and most good databases are in third normal form.  Normalization is a complex subject in and of itself.  What happens is that data is split out into various tables with different keys to reduce redundancy and dependency.  For a quick look into normalization (I won’t go deep here)  you can read this https://en.wikipedia.org/wiki/Database_normalization

DBAs want to protect the normalized base data from end users and keep them from having SQL query access because getting accurate results from normalized data takes practice and a good grasp on the SQL (Structured Query Language) code needed to retrieve the data.

I’ve been in environments where power users were granted this access, and consequently sent incorrect data to customers causing all sorts of issues and reducing the customer’s confidence in the data provided to them.  

Keeping the database locked down for nobody to see is also not an option.  There are legitimate uses for data (otherwise why store it?)  but perhaps to only a subset of the data.  The DBA plans on this, and creates a framework to build security based upon roles and groups so that when people are added or removed this maintenance is simple and quick. To do this the DBA must understand the company’s data, the org chart of the company, and the applications.

Like a shepherd the DBA protects the data against predators too.  You may think that there is no way somebody can get into your network, and get to your data, but I beg to differ.  There are many methods that hackers and others can use to either see data they aren’t supposed to see or destroy data that they shouldn’t have access to.  All using seemingly legitimate means.  I refer you to a XKCD comic that is a staple among all DBA’s

This may seem far fetched, but it is common, and dangerous.  A DBA protects against these threats as well as other.

I will tell a story about a great Data Analyst named JB.  JB is perhaps one of the most gifted analysts I have ever run across.  Careful, intelligent, and extremely adept at writing SQL code.  

One day I got a call from JB who informs me that an entire table had the first and last names changed to “Joe” and “Bob” respectively.   JB had legitimate read access to this data, however perhaps he should not have had Update.  That was a debate that we could have had.  But at the time I had half a million people who needed their proper names returned.

DBA’s must protect against threats internal and external, and must think of things like little Bobby Tables, and well meaning and gifted internal folks digging around in the database.

Shepherds feed their flock and provide water.  Without these two basic needs all of their animals will die.  Likewise a DBA must maintain the company’s databases daily, making sure that many tasks are accomplished every day, week and month.  If the company is healthy and growing, so are its databases.  Most companies I’ve worked for do not like to archive things off, opting instead to have huge databases.  When I started in this line of work, having a 1 GB database was huge.  Now I have databases with tables that are over 300GB, and multiple terabytes of data on my servers.  Keeping access to this data fast and reliable is a huge chore.  

Yes much of it can be automated, but a good DBA keeps tabs on the health and wellbeing of the databases.  There are constructs such as Indexes that are designed to make data retrieval fast. A DBA must not only monitor the existing Indexes and keep them from being fragmented, but must keep tabs on how they are utilized, look for indexes that need to be removed, and look for indexes that need to be added.  Keeping in mind that most databases have new data added, old data updated and all data retrieved a DBA must understand how the database is used to properly design the indexing strategy for the application at hand.  

The DBA must also check for corruption in a database and repair any issues there along with keeping tabs on how the server is used, Memory, processor, networking, storage (disks) .  A good DBA will note that the trends and proactively get the ball rolling on upgrades and other tweaks so that any outages in service are planned and work towards a better solution rather than a surprise when the disk runs out of space or a query doesn’t complete because it times out due to the processor being too busy.

This post is turning into a long read.  I will wrap it up here.  Without a good and careful DBA shepherding the databases a company runs the risk of many dangers, and snares.  From issues that arise from greater database usage and growth, to external threats.  Having a DBA who wants to understand your business, who wants to know all about the underlying technology of what the database lives on, not just the database engine, but the operating system, hardware, storage and networking is essential to all going concerns who have a database.

This is stressful work, a misstep can cause some big problems, so when the DBA isn’t responsive please remember they are taking care of the data like a shepherd takes care of his sheep!

In the next installment I will show how the DBA is part of a pit crew for a race car, and while there is some overlap it will be another angle into the mind of a DBA.

Labels: , , ,

0 Comments:

Post a Comment

<< Home