Posted: January 6, 2011 in Information Security

Data classification is the first step to ensure the integrity of your files

Data classification will ensure the long-term security of your company.

If you were to give most IT folks at a mid-sized business a pen and ask them to draw a diagram of the types of data they handle coming into and out of their network, they probably couldn’t. To be fair, it’s not entirely their fault. On average, there are at least 5 – 10 major applications in use at most organizations, all dealing with a mixture of data types and all having connections to at least one other system that either sends or receives data from another source.

How did we get here?

IT professionals are often told to simply “make it work,” which isn’t a bad thing, but it can mean that a system, application or server was implemented without putting a lot of thought into what kind of data would be handled or how it would be protected. I get it – when you need something, you need it now. But just like building a house, we need to lay a foundation for our data in order to manage and support it down the road. When the roof is collapsing, the last thing you want is to have to repour your foundation.

Enter the data classification scheme

Most folks in regulated industries (like healthcare and banking) are already familiar with data classification. For some years, federal laws have required that certain entities have a data classification scheme that outlines the types of data they deal with and how such data is processed and protected. When we do business with these organizations, we like to know that our personal information (such as SSNs, account numbers or information on our health) is protected. Although it may seem like an easy task to comply with, I routinely find that even these industries can have difficulties in classifying their data.

So, why is everybody so crazy about having a data classification scheme anyway? The answer’s really simple: without knowing what kind of data you have and where it resides, you can’t possibly protect it.

Controls

In the security industry, we refer to a control as any procedure or device that is designed to control how people or computers interact with a piece of data. A great example of a control is a password, which most of us are very familiar with. Many of us use these controls on a daily basis to access things like our online banking, personal computers or Facebook accounts. Firewalls are another example of a control. Believe it or not, firewalls weren’t always common. That meant that there was little protection separating our networks from the dangers of the Internet. Even passwords have evolved with the times. Only 5 years ago, six character passwords were acceptable – now we demand at least eight.

There are two major control pitfalls that can occur if you don’t have a data classification scheme. One is applying too few controls, with the result being that data isn’t protected well enough. The other occurs when controls are paid for and applied in the wrong areas. Imagine that the second pitfall is a lot like trying to hit a bulls-eye with a dart – while blindfolded. Not applying enough controls means that your data could be at risk; applying them in the wrong places means you’ve wasted money.

Two Types of Data

Let’s take a look at the two basic kinds of data that most organizations handle: internal and external.

Internal Data

Internal data typically refers to resources that keep the organization operating smoothly and efficiently. Payroll, accounting, sales forecasts, client lists and proprietary data (like patents or schematics) fall into this category. Also included in this category is information on how the organization’s IT infrastructure – the routers, switches and servers – is implemented. For instance, you wouldn’t want anybody outside the organization to have information on the version of specific applications and services you leverage – this could allow a bad guy to craft an attack targeting your organization.

External Data

External data includes information on your customers or clients, business associates, partners and other entities that share data with you. It could include information on a client that you are developing a business proposal for, or the credit card information of a customer purchasing a sandwich and bag of chips. You may also be receiving external data through automated processes, such as data imports via FTP or web submissions from other organizations sharing data with you.

Classifying Data

I’m going to assume by now that you have at least located your data. Be sure to consider electronic and data in paper form. Think about all the methods your organization uses to import and export data. Do you share with credit bureaus? Do you receive data from state or local agencies? All of these paths should be considered. Next, consider the following:

  • Are there any applicable laws that mandate safeguarding of specific types of data?
  • Do any agreements with third parties mandate how data should be safeguarded or handled?

If you answered yes to either of those questions, you need to handle and safeguard the data exactly as proscribed through your contractual agreements. Since dealing with each regulatory framework is way outside the scope of this discussion, let’s assume that you answered no to both of the above questions. The next step is to look at data based on importance.  Ask yourself whether the data is required for the smooth and continual operation of business functions. What about patents or trademarks? If another brewery gets a copy of your famous pale ale recipe, how much longer will you be in business? Chances are at least some of the data you handle falls into this category. This type of data is your critical data, and next to data that you’re legally obligated to safeguard, this is your most important data (they could also be one-in-the-same).

Critical systems

Any system that processes, stores or transmits this critical data should be labeled as a critical system. Why? Because controls should be implemented where the need for control is the greatest – whether that data is internal or external in nature. If this data becomes unavailable, lost or compromised in some way, it may disrupt your ability to remain in business. This is where we’re going to focus most of our efforts, however you still need to consider data that is not critical to your success, but might still result in embarrassing press releases or legal action if it was lost. As a general rule, you should treat all data that you receive from others as critical.

Document everything

Since data can be an abstract concept, it’s even more important to document the decisions that you make. Also keep in mind that the data doesn’t always belong to you. For example, in the financial industry, the Board of Directors of a bank bear the sole responsibility for safeguarding data. So even if Johnny Computerguy decides that the data classification scheme is adequate, it’s really not up to him since he’s not going to be sued into oblivion in the event of a breach. Always make sure that you have sign-off from the data owners on how the data is classified and handled.

Other data types

Next, consider the other types of internal data you handle. Payroll information is a great example of data that isn’t critical, since it can be quickly and easily restored or recreated in most cases, but should be protected from other employees internally. Since one employee knowing another employee’s salary is undesirable, but probably won’t put you out of business, you might label this data sensitive.

Organize your data types

At the end of the process, you want to be able to place all types of data you handle into containers. External data and details about your information systems should be considered critical, so we’ll place all of that data into one bucket. Your payroll information, client lists and forecasts are all types of data that you wouldn’t want competitors to have, so we’ll stick that information into sensitive. Based on these groupings, we also know that any systems that house critical data should also be labeled as critical.

Applying Controls

Now that you have your data somewhat sorted (this is good enough for our needs), it’s time to actually implement controls around our data. Although we aren’t going to go too deep into controls, systems that house critical data should be isolated from everyone without a need to interact with the data – that includes external users (contractors, script kiddies on the Internet, etc.) and internal personnel.

Access controls

Since someone in your bookkeeping department very likely isn’t involved in the brewing process, they probably don’t need any level of access to the server or file share that houses your recipes. Likewise, your brewmasters don’t work in accounting, so they don’t need access to any of the information labeled sensitive. Access controls aren’t a new concept (passwords, right?), so make sure that you apply them appropriately based on job function and “need to know.”  Other technical controls include network-level controls that actually place systems into logical groups on the network, with only users and systems needing to work with the data having access.

Detection controls

Aside from the controls that most folks already have, you will need to consider adding detective controls. True to their name, these controls are designed to detect unwanted or illicit behavior and notify someone in the event that data is accessed by an unauthorized user. This is more important than people know, since in order to prosecute someone you will need to be able to prove they carried out the act in question. Unfortunately, turning on auditing and logging features of each system doesn’t actually do much if someone isn’t reviewing the logs. There are a variety of tools available that will parse logs for specific signs of unwanted behavior, but those are outside the scope of our discussion. At the very least, know that most major software applications and operating systems support some level of logging.

Physical controls

Finally, you need to think about the physical controls that will need to be implemented to restrict access to data that’s off-limits. It’s one thing for somebody to browse to a network share and grab a bunch of confidential documents (but you’ve already restricted that access, right?), but if they can walk into a file room and obtain those same documents, you’ve still got a problem on your hands. Physical controls are tricky to implement, since they can be a hindrance to your employees and can become costly. Focus on spreading physical controls across several areas, such as cameras and access controls.

Conclusion

In short, there’s no real right or wrong way to classify data, as long as you know where it lives and how it’s protected. In the event that you decide to upgrade your hardware or applications at some point, think about where budgetary resources can be applied to provide the best protection at the best value. Build your defenses around systems and data that would really impact your bottom line if they were lost, stolen or damaged. With a data classification scheme, regardless of how complex it is, you will sleep better at night knowing that you know where your data is.

I graduated from LSU with a B.F.A. in Graphic Design and a minor in Art History. My passions are web design and front-end development with over 10 years of experience.


Leave a comment

Did this article help you? Do you have a different opinion? Feel free to leave a comment or ask questions.