On TV.com: Who?ll Replace OPRAH as Our Life Coach?
BNET Business Network:
BNET
TechRepublic
ZDNet

Talkback

Add your opinion
advertisement

From our video sponsors

advertisement
Information fingerprinting

Data protection solutions have typically filtered content by matching patterns and keywords. Raj Dhingra of PortAuthority Technologies introduces a new method called 'information fingerprinting.' It uses filters to actually learn the context of data.

Hi, I m Raj Dhingra, Vice President of Product Management and Marketing at PortAuthority Technologies, and today we re going to talk about information fingerprinting, a technique for being able to accurately identify and stop information from leaking. In our previous video we talked about why content filtering is not enough. Let s talk about some other techniques that can be used to protect your good stuff, your confidential information, whether that s sitting in a customer data database or is actually a document sitting on a file server or document management system.

Let s take the example of customer data. You ve got name, social security number, zip plus four, account number. I m going to go into the database and I m going to extract out ten records that contain name, social security number, date of birth and zip. Let s take a look at the different kind of filters that can help stop this information from leaking.

The first class is global filters, which relies primarily on file type. So a global filter can stop an encrypted file from leaking. You can say stop encrypted files. On the other hand, when it looks at an Excel file, it will not stop it from leaking because Excel files might be allowed in your company policy.

Let s take the next class of filters, which is tokens. In the case of tokens, tokens are using keywords, patterns and expressions. So we take this example of my Excel spreadsheet that contains the name, social security number, date of birth and zip plus four. When tokens are being used to identify social security numbers, while they will correctly identify the SSNs, they will also pick up the zip plus four as SSNs. As a result, you have false positives. So we have a limitation where this information is now going to leak out, creating an unsecure environment.

The third class of filters is contextual filters. And that is very different from either global or token-based filters. In the contextual filter case, we re now starting to learn the actual data and the context of the data. That is what information fingerprinting is about, and information fingerprinting will actually go through, for example, your entire database that might contain 100,000 records, or it might contain one million records. And it will learn the specifics of the name, the actual social security numbers, the zip plus four, the account numbers.

So as a result, when the fingerprinting is complete, this information about the actual customers is now stored securely in a fingerprint database. As a result, when we start to use filtering techniques for information that might be leaking, let s say this Excel document now is connected to an email, attachment email, the information fingerprinting based identification will now very accurately and precisely identify that this social security number maps into a particular customer s name and address. As a result, we ve got very precise identification of sensitive information occurring when we start to use information fingerprinting. In this particular case, when it sees zip plus four, it s not going to identify that as sensitive information.

So to summarize, contextual information does a few things. First, it learns your data; it precisely builds the fingerprints down to the very granular level. It can then accurately and reliably identify what the sensitive information is. It is also resilient to being manipulated from a data perspective, so really provides very accurate and very reliable identification. As a result, now you can stop sensitive information from leaking from the organization.