On mySimon: Ultimate Box Sets!
BNET Business Network:
BNET
TechRepublic
ZDNet

By Jerome Pesenti, Vivisimo
Posted on ZDNet News: Nov 9, 2006 7:17:00 PM

Commentary--As search capabilities move out of applications and across the organization, the potential for security breaches increases greatly. Yet, despite the buzz around search these days, security is one aspect that few have paid much attention to.

When implemented carelessly, search engines have the potential to uncover flaws in existing security frameworks and can expose either restricted content itself or verify the existence of hidden information to unauthorized users. Either scenario can have dire consequences.

Before introducing an enterprise search solution, organizations must carefully document their requirements, understand how their search solution handles security, test extensively before deployment, and monitor performance after the rollout.

Search engine security defined
"Search engine security" is a form of access control restricted to the context of a search application and is primarily about ensuring that in the course of using a search application, users can only access information they are permitted to see. Search engine security is only concerned with the information accessed through the search application and not with information delivered by other means. For example, once a user clicks on a link pointing to an independent resource, it is no longer in the domain of the search application. Further, search engine security is not about hiding links––if links to sensitive content exist, that is a problem that needs to be addressed at the source, not by hiding them in the search engine.

In addition, unlike security for the systems storing information, search engine security is only concerned with read access. Access control in general is critical for obvious reasons––confidentiality, privacy, competitive intelligence, etc. On the one hand, compared to other types of computer security, access control violations are often limited and do not have the domino effect that result from most computer security breaches. On the other hand, search engine security is often seen as the most critical of the access control issue because a search engine can expedite the process of revealing security holes.

Imagine a system with millions of files, some of which have unprotected, confidential or critical information. Without a search engine, a malicious user would have to scour all of these files one by one. With a search engine a user can just type well chosen keywords targeting critical and/or confidential information. Here is where read access comes into play. Read access involves not only being able to read the content, but also knowing that it exists. Consider, for example, the employee who searches his/her intranet using keywords targeting critical and/or confidential information. Examples include:

- List of passwords
- Social security numbers
- Salary of John Doe
- Employee reviews
- Company secrets

Even allowing a user to know that a file with certain content exists can be a serious security breach. Imagine searches such as: "X should be fired," "merger with Y" or "June sales are low." The simple title of these results could reveal extremely sensitive information.

No silver bullet
Organizations have taken a variety of approaches to search engine security, from simple identification and authentication to "access control list" (ACL) indexing and federated search, and even mixed approaches that combine elements of the former. Each approach has benefits and drawbacks that make it the right or wrong approach for any given organizations.

The bottom line is that search engine security is one of the most challenging problems for enterprises and this shear complexity makes it critical not only to do it right, but also often requires a custom approach for each organization. There is no silver bullet; therefore search applications need to offer a variety of security tools that are easily configurable, while at the same time able to handle the most complex and demanding security requirements. Each organization must understand its needs, its IT infrastructure and its security goals in order to find the solution that fits the best.

In general, a good search engine application will mirror a pre-existing security framework and rely as much as possible on external processes to conduct authentication and authorization. This will decrease the complexity of the implementation, the need for synchronization and consequently, the risk of errors. Furthermore, it needs to be flexible, but as out-of-the-box as possible in order to avoid too much configuration complexity which is a sure recipe for introducing errors that have potentially dire security consequences.

So how should organizations proceed when considering deploying a search engine? Here are a few simple steps to handle the security conundrum:

1. Collect all necessary information beforehand to define the specifications as exhaustively as possible. Design choices need to be made early on. Determine the different dimensions of the search security requirements: username and security group domains, ACL granularity (domains, documents and content), freshness requirements, etc.

2. Create a test environment that is as realistic as possible and able to demonstrate the feasibility of all the security aspects identified in the specifications.

3. Test before you buy! Do not buy a product on paper; test it in a real environment to see if it can handle all of your security requirements. Pay great attention to the ease of configuration, the likelihood of committing configuration errors and the expertise of the support staff.

4. Test before you launch. A search engine is a great tool to expose security holes in your pre-existing framework. Before deploying your search application organization-wide, employ a small set of trusted users to try to identify security problems and resolve them before the launch.

5. Monitor the use and the performance of the application and modify its configuration accordingly.

6. Enjoy the power of an organization-wide, secure, fast and well configured search engine!

biography
Jerome Pesenti is chief scientist and co-founder of Vivisimo, Inc., a leading provider of search software and expertise. He was previously a visiting scientist at Carnegie Mellon's computer science department, carrying out research on document clustering and data mining. He can be contacted at jpesenti@vivisimo.com.

SponsoredWhite Papers, Webcasts, and Downloads

Talkback

Add your opinion
advertisement
advertisement

White Papers, Webcasts, and Downloads

Enterprise Applications

  • Check out some of the easiest and most powerful ways to boost productivity while saving money on your application infrastructure. See ZDNet's comprehensive Enterprise Application resource center, now!
  • New Online Dashboard
  • Read about top issues IT decision-makers face every day, plus get cost effective solutions to real life IT problems. Oracle Topline