Featured in REMOTE Site & Equipment Magazine - August/September 2002

Managing Remote Sites as One, Unified Facilities Network



Ten years ago, it was common for mission-critical IT systems to be concentrated in large, centralized 'raised floor' centers, which were manned around the clock. Today, the benefits of distributed networks have resulted in an exponential growth in the number and geographic diversity of mission-critical sites. Now, a 6'x6' data closet can be the host to critical server and router assets. This new distributed topology translates into more sites to manage, many of them 'dark sites', many of them in places financially prohibitive to man 7x24. This proliferation of mission-critical remote sites include branch offices, Telco closets, POPs, cellular towers, call centers, and outsourced collocation facilities. Not only is this growth in remote sites not supported by corresponding growth in facility staff to physically watch over the sites to react to network-threatening facilities issues, but also the service demand standard have grown to a daunting 99.999% in uptime.

Internet Service Providers (ISPs) are an extreme example of the distributed mission critical environment. Typically, an ISP has at least two larger data centers, a couple of critical call centers and many hundreds of unmanned, "points of presence" (POPs), distributed throughout area codes and prefixes. A customer dials into the local POP in order to gain access to the network and Internet. Each POP becomes an extension of the main mission-critical data center. If the equipment fails in one of these small equipment rooms, the customer loses service. Retail operations, including banks and pharmacies follow the same topology.
With the distributed nature of IT network and systems, availability will be increasingly dependent on remote site as well as an in-depth understanding of the weakest link. Securing and managing each and every remote site--especially the weakest link--is imperative in order to ensure overall IT network and system availability.

The starting point is in understanding root causes of network and system downtime and techniques or tools to help avoid such costly occasions. This involves not just monitoring a network of simple inputs, provided by dry contacts today, but more importantly providing easy access and detailed visibility of every sensory input available at the machine level. With complete access to all the sensory inputs, companies can create virtual presence in managing the critical remote equipment without the cost of being there.

This leveraged "virtual omnipresence" has already been established within the corporate data networks enterprise-wide. As evidence of market interest in this area is the boom in network and systems management software sales, and more recently the specific segment of root cause software is an indication of the worldwide appetite for solutions to measure, assess, predict and hopefully, ultimately avoid network and system outages. Sales of these software packages are in the tens of billions of dollars annually by all survey accounts and have become critical for IT groups to manage IT uptime globally from one location
Despite our best efforts and the best IT software management packages, failures occur. Why? For starters, studies are beginning to show that 30% to 50% of IT failures have a root cause in power, fire, environmental and physical security problems, not the hardware and software monitored by the fancy network and systems management software packages. Apparently, we have only been looking at half the uptime equation. Despite the changes in facilities management model over the last ten years, our tools have not kept pace with the change in monitoring demands. We don't do enough to manage our distributed IT and facility ecosystem as a single unified network.

Obstacles in Creating One Unified Facilities Network

Along with the exponential growth of remote sites, there has been a proliferation of incompatible equipment, standards, and methods. To begin with, the set up, management, and maintenance of the equipment vary for each vendor. Even equipment from different product lines within the same vendor family can be incompatible. Given that a typical enterprise has hundreds or thousands of equipment components from all different vendors, one can readily see the challenge in cost-effectively managing remote site equipment.

Meanwhile, the technology that is currently installed at many facilities was developed using industrial automation technology of the 70's and 80's. Traditionally, these workstation-centric, modem-oriented systems take too long and cost too much to install and operate. Additionally, for dark sites, the workstation itself creates a single point of failure found to be intolerable by modern standards. If the PC freezes and there is an alarm, no one can be notified. Moreover, the incompatibilities among equipment and systems create enormous inefficiencies, making them difficult to scale. As a result, both the building automation systems and the network management systems fall short in their abilities to unite all these remote machinery assets due to their inherent technology limitations.

Imagine trying to extract information in a uniform way from all the remote equipment, in order to track utilization, perform failure analysis, predict performance, and anticipate replacement. Until now, the only option was customized systems integration. A lot of time and money is spent creating and supporting custom connections among the various types of equipment. Even more time and resources are spent just trying to extract information from each of these proprietary systems.
Bottom line: with traditional technology, there is no single, common language to unite all the equipment at these remote sites, and hence no way to achieve the operating efficiencies and leverage.

Foundation for Creating One Unified Network

One can take the 7 Layer OSI model and envision a foundational layer upon which the IT network and systems depend upon. We refer to this critical facilities foundation as the Zero Layer, and it includes the critical power, environmental, fire safety, space, and physical security machinery.

What is interesting is that most corporations today lack rapid, remote visibility into these foundational elements even with their billions of dollars in network system management packages and root cause engines and the acknowledgement that up to half of all failures can be attributed to Zero Layer failure.

Thus the distributed IT network and system has not been effectively managed as an enterprise, and this disjointedness, due to lack of root cause understanding or tools to manage and assess these causes at the facilities foundational layer, have lead to famous disasters. Service providers, Internet companies, banks and manufactures alike have spent time in the newspaper because of outages due to failed generators during rolling blackouts, water leaks or simply failed air cooling at unmanned sites.
Keep in mind that there is only so much software tools can help in disaster avoidance within the IT network and systems, but there is a value in being able to rapidly assess the viability of the different foundational layer components, post-disaster. For example, how long might it now take for your company to access the viability of its facilities systems after a major earthquake? The answer to that question is quite certainly different than, how long would it take to access the viability of your network connection after the natural disaster? If the distributed IT network and systems were managed as such, the answers would match because you would have remote, unified, global visibility to all areas of the facilities foundational layer including power, fire, environmental and physical security systems just as you do server health.

Emerging Solutions for Creating One Unified Network

New tools such as NetBrowser Communication's e-Guardian are now coming to the market to begin to address this forgotten piece of the facilities foundation layer. This gives acknowledgement to the fact that there is indeed and interconnectedness between the 7 layer OSI network model and the more mundane inputs that make them possible: power, fire, environmental and physical security.

Built on new era architectures that directly monitor and predict the health and well being of the foundational layer and link them to the rest of the distributed IT network and systems, these tools now hold promise to provide the same visibility into the IT enterprise as the CFO would expect of his/her financial system.

One perspective shared recently is "Solutions such as NetBrowser's dramatically improve infrastructure system availability. They do so by improving the key parameters that measure availability - Mean Time To Failure (MTTF) and Mean Time To Restore (MTTR). This allows enterprises to achieve high availability in their mission-critical infrastructure for a cost that is small in relation to the value of the investment." - David DiQuinzio, President, Strategic Facilities Inc.

Better Securing Of Unmanned Sites

For example, a leading telecommunications company had a $2.5M per year problem: at its remote telecom sites (Central Offices) trained thieves would break in the facility and steal telephone switch components for sale on the black market. The process to enter the facility, power down a switch and remove the card was about seven minutes. Meanwhile, though simple door alarms where employed, the network operation team was so overwhelmed from nuisance alarms at over 650 sites that it ignored most out of shear necessity. When the stolen card was found to be the problem, CCTV equipment has to be accessed locally and searched--a tedious and often time unproductive exercise.

As a result, they searched for a solution that allowed them to manage their remote sites as one unified network and to replace the CCTV with real-time networked cameras.

The firm finally selected e-Guardian®, which allowed them proactively safeguarded its data and telecommunications equipment against physical intrusions in its remote unmanned sites with integrated, enterprise-class, real-time remote surveillance technology. For the first time, they could see hundreds of cameras across all the remote sites from one browser. In addition, they were notified of any specific physical intrusion, say the telephone switch cabinet was accessed, all at network speeds. Within one minute, the network team and local security team receive images that fully document the incident. Best of all, they eliminated the cost and cumbersome closed-circuit television system problem at these unmanned sites. They were able to know what was happening when it happened even on a wireless PDA.

Within one month, the Telco was able to pinpoint the intruders and subsequently apprehended three individuals responsible for the equipment theft ring. The notification occurred in real time and the police were dispatch to the site, all within the seven minutes it would normally take to carry out such a theft.

Better Ensuring Uptime and Availability Enterprise-Wide

In another case, a leading enterprise storage company needed to reduce outage risk and ensure high availability to meet its Service Level Agreements in its enterprise application outsourcing services. Its data centers support over forty corporate customers and contain over 175 terabytes of live customer data, on over 650 heterogeneous servers running eight different operating systems and five different databases.

A major technical hurdle that it faced was the lack of an adequate solution to monitor all its facilities equipment at different sites. First, its existing Windows-based monitoring solution was unable to unite all the facilities infrastructure equipment within their data centers under a single platform, thus limiting the visibility of information about the equipments' condition. Second, there was no way to access and view the data from remote locations without installing client software or dealing with cumbersome remote management tools. Third, the system was neither secure nor reliable, as it utilized modem-based communication with its glaring potential for network security breaches. Fourth, the system had limited scalability requiring a server to be installed at each individual location in order to provide acceptable performance. Lastly, their system offered no back up or fail-safe mechanism in the event of component failure.

As a result, the system never performed adequately in alerting engineering and facilities staff of out-of-tolerance conditions. Additionally, the total cost of ownership was higher than expected requiring a special system administrator and significant IT support during installations and upgrades. Any new software updates needed to be completed on each individual client and server. With its zero tolerance position for system failures and the need to guarantee the highest availability to their customers (per their SLA agreements), it needed to find an alternative system.

By adopting a solution such as e-Guardian®, it was able to provide a scalable enterprise solution that enabled its data center to unite all of its facilities machinery into one, manageable proactive system accessible from anywhere at anytime. With this solution, all of its engineering, facilities and the NOC personnel can use and receive the benefits of a single solution that monitors all their mission critical equipment and facilities. Moreover, its employees receive a unified view of all facilities assets from anywhere, using a simple web browser or wireless PDA. It provides data encryption to ensure network security, bandwidth friendly communication that optimizes performance, and the ability navigate the firewall. Finally, it provides full scalability and robust N+1 redundancy.

In conclusion, proactive managing of your remote sites as one, complete enterprise network can help businesses avoid the costly outages and downtime associated with unplanned failure in their facilities machinery. Getting to this information is a challenging task that few companies ever accomplish, especially given the external pressures and implementation obstacles today. As companies seek solutions to automate their entire infrastructure, they must look holistically at all the key requirements and leverage the benefits of new technologies coming to the market over the next few years.

Jonathan Buckley (jbuckley@netbrowser.com) is the VP, Marketing and Business Development of NetBrowser Communications. NetBrowser has pioneered and patented an enterprise monitoring software suite, e-Guardian®, for what it calls The Zero Layer™, or the facility foundations layer upon which critical IT systems depend. NetBrowser's Fortune 1000 customer base has plenty of stories of how they avoided disasters using this new technology.

 

Requirements for Managing Remote Sites as One Unified Network

What are some of the requirements for safeguarding and better unlocking the value when managing remote sites as one unified network?
Make sure your solution… So that…
Incorporate software tools that provide a unified view of all the power, fire, environmental and physical security equipment in all parts of your enterprise.
You manage all of the foundational equipment in one cost effective, integrated fashion regardless of the number of locations. IT and Facilities can understand, without delay the status of remote site health through one portal.
Scale from the enterprise right down to the computer rack and back up again, making sure that not even a Telco closet becomes a missing link in the chain. Manage through one portal accessible from anywhere.
You cover the entire the input level of the supply chain rather than only selective components, to reduce downtime and cost.
Allow flexibility in accessing the system data
You eliminate machine vendor dependency, save money, and have one view of all vendors, types and locations of equipment.
Provide the highest level of data security The corporate IT security department must be at ease so there must be no dependency on modems, security holes in the firewalls, or any users coming from outside the firewall to access information
Work without hogging much bandwidth on your corporate network You avoid additional cost of dedicated data lines and conflicts with IT over traffic concerns. Be able to grow the solution without worries of future network growth needs.
Provide for the same level of N+1 redundancy you expect from critical network systems You get a dependable monitoring solution that monitors and safeguards your data, no matter what happens.

Provide for ease of installation and maintenance

You get a dependable monitoring solution that monitors and safeguards your data, no matter what happens.

 


Reprinted by permission from REMOTE Site & Equipment Magazine, August/September 2002 issue.