|
By
Jonathan Buckley
Ten years ago, it was common for mission-critical IT systems to be concentrated
in large, centralized 'raised floor' centers, which were manned around
the clock. Today, the benefits of distributed networks have resulted in
an exponential growth in the number and geographic diversity of mission-critical
sites. Now, a 6'x6' data closet can be the host to critical server and
router assets. This new distributed topology translates into more sites
to manage, many of them 'dark sites', many of them in places financially
prohibitive to man 7x24. This proliferation of mission-critical remote
sites include branch offices, Telco closets, POPs, cellular towers, call
centers, and outsourced collocation facilities. Not only is this growth
in remote sites not supported by corresponding growth in facility staff
to physically watch over the sites to react to network-threatening facilities
issues, but also the service demand standard have grown to a daunting
99.999% in uptime.
Internet Service Providers (ISPs) are an extreme example of the distributed
mission critical environment. Typically, an ISP has at least two larger
data centers, a couple of critical call centers and many hundreds of unmanned,
"points of presence" (POPs), distributed throughout area codes
and prefixes. A customer dials into the local POP in order to gain access
to the network and Internet. Each POP becomes an extension of the main
mission-critical data center. If the equipment fails in one of these small
equipment rooms, the customer loses service. Retail operations, including
banks and pharmacies follow the same topology.
With the distributed nature of IT network and systems, availability will
be increasingly dependent on remote site as well as an in-depth understanding
of the weakest link. Securing and managing each and every remote site--especially
the weakest link--is imperative in order to ensure overall IT network
and system availability.
The starting point is in understanding root causes of network and system
downtime and techniques or tools to help avoid such costly occasions.
This involves not just monitoring a network of simple inputs, provided
by dry contacts today, but more importantly providing easy access and
detailed visibility of every sensory input available at the machine level.
With complete access to all the sensory inputs, companies can create virtual
presence in managing the critical remote equipment without the cost of
being there.
This leveraged "virtual omnipresence" has already been established
within the corporate data networks enterprise-wide. As evidence of market
interest in this area is the boom in network and systems management software
sales, and more recently the specific segment of root cause software is
an indication of the worldwide appetite for solutions to measure, assess,
predict and hopefully, ultimately avoid network and system outages. Sales
of these software packages are in the tens of billions of dollars annually
by all survey accounts and have become critical for IT groups to manage
IT uptime globally from one location
Despite our best efforts and the best IT software management packages,
failures occur. Why? For starters, studies are beginning to show that
30% to 50% of IT failures have a root cause in power, fire, environmental
and physical security problems, not the hardware and software monitored
by the fancy network and systems management software packages. Apparently,
we have only been looking at half the uptime equation. Despite the changes
in facilities management model over the last ten years, our tools have
not kept pace with the change in monitoring demands. We don't do enough
to manage our distributed IT and facility ecosystem as a single unified
network.
Obstacles in Creating One Unified Facilities Network
Along with the exponential growth of remote sites, there has been a proliferation
of incompatible equipment, standards, and methods. To begin with, the
set up, management, and maintenance of the equipment vary for each vendor.
Even equipment from different product lines within the same vendor family
can be incompatible. Given that a typical enterprise has hundreds or thousands
of equipment components from all different vendors, one can readily see
the challenge in cost-effectively managing remote site equipment.
Meanwhile, the technology that is currently installed at many facilities
was developed using industrial automation technology of the 70's and 80's.
Traditionally, these workstation-centric, modem-oriented systems take
too long and cost too much to install and operate. Additionally, for dark
sites, the workstation itself creates a single point of failure found
to be intolerable by modern standards. If the PC freezes and there is
an alarm, no one can be notified. Moreover, the incompatibilities among
equipment and systems create enormous inefficiencies, making them difficult
to scale. As a result, both the building automation systems and the network
management systems fall short in their abilities to unite all these remote
machinery assets due to their inherent technology limitations.
Imagine trying to extract information in a uniform way from all the remote
equipment, in order to track utilization, perform failure analysis, predict
performance, and anticipate replacement. Until now, the only option was
customized systems integration. A lot of time and money is spent creating
and supporting custom connections among the various types of equipment.
Even more time and resources are spent just trying to extract information
from each of these proprietary systems.
Bottom line: with traditional technology, there is no single, common language
to unite all the equipment at these remote sites, and hence no way to
achieve the operating efficiencies and leverage.
Foundation for Creating One Unified Network
One can take the 7 Layer OSI model and envision a foundational layer upon
which the IT network and systems depend upon. We refer to this critical
facilities foundation as the Zero Layer, and it includes the critical
power, environmental, fire safety, space, and physical security machinery.
What is interesting is that most corporations today lack rapid, remote
visibility into these foundational elements even with their billions of
dollars in network system management packages and root cause engines and
the acknowledgement that up to half of all failures can be attributed
to Zero Layer failure.
Thus the distributed IT network and system has not been effectively managed
as an enterprise, and this disjointedness, due to lack of root cause understanding
or tools to manage and assess these causes at the facilities foundational
layer, have lead to famous disasters. Service providers, Internet companies,
banks and manufactures alike have spent time in the newspaper because
of outages due to failed generators during rolling blackouts, water leaks
or simply failed air cooling at unmanned sites.
Keep in mind that there is only so much software tools can help in disaster
avoidance within the IT network and systems, but there is a value in being
able to rapidly assess the viability of the different foundational layer
components, post-disaster. For example, how long might it now take for
your company to access the viability of its facilities systems after a
major earthquake? The answer to that question is quite certainly different
than, how long would it take to access the viability of your network connection
after the natural disaster? If the distributed IT network and systems
were managed as such, the answers would match because you would have remote,
unified, global visibility to all areas of the facilities foundational
layer including power, fire, environmental and physical security systems
just as you do server health.
Emerging Solutions for Creating One Unified Network
New tools such as NetBrowser Communication's e-Guardian are now coming
to the market to begin to address this forgotten piece of the facilities
foundation layer. This gives acknowledgement to the fact that there is
indeed and interconnectedness between the 7 layer OSI network model and
the more mundane inputs that make them possible: power, fire, environmental
and physical security.
Built on new era architectures that directly monitor and predict the health
and well being of the foundational layer and link them to the rest of
the distributed IT network and systems, these tools now hold promise to
provide the same visibility into the IT enterprise as the CFO would expect
of his/her financial system.
One perspective shared recently is "Solutions such as NetBrowser's
dramatically improve infrastructure system availability. They do so by
improving the key parameters that measure availability - Mean Time To
Failure (MTTF) and Mean Time To Restore (MTTR). This allows enterprises
to achieve high availability in their mission-critical infrastructure
for a cost that is small in relation to the value of the investment."
- David DiQuinzio, President, Strategic Facilities Inc.
Better Securing Of Unmanned Sites
For example, a leading telecommunications company had a $2.5M per year
problem: at its remote telecom sites (Central Offices) trained thieves
would break in the facility and steal telephone switch components for
sale on the black market. The process to enter the facility, power down
a switch and remove the card was about seven minutes. Meanwhile, though
simple door alarms where employed, the network operation team was so overwhelmed
from nuisance alarms at over 650 sites that it ignored most out of shear
necessity. When the stolen card was found to be the problem, CCTV equipment
has to be accessed locally and searched--a tedious and often time unproductive
exercise.
As a result, they searched for a solution that allowed them to manage
their remote sites as one unified network and to replace the CCTV with
real-time networked cameras.
The firm finally selected e-Guardian®, which allowed them proactively
safeguarded its data and telecommunications equipment against physical
intrusions in its remote unmanned sites with integrated, enterprise-class,
real-time remote surveillance technology. For the first time, they could
see hundreds of cameras across all the remote sites from one browser.
In addition, they were notified of any specific physical intrusion, say
the telephone switch cabinet was accessed, all at network speeds. Within
one minute, the network team and local security team receive images that
fully document the incident. Best of all, they eliminated the cost and
cumbersome closed-circuit television system problem at these unmanned
sites. They were able to know what was happening when it happened even
on a wireless PDA.
Within one month, the Telco was able to pinpoint the intruders and subsequently
apprehended three individuals responsible for the equipment theft ring.
The notification occurred in real time and the police were dispatch to
the site, all within the seven minutes it would normally take to carry
out such a theft.
Better Ensuring Uptime and Availability Enterprise-Wide
In another case, a leading enterprise storage company needed to reduce
outage risk and ensure high availability to meet its Service Level Agreements
in its enterprise application outsourcing services. Its data centers support
over forty corporate customers and contain over 175 terabytes of live
customer data, on over 650 heterogeneous servers running eight different
operating systems and five different databases.
A major technical hurdle that it faced was the lack of an adequate solution
to monitor all its facilities equipment at different sites. First, its
existing Windows-based monitoring solution was unable to unite all the
facilities infrastructure equipment within their data centers under a
single platform, thus limiting the visibility of information about the
equipments' condition. Second, there was no way to access and view the
data from remote locations without installing client software or dealing
with cumbersome remote management tools. Third, the system was neither
secure nor reliable, as it utilized modem-based communication with its
glaring potential for network security breaches. Fourth, the system had
limited scalability requiring a server to be installed at each individual
location in order to provide acceptable performance. Lastly, their system
offered no back up or fail-safe mechanism in the event of component failure.
As a result, the system never performed adequately in alerting engineering
and facilities staff of out-of-tolerance conditions. Additionally, the
total cost of ownership was higher than expected requiring a special system
administrator and significant IT support during installations and upgrades.
Any new software updates needed to be completed on each individual client
and server. With its zero tolerance position for system failures and the
need to guarantee the highest availability to their customers (per their
SLA agreements), it needed to find an alternative system.
By adopting a solution such as e-Guardian®, it was able to provide
a scalable enterprise solution that enabled its data center to unite all
of its facilities machinery into one, manageable proactive system accessible
from anywhere at anytime. With this solution, all of its engineering,
facilities and the NOC personnel can use and receive the benefits of a
single solution that monitors all their mission critical equipment and
facilities. Moreover, its employees receive a unified view of all facilities
assets from anywhere, using a simple web browser or wireless PDA. It provides
data encryption to ensure network security, bandwidth friendly communication
that optimizes performance, and the ability navigate the firewall. Finally,
it provides full scalability and robust N+1 redundancy.
In conclusion, proactive managing of your remote sites as one, complete
enterprise network can help businesses avoid the costly outages and downtime
associated with unplanned failure in their facilities machinery. Getting
to this information is a challenging task that few companies ever accomplish,
especially given the external pressures and implementation obstacles today.
As companies seek solutions to automate their entire infrastructure, they
must look holistically at all the key requirements and leverage the benefits
of new technologies coming to the market over the next few years.
Jonathan Buckley (jbuckley@netbrowser.com)
is the VP, Marketing and Business Development of NetBrowser Communications.
NetBrowser has pioneered and patented an enterprise monitoring software
suite, e-Guardian®, for what it calls The Zero Layer, or the
facility foundations layer upon which critical IT systems depend. NetBrowser's
Fortune 1000 customer base has plenty of stories of how they avoided disasters
using this new technology.
|
Requirements
for Managing Remote Sites as One Unified Network
What
are some of the requirements for safeguarding and better unlocking
the value when managing remote sites as one unified network?
|
| Make
sure your solution
|
So
that
|
Incorporate
software tools that provide a unified view of all the power, fire,
environmental and physical security equipment in all parts of your
enterprise.
|
You
manage all of the foundational equipment in one cost effective, integrated
fashion regardless of the number of locations. IT and Facilities can
understand, without delay the status of remote site health through
one portal. |
Scale
from the enterprise right down to the computer rack and back up again,
making sure that not even a Telco closet becomes a missing link in
the chain. Manage through one portal accessible from anywhere.
|
You
cover the entire the input level of the supply chain rather than only
selective components, to reduce downtime and cost.
|
Allow
flexibility in accessing the system data
|
You
eliminate machine vendor dependency, save money, and have one view
of all vendors, types and locations of equipment. |
| Provide
the highest level of data security |
The
corporate IT security department must be at ease so there must be
no dependency on modems, security holes in the firewalls, or any users
coming from outside the firewall to access information |
| Work
without hogging much bandwidth on your corporate network |
You
avoid additional cost of dedicated data lines and conflicts with IT
over traffic concerns. Be able to grow the solution without worries
of future network growth needs. |
| Provide
for the same level of N+1 redundancy you expect from critical network
systems |
You
get a dependable monitoring solution that monitors and safeguards
your data, no matter what happens. |
Provide
for ease of installation and maintenance
|
You
get a dependable monitoring solution that monitors and safeguards
your data, no matter what happens. |
|