32
exposures e.g. fire, flood, earthquake at once per 250
years each, means that external factors will limit the facility
to at best Class 0.75.
MTech’s fault tree analysis of electric and cooling systems
cannot determine Class ratings without important limits
and assumptions. Our studies have shown that so-called
preventive maintenance is a major, in some cases
dominant, cause of system failure. A realistic “budget” for
class rating a very high performance facility might include
the following terms:
Facility Power and Cooling Systems Unreliability: Class 0.5
Maintenance-induced Failures:
Class 0.5
Sum of fires, floods, other 250-year events:
Class 1
Facility Class rating:
Class 2
This example shows a useful property of Class ratings:
they can be added, so long as each term represents
separate threats to the facility.
Another useful attribute is comparison between facilities.
A Class 5 facility is 5 times more likely to fail than a Class
1 facility, and 10 times more likely to fail than a class 0.5
facility.
Reliability cannot be compared so easily. AC 5 facility has
95% reliability for one year, a Class 1 facility has 99%. The
difference in reliability is only 4%. It is tempting to say that
a Class 1 facility is five times more reliable than a Class 5
facility, but that is incorrect.
DETERMINING CRITICAL FACILITY CLASS
Calculation with fault tree analysis has many useful
attributes, but is not the only way to predict or measure
critical facility Class. Reliability block diagrams, when used
appropriately, can perform similar calculations.
5
Measurement and observation may be used. A single
facility that operates for some years without any failures
might be tempted to claim “Class 0” performance, but
would risk rambunctious skepticism from informed
customers and damaged credibility. A more defensible
and conservative approach would be to estimate or
calculate the expected Class of the facility, and then use
statistical analysis to show that the years of zero downtime
are consistent with those claims within a given level of
statistical confidence.
Owners and operators of large fleets of data centers can
collect and publish data supporting Class ratings of their
entire fleet. A co-location provider who could offer
potential customers evidence of fleet Class performance
along with best and worse-case facilities in the fleet would
have a significant competitive advantage over a single
facility owner, who must collect their data much more
slowly.
Original equipment manufacturers can also employ Class
ratings, so long as they carefully define the assumptions
and limits of the claim. MTech’s client Active Power
recently published a white paper showing that the
unreliability for the CleanSource 750HD UPS with
extended runtime was 0.36% for short outages (less than
10 seconds.) That claim might be expressed as “The
CleanSource 750HD provides a Class 0.5 component for
critical facility electric power systems subjected to 1 short
outage per year.” Since there is no allowance for
scheduled maintenance in the 0.36% figure; Class 0.5 is
offered as a likely example.
An important component of Class rating is the frequency
of demands placed upon the system. Demand failures are
events that require standby systems to operate, or active
systems to change operating state, often by switching. If a
standby diesel/generator set has a 1% probability of not
starting when utility power fails, a facility with a single
generator cannot achieve Class 10 performance if there
are 10 or more utility outages per year. In a highly reliable
urban network distribution system, it is entirely possible to
support a claim of Class 10 performance with no standby
generator at all.
A credible class rating must disclose all assumptions and
data used to produce the claim, including the frequency
and duration of utility outages, assumptions about on-site
and off-site fuel supplies, allowance for failures produced
by maintenance and other sources of human error, and
exclusions of certain events such as major storms,
earthquakes, or fires.
MTech proposes the Class metric as an aid for discussion
and improvement of mission critical facility performance. It
can be used in design, operation, maintenance, and
failure analysis. Class, the probability of failure over a full
year of operation, is related to but distinct from reliability.
MTech makes no claim of intellectual property rights to
the term Class or its use to characterize the performance
of critical facilities. We encourage the use, discussion,
criticism, and debate of Class as a critical facility
performance metric. Those who wish to use Class to
characterize their facilities should describe in detail the
methods, assumptions, and limits that were used to
produce the claim.
7X24 MAGAZINE SPRING 2015