Introduction

 

Customers who are considering Unified Messaging (UM) will at some stage ask the question: "How Many Exchange 2007 Servers running the UM role will I need to support my users?". The short answer to the question is: "That depends". This blog post tries to illustrate the factors on which the answer will depend, and in so doing to give a longer and more helpful answer.

 

Factors Influencing the Number of Users Supported

 

Supply: Availability of Resources

The principal resource that is demand is the UM server's ability to sustain concurrent conversations (sessions) with telephone callers. Every voice-over-IP session requires the UM server to dedicate some network bandwidth, processing time and virtual memory. All of these are in limited supply, and so the UM server can handle only a finite number of telephone calls.

 

Strictly speaking, packet-switched (IP) networks do not exhibit the same characteristics as circuit-switched networks, and are not generally susceptible to the same kinds of traffic analysis. However, Exchange Server 2007 Unified Messaging imposes an absolute limit (which the administrator can configure, if required) on the number of concurrent calls that a given UM server will accept.

 

Given this limit, and assuming that the UM server is able to sustain this load, well-established techniques can be applied to the question of UM sizing. In particular, Erlang traffic analysis can be used to estimate the number of concurrent calls required to support the UM user population. This provides a mathematical model of competition for a limited number of ports. Such analysis been used for many years for circuit-switched telephony equipment.

 

If the number of concurrent calls required greater than that supported by a single UM server, the scale-out architecture offered by Exchange Server 2007 UM allows the capacity to be increased by adding UM servers to the Dial Plan.

 

Demand: Usage by UM-Enabled Users

UM-enabled users can consume UM server communications resources in several ways:

·         Directly

-        By calling into the UM pilot number, logging into their mailboxes, and accessing their messages, calendar, contacts and/or the directory

-        By using a UM server (under the control of Outlook or OWA) to play back voice content on a telephone

·         Indirectly

-        Each time that a caller to UM uses the system to identify and:

-        Transfer the call to the user's phone

-        Sends a voice message (without calling the user's phone)

-        Each time that someone calls the user, fails to reach them and instead:

   -         Leaves a voice message via UM

   -         Leaves a fax message via UM

 

Demand: Usage by Unauthenticated Callers

People who call into UM over the phone, but do not log into a mailbox, are unauthenticated callers. UM resources may be used in servicing their requests. Some of this usage has already been described above as related in an "indirect" way to contact UM-enabled users. However, there may also be usage that can not be associated with UM-enabled users.

 

Each time that a caller to UM uses the system to identify a non-UM enabled user (who may appear the directory as a user or a contact) they may:

-        Transfer the call to the user's phone

-        Send a voice message to the user

-        Use an Automated Attendant to transfer to another number, another Auto Attendant, or listen to recorded audio

 

Demand: Busy Hour and Probability of Not Finding a Resource

In most real-life systems, demand will not be distributed evenly through the day. There will be periods when the demand is low (often at night and in the early morning), and periods of high demand (perhaps shortly after business hours begin, and after lunch). Traditionally, this has been simplified to a single statistic: the fraction (or percentage) of calls arriving in the busiest hour of the day.

 

If calls arrive with equal probability during the day, the busy hour (like all other hours) will receive (100 × 1 / 24)% = 4.2% of those calls.

 

A figure of 14% for calls arriving in the busy hour is considered fairly typical. Here, about one-seventh of all the day's calls arrive in just one hour.

 

UM system managers should try to arrange good service for callers. The most important metric is the fraction of calls offered to UM that actually become connected. Rather than expensively over-provisioning their systems, the administrator may wish to determine a specific level of service. This equates to a probability that, at times of greatest demand, the caller will receive a busy tone, because UM has reached its limit of concurrent calls.

 

Traffic analysis calculations typically specify both the busy hour occupancy (e.g. 14%) and the probability (P) of finding all "ports" busy.  A value of P = 0.01 means that a caller trying to reach UM in the busiest hour of the day will, on average, have a 1% probability of getting a busy tone instead. Higher values of P therefore correspond to lower levels of service.

 

Sample Calculations

 

To illustrate the sort of results that are predicted, Figure 1 shows the results computed for the number of user supported, under various conditions, as a function of the maximum number of concurrent calls allowed. These calculations assume that calls to UM are not queued; either they are answered as soon as possible after they are offered, or the caller receives a "busy" indication and must hang up and retry. This corresponds to the behavior of the majority of PBXs.

 

Two basic kinds of system are illustrated: one with "light" usage (upper pair of curves), and one with "heavy" usage (lower pair of curves). The pairing of the curves shows that the effect of variations of service level (P = 0.01; P = 0.05) is fairly minor compared to variations in call traffic.

 

 

Figure 1. Sample Calculations (small systems)

 

Clearly, UM can support a user population that is considerably bigger than the number of concurrent calls it can handle. This is easy to understand; users (or callers trying to reach them) don't spend all their time on the phone.

 

Figure 1 was calculated according to the following assumptions:

 

 

"Light"

"Heavy"

Average number of call-answered voice messages per user per day

4

8

Average duration of call-answered voice message, seconds

25

25

Average duration of greeting, seconds

5

5

Average number of fax messages per user per day

1

2

Average duration of fax call, seconds

60

60

Average number of UM logins per user per day

0.5

3

Average duration of UM user session, seconds

120

120

Average usage per user in busy hour (14%), seconds

33.6

100.8

 

In "light" usage, each user receives a handful of voice and fax messages per day, and does not log in over the phone. The assumption is that they use a mail client (Outlook or OWA) to access these messages. Play-on-phone is not included, so voice messages are assumed to be played back with local multimedia.

 

In the scenario where users are using the system as "experts", they log in several times a day (to check messages, calendar etc.) and receive more call-answered messages, too. This approximates fairly heavy system use, and the number of users supported is correspondingly less.

 

Naturally, there are scenarios that fall between these curves, and some that even represent lighter or heavier usage.  There is no allowance here for Automated Attendants, which also compete for resources.

 

 

Figure 2. Sample Calculations (Larger Systems)

 

Figure 2 shows the same calculations, extended to larger numbers of concurrent calls. The two "heavy" usage curves are very close together; the higher quality of service (P = 0.01) is the lower of the curves.

 

At this scale, the number of supported users is approximately proportional to the maximum number of concurrent calls. The non-linear portion of the curves (which was visible in Figure 1) is confined to the smallest systems.

 

The key statistic is the average fraction of the day (busy hour) which each user (or caller, trying to reach a user) spends interacting with UM over the phone. It does not matter how this time is spent (e.g. user listening to e-mail, system receiving a call-answered fax for the user...), just how much time is spent.

 

Each UM Server should support a maximum of 60 - 100 concurrent calls. (More precise details will be available by the time that Exchange Server 2007 is released). For large systems, the calculations above suggest that the number of UM-enabled users supported per UM server is of the order of 3000 (heavy usage) to 10000 (light usage).

 

Caveat

 

The calculations shown are only intended to illustrate plausible patterns of usage. Each customer will be slightly different, and establishing the average patterns of usage should be a part of any effort to scope the required extent of UM deployment. Additionally, customers may wish to provide one or more "redundant" UM servers to increase overall system availability.

 

- Michael Wilson