Systems Engineering in Healthcare: Decision Making in the Face of Variation and Uncertainty

© 2004 Wayne G. Fischer, PhD

Download PDF 


A major healthcare organization’s new ambulatory clinic building (ACB) was targeted to open four years after groundbreaking.  Five major outpatient clinics were scheduled to occupy this new facility, along with ancillary services.  Allocating space among these outpatient clinics in the new ACB was driven mainly by how many exam rooms each Center would need by the move-in date.  The number of rooms is driven on two levels: 1) on a strategic level by the uncertain number of total patient visits forecasted for the move-in date, and five years after that; and 2) on a operational level by the varying number of patients seen daily and by the varying amount of time patients spend in the rooms.  Monte Carlo simulation was used to determine the critical output distribution: number of exam rooms for a targeted room utilization rate.  Risk was quantified as the percent of time each clinic would have to operate above the targeted room utilization, for a given number of rooms, in order to see all patients in a nominal eight-hour day.  Consequence is defined as the estimated cost (facilities and staff) due to that risk.

While the initial objective was to determine the number of exam rooms each clinic should have in the new ACB, the results prompted clinics to rethink their operations.  It became apparent that no clinic could continue to grow while maintaining the operational status quo, and expect that any reasonable number of rooms would suffice – especially five years after move-in.  Efforts were initiated to decrease variation and increase room utilization in the clinics.


Plans had been set and ground broken for the new Ambulatory Clinic Building (ACB).  Major occupants of the new facility would be five clinics.  Now came a critical question: “How sure are we that the space allocated to each clinic will meet their needs?  And not just at time of occupancy, but what about future needs – say, five years later?”

One major determinant of clinic space requirements is size and number of exam rooms.  Size is more or less fixed by equipment and furnishing needs. Estimates of each clinic’s number of rooms were initially based on five turns per day; however, this number was anecdotal, having been referenced in one literature article but with no supporting data.  Executive leadership wanted more confidence in space allocation, especially given that the prevailing mindset throughout all clinics was, “We need more rooms.”

The need for exam rooms is dependent on the number of patients seen each day and the length of times patients spend in the rooms.  And when future room needs are brought into the decision process, a third factor must be considered: forecasted annual patient visits.


The number, type, date, and time of each clinic’s patient visits are available in a database.  Strategic Planning and Medical Informatics periodically update forecasts of future patient visit volumes.  But no data were available for patient times in the exam rooms.  Performance Improvement (PI) decided direct observation with manual data capture was the most expedient approach.  PI designed a generic data collection sheet and worked with each clinic to customize it so that not only each patient type and time-in-room were captured, but also each caregiver type and time-in-room were recorded, too.

In order to strike a balance between cost of manual acquisition of data and the desire to capture a representative sample of patient types and times, five days of operation were observed – Monday through Friday.  Number of visits captured ranged from about 200 to over 600, depending on the clinic.  These sample sizes were deemed large enough to construct representative histograms of patient in-room times, to which continuous distributions would be fit for the Monte Carlo models (1).

For a sample of number of patient visits per day, the most recent three months of visits for each clinic were pulled from the database.  With the average time-in-room for each patient type, the numbers of visits of each patient type, and the number of exam rooms, monthly average utilizations could be calculated, assuming a nominal eight-hour clinic day:

(1)            Room time required = [ # patient visits ] x [ average time-in-room ]

(2)            Room time available = [ # exam rooms ] x [ 8 hours / day ] x [ # working days ]

(3)            % Utilization = 100 [ Room time required ] / [ Room time available ]

Equation (1) is used for each type of patient visit, and the results summed for total Room time required.  These equations can be used to calculate the average utilization for any period of time – day, month, or year.

Average room utilizations can always be calculated, even for a future point in time.  But when the decision must take into account multiple points in the future (as almost all planning decisions should), averages do not tell the whole story and, worse, can be very misleading (2).  This is because averages do not convey the impact of variation that is a very real part of all work processes, including those of healthcare.  And in this particular instance there are three sources of variation that must be comprehended in estimating number of exam rooms for the future:

  • Patient times-in-room
  • Number of patients visits / day
  • Number of patients visits forecasted for the future
    – move-in and five years after for this work

While the first two arise as a natural consequence of the variation in daily operations, the third is due to uncertainty of the future rather than to normal operations.  Those who engage in forecasting any aspect of the future know a particular result is dependent on a set of assumptions about that future, sometimes called a “scenario.”  Change any assumption and the “future” changes.  Also, numerical projections are dependent on the models used to project – for example, regression versus time-series.  In addition, in our case we had two sources of historical data for input – so each clinic had multiple values of forecasted visits per year for the move-in year and five years after.  We assumed the various forecasted annual visits per year were equally likely – resulting in what’s known as a uniform distribution (Figure 3 for Clinic “A”).

To assess the impact of variation on operations, we must quantitatively take into account each source of variation.  This is not accomplished by calculating “worst case,” “most likely,” and “best case” scenarios – wherein the “worst,” “most likely,” and “best” values of key variables are presumed to occur simultaneously.  Reality rarely – if ever – allows such simplistic outcomes.  And even if the intent is to bound the range of all possible likely outcomes (however improbable), this approach is of no use in understanding the tradeoffs between risk and consequence that is caused by system variation.

A much more useful representation of reality is Monte Carlo simulation (1).  Every input variable that has significant variation or uncertainty is represented by a histogram or probability distribution.  Each of these distributions is randomly sampled for a single value, and this set of values used to calculate the output variables’ values.  This process of random sampling and calculation is repeated hundreds of times (typically as many as 500-1000), allowing the building of probability distributions for the output variables.  [This assumes the input variables vary independently of one another.  If this is not the case, correlation coefficients can be used within the Monte Carlo simulation to take any linear association into account (3).  Palisades Corporation’s @RISK (4, 5) was used for all the Monte Carlo modeling and simulations in this work.]

Results – Utilizations

Table 1 gives the average monthly room utilizations and turns for the five clinics originally scheduled to move into the new ACB.  Both are given – and should be tracked together in any set of key operational metrics – because while intuition may suggest they are positively correlated, they sometimes are not.  For example, one patient in an exam room for eight hours is a room utilization of 100% but a turn of only one.  On the other hand, ten patients treated in four hours (with the room empty the other four) is a turn of 10 but a room utilization of only 50%.

Table 1.  Average Monthly Room Utilization / Turns (at time of study)


Month 1

Month 2

Month 3


48% / 4.0

50% / 4.2

50% / 4.2


53% / 4.6

59% / 5.1

57% / 4.9


51% / 2.9

49% / 2.8

49% / 2.8


27% / 1.9

28% / 2.0

27% / 1.9


42% / 4.8

40% / 4.6

40% / 4.6

As an example of how averages can be misleading, Table 2 shows the average room utilization on a daily basis for Clinic “A.”

Table 2.  Average Daily Room Utilization for Clinic “A” (at time of study)

Day / Date Room Utilization Turns















Notice the difference in ranges of the two averages.  Since monthly averages are over a longer time period than daily averages, and averaging is a “smoothing” operation, the wider scatter in daily averages is to be expected.  [But workers function in systems that vary hour-to-hour, sometimes minute-to-minute.  This is one reason “averages” can be so misleading in process improvement efforts – and why the nemesis of variation must be continually attacked using the science of variation: statistics.  A good place to start is with histograms and Statistical Process Control charts for key variables (6,7).]

In order to give staff a visual representation of the week’s worth of patient times-in-room data that were collected for each of their clinics, a “tile plot” was created using Excel (Figure 1).   Each row represents an exam room of a clinic and each column equals five minutes.  Each row (“room”) extends across columns representing “hours” from 7am to 5pm.  [Clinic days obviously run longer – until all patients are seen.]  The red segments represent periods when patients were in the exam rooms alone or with one or more caregivers.  White segments signify empty room periods, light gray signifies unscheduled rooms (not used), and dark gray segments are scheduled rooms that were used but not observed.

Figure 1.  Tile plot for Clinic “A” patient times-in-room – Monday through Friday

This tile plot is typical of all the over twenty clinics that were ultimately analyzed: lots of “white space.”  To break the results down even further, a second tile plot was constructed, Figure 2.

Figure 2.  Tile plot for Clinic “A” patient alone in room (gray) and patient with one or more caregivers (red) – Monday

This tile plot is also typical of all the other clinics: the exam rooms (unintentionally) too often serve as a second waiting area – the effective utilizations are much lower than that calculated.

[A third tile plot used different colors for each type of caregiver to “decompose” the red space of Figure 2 into the actual times of caregivers in the rooms with patients.]

Results – Distributions of Required Rooms for Given Utilizations

Figures 3-5 show the histograms and fitted probability distributions of Clinic “A” for the three sources of variation described above.  To determine the desired output distribution, these steps of the Monte Carlo simulation are followed:

  1. Select year (move-in or five years after) and assumed room utilization
  2. Randomly select single value of Patients / Year from its distribution (Figure 3), then “center” the Patients / Day distribution (Figure 4) around the average (= Patients per Year / 250 working days)
  3. Randomly select Patients / Day and Patient Time in Room (Figure 5) from their respective distributions
  4. Using rearranged versions of Equations 1-3, calculate required number of rooms
  5. Iterate steps 2-4 for 1000 “days”

Figures 6-9 show the resulting distributions of required exam rooms for Clinic “A” at move-in and five years after, assuming average room utilizations of 50% and 65%, respectively.

Figure 3.  Clinic “A” uniform distribution of patient visits / year (move-in year)

Figure 4.  Clinic “A” sample distribution of patients / day (year of study)

Figure 5.  Clinic “A” sample distribution of patient times in exam rooms (year of study)

Figure 6.  Clinic “A” Calculated Required # of Exam Rooms for 50% Utilization (move-in year)

Figure 7.  Clinic “A” Calculated Required # of Exam Rooms for 65% Utilization (move-in year)

Figure 8.  Clinic “A” Calculated Required # of Exam rooms for 50% Utilization (five years after move-in)

Figure 9.  Clinic “A” Calculated Required # of Exam Rooms for 65% Utilization (five years after move-in)

Discussion – Rooms, Risk, & Consequences

As would be expected, the average number of rooms required increases with increasing patients seen, and decreases with increasing utilization.  But the real story is the effects of the variation and uncertainty on the required number of exam rooms.

If the total area under any distribution is considered as encompassing 100% of all possible outcome values, then we can talk about the effects of our decisions in terms of risks and the consequences should any of those risks actually occur.  For example, if we wanted only a 5% risk of not seeing all patients in any given eight-hour day during the move-in year while maintaining a 50% room utilization rate (Figure 6), we select the value on the horizontal axis in Figure 6 that would leave to its right 5% of the total area under the curve.  This value is 53 exam rooms.  At the time this work was being done, Clinic “A” was slated to get 44 rooms, incurring a 10% risk (or one day out of every two weeks – on average) of not seeing every patient on any given nominal eight-hour day.

If room utilization is maintained at 65% during the move-in year (Figure 7), only 41 rooms are needed to have a 5% risk of not seeing every patient on any given day.  If the actual number of rooms is the contemplated 44, the risk falls to 3%.

Five years after move-in, when more patients are forecasted, more rooms are needed to maintain the same level of risk: 77 rooms for 5% risk at 50% utilization (Figure 8), but only 59 rooms at 65% utilization (Figure 9).  The contemplated 44 rooms would yield risks of 32% and 15% at 50% (Figure 8) and 65% (Figure 9) utilization, respectively, of not seeing all patients in any given eight-hour clinic day.

The executive team set the 5% risk level.  Two important learning points presented themselves:

  1. Almost all decisions based on any data at all are implicitly taking on 50% risk – because those data are almost always averages and averages divide the area under their distributions in half.
  2. If we set our risk level at 5% and resource for that level of operations, then 95% of the time we will be over-resourced – incurring costs that are not productive.  [Not strictly true since the workers in the system will find other things to do to “keep busy,” but these other things will be of significantly less value to the patients (i.e., indirect care vs. direct care).]

And these two points bring us to consequences.  The more variation in our work systems, the worse the consequences in the very real sense of value to our customers, costs incurred, and revenue lost – no matter the desired risk level.  If we resource to maintain a low risk of not servicing all customers in a fixed length of time, we incur unproductive costs a high proportion of time.  If we resource to maintain a high risk of not servicing all customers in a fixed length of time (that is, to minimize unproductive costs due to underutilization of resources), we incur lost revenue.

This is precisely why, once the desired characteristics of products or services are defined, continuous improvement is all about, first, identifying and removing Special Causes of variation; second, adjusting system “aim” to the desired target; and finally, reducing Common Causes of variation to their economic minimum (8, 9).  Theoretically (if it were possible) we would like no variation in our work systems – then we’d always resource them at exactly the right level.

Table 3 shows the risk analysis result for all five clinics.  Target room utilization was fixed at either 50% or 65%, and risk level at 5%.  The last three columns of the table are number of rooms – those contemplated by Facilities Planning and those estimated by risk analysis for the move-in year and five years after.

Table 3.  Summary of Risk Analysis Results

The models can also be used to estimate the risk levels if the number of rooms contemplated by Facilities Planning were used:

Table 4.  Risks Associated with Contemplated Exam Rooms

Key Learnings

Besides the quantitative analyses that allowed a full appreciation of the effects of variation on use of clinic exam rooms (or any resource for that matter), other organizational learnings were realized:

  • Neither room turns nor % utilization alone is sufficient to comprehend the “productivity” of this resource.
  • Averages are not representative of the reality of daily operations of work systems.
  • The existing paradigm of clinic personnel, “We need more rooms!” was not supported by the results.
  • The objection of “Don’t mess with my practice!” need not be violated.  There is ample opportunity in the “white space” (room empty) and the “gray space” (patient alone in room).
  • The prior, unstated and not comprehended, risk tolerance of 50% was (initially at least) lowered to 5%.
  • A new focus was achieved: “Let’s improve before move-in!”
  • The risk analysis methodology (Monte Carlo simulation) was accepted as a valid representation of reality.  The VP sponsoring the study extended the work to two more phases: those clinics that would gain the space vacated by those moving to the ACB, and then all remaining clinics.

Some Key Assumptions

A great statistician and teacher (George E. P. Box) made the observation, “All models are wrong; some models are useful.”  This work is no exception.  The saying reminds us that no model perfectly reflects reality, but merely serves as an approximation to that reality.  We build models in the hope of gaining some insights into the interdependencies of complex systems.

The major assumptions underlying this work are:

  • Sampled or selected data are representative of actual variation in Patient Times and Patients per Day (i.e., major sources of variation are captured)
  • Patient Times are not significantly impacted by number of Patients (i.e., clinical and administrative practices remain essentially unchanged over the range of Patients per Day)
  • Variation in Patients per Day does not change with a change in its average (i.e., its distribution has the same shape over the range of forecasted volumes for the move-in year and five years after)


Performance improvement initiatives to reduce variation were launched in the original five clinics.  Unfortunately, a year after completing the entire study (which lasted a year because we ended up doing all the outpatient clinics – over 20), the analyses of the five clinics were repeated – there was no change in the results.   A year after that, Clinic “A” alone was repeated again – no change.  A few years after that, to establish a baseline as part of a quality improvement project required for an internal education program, Clinic “A” was repeated again – no change still.  The moral: It takes more than incontrovertible evidence (i.e., data and the results of the analyses) and enthusiasm to make that proverbial horse drink…it takes a committed, focused, and adamant leadership – something many organizations still lack. 


  1. Practical Management Science (2nd ed – chapters 11 and 12), Wayne L. Winston and S. Christian Albright, Duxbury (2001)
  2. “The Flaw of Averages: Decisions Based on Averages are Wrong on Average,” Sam Savage, Harvard Business Review, November 2002 (pp 20-21)
  3. Practical Management Science, op. cit., pp 600-603
  4. Palisade Corporation, 798 Cascadilla Street, Ithaca, NY 14850, (accessed 28 July 2012)
  5. Practical Management Science, op. cit., pp 582-596
  6. “Statistical Process Control as a Tool for Research and Healthcare Improvement,” Quality and Safety in Health Care, 2003 (v12, pp 458-464)
  7. Measuring Quality Improvement in Healthcare, Raymond G. Carey and Robert C. Lloyd, Quality Resources (19995)
  8. Improving Performance through Statistical Thinking, Galen C. Britz, Donald W. Emerling, Lynne B. Hare, Roger W. Hoerl, Stuart J. Janis, and Janice E. Shade, ASQ Quality Press (2000)
  9. Statistical Thinking: Improving Business Performance (chapters 1-3), Roger Hoerl and Ronald Snee, Duxbury (2002)


Wayne G. Fischer, PhD
Statistician – University of Texas Medical Branch, Galveston, TX 77555-0132


version 09.19.2012