HHDS.16 - Queueing Theory and Modeling

This is Chapter 16 of 50 in a summary of the textbook Handbook of Healthcare Delivery Systems. Go to the series index here. Listen on YouTube Playlist, or search your podcast app: Gregory Schmidt

Chapter 16 Summary
Queueing Theory and Modeling

Chapter Author
Linda V. Green - Columbia University

Some Commentary

Great chapter. Very practical and counter-intuitive observations.

The really big take away is that one cannot properly asses bed capacity base on occupancy levels. As occupancy levels (utilization) increases, the queue grows exponentially. In general, utilization of over 85% will start to lead exponentially to delays and wait length.

1. Queues an Introduction

A.K. Erlang developed queueing theory in 1904 to manage the Danish telephone system. By nature, it helps deal with delays - which occur when the “Demand for a service and the capacity available to meet that demand” are mismatched.

One of the benefits of modeling queues is that little data (perhaps only 3 variables) are required, compared to other types of healthcare modeling looked at in past chapters that are much more intensive.

The first models we will look at today assume a ‘steady state’ - meaning that you can look the model at any time and see on average the same delays. The last model ‘time-varying demands’ looks at an example where the state changes throughout the day.

2. Queueing Theory Fundamentals

Queueing theory has two major sides

  • customers: this is the side waiting, it can be customers, or widgets, or patients for beds, etc

  • servers: this is the side providing the service, such as a server, an MRI machine slot, etc

The queue can be either ‘visible’ an actual line people wait in, or ‘invisible’. Waiting in line at the store, vs waiting in line on the phone.

A queue may have an infinite ‘waiting room’, or it may be capped at a specific limit where customers walk away if the line is too long, or are unable to queue.

The queue can be processed as a single line, or with each server having their own queue in parallel.

A queue may be processes based on first-come, first-served (FCFS) or it may be based on priority. Priority queues can either be preemptive (and interrupt a server who is seeing another customer if a higher priority customer enters the queue), or non-preemptive where the higher priority customer waits until a server is free.

3. Utilization, Delays, and System Size

Utilization: “the average number of busy servers divided by the total number of servers x 100”

General Utilization Principles

  • the higher the utilization, the longer the wait list

  • as utilization increases, the queue grows faster

There is an ‘elbow’ or ‘kick of the curve’ where even small increases in utilization results in dramatic increases in queue length. The exact place where this occurs is based on the

  • variability: of the system, where more variability (in arrival time and service time) leads to increases queueing sooner

  • system size: larger systems are able to function at higher levels of utilization until they enter the ‘kick of the curve’.

“If variability is lower, the model will overestimate delays while the converse is true if variability is greater”

4. Simple Queuing Models

Poisson Process is used for many healthcare processes where patients arrive one at a time, at random intervals, that is not dependent on other cutomers or an external event.

M/M/s Model: relies on (1) Poisson process, (2) unlimited waiting room, (3) identical servers, (4) single queue. The desired ‘service level’ can be programmed into the equation such as, “85% of calls answered within 20 seconds”.

The M/M/s model can be extended to include ‘priorities’, and this allows the user to calculate the required number of physicians “to assure that 90% of emergent and urgent patients will be seen by a physician within 45 minutes”

Another way to extend the M/M/s model is to place a cap on the number of customers in the queue.

Queues in Healthcare

When a hospital wants to request that more beds are required under American ‘Certificate of Need’ requirements, the hospital must have an average occupancy level of at least 85%. In a prior chapter, 85% was also around the number used in factories when they start to look at expanding capacity.

5. Queues: Fixed Capacity

Fixed capacity models are useful typically when the ‘server’ is actually an object - such as a hospital bed, MRI slot, or operating room.

Note: many fixed capacity objects actually are ‘scheduled’, meaning that the concept of queuing and poisson processes do not entirely apply at baseline because of the utilization appointment schedule. The exception is if additional patients are aded to fill the unfilled appointment slots in a walk in manner.

Example of Fixed Capacity Queues: The Obstetrics Unit

The chapter demonstrates well the effect of reducing / or increasing the number of beds in an obstetrics unit on the probability a patient would experience a delay in getting to the bed, because there would be a queue.

The chapter authors used baseline data from the unit to calculate their model. The baseline data had, an average arrival rate of 14.8 patients per day, average length of stay of 2.9 days, and 56 beds on unit.

As you can see in the table below, the delays to get a bed increase dramatically as utilization gets above 80%

The American College of Obstetrics and Gynecology suggests a maximum occupancy level of 75% in the obstetrics unit. That corresponds in this model to a probability of delay at 1-2%.

The other consideration is that if the rate of admission increases, it can have a dramatic increase on the utilization, as seen below.

This second model demonstrates the effect of different size units, and the desired probability of delay.

Flexible beds as the solution to fixed capacity models

A medicine department with 100 flexible beds, has a dramatically reduced queue when compared to dividing the department into 10 individual subspecialty units.

The use of ‘swing beds’ allows a bed to be used by at least two different departments. Such as for the flu in one season, and in another season for another purpose that has the countercyclical capacity cycle.

4. Queues: Time-Varying Demands

Patients arriving at the emergency department vary their time of arrival based on the time of day as this graphic demonstrates.

The authors discuss an example where they optimized the staffing of emergency room physicians by dividing the day into periods of 2 hours, and calculating M/M/s models for each period using the assumption that “no more than 20% of patients wait more than 1 hour before being seen by a provider”.

All tables and figures in this article from Chapter 16 in the Handbook of Healthcare Delivery Systems (2010).

History of AMPATH Medical Record System (AMRS)

History of AMPATH Medical Record System (AMRS)

HHSD.15 - An Introduction to Optimization Models and Applications in Healthcare Delivery Systems