Example 6: Working with SciPy

This example demonstrates how to use SciPy to obtain parametric distributions for use with QPLEX. In particular, it constructs a Poisson pmf with parameter \(\lambda=3\) for the number of arrivals and a discretized lognormal distribution with parameters \(\mu=-2\) and \(\sigma^2=0.25\) for the service durations.

The rough outline of the Python script is:

from qplex import StandardMultiserver

# Compute pmf of the number of arrivals and assign to number_of_arrivals_pmf

# Compute service duration pmf and assign to service_duration_pmf

engine = StandardMultiserver()
engine.number_of_servers = 5
engine.number_of_arrivals_pmf = number_of_arrivals_pmf
engine.service_duration_pmf = service_duration_pmf
engine.step()
print(engine.get_number_of_entities_in_system_pmf())

To compute the two pmfs, the following additional imports are required:

import numpy as np
from scipy.stats import poisson, lognorm

If SciPy is not already installed in your working environment, do so as shown below. This will also install NumPy if necessary.

pip3 install scipy

Since both the supports of the Poisson and lognormal distributions are unbounded, they must be truncated. This example will use the truncation tolerance \(10^{-4}\) and truncate only from above.

The SciPy function poisson.ppf calculates the inverse cumulative distribution function of a Poisson random variable. Here, it is used to determine the smallest integer such that the probability that a Poisson random variable with parameter 3 exceeding it is less than or equal to the truncation tolerance.

truncation_tolerance = 1e-4
arrival_rate = 3.0
max_value = int(poisson.ppf(1-truncation_tolerance, arrival_rate))
print(max_value)

And here is the output:

11

Accordingly, the support of the pmf of the number of arrivals to be created will be {0, 1, ..., 11}.

The desired probabilities are calculated using the SciPy function poisson.pmf:

support = range(int(max_value) + 1)
probs = poisson.pmf(support, arrival_rate)
print(probs)

And here is the output:

[4.97870684e-02 1.49361205e-01 2.24041808e-01 2.24041808e-01
 1.68031356e-01 1.00818813e-01 5.04094067e-02 2.16040315e-02
 8.10151179e-03 2.70050393e-03 8.10151179e-04 2.20950322e-04]

The variable probs is a NumPy array. Here is the code to convert it to a dictionary:

number_of_arrivals_truncated_pmf = {value: probs[value] for value in support}
print(number_of_arrivals_truncated_pmf)

And here is the output:

{0: 0.049787068367863944, 1: 0.14936120510359185, 2: 0.22404180765538775, 3: 0.22404180765538775, 4: 0.16803135574154085, 5: 0.10081881344492458, 6: 0.05040940672246224, 7: 0.02160403145248382, 8: 0.008101511794681432, 9: 0.002700503931560479, 10: 0.0008101511794681439, 11: 0.00022095032167312998}

Due to truncation, the probabilities in number_of_arrivals_truncated_pmf will not quite sum to one. But this is of no consequence since QPLEX automatically normalizes any pmf it receives from Python.

The lognormal distribution is continuous and measured in real time units, while service durations for QPLEX engines need to be measured in units of model time. Therefore, computing the service duration pmf requires using a mesh size to convert real time to model time. The number of model time units is obtained by rounding up real time on the resulting mesh, since service durations must be strictly positive. In this example, the mesh size is taken as 0.1. Other than that, the general plan is similar to that for the Poisson pmf.

Start by determining the largest value for truncation, using the same truncation tolerance as before:

mu = -2
sigma_squared = 0.25
mesh_size = 0.1
max_value = lognorm.ppf(1-truncation_tolerance, np.sqrt(sigma_squared), scale=np.exp(mu))
print(max_value)

And here is the output:

0.8689308272635914

So the support for the service duration pmf to be created will be {1, ..., 9}.

Calculate the right end points of each of the nine discretization intervals and the cumulative distribution function at those points:

mesh_points = mesh_size * np.arange(1, np.ceil(max_value/mesh_size)+1)
print(mesh_points)
cdf_on_mesh = lognorm.cdf(mesh_points, np.sqrt(sigma_squared), scale=np.exp(mu))
print(cdf_on_mesh)

And here is the output:

[0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9]
[0.27253296 0.78263527 0.94431376 0.98489852 0.99552169 0.99855097
 0.99949307 0.99981008 0.99992446]

The desired probabilities at each mesh point are determined using NumPy's diff function, which calculates the differences between the values in its input. The prepend argument of 0 ensures that the probability at the first mesh point 0.1 is included in the output.

probs = np.diff(cdf_on_mesh, prepend=0)
print(probs)

And here is the output:

[2.72532962e-01 5.10102306e-01 1.61678490e-01 4.05847643e-02
 1.06231693e-02 3.02927820e-03 9.42096507e-04 3.17016790e-04
 1.14374225e-04]

Finally, construct the dictionary to be used for the service duration pmf:

support = range(1, len(probs) + 1)
service_duration_truncated_pmf = {value: probs[value-1] for value in support}
print(service_duration_truncated_pmf)

And here is the output:

{1: 0.27253296168426255, 2: 0.5101023060442261, 3: 0.16167849038959536, 4: 0.04058476431355662, 5: 0.010623169259521448, 6: 0.003029278199496388, 7: 0.0009420965072201026, 8: 0.0003170167903396859, 9: 0.00011437422542381892}

Once again, these probabilities do not quite sum to one, but that will be corrected by QPLEX when it receives the pmf.

Here is the complete code:

import numpy as np
from scipy.stats import poisson, lognorm

from qplex import StandardMultiserver

truncation_tolerance = 1e-4

arrival_rate = 3.0
max_value = int(poisson.ppf(1-truncation_tolerance, arrival_rate))
support = range(max_value + 1)
probs = poisson.pmf(support, arrival_rate)
number_of_arrivals_truncated_pmf = {value: probs[value] for value in support}

mu = -2
sigma_squared = 0.25
mesh_size = 0.1
max_value = lognorm.ppf(1-truncation_tolerance, np.sqrt(sigma_squared), scale=np.exp(mu))
mesh_points = mesh_size * np.arange(1, np.ceil(max_value/mesh_size)+1)
cdf_on_mesh = lognorm.cdf(mesh_points, np.sqrt(sigma_squared), scale=np.exp(mu))
probs = np.diff(cdf_on_mesh, prepend=0)
support = range(1, len(probs) + 1)
service_duration_pmf = {value: probs[value-1] for value in support}

engine = StandardMultiserver()
engine.number_of_servers = 5
engine.number_of_arrivals_pmf = number_of_arrivals_truncated_pmf
engine.service_duration_pmf = service_duration_pmf
engine.step()
print(engine.get_number_of_entities_in_system_pmf())

And here is the output:

{0: 0.049790622752576764, 1: 0.14937186825773033, 2: 0.22405780238659542, 3: 0.22405780238659542, 4: 0.16804335178994667, 5: 0.10082601107396805, 6: 0.050413005536983975, 7: 0.021605573801564568, 8: 0.008102090175586711, 9: 0.0027006967251955723, 10: 0.0008102090175586717, 11: 0.00022096609569781942}

The number of entities in the system at time 1 is simply the number of arrivals between time 0 and time 1. Its distribution was given earlier, but the probabilities have been automatically normalized by the engine.