(Rabin)
Acknowledgements: We are grateful to Martijn Moraal, Jeroen Schutte, Fides Aarts and Johan Uijen (students on Frits Vaandrager's Analysis of Embedded Systems course) for contributions to this case study.
This case study is based on Rabin's solution to the well known mutual exclusion problem [Rab82].
Let P1 ... Pn be n processes that from time to time need to execute a critical section in which exactly one of them is allowed to employ a shared resource. The processes can coordinate their activities by use of a common resource.
The model is represented as an MDP. We let N denotes the number of processes. The PRISM source code for the case where N = 3 is show below:
// N-processor mutual exclusion [Rab82] // gxn/dxp 03/12/08 // to remove the need for fairness constraints for this model it is sufficent // to remove the self loops from the model // the step corresponding to a process making a draw has been split into two steps // to allow us to identify states where a process will draw without knowing the value // randomly drawn // to correctly model the protocol and prevent erroneous behaviour, the two steps are atomic // (i.e. no other process can move one the first step has been made) // as for example otherwise an adversary can prevent the process from actually drawing // in the current round by not scheduling it after it has performed the first step mdp // size of shared counter const int K = 6; // 4+ceil(log_2 N) // global variables (all modules can read and write) global c : [0..1]; // 0/1 critical section free/taken global b : [0..K]; // current highest draw global r : [1..2]; // current round // formula for process 1 drawing formula draw = p1=1 & (b<b1 | r!=r1); // formula to keep drawing phase atomic // (a process can only move if no other process is in the middle of drawing) formula go = (draw2=0&draw3=0); module process1 p1 : [0..2]; // local state // 0 remainder // 1 trying // 2 critical section b1 : [0..K]; // current draw: bi r1 : [0..2]; // current round: ri draw1 : [0..1]; // performed first step of drawing phase // remain in remainder // [] go & p1=0 -> (p1'=0); // enter trying [] go & p1=0 -> (p1'=1); // make a draw [] go & draw & draw1=0 -> (draw1'=1); [] draw1=1 -> 0.5 : (b1'=1) & (r1'=r) & (b'=max(b,1)) & (draw1'=0) + 0.25 : (b1'=2) & (r1'=r) & (b'=max(b,2)) & (draw1'=0) + 0.125 : (b1'=3) & (r1'=r) & (b'=max(b,3)) & (draw1'=0) + 0.0625 : (b1'=4) & (r1'=r) & (b'=max(b,4)) & (draw1'=0) + 0.03125 : (b1'=5) & (r1'=r) & (b'=max(b,5)) & (draw1'=0) + 0.03125 : (b1'=6) & (r1'=r) & (b'=max(b,6)) & (draw1'=0); // enter critical section and randomly set r to 1 or 2 [] go & p1=1 & b=b1 & r=r1 & c=0 -> 0.5 : (r'=1) & (c'=1) & (b'=0) & (b1'=0) & (r1'=0) & (p1'=2) + 0.5 : (r'=2) & (c'=1) & (b'=0) & (b1'=0) & (r1'=0) & (p1'=2); // loop when trying and cannot make a draw or enter critical section // [] go & p1=1 & r1=r & b1<=b & ((c=0 & b1!=b) | c=1) -> (p1'=p1); // leave crictical section [] go & p1=2 -> (p1'=0) & (c'=0); // stay in critical section // [] go & p1=2 -> (p1'=2); endmodule // construct further modules through renaming module process2 = process1 [p1=p2, b1=b2, r1=r2, draw1=draw2, draw2=draw1 ] endmodule module process3 = process1 [p1=p3, b1=b3, r1=r3, draw1=draw3, draw3=draw1 ] endmodule // formulas/labels for use in properties: // number of processes in critical section formula num_procs_in_crit = (p1=2?1:0)+(p2=2?1:0)+(p3=2?1:0); // one of the processes is trying label "one_trying" = p1=1|p2=1|p3=1; // one of the processes is in the critical section label "one_critical" = p1=2|p2=2|p3=2; // maximum current draw of the processes formula maxb = max(b1,b2,b3);
The table below shows statistics for each model we have built. The first section gives the number of states in the MDP representing the model. The second part refers to the MTBDD which represents this MDP: the total number of nodes and the number of leaves (terminal nodes).
N: | Model: | MTBDD: | |||
States: | Nodes: | Leaves: | |||
3 | 27,766 | 27,966 | 7 | ||
4 | 668,836 | 63,135 | 7 | ||
5 | 27,381,358 | 131,591 | 8 | ||
6 | 624,265,622 | 196,376 | 8 | ||
7 | 1.3e+11 | 271,825 | 8 | ||
8 | 2.8e+12 | 357,938 | 8 | ||
9 | 1.6e+13 | 723,514 | 9 | ||
10 | 3.5e+15 | 894,897 | 9 |
The table below gives the times taken to construct the models. There are two stages. Firstly, the system description (in our module based language) is parsed and converted to the MTBDD representing it. Secondly, reachability is performed to identify non-reachable states and the MTBDD is filtered accordingly. Reachability is performed using a BDD based fixpoint algorithm. The number of iterations for this process is also given.
N: | Construction: | Reachability: | |||
Time (s): | Time (s): | Iter.s: | |||
3 | 0.527 | 0.23 | 21 | ||
4 | 1.415 | 0.81 | 24 | ||
5 | 4.776 | 3.85 | 27 | ||
6 | 10.19 | 8.59 | 30 | ||
7 | 20.16 | 17.62 | 33 | ||
8 | 35.63 | 30.51 | 36 | ||
9 | 97.92 | 86.76 | 39 | ||
10 | 188.4 | 169.4 | 42 |
When model checking, we have the following requirements:
These correspond to fairness assumptions on the adversary/scheduler. However, by following the approach employed by [DFP04] for the dining philosophers case study, in fact we can bypass this restriction by removing the self-loops from the model.
We have verified the following PRISM properties in all states of the model:
// Mutual exclusion: at any time t there is at most one process in its critical section phase num_procs_in_crit <= 1 // Liveness: if a process is trying, then eventually a process enters the critical section "one_trying" => P>=1 [ F "one_critical" ]
One requirement that we have been unable to verify is:
One problem with this requirement is that it cannot be expressed by a PCTL formula as property corresponds to the probability of a conjunction of path formulae (expressing the properties a process participates in a round, M processes participate in a round and a process enters the critical section) being greater than a certain bound. However, in PCTL we are restricted to comparing only the probability of single path formulae with bounds.
On the other hand, this property could be expressed in PCTL*, but we would still need to consider a different formula for each possible value of M (M ranges from 1 up to the number of processes).
We note that Saias [Sai92] proved that this property does not hold. Furthermore, Kushilevitz and Rabin [KR92] have presented a modified version of this algorithm which solves this problem.
Based on ideas from Martijn Moraal and Jeroen Schutte, we instead consider a weaker property. More precisely, we compute:
As shown in the PRISM description above, to allow for the analysis of this property, we modified the model by dividing the step corresponding to a process making a draw into two. This allowed us to identify states where a process will draw without knowing what value the process will randomly pick. To correctly model the protocol and prevent erroneous behaviour, we kept these two steps atomic (i.e. no other process can move once the first step has been made). For example, if we do not add this restriction, then the adversary can prevent the process from actually drawing in the current round by not scheduling it after it has performed the first step.
The actual property, concerning process 1, verified in PRISM is given below:
// minimum probability process 1 enters the criticial section next Pmin=?[ !"one_critical" U (p1=2) {draw1=1 & !"one_critical"}{min} ]
The use of the filter in the property expresses the fact that we are only interested in the probability in states for which:
and we take the minimum value over this set of states.
Verifying this property returns the value 0 which is due to the fact that the adversary can use the values previously drawn by other processes. This is similar to the proof Saias [Sai92] presents for proving that Rabin's bounded waiting property does not hold. To demonstrate this fact we present the results for the following property which restricts attention to those states where the values drawn by the other processes are below k.
const int k; Pmin=?[ !"one_critical" U (p1=2) {draw1=1 & !"one_critical" & maxb<=k}{min} ]
The tables below present results for this property as k varies (note that the case k=4+ceil(log2 N) corresponds to the Prob-Win property above).
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.237457 | ||
4 | 0.164 | 11 | 0.180014 | ||
5 | 0.476 | 14 | 0.144384 | ||
6 | 4.570 | 17 | 0.119936 | ||
7 | 10.17 | 20 | 0.102551 | ||
8 | 21.11 | 23 | 0.089578 | ||
9 | 88.64 | 26 | 0.079959 | ||
10 | 199.6 | 29 | 0.071948 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.237457 | ||
4 | 0.164 | 11 | 0.180014 | ||
5 | 0.476 | 14 | 0.144384 | ||
6 | 4.570 | 17 | 0.119936 | ||
7 | 10.17 | 20 | 0.102551 | ||
8 | 21.11 | 23 | 0.089578 | ||
9 | 88.64 | 26 | 0.079959 | ||
10 | 199.6 | 29 | 0.071948 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.208007 | ||
4 | 0.164 | 11 | 0.174957 | ||
5 | 0.476 | 14 | 0.144384 | ||
6 | 4.570 | 17 | 0.119936 | ||
7 | 10.17 | 20 | 0.102551 | ||
8 | 21.11 | 23 | 0.089578 | ||
9 | 88.64 | 26 | 0.079959 | ||
10 | 199.6 | 29 | 0.071948 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.114257 | ||
4 | 0.164 | 11 | 0.104644 | ||
5 | 0.476 | 14 | 0.096728 | ||
6 | 4.570 | 17 | 0.089208 | ||
7 | 10.17 | 20 | 0.082461 | ||
8 | 21.11 | 23 | 0.076397 | ||
9 | 88.64 | 26 | 0.071337 | ||
10 | 199.6 | 29 | 0.066468 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.059570 | ||
4 | 0.164 | 11 | 0.056793 | ||
5 | 0.476 | 14 | 0.054858 | ||
6 | 4.570 | 17 | 0.052572 | ||
7 | 10.17 | 20 | 0.050404 | ||
8 | 21.11 | 23 | 0.048347 | ||
9 | 88.64 | 26 | 0.046794 | ||
10 | 199.6 | 29 | 0.044992 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.030273 | ||
4 | 0.164 | 11 | 0.029327 | ||
5 | 0.476 | 14 | 0.029109 | ||
6 | 4.570 | 17 | 0.028432 | ||
7 | 10.17 | 20 | 0.027773 | ||
8 | 21.11 | 23 | 0.027131 | ||
9 | 88.64 | 26 | 0.026903 | ||
10 | 199.6 | 29 | 0.026345 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | 0.053 | 8 | 0.0 | ||
4 | 0.164 | 11 | 0.0 | ||
5 | 0.476 | 14 | 0.014903 | ||
6 | 4.570 | 17 | 0.014671 | ||
7 | 10.17 | 20 | 0.014441 | ||
8 | 21.11 | 23 | 0.014216 | ||
9 | 88.64 | 26 | 0.014392 | ||
10 | 199.6 | 29 | 0.014225 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | - | - | - | ||
4 | - | - | - | ||
5 | 0.476 | 14 | 0.0 | ||
6 | 4.570 | 17 | 0.0 | ||
7 | 10.17 | 20 | 0.0 | ||
8 | 21.11 | 23 | 0.0 | ||
9 | 88.64 | 26 | 0.007395 | ||
10 | 199.6 | 29 | 0.007337 |
N: | Model Checking: | Probability: | |||
Time (s): | Iter.s: | ||||
3 | - | - | - | ||
4 | - | - | - | ||
5 | - | - | - | ||
6 | - | - | - | ||
7 | - | - | - | ||
8 | - | - | - | ||
9 | 88.64 | 26 | 0.0 | ||
10 | 199.6 | 29 | 0.0 |