Distributed Deadlock Detection
Introduction
- What are the two main issues in detecting deadlocks ?
- Maintaining WFG
- Searching WFG
- How to resolve a deadlock ? To resolve a deadlock we have to break the cycle in the WFG. This involves rolling back one or more processes and allocation resources to correct the allocation. This change should be immediately passed on to all the sites so that phantom deadlocks are not reported
- What are the correctness criteria of a deadlock detection algorithm?
- Progress
- Detect all deadlocks in finite time
- Detect a deadlock the instant it is formed i.e it should report the instant a cycle is formed in the WFG. It should not wait for more processes to be added to the cycle
- Safety
- No phantom or false deadlocks
- The absence of a global state and a global clock means that sites often receive wrong and out-of-date information about the system. This leads to phantom deadlocks being reported
- Progress
Deadlock Models
AND Model
- What is the AND model?
- Each request may ask for more that one resource.
- The request is complete only when all of the resources asked have been allocated.
- What is the sign that a process is in a deadlock? Can it be in a deadlock without exhibiting this sign? Give an example
- The presence of a cycle in the WFG indicates a deadlock but the vice versa is not true i.e there may be a deadlocked process without it being in the cycle
- Here $P_{44}$ is deadlocked even though it is not part of a cycle
- This can be detected using the Chandy-Misra-Hass algorithm for AND model
OR Model
- What is the OR model
- Each request may ask for more that one resource.
- The request is complete if any of the resources asked has been allocated.
- Can a deadlock be detected using a cycle ?
- The presence of a cycle in the WFG doesn’t indicate a deadlock.
- What is an indicator of deadlock? Can a process be deadlocked without exhibiting this sign
- The presence of a Knot in the WFG indicates a deadlock.
- If a process is not in a knot it doesn’t mean that it can’t be deadlocked.
- So a process is said to be deadlocked if it is in a knot or can only reach processes in a knot
- Formally define deadlock using dependent sets
- Formally a set of processes $S$ are said to be deadlocked if all of the processes of $S$ are blocked , the dependent set of all the process in $S$ are subsets of $S$ and there are no grant message in transit.
- This can be detected using the Chandy-Misra-Hass algorithm for OR Model
Knapp’s Classification
- What are the types of deadlock algorithms ?
According to Knapp there are 4 types of deadlock detection algorithms
- Path-pushing
- Edge-chasing
- Diffusion computation
- Global state
Path Pushing
In this method each site sends it local WFG to all its neighbors. A global WFG is being built on all the sites. When a site has sufficient information, it runs the detection algorithm as states if there is a deadlock or not
Edge Chasing
- What is edge chasing? In this method each site sends a special message known as the probe message. If a process is running then it discards this probe message. If it is blocked then the process sends out probe messages along its outgoing edges. If the original instigator receives a probe message then there is a deadlock in the system.
- What is the main advantage of edge-chasing?
- The main advantage of this system is that the message size is constant and small.
Diffusion computation
In this method deadlock detection is done as part of the WFG construction. But the actual WFG is never constructed. This is accomplished through echo algorithms. The initiator sends query messages along all of its outgoing WFG edges and the receiving process does the same. When a process receives such query message if it is running it discards it, if not it sends back a reply only after receiving a reply for all the queries it had sent. Deadlock is detected if the initiator receives a reply for all queries it had sent
Global state
Global state can be used to detect deadlock. In the context of distributed systems two key things are in favor of this method:
- Global state can be built without freezing the underlying computation
- Even though the global state may not be the state of the system at any physical point of time, if the system was stable before initiation then that will hold in the snapshot