Update: 2020-03-07:
This approach works only with a set of instances in which every two of them interfere with each other. See @snaury 's comments below: #14 (comment)
The livelock problem
Epaxos specifies that in an SCC, the lowest seq commands should be executed
first.
And if there is a stream of interfering commands, execution thread needs to wait until walking through the entire SCC, before it could execute any command.
This is the livelock.
The solution provided by epaxos is to prioritize completing old commands over
proposing new commands.
This solution might bring in some latency because we need to stop
proposing a new command to break the command chain.
I've been thinking maybe there is some other way to solve the livelock problem.
We assume that all commands mentioned below interfere with each other.
Ordering
Assume we determine the command order in an SCC by a tuple (seq, replicaID).
Then the first command to execute is the one with the least (seq, replicaID)
in an SCC.
Find the first command safe to execute
Provided with a committed, un-executed, interfering command list
CMDs = {Ra.A, Rb.B, Rc.C ..}(Ra.A is a command owned by replica Ra).
Assume that Ra.A has the least (seq, replicaID).
Define a function replicas(cmds): to extract a list of command owner replica,
e.g.: replicas({Ra.A, Rb.B}) = {Ra, Rb}.
Define allDeps, the union of all dependency of CMDs: allDeps = Ra.A.deps U Rb.B.deps U Rc.C.deps...
If CMDs contains only the least seq commands(with two un-executed interfering command Ra.X and Ra.Y:
if Ra.X.seq < Ra.Y.seq, and Ra.Y is in CMDs, Ra.X.seq must be in CMDs):
-
For an un-executed command X not in CMDs and owned by one of replicas(CMDs), we have:
X.seq > Ra.A.seq.
Because X.seq is greater than the seq of its preceding command on a same
replica.
Thus X should never be executed before Ra.A.
-
For an un-executed command Y not in CMDs and NOT owned by any of replicas(CMDs),
if replicas(allDeps) ⊆ replicas(CMDs),
then Y depends on CMDs.
Because two committed interfering command U and V have at least one dependency
relationship.
Thus Y should never be executed before Ra.A.
With (1) and (2), we have the conclusion
that Ra.A is the command that should be executed first.
The abstract algorithm
Initialize an empty CMDs.
Add the next least (seq, replicaID) command and its deps into CMDs.
Repeat this step until we find a CMDs so that replicas(allDeps) ⊆ replicas(CMDs), then
execute the least command in CMDs.
Thus we could execute command even before the entire SCC committed.
Because the number of replicas is finite and small, this process would finish quickly,
even when there are a lot of interfering commands.
In this way searching for an SCC is no more required and there won't be a livelock problem anymore.
Update: 2020-03-07:
This approach works only with a set of instances in which every two of them interfere with each other. See @snaury 's comments below: #14 (comment)
The livelock problem
Epaxos specifies that in an SCC, the lowest
seqcommands should be executedfirst.
And if there is a stream of interfering commands, execution thread needs to wait until walking through the entire SCC, before it could execute any command.
This is the livelock.
The solution provided by epaxos is to prioritize completing old commands over
proposing new commands.
This solution might bring in some latency because we need to stop
proposing a new command to break the command chain.
I've been thinking maybe there is some other way to solve the livelock problem.
We assume that all commands mentioned below interfere with each other.
Ordering
Assume we determine the command order in an SCC by a tuple
(seq, replicaID).Then the first command to execute is the one with the least
(seq, replicaID)in an SCC.
Find the first command safe to execute
Provided with a committed, un-executed, interfering command list
CMDs = {Ra.A, Rb.B, Rc.C ..}(Ra.Ais a command owned by replicaRa).Assume that
Ra.Ahas the least(seq, replicaID).Define a function
replicas(cmds): to extract a list of command owner replica,e.g.:
replicas({Ra.A, Rb.B}) = {Ra, Rb}.Define
allDeps, the union of all dependency ofCMDs:allDeps = Ra.A.deps U Rb.B.deps U Rc.C.deps...If
CMDscontains only the leastseqcommands(with two un-executed interfering commandRa.XandRa.Y:if
Ra.X.seq < Ra.Y.seq, andRa.Yis inCMDs,Ra.X.seqmust be inCMDs):For an un-executed command X not in
CMDsand owned by one ofreplicas(CMDs), we have:X.seq > Ra.A.seq.Thus X should never be executed before
Ra.A.For an un-executed command Y not in
CMDsand NOT owned by any ofreplicas(CMDs),if
replicas(allDeps) ⊆ replicas(CMDs),then Y depends on
CMDs.Thus Y should never be executed before
Ra.A.With (1) and (2), we have the conclusion
that
Ra.Ais the command that should be executed first.The abstract algorithm
Initialize an empty
CMDs.Add the next least
(seq, replicaID)command and itsdepsintoCMDs.Repeat this step until we find a
CMDsso thatreplicas(allDeps) ⊆ replicas(CMDs), thenexecute the least command in
CMDs.Thus we could execute command even before the entire SCC committed.
Because the number of replicas is finite and small, this process would finish quickly,
even when there are a lot of interfering commands.
In this way searching for an SCC is no more required and there won't be a livelock problem anymore.