Feature #3176
closedNACK in multicast strategy
Added by Junxiao Shi about 9 years ago. Updated over 7 years ago.
100%
Description
Extend multicast strategy to support NACK.
- Design how NACK should be processed in multicast strategy.
- Implement.
- Update NFD Developer Guide.
Updated by Junxiao Shi about 9 years ago
- Blocked by Feature #3156: NACK in forwarding and best-route added
Updated by Junxiao Shi about 9 years ago
- Assignee set to Anonymous
Design can start now.
Implementation cannot start until code for #3156 is approved.
Updated by Anonymous about 9 years ago
I think the design we talked about is as follows:
If all upstreams return a NACK: Return a NACK of the lowest severity to all downstreams.
This leads to the question: How to rank the severity among NACK reasons? NoRoute > DuplicateNonce > Congestion?
Updated by Ashlesh Gawande over 8 years ago
- Assignee changed from Anonymous to Ashlesh Gawande
Updated by Anonymous over 8 years ago
Let me know if you need any help with the design. But I think it should be rather straight forward, as written above.
Updated by Ashlesh Gawande over 8 years ago
So just to make sure, does this sound good:
on afterReceiveInterest: send Nack [No-Route] if no eligible next hop
on afterReceiveNack: send Nack to downstream(s) if all the upstreams have been Nacked else wait
?
Updated by Junxiao Shi over 8 years ago
I agree with note-7. This is basically the same procedure as best-route v4.
Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.
Updated by Anonymous over 8 years ago
I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.
When you get 3 different NACK Reasons, which one do you send on?
Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?
I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.
Updated by Ashlesh Gawande over 8 years ago
Junxiao Shi wrote:
I agree with note-7. This is basically the same procedure as best-route v4.
Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.
Okay, does this include consumer retransmissions?
Klaus Schneider wrote:
I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.
When you get 3 different NACK Reasons, which one do you send on?
Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?
I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.
Lowest severity NACK (note-3) using the ranking from note-4 will be sent when all OutRecords are nacked.
To the best of my knowledge (pray enlighten if wrong):
On PIT timeout, NACK is sent anyway
PIT entry life time is the same as the last interest's lifetime (as per dev guide).
Different links shouldn't have different timeouts
Updated by Junxiao Shi over 8 years ago
I agree with note-7. This is basically the same procedure as best-route v4.
Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.Okay, does this include consumer retransmissions?
First, I want to correct note-8: the shared functionality can either be implemented as a base class or as a helper class/function.
Helper class/function should be preferred over a base class because it allows better composability.
Second, it's better to keep consumer retransmission out of the base class or helper class/function, because the policy of consumer retransmission suppression is independent from how Nacks are handled.
On PIT timeout, NACK is sent anyway
When PIT entry times out (before expire pending Interest trigger), don't send Nacks.
At this moment, the PIT entry at downstream is also timed out, and Nacks won't match a PIT entry and would be dropped.
Updated by Ashlesh Gawande about 8 years ago
- Status changed from New to In Progress
1)
Please find first patch at: https://gerrit.named-data.net/#/c/3195/1, abandoned
Patch at: https://gerrit.named-data.net/#/c/3013/
I moved compareLessSevere and predicate_NextHop_Eligible
to pit-algorithm.
Seems like afterReceiveNack is almost similar to that of BestRoute2, but I am not sure how to pull out common things?
2)
For the tests should I delete multicast v1 test?
I will add more test in patchset2. I am looking into BestRoute2 tests to see how best to test.
Updated by Junxiao Shi about 8 years ago
https://gerrit.named-data.net/3013 patchset5 has a major difference from best-route v4: "one out-record not Nacked, which is also a downstream" special case (see #3033-7 and #3033-10) is not considered.
I wonder what's the rationale?
Updated by Ashlesh Gawande about 8 years ago
Overlooked it (I ran the test for it from best-route in multicast but forgot to change the strategy on each node to multicast and it passed. So I thought it does not pertain to multicast - should have gone through the scenario). Will include it in next patchset. Should I post something here about it before doing that?
Updated by Ashlesh Gawande about 8 years ago
https://gerrit.named-data.net/#/c/3013/6
The functions for afterReceiveNack are same in multicast and best route now to handle https://redmine.named-data.net/issues/3033#note-7.
Can someone please suggest how to use a common function - I am not sure how to separate the logging messages for the different strategies.
Updated by Ashlesh Gawande about 8 years ago
So we were discussing about handling Duplicate NACKs. Would this be a desired behavior:
When a duplicate NACK is received, check the inRecords of the PIT Entry and see if there is any interest with the same name pending. Then send that interest which will have a different Nonce (with preference given to local application, ex NLSR) to the inFace of the NACK.
Updated by Anonymous about 8 years ago
So a "Duplicate NACK" means that the sender has received an Interest with a duplicate nonce, right? (I still think that the term "Duplicate NACK" isn't descriptive/intuitive enough)
I think it makes sense to then try other nonces in the PIT entry, but the question is in which order to try them?
An alternative would be to, after the NACK, send an Interest packet that contains all received nonces so far. If at least one of them isn't duplicate, the upstream host will add the face to the inRecord. Otherwise, it will send another NACK.
Updated by Junxiao Shi about 8 years ago
Reply to note-16 and note-17:
I have a recommendation of Nack-Duplicate processing applicable to all strategies (and may involve forwarding pipelines).
See #1966-7.
Updated by Anonymous about 8 years ago
Here's the text from Junxiao's note-18
Iterate through all unexpired PIT in-records, and pick a PIT in-record whose
last incoming Nonce is not recorded as duplicated in the PIT out-record. When
multiple PIT in-records are eligible, the newest PIT in-record should be used
because its Nonce is less likely to be seen by the upstream.
Implementation concernIn NFD strategy API, send Interest action does not allow strategy to specify which
Nonce to use in an outgoing Interest packet. Either this API should be extended, or
this procedure should be implemented in the outgoing Interest pipeline.
I agree with the design, as it seems simpler as my suggestion of sending multiple nonces.
Updated by Lan Wang about 8 years ago
how about give the first preference to local applications that sent the same interest (same name)? if there are multiple local faces or there are no local faces, then use time to decide (newest nonce is preferred).
Updated by Anonymous about 8 years ago
Can you state the reason for preferring local interfaces?
Is the assumption that local applications will always use an unseen nonce? What if a local app, for whatever reason, retransmits with the same nonce on a different interface?
Maybe we can think about the probability of getting a duplicate nonce for different decision rules and then create a hierarchy like:
- Choose nonce of local face (criteria A)
- If none or multiple of (A), choose face with highest/lowest of criteria B.
- If none or multiple choices left, choose highest/loweest of criteria C.
- ...
- Choose randomly.
The assumptions that we already have are:
- Newer PIT in-records are less likely to result in a duplicate nonce than older ones.
- PIT in-records from local faces are less likely to result in a duplicate nonce than in-records from non-local faces.
Maybe there are others.
Updated by Junxiao Shi about 8 years ago
- Related to Feature #1756: Let strategy pick outgoing Interest packet added
Updated by Ashlesh Gawande almost 8 years ago
I have run experiments with the following code to determine whether retransmission with new Nonce from pitEntry on Duplicate benefits NLSR convergence speed or reduces Duplicate NACKs.
Currently NLSR uses /localhop prefix for Sync and non-localhop for application prefix which are on multicast strategy.
The code is same as BestRoutev4 for NACK handling except it has the simple retransmission code based on Junxiao's suggestion in https://redmine.named-data.net/issues/3176#note-19.
void
MulticastStrategy::afterReceiveNack(const Face& inFace, const lp::Nack& nack,
const shared_ptr<pit::Entry>& pitEntry)
{
if (nack.getReason() == lp::NackReason::DUPLICATE) {
time::steady_clock::TimePoint newest = time::steady_clock::TimePoint::min();
pit::InRecord* newest_inR = nullptr;
bool existInOutRecord = false;
for (const pit::InRecord& inR : pitEntry->getInRecords()) {
// Is outRecord.getLastNonce in not in inRecords
existInOutRecord = false;
for (const pit::OutRecord& outR : pitEntry->getOutRecords()) {
if (inR.getLastNonce() == outR.getLastNonce()) {
existInOutRecord = true;
break;
}
}
// If does not exist in outRecord and is the newest and the inR is not from inFace
if (!existInOutRecord && inR.getLastRenewed() > newest && inFace.getId() != inR.getFace().getId()) {
newest = inR.getLastRenewed();
newest_inR = &const_cast<pit::InRecord&>(inR);
}
} // for outRecords
// send interest with new Nonce to inFace
if (newest_inR != nullptr) {
this->sendInterest(pitEntry, const_cast<Face&>(inFace), newest_inR->getInterest());
NFD_LOG_ERROR("Special: " << newest_inR->getInterest() << " from=" << newest_inR->getFace().getId() <<
" newPitEntry-to=" << inFace.getId());
}
}
Followed by same code as BestRoutev4
I ran experiments with and without (i.e. NACK handling is the same as BestRoutev4) this retransmission code and found no significant reduction in Duplicate NACks or faster convergence.
I ran 10 experiments for each with 34 node topology (current testbed) in Mini-NDN.
Avg Dupl NACKs with retransmission code: 87223
Avg Dupl NACKs w/o retransmission code: 90879
Convergence time is 30 seconds, I cannot go lower for either.
Since it does not help, it seems like I should not put it in?
So for NLSR we are using /localhop prefix to reduce Duplicate NACKs on Sync and plan to do it for LSA prefix as well.
Otherwise if I put both sync and LSA prefix on non-localhop prefix, NLSR stops converging for bigger topology and there is a storm of NACK duplicates. I am not sure if this is because of computing power available.
I have tried to turn off all logging (no ndndump, NFD, or NLSR logs) and still the topology does not converge in any reasonable time (I have tried up to 5 minutes but it still does not converge).
The largest topology I have that converges is the 10 node topology.
(And this is on a two octacore (each has 16 threads) machine - with 160 GB RAM, 30GB dedicated to RAMDISK for these experiments).
Updated by Ashlesh Gawande almost 8 years ago
I have run more experiments to determine whether the retransmission of interest with new Nonce on duplicate NACK reduces duplicate NACKs. On average the number of Duplicate NACKs is very slightly higher for the patch with retransmission. So I will skip it and push a new patch.
Updated by Junxiao Shi almost 8 years ago
- Target version changed from v0.5 to v0.6
Updated by Junxiao Shi almost 8 years ago
- Related to Feature #3879: ASF strategy should propagate NACK if all nexthops are NACKed added
Updated by Junxiao Shi almost 8 years ago
- Blocks Feature #3968: Forward Interest to ad hoc incoming face added
Updated by Junxiao Shi over 7 years ago
- Status changed from In Progress to Resolved
- % Done changed from 0 to 100
Code is merged in https://gerrit.named-data.net/3013. NFD devguide update is needed, but can be deferred to be together with #2062.
Updated by Ashlesh Gawande over 7 years ago
NFD devguide is updated, can somebody please review?
(Should I add myself to the author list?)
Updated by Alex Afanasyev over 7 years ago
(Should I add myself to the author list?)
Yes
Updated by Junxiao Shi over 7 years ago
- Status changed from Resolved to Closed
NFD-devguide looks correct.