Project

General

Profile

Feature #3176

NACK in multicast strategy

Added by Junxiao Shi almost 4 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Category:
Forwarding
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
3.00 h

Description

Extend multicast strategy to support NACK.

  • Design how NACK should be processed in multicast strategy.
  • Implement.
  • Update NFD Developer Guide.

Related issues

Related to NFD - Feature #1756: Let strategy pick outgoing Interest packetClosed

Related to NFD - Task #3879: ASF strategy should propagate NACK if all nexthops are NACKedNew2016-12-06

Blocked by NFD - Feature #3156: NACK in forwarding and best-routeClosed

Blocks NFD - Feature #3968: Forward Interest to ad hoc incoming faceClosed

History

#1 Updated by Junxiao Shi almost 4 years ago

  • Blocked by Feature #3156: NACK in forwarding and best-route added

#2 Updated by Junxiao Shi almost 4 years ago

  • Assignee set to Klaus Schneider

Design can start now.

Implementation cannot start until code for #3156 is approved.

#3 Updated by Klaus Schneider almost 4 years ago

I think the design we talked about is as follows:

If all upstreams return a NACK: Return a NACK of the lowest severity to all downstreams.

This leads to the question: How to rank the severity among NACK reasons? NoRoute > DuplicateNonce > Congestion?

#4 Updated by Junxiao Shi almost 4 years ago

Answer to note-3: see #3032.

#5 Updated by Ashlesh Gawande about 3 years ago

  • Assignee changed from Klaus Schneider to Ashlesh Gawande

#6 Updated by Klaus Schneider about 3 years ago

Let me know if you need any help with the design. But I think it should be rather straight forward, as written above.

#7 Updated by Ashlesh Gawande almost 3 years ago

So just to make sure, does this sound good:

on afterReceiveInterest: send Nack [No-Route] if no eligible next hop

on afterReceiveNack: send Nack to downstream(s) if all the upstreams have been Nacked else wait

?

#8 Updated by Junxiao Shi almost 3 years ago

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

#9 Updated by Klaus Schneider almost 3 years ago

I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.

When you get 3 different NACK Reasons, which one do you send on?

Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?

I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.

#10 Updated by Ashlesh Gawande almost 3 years ago

Junxiao Shi wrote:

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

Okay, does this include consumer retransmissions?

Klaus Schneider wrote:

I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.

When you get 3 different NACK Reasons, which one do you send on?

Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?

I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.

Lowest severity NACK (note-3) using the ranking from note-4 will be sent when all OutRecords are nacked.

To the best of my knowledge (pray enlighten if wrong):
On PIT timeout, NACK is sent anyway
PIT entry life time is the same as the last interest's lifetime (as per dev guide).
Different links shouldn't have different timeouts

#11 Updated by Junxiao Shi almost 3 years ago

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

Okay, does this include consumer retransmissions?

First, I want to correct note-8: the shared functionality can either be implemented as a base class or as a helper class/function.
Helper class/function should be preferred over a base class because it allows better composability.

Second, it's better to keep consumer retransmission out of the base class or helper class/function, because the policy of consumer retransmission suppression is independent from how Nacks are handled.

On PIT timeout, NACK is sent anyway

When PIT entry times out (before expire pending Interest trigger), don't send Nacks.
At this moment, the PIT entry at downstream is also timed out, and Nacks won't match a PIT entry and would be dropped.

#12 Updated by Ashlesh Gawande almost 3 years ago

  • Status changed from New to In Progress

1)
Please find first patch at: https://gerrit.named-data.net/#/c/3195/1, abandoned

Patch at: https://gerrit.named-data.net/#/c/3013/

I moved compareLessSevere and predicate_NextHop_Eligible to pit-algorithm.

Seems like afterReceiveNack is almost similar to that of BestRoute2, but I am not sure how to pull out common things?

2)
For the tests should I delete multicast v1 test?
I will add more test in patchset2. I am looking into BestRoute2 tests to see how best to test.

#13 Updated by Junxiao Shi almost 3 years ago

https://gerrit.named-data.net/3013 patchset5 has a major difference from best-route v4: "one out-record not Nacked, which is also a downstream" special case (see #3033-7 and #3033-10) is not considered.
I wonder what's the rationale?

#14 Updated by Ashlesh Gawande almost 3 years ago

Overlooked it (I ran the test for it from best-route in multicast but forgot to change the strategy on each node to multicast and it passed. So I thought it does not pertain to multicast - should have gone through the scenario). Will include it in next patchset. Should I post something here about it before doing that?

#15 Updated by Ashlesh Gawande almost 3 years ago

https://gerrit.named-data.net/#/c/3013/6

The functions for afterReceiveNack are same in multicast and best route now to handle https://redmine.named-data.net/issues/3033#note-7.

Can someone please suggest how to use a common function - I am not sure how to separate the logging messages for the different strategies.

#16 Updated by Ashlesh Gawande over 2 years ago

So we were discussing about handling Duplicate NACKs. Would this be a desired behavior:

When a duplicate NACK is received, check the inRecords of the PIT Entry and see if there is any interest with the same name pending. Then send that interest which will have a different Nonce (with preference given to local application, ex NLSR) to the inFace of the NACK.

#17 Updated by Klaus Schneider over 2 years ago

So a "Duplicate NACK" means that the sender has received an Interest with a duplicate nonce, right? (I still think that the term "Duplicate NACK" isn't descriptive/intuitive enough)

I think it makes sense to then try other nonces in the PIT entry, but the question is in which order to try them?

An alternative would be to, after the NACK, send an Interest packet that contains all received nonces so far. If at least one of them isn't duplicate, the upstream host will add the face to the inRecord. Otherwise, it will send another NACK.

#18 Updated by Junxiao Shi over 2 years ago

Reply to note-16 and note-17:

I have a recommendation of Nack-Duplicate processing applicable to all strategies (and may involve forwarding pipelines).
See #1966-7.

#19 Updated by Klaus Schneider over 2 years ago

Here's the text from Junxiao's note-18

Iterate through all unexpired PIT in-records, and pick a PIT in-record whose
last incoming Nonce is not recorded as duplicated in the PIT out-record. When
multiple PIT in-records are eligible, the newest PIT in-record should be used
because its Nonce is less likely to be seen by the upstream.
Implementation concern

In NFD strategy API, send Interest action does not allow strategy to specify which
Nonce to use in an outgoing Interest packet. Either this API should be extended, or
this procedure should be implemented in the outgoing Interest pipeline.

I agree with the design, as it seems simpler as my suggestion of sending multiple nonces.

#20 Updated by Lan Wang over 2 years ago

how about give the first preference to local applications that sent the same interest (same name)? if there are multiple local faces or there are no local faces, then use time to decide (newest nonce is preferred).

#21 Updated by Klaus Schneider over 2 years ago

Can you state the reason for preferring local interfaces?

Is the assumption that local applications will always use an unseen nonce? What if a local app, for whatever reason, retransmits with the same nonce on a different interface?

Maybe we can think about the probability of getting a duplicate nonce for different decision rules and then create a hierarchy like:

  1. Choose nonce of local face (criteria A)
  2. If none or multiple of (A), choose face with highest/lowest of criteria B.
  3. If none or multiple choices left, choose highest/loweest of criteria C.
  4. ...
  5. Choose randomly.

The assumptions that we already have are:

  • Newer PIT in-records are less likely to result in a duplicate nonce than older ones.
  • PIT in-records from local faces are less likely to result in a duplicate nonce than in-records from non-local faces.

Maybe there are others.

#22 Updated by Junxiao Shi over 2 years ago

  • Related to Feature #1756: Let strategy pick outgoing Interest packet added

#23 Updated by Ashlesh Gawande over 2 years ago

I have run experiments with the following code to determine whether retransmission with new Nonce from pitEntry on Duplicate benefits NLSR convergence speed or reduces Duplicate NACKs.
Currently NLSR uses /localhop prefix for Sync and non-localhop for application prefix which are on multicast strategy.

The code is same as BestRoutev4 for NACK handling except it has the simple retransmission code based on Junxiao's suggestion in https://redmine.named-data.net/issues/3176#note-19.

void
MulticastStrategy::afterReceiveNack(const Face& inFace, const lp::Nack& nack,
                                     const shared_ptr<pit::Entry>& pitEntry)
{
  if (nack.getReason() == lp::NackReason::DUPLICATE) {
    time::steady_clock::TimePoint newest = time::steady_clock::TimePoint::min();
    pit::InRecord* newest_inR = nullptr;

    bool existInOutRecord = false;

    for (const pit::InRecord& inR : pitEntry->getInRecords()) {

      // Is outRecord.getLastNonce in not in inRecords
      existInOutRecord = false;
      for (const pit::OutRecord& outR : pitEntry->getOutRecords()) {
         if (inR.getLastNonce() == outR.getLastNonce()) {
           existInOutRecord = true;
           break;
         }
      }

      // If does not exist in outRecord and is the newest and the inR is not from inFace
      if (!existInOutRecord && inR.getLastRenewed() > newest && inFace.getId() != inR.getFace().getId()) {
        newest = inR.getLastRenewed();
        newest_inR = &const_cast<pit::InRecord&>(inR);
      }
    } // for outRecords

    // send interest with new Nonce to inFace
    if (newest_inR != nullptr) {
       this->sendInterest(pitEntry, const_cast<Face&>(inFace), newest_inR->getInterest());
       NFD_LOG_ERROR("Special: " << newest_inR->getInterest() << " from=" << newest_inR->getFace().getId() <<
                     " newPitEntry-to=" << inFace.getId());
     }
   }

   Followed by same code as BestRoutev4

I ran experiments with and without (i.e. NACK handling is the same as BestRoutev4) this retransmission code and found no significant reduction in Duplicate NACks or faster convergence.
I ran 10 experiments for each with 34 node topology (current testbed) in Mini-NDN.
Avg Dupl NACKs with retransmission code: 87223
Avg Dupl NACKs w/o retransmission code: 90879

Convergence time is 30 seconds, I cannot go lower for either.

Since it does not help, it seems like I should not put it in?


So for NLSR we are using /localhop prefix to reduce Duplicate NACKs on Sync and plan to do it for LSA prefix as well.

Otherwise if I put both sync and LSA prefix on non-localhop prefix, NLSR stops converging for bigger topology and there is a storm of NACK duplicates. I am not sure if this is because of computing power available.

I have tried to turn off all logging (no ndndump, NFD, or NLSR logs) and still the topology does not converge in any reasonable time (I have tried up to 5 minutes but it still does not converge).

The largest topology I have that converges is the 10 node topology.

(And this is on a two octacore (each has 16 threads) machine - with 160 GB RAM, 30GB dedicated to RAMDISK for these experiments).

#24 Updated by Ashlesh Gawande over 2 years ago

I have run more experiments to determine whether the retransmission of interest with new Nonce on duplicate NACK reduces duplicate NACKs. On average the number of Duplicate NACKs is very slightly higher for the patch with retransmission. So I will skip it and push a new patch.

#25 Updated by Junxiao Shi over 2 years ago

  • Target version changed from v0.5 to v0.6

#26 Updated by Junxiao Shi over 2 years ago

  • Related to Task #3879: ASF strategy should propagate NACK if all nexthops are NACKed added

#27 Updated by Junxiao Shi over 2 years ago

  • Blocks Feature #3968: Forward Interest to ad hoc incoming face added

#28 Updated by Junxiao Shi over 2 years ago

  • Status changed from In Progress to Resolved
  • % Done changed from 0 to 100

Code is merged in https://gerrit.named-data.net/3013. NFD devguide update is needed, but can be deferred to be together with #2062.

#29 Updated by Ashlesh Gawande over 2 years ago

NFD devguide is updated, can somebody please review?
(Should I add myself to the author list?)

#30 Updated by Alex Afanasyev over 2 years ago

(Should I add myself to the author list?)

Yes

#31 Updated by Junxiao Shi over 2 years ago

  • Status changed from Resolved to Closed

NFD-devguide looks correct.

Also available in: Atom PDF