Feature #3176: NACK in multicast strategy - NFD - NDN project issue tracking system

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

Actions

Copy link

#9

Updated by Anonymous about 9 years ago

I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.

When you get 3 different NACK Reasons, which one do you send on?

Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?

I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.

Actions

Copy link

#10

Updated by Ashlesh Gawande about 9 years ago

Junxiao Shi wrote:

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

Okay, does this include consumer retransmissions?

Klaus Schneider wrote:

I agree with the design, but think you should be more specific about which NACK Reasons you use in afterReceiveNack.

When you get 3 different NACK Reasons, which one do you send on?

Also: what happens on a PIT timeout? Is the PIT entry lifetime always the same as the Interest lifetime? Can different links have different timeouts?

I already have my guesses on what the answers to these questions are, but I would prefer them to be written down.

Lowest severity NACK (note-3) using the ranking from note-4 will be sent when all OutRecords are nacked.

To the best of my knowledge (pray enlighten if wrong):
On PIT timeout, NACK is sent anyway
PIT entry life time is the same as the last interest's lifetime (as per dev guide).
Different links shouldn't have different timeouts

Actions

Copy link

#11

Updated by Junxiao Shi about 9 years ago

I agree with note-7. This is basically the same procedure as best-route v4.

Therefore, you should introduce a base class; both best-route v4 and multicast v2 (yes, you need to increment the version number) should inherit from that base class.

Okay, does this include consumer retransmissions?

First, I want to correct note-8: the shared functionality can either be implemented as a base class or as a helper class/function.
Helper class/function should be preferred over a base class because it allows better composability.

Second, it's better to keep consumer retransmission out of the base class or helper class/function, because the policy of consumer retransmission suppression is independent from how Nacks are handled.

On PIT timeout, NACK is sent anyway

When PIT entry times out (before expire pending Interest trigger), don't send Nacks.
At this moment, the PIT entry at downstream is also timed out, and Nacks won't match a PIT entry and would be dropped.

Actions

Copy link

#12

Updated by Ashlesh Gawande almost 9 years ago

Status changed from New to In Progress

1)
Please find first patch at: https://gerrit.named-data.net/#/c/3195/1, abandoned

Patch at: https://gerrit.named-data.net/#/c/3013/

I moved compareLessSevere and predicate_NextHop_Eligible to pit-algorithm.

Seems like afterReceiveNack is almost similar to that of BestRoute2, but I am not sure how to pull out common things?

2)
For the tests should I delete multicast v1 test?
I will add more test in patchset2. I am looking into BestRoute2 tests to see how best to test.

Actions

Copy link

#13

Updated by Junxiao Shi almost 9 years ago

https://gerrit.named-data.net/3013 patchset5 has a major difference from best-route v4: "one out-record not Nacked, which is also a downstream" special case (see #3033-7 and #3033-10) is not considered.
I wonder what's the rationale?

Actions

Copy link

#14

Updated by Ashlesh Gawande almost 9 years ago

Overlooked it (I ran the test for it from best-route in multicast but forgot to change the strategy on each node to multicast and it passed. So I thought it does not pertain to multicast - should have gone through the scenario). Will include it in next patchset. Should I post something here about it before doing that?

Actions

Copy link

#15

Updated by Ashlesh Gawande almost 9 years ago

https://gerrit.named-data.net/#/c/3013/6

The functions for afterReceiveNack are same in multicast and best route now to handle https://redmine.named-data.net/issues/3033#note-7.

Can someone please suggest how to use a common function - I am not sure how to separate the logging messages for the different strategies.

Actions

Copy link

#16

Updated by Ashlesh Gawande almost 9 years ago

So we were discussing about handling Duplicate NACKs. Would this be a desired behavior:

When a duplicate NACK is received, check the inRecords of the PIT Entry and see if there is any interest with the same name pending. Then send that interest which will have a different Nonce (with preference given to local application, ex NLSR) to the inFace of the NACK.

Actions

Copy link

#17

Updated by Anonymous almost 9 years ago

So a "Duplicate NACK" means that the sender has received an Interest with a duplicate nonce, right? (I still think that the term "Duplicate NACK" isn't descriptive/intuitive enough)

I think it makes sense to then try other nonces in the PIT entry, but the question is in which order to try them?

An alternative would be to, after the NACK, send an Interest packet that contains all received nonces so far. If at least one of them isn't duplicate, the upstream host will add the face to the inRecord. Otherwise, it will send another NACK.

Actions

Copy link

#18

Updated by Junxiao Shi almost 9 years ago

Reply to note-16 and note-17:

I have a recommendation of Nack-Duplicate processing applicable to all strategies (and may involve forwarding pipelines).
See #1966-7.

Actions

Copy link

#19

Updated by Anonymous almost 9 years ago

Here's the text from Junxiao's note-18

Iterate through all unexpired PIT in-records, and pick a PIT in-record whose
last incoming Nonce is not recorded as duplicated in the PIT out-record. When
multiple PIT in-records are eligible, the newest PIT in-record should be used
because its Nonce is less likely to be seen by the upstream.
Implementation concern

In NFD strategy API, send Interest action does not allow strategy to specify which
Nonce to use in an outgoing Interest packet. Either this API should be extended, or
this procedure should be implemented in the outgoing Interest pipeline.

I agree with the design, as it seems simpler as my suggestion of sending multiple nonces.

Actions

Copy link

#20

Updated by Lan Wang almost 9 years ago

how about give the first preference to local applications that sent the same interest (same name)? if there are multiple local faces or there are no local faces, then use time to decide (newest nonce is preferred).

Actions

Copy link

#21

Updated by Anonymous almost 9 years ago

Can you state the reason for preferring local interfaces?

Is the assumption that local applications will always use an unseen nonce? What if a local app, for whatever reason, retransmits with the same nonce on a different interface?

Maybe we can think about the probability of getting a duplicate nonce for different decision rules and then create a hierarchy like:

Choose nonce of local face (criteria A)
If none or multiple of (A), choose face with highest/lowest of criteria B.
If none or multiple choices left, choose highest/loweest of criteria C.
...
Choose randomly.

The assumptions that we already have are:

Newer PIT in-records are less likely to result in a duplicate nonce than older ones.
PIT in-records from local faces are less likely to result in a duplicate nonce than in-records from non-local faces.

Maybe there are others.

Actions

Copy link

#22

Updated by Junxiao Shi almost 9 years ago

Related to Feature #1756: Let strategy pick outgoing Interest packet added

Actions

Copy link

#23

Updated by Ashlesh Gawande over 8 years ago

I have run experiments with the following code to determine whether retransmission with new Nonce from pitEntry on Duplicate benefits NLSR convergence speed or reduces Duplicate NACKs.
Currently NLSR uses /localhop prefix for Sync and non-localhop for application prefix which are on multicast strategy.

The code is same as BestRoutev4 for NACK handling except it has the simple retransmission code based on Junxiao's suggestion in https://redmine.named-data.net/issues/3176#note-19.


void
MulticastStrategy::afterReceiveNack(const Face& inFace, const lp::Nack& nack,
                                     const shared_ptr<pit::Entry>& pitEntry)
{
  if (nack.getReason() == lp::NackReason::DUPLICATE) {
    time::steady_clock::TimePoint newest = time::steady_clock::TimePoint::min();
    pit::InRecord* newest_inR = nullptr;

    bool existInOutRecord = false;

    for (const pit::InRecord& inR : pitEntry->getInRecords()) {

      // Is outRecord.getLastNonce in not in inRecords
      existInOutRecord = false;
      for (const pit::OutRecord& outR : pitEntry->getOutRecords()) {
         if (inR.getLastNonce() == outR.getLastNonce()) {
           existInOutRecord = true;
           break;
         }
      }

      // If does not exist in outRecord and is the newest and the inR is not from inFace
      if (!existInOutRecord && inR.getLastRenewed() > newest && inFace.getId() != inR.getFace().getId()) {
        newest = inR.getLastRenewed();
        newest_inR = &const_cast<pit::InRecord&>(inR);
      }
    } // for outRecords

    // send interest with new Nonce to inFace
    if (newest_inR != nullptr) {
       this->sendInterest(pitEntry, const_cast<Face&>(inFace), newest_inR->getInterest());
       NFD_LOG_ERROR("Special: " << newest_inR->getInterest() << " from=" << newest_inR->getFace().getId() <<
                     " newPitEntry-to=" << inFace.getId());
     }
   }

   Followed by same code as BestRoutev4

I ran experiments with and without (i.e. NACK handling is the same as BestRoutev4) this retransmission code and found no significant reduction in Duplicate NACks or faster convergence.
I ran 10 experiments for each with 34 node topology (current testbed) in Mini-NDN.
Avg Dupl NACKs with retransmission code: 87223
Avg Dupl NACKs w/o retransmission code: 90879

Convergence time is 30 seconds, I cannot go lower for either.

Since it does not help, it seems like I should not put it in?

So for NLSR we are using /localhop prefix to reduce Duplicate NACKs on Sync and plan to do it for LSA prefix as well.

Otherwise if I put both sync and LSA prefix on non-localhop prefix, NLSR stops converging for bigger topology and there is a storm of NACK duplicates. I am not sure if this is because of computing power available.

I have tried to turn off all logging (no ndndump, NFD, or NLSR logs) and still the topology does not converge in any reasonable time (I have tried up to 5 minutes but it still does not converge).

The largest topology I have that converges is the 10 node topology.

(And this is on a two octacore (each has 16 threads) machine - with 160 GB RAM, 30GB dedicated to RAMDISK for these experiments).

Actions

Copy link

#24

Updated by Ashlesh Gawande over 8 years ago

I have run more experiments to determine whether the retransmission of interest with new Nonce on duplicate NACK reduces duplicate NACKs. On average the number of Duplicate NACKs is very slightly higher for the patch with retransmission. So I will skip it and push a new patch.

Actions

Copy link

#25