Project

General

Profile

Feature #1999

Strategy for access router

Added by Junxiao Shi about 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Forwarding
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
18.00 h

Description

Develop a strategy suitable for local site prefix on access/edge router.

This strategy should have the following features:

  • Make use of multiple paths in FIB entry, and remember which path can lead to contents.
    • The paths could be installed by nfd-autoreg, so that some paths cannot reach the producer.
  • Recovery from packet loss in last-hop link.
    • Consumer retransmission is a hint for strategy to retransmit.
  • Expect producer mobility.

Related issues

Related to NFD - Bug #2403: AccessStrategy: after producer failure, Interests still goes to last nexthopRejected

Blocks NFD - Bug #2055: No strategy can support realtime traffic over lossy link when autoreg is usedClosed2014-10-13

Blocked by NFD - Feature #2272: Strategy API: access to FaceTableClosed

Blocked by NFD - Feature #2295: Scheduler: ScopedEventIdClosed

Blocked by NFD - Feature #2314: Measurements::findLongestPrefixMatch(pit::Entry) and MeasurementsAccessor::findLongestPrefixMatchClosed

Blocked by NFD - Task #2300: Face: use SignalClosed

Blocked by ndn-cxx - Bug #2318: DummyClientFace: setInterestFilter(InterestFilter, OnInterest) is not effective by itselfClosed

Blocked by ndn-cxx - Task #2319: DummyClientFace: use SignalClosed

Blocked by NFD - Task #2377: Abstract retransmission suppression logicClosed

Blocks NFD - Bug #2452: AccessStrategy: Measurements entry lifetime is not extendedClosed2015-01-30

History

#1 Updated by Junxiao Shi about 5 years ago

  • Blocks Bug #2055: No strategy can support realtime traffic over lossy link when autoreg is used added

#2 Updated by Junxiao Shi about 5 years ago

  • Description updated (diff)
  • Status changed from New to In Progress

I presented the idea for this strategy to Beichuan at 20141016 research meeting. The idea is:

  1. The strategy remembers the last working nexthop of each prefix; the granularity of this knowledge is "one level up", aka the parent of Data Name. The strategy also maintains an RTT estimator for last working face at each prefix. In addition, there's a global RTT estimator for each face not associated to any prefix.
  2. Upon incoming Interest to be forwarded, if there's knowledge of last working nexthop, the Interest is sent to that nexthop. If there's no response within RTO, it's multicasted to all other nexthops. If there's no knowledge of last working nexthop, the Interest is multicasted directly.
  3. 100ms after each multicast, if consumer retransmits, the retransmitted Interest will be multicasted.

The following assumptions are made:

  • NACK is not used.
  • If a laptop has multiple producers, their response times are close.
    This is because the strategy only maintains one RTT estimator per prefix for the last working nexthop. When the last work nexthop changes, the RTT estimator is copied from the global RTT estimator.
    If a laptop has a fast producer (such as ping) and a slow producer (such as HTTP proxy), the strategy is less efficient.
  • FIB is mostly correct. The strategy is most efficient if laptops use remote prefix registration. Same prefix could be registered by multiple laptops, and there could be link failure or end host failure.
    Although the strategy works with autoreg, it's less efficient, because Interest is multicasted to laptops that cannot serve the contents.

I expect this strategy to perform no worse than NCC. It's intended as a replacement of NCC at last hop.

#3 Updated by Lan Wang about 5 years ago

Why not maintain one RTT at the registered prefix level (not parent)? At least for now there should not be a scalability issue.

#4 Updated by Junxiao Shi almost 5 years ago

Why not maintain one RTT at the registered prefix level (not parent)? At least for now there should not be a scalability issue.

Per-prefix RTT estimators are maintained at parent of Data Name, not parent of Route prefix.

#5 Updated by Lan Wang almost 5 years ago

OK. Maybe at the registered prefix level would be good enough for most situations. But it doesn't hurt to do it at the parent of the data name level, as long as scalability is not an issue.

#6 Updated by Junxiao Shi almost 5 years ago

Please review the design in access-router-strategy_20141022.pptx

#7 Updated by Beichuan Zhang almost 5 years ago

what if a prefix is multihomed to more than one gateway? say, /ndn/presentations content is available via Memphis and Arizona gateways. are you going to differentiate "external" and "internal" faces and only multicast among internal faces?

#8 Updated by Junxiao Shi almost 5 years ago

what if a prefix is multihomed to more than one gateway? say, /ndn/presentations content is available via Memphis and Arizona gateways. are you going to differentiate "external" and "internal" faces and only multicast among internal faces?

No.

This scenario violates the assumption used in this strategy: every nexthop connects to a laptop in one hop.

Therefore, the access router strategy is unsuitable for this namespace.

The access router strategy is intended to be practical for today's testbed operations, where each local site prefix sits on only one access router.

#9 Updated by Junxiao Shi almost 5 years ago

20141110 meeting with Beichuan approves the design.

We also recognize the limitation of this strategy that it cannot efficiently support multi-homed contents, and identify the need for another strategy to support that use case.

#10 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Feature #2272: Strategy API: access to FaceTable added

#11 Updated by Junxiao Shi almost 5 years ago

  • % Done changed from 20 to 30

#12 Updated by Junxiao Shi almost 5 years ago

#13 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Feature #2314: Measurements::findLongestPrefixMatch(pit::Entry) and MeasurementsAccessor::findLongestPrefixMatch added

#14 Updated by Junxiao Shi almost 5 years ago

#15 Updated by Junxiao Shi almost 5 years ago

#2300 blocks this because it's needed by unit testing.

#16 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Bug #2318: DummyClientFace: setInterestFilter(InterestFilter, OnInterest) is not effective by itself added

#17 Updated by Junxiao Shi almost 5 years ago

#2318 #2319 blocks this because it's needed by unit testing.

#18 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Task #2319: DummyClientFace: use Signal added

#19 Updated by Junxiao Shi almost 5 years ago

  • Tracker changed from Task to Feature

Work on this Feature is paused due to several issues blocking this. I'll work on these before continuing with this Feature.

#20 Updated by Klaus Schneider almost 5 years ago

I've been thinking about the design of this very promising strategy proposal. Here are some questions which I hope are helpful for clarification.

As I understood the main problem is that the NCC strategy does not permit re-transmission to the same interface before the InterestLifetime is over.

  1. Is the consumer re-transmission (which happens after an RTE timeout, right?) much faster/shorter than the InterestLifetime? Which are the expected durations for both? As asked in #2055 is it possible to reduce the InterestLifetime?

  2. Is re-transmission on the same interface beneficial? In particular:

  3. How likely is packet loss on the access link when considering that link-layer (WiFi) re-transmissions reduce local losses?

  4. How likely is it that the next packet after a loss is successful? My guess would be that packet loss occurs in bursts and that the next Interest on this link is also likely to be lost (e.g., a laptop moving into bad WiFi reception)

  5. Is sending to the same interface again more likely to return data than trying another one? If the FIB is correct, using another link may be better.

  6. How do you intend to validate the performance of this strategy and show that it is indeed better than NCC in the mentioned scenario?

Sorry that this sounds much more critical than intended. Keep up the good work.

#21 Updated by Junxiao Shi almost 5 years ago

  1. InterestLifetime SHOULD be set to the duration in which the Data is useful to the consumer. Consumer SHOULD NOT attempt to manipulate InterestLifetime in response to network conditions.
    Consumer MAY retransmit before InterestLifetime expires.
  2. Yes. Retransmitting on the same face can recover from a possible packet loss.
  3. On NDN testbed, the access link is a UDP tunnel over public Internet. Packet loss is non-negligible.
  4. The next packet, aka the retransmitted packet, is typically at least an RTO apart. Thus, a burst of packet loss won't affect the retransmission.
  5. Whenever the first choice doesn't work within RTO, all nexthops listed in the FIB entry will be used.
  6. The strategy can be validated with NDN-RTC and other real applications.

#22 Updated by Klaus Schneider almost 5 years ago

InterestLifetime SHOULD be set to the duration in which the Data is useful to the consumer. Consumer SHOULD NOT attempt to manipulate InterestLifetime in response to network conditions.

Consumer MAY retransmit before InterestLifetime expires.

Thanks, I didn't know that. Can you tell me the reason for this? I guess the InterestLifetime is kept high to not discard packets that could possibly be useful for the consumer?

Seems like I am not the only one who makes the mistake of matching the InterestLifetime with RTO (http://onlinepresent.org/proceedings/vol2_2012/109.pdf http://www.lists.cs.ucla.edu/pipermail/ndnsim/2013-November/000964.html).
Maybe you want to document that somewhere.

The strategy can be validated with NDN-RTC and other real applications.

What about simulations? This question is a bit more general, but I am wondering how one can proof that a new strategy is worth the effort of higher complexity (following Occam's razor to prefer the simplest solution for a given problem).

#23 Updated by Junxiao Shi almost 5 years ago

Seems like I am not the only one who makes the mistake of matching the InterestLifetime with RTO (http://onlinepresent.org/proceedings/vol2_2012/109.pdf http://www.lists.cs.ucla.edu/pipermail/ndnsim/2013-November/000964.html).
Maybe you want to document that somewhere.

Agreed. This shall appear in ndn-cxx Application Developer Guide, but it's a separate Task.

What about simulations? This question is a bit more general, but I am wondering how one can proof that a new strategy is worth the effort of higher complexity (following Occam's razor to prefer the simplest solution for a given problem).

I don't want to block this Feature on ndnSIM. It could be separate Task.

#24 Updated by Junxiao Shi almost 5 years ago

  • % Done changed from 30 to 80

http://gerrit.named-data.net/1545 is ready for initial review. Please give feedback.

There's currently no test case for "recovery from packet loss in last-hop link". I'll add them into this Change later.

#25 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Task #2377: Abstract retransmission suppression logic added

#26 Updated by Alex Afanasyev over 4 years ago

I know that I'm kind of late, but I'm kind of against of having RTO/retransmission inside the strategy.

As I remember, the way to address link problems was to introduce a limited recovery inside the link (2.5) layer. This layer will do retransmissions much more efficiently and quicker than usually coarse RTO estimation (the initial value is 1 second). I would say, just allowing retransmitted interests to go through will be enough. Otherwise, there will be client-based retransmission, RTO retransmission, and interaction of both (I know the timer is cancelled, but there could be some interactions with previous hops).

#27 Updated by Junxiao Shi over 4 years ago

This strategy is designed to solve the problems now, in a limited scenario as described in design slides.

I agree with the idea of link layer retx, but it won't come in v0.3.

Also, there is no RTO retx in this strategy.

  1. forward to face X
  2. RTO timeout
  3. multicast to all nexthops except face X and the downstream

#28 Updated by Alex Afanasyev over 4 years ago

There is a form of RTO retx in this strategy. Given RTO timeout is not tied to Measurement entry lifetime, the following could happen:

  • incoming interest
  • found Measurement entry, forward to single next hop, set up RTO
  • RTO expires, broadcast interest to all next hops
  • ...
  • incoming interest
  • found Measurement entry, forward to single next hop, set up RTO
  • RTO expires, broadcast interest to all next hops
  • ...
  • (repeat until Measurement entry is not found)

I'm not sure that multicast after RTO is useful... At least RTO should reset Measurement entry, so there is no delay in multicasting the interest next time.

#29 Updated by Junxiao Shi over 4 years ago

I don't understand. Please be precise in your scenario, and include:

  • Interest Name, Selectors, InterestLifetime
  • list of faces (with FaceId), and their RTT
  • what's stored in Measurements initially
  • time (milliseconds from scenario starts) of each event
  • incoming FaceId of each packet

After you provide these information, I'll determine whether there's a problem.

#30 Updated by Junxiao Shi over 4 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 80 to 100

Packet loss recovery test case is added.

#31 Updated by Alex Afanasyev over 4 years ago

I've added Ovehead test case to show my concern about the strategy.

#32 Updated by Junxiao Shi over 4 years ago

I've seen the Overhead test case added in commit 2c1243ada4b8199e9c3357c8b71836b3e82a749f.
The strategy's behavior may be inoptimal in this scenario, but it's still correct.
Changing the design may cause inoptimal behavior in other scenarios.
This problem should be addressed in a separate issue.

The strategy is implemented using a design approved in note-9.
If there's no other problem, the code should be approved, and that test case will be deleted before merging but readded in the future issue.
I disagree without changing the design again in this issue.
I want to let the current design out for a real-world test.

#33 Updated by Junxiao Shi over 4 years ago

  • Related to Bug #2403: AccessStrategy: after producer failure, Interests still goes to last nexthop added

#34 Updated by Junxiao Shi over 4 years ago

The design concern mentioned in note-26~29,31~32 is split to #2403.

#35 Updated by Alex Afanasyev over 4 years ago

The intention of my comments is to raise awareness of the issues. I'm not asking for design changes as part of this task.

#36 Updated by Junxiao Shi over 4 years ago

  • Status changed from Code review to Closed

#37 Updated by Junxiao Shi over 4 years ago

A brief description of the new strategy is added to NFD Developer Guide.

#38 Updated by Junxiao Shi over 4 years ago

uploading a newer design slides which is supposed to be same as what's implemented in commit:80ee7cb931eb48327659621fa1b294f47f79f1ec

#39 Updated by Junxiao Shi over 4 years ago

  • Blocks Bug #2452: AccessStrategy: Measurements entry lifetime is not extended added

Also available in: Atom PDF