Feature #1999: Strategy for access router - NFD - NDN project issue tracking system

Actions

Copy link

#1

Updated by Junxiao Shi over 10 years ago

Blocks Bug #2055: No strategy can support realtime traffic over lossy link when autoreg is used added

Actions

Copy link

#2

Updated by Junxiao Shi over 10 years ago

Description updated (diff)
Status changed from New to In Progress

I presented the idea for this strategy to Beichuan at 20141016 research meeting. The idea is:

The strategy remembers the last working nexthop of each prefix; the granularity of this knowledge is "one level up", aka the parent of Data Name. The strategy also maintains an RTT estimator for last working face at each prefix. In addition, there's a global RTT estimator for each face not associated to any prefix.
Upon incoming Interest to be forwarded, if there's knowledge of last working nexthop, the Interest is sent to that nexthop. If there's no response within RTO, it's multicasted to all other nexthops. If there's no knowledge of last working nexthop, the Interest is multicasted directly.
100ms after each multicast, if consumer retransmits, the retransmitted Interest will be multicasted.

The following assumptions are made:

NACK is not used.
If a laptop has multiple producers, their response times are close.
This is because the strategy only maintains one RTT estimator per prefix for the last working nexthop. When the last work nexthop changes, the RTT estimator is copied from the global RTT estimator.
If a laptop has a fast producer (such as ping) and a slow producer (such as HTTP proxy), the strategy is less efficient.
FIB is mostly correct. The strategy is most efficient if laptops use remote prefix registration. Same prefix could be registered by multiple laptops, and there could be link failure or end host failure.
Although the strategy works with autoreg, it's less efficient, because Interest is multicasted to laptops that cannot serve the contents.

I expect this strategy to perform no worse than NCC. It's intended as a replacement of NCC at last hop.

Actions

Copy link

#3

Updated by Lan Wang over 10 years ago

Why not maintain one RTT at the registered prefix level (not parent)? At least for now there should not be a scalability issue.

Actions

Copy link

#4

Updated by Junxiao Shi over 10 years ago

Why not maintain one RTT at the registered prefix level (not parent)? At least for now there should not be a scalability issue.

Per-prefix RTT estimators are maintained at parent of Data Name, not parent of Route prefix.

Actions

Copy link

#5

Updated by Lan Wang over 10 years ago

OK. Maybe at the registered prefix level would be good enough for most situations. But it doesn't hurt to do it at the parent of the data name level, as long as scalability is not an issue.

Actions

Copy link

#6

Updated by Junxiao Shi over 10 years ago

File access-router-strategy_20141022.pptx access-router-strategy_20141022.pptx added
% Done changed from 0 to 20

Please review the design in access-router-strategy_20141022.pptx

Actions

Copy link

#7

Updated by Beichuan Zhang over 10 years ago

what if a prefix is multihomed to more than one gateway? say, /ndn/presentations content is available via Memphis and Arizona gateways. are you going to differentiate "external" and "internal" faces and only multicast among internal faces?

Actions

Copy link

#8

Updated by Junxiao Shi over 10 years ago

what if a prefix is multihomed to more than one gateway? say, /ndn/presentations content is available via Memphis and Arizona gateways. are you going to differentiate "external" and "internal" faces and only multicast among internal faces?

No.

This scenario violates the assumption used in this strategy: every nexthop connects to a laptop in one hop.

Therefore, the access router strategy is unsuitable for this namespace.

The access router strategy is intended to be practical for today's testbed operations, where each local site prefix sits on only one access router.

Actions

Copy link

#9

Updated by Junxiao Shi over 10 years ago

20141110 meeting with Beichuan approves the design.

We also recognize the limitation of this strategy that it cannot efficiently support multi-homed contents, and identify the need for another strategy to support that use case.

Actions

Copy link

#10

Updated by Junxiao Shi over 10 years ago

Blocked by Feature #2272: Strategy API: access to FaceTable added

Actions

Copy link

#11

Updated by Junxiao Shi over 10 years ago

% Done changed from 20 to 30

http://gerrit.named-data.net/1545

Actions

Copy link

#12

Updated by Junxiao Shi over 10 years ago

Blocked by Feature #2295: Scheduler: ScopedEventId added

Actions

Copy link

#13

Updated by Junxiao Shi over 10 years ago

Blocked by Feature #2314: Measurements::findLongestPrefixMatch(pit::Entry) and MeasurementsAccessor::findLongestPrefixMatch added

Actions

Copy link

#14

Updated by Junxiao Shi over 10 years ago

Blocked by Task #2300: Face: use Signal added

Actions

Copy link

#15

Updated by Junxiao Shi over 10 years ago

#2300 blocks this because it's needed by unit testing.

Actions

Copy link

#16

Updated by Junxiao Shi over 10 years ago

Blocked by Bug #2318: DummyClientFace: setInterestFilter(InterestFilter, OnInterest) is not effective by itself added

Actions

Copy link

#17

Updated by Junxiao Shi over 10 years ago

#2318 #2319 blocks this because it's needed by unit testing.

Actions

Copy link

#18

Updated by Junxiao Shi over 10 years ago

Blocked by Task #2319: DummyClientFace: use Signal added

Actions

Copy link

#19

Updated by Junxiao Shi over 10 years ago

Tracker changed from Task to Feature

Work on this Feature is paused due to several issues blocking this. I'll work on these before continuing with this Feature.

Actions

Copy link

#20

Updated by Anonymous over 10 years ago

I've been thinking about the design of this very promising strategy proposal. Here are some questions which I hope are helpful for clarification.

As I understood the main problem is that the NCC strategy does not permit re-transmission to the same interface before the InterestLifetime is over.

Is the consumer re-transmission (which happens after an RTE timeout, right?) much faster/shorter than the InterestLifetime? Which are the expected durations for both? As asked in #2055 is it possible to reduce the InterestLifetime?
Is re-transmission on the same interface beneficial? In particular:
How likely is packet loss on the access link when considering that link-layer (WiFi) re-transmissions reduce local losses?
How likely is it that the next packet after a loss is successful? My guess would be that packet loss occurs in bursts and that the next Interest on this link is also likely to be lost (e.g., a laptop moving into bad WiFi reception)
Is sending to the same interface again more likely to return data than trying another one? If the FIB is correct, using another link may be better.
How do you intend to validate the performance of this strategy and show that it is indeed better than NCC in the mentioned scenario?

Sorry that this sounds much more critical than intended. Keep up the good work.

Actions

Copy link

#21

Updated by Junxiao Shi over 10 years ago

InterestLifetime SHOULD be set to the duration in which the Data is useful to the consumer. Consumer SHOULD NOT attempt to manipulate InterestLifetime in response to network conditions.
Consumer MAY retransmit before InterestLifetime expires.
Yes. Retransmitting on the same face can recover from a possible packet loss.
On NDN testbed, the access link is a UDP tunnel over public Internet. Packet loss is non-negligible.
The next packet, aka the retransmitted packet, is typically at least an RTO apart. Thus, a burst of packet loss won't affect the retransmission.
Whenever the first choice doesn't work within RTO, all nexthops listed in the FIB entry will be used.
The strategy can be validated with NDN-RTC and other real applications.

Actions

Copy link

#22

Updated by Anonymous over 10 years ago

InterestLifetime SHOULD be set to the duration in which the Data is useful to the consumer. Consumer SHOULD NOT attempt to manipulate InterestLifetime in response to network conditions.

Consumer MAY retransmit before InterestLifetime expires.

Thanks, I didn't know that. Can you tell me the reason for this? I guess the InterestLifetime is kept high to not discard packets that could possibly be useful for the consumer?

Seems like I am not the only one who makes the mistake of matching the InterestLifetime with RTO (http://onlinepresent.org/proceedings/vol2_2012/109.pdf http://www.lists.cs.ucla.edu/pipermail/ndnsim/2013-November/000964.html).
Maybe you want to document that somewhere.

The strategy can be validated with NDN-RTC and other real applications.

What about simulations? This question is a bit more general, but I am wondering how one can proof that a new strategy is worth the effort of higher complexity (following Occam's razor to prefer the simplest solution for a given problem).

Actions

Copy link

#23

Updated by Junxiao Shi over 10 years ago

Seems like I am not the only one who makes the mistake of matching the InterestLifetime with RTO (http://onlinepresent.org/proceedings/vol2_2012/109.pdf http://www.lists.cs.ucla.edu/pipermail/ndnsim/2013-November/000964.html).
Maybe you want to document that somewhere.

Agreed. This shall appear in ndn-cxx Application Developer Guide, but it's a separate Task.

What about simulations? This question is a bit more general, but I am wondering how one can proof that a new strategy is worth the effort of higher complexity (following Occam's razor to prefer the simplest solution for a given problem).

I don't want to block this Feature on ndnSIM. It could be separate Task.

Actions

Copy link

#24

Updated by Junxiao Shi over 10 years ago

% Done changed from 30 to 80

http://gerrit.named-data.net/1545 is ready for initial review. Please give feedback.

There's currently no test case for "recovery from packet loss in last-hop link". I'll add them into this Change later.

Actions

Copy link

#25

Updated by Junxiao Shi over 10 years ago

Blocked by Task #2377: Abstract retransmission suppression logic added

Actions

Copy link

#26

Updated by Alex Afanasyev over 10 years ago

I know that I'm kind of late, but I'm kind of against of having RTO/retransmission inside the strategy.

As I remember, the way to address link problems was to introduce a limited recovery inside the link (2.5) layer. This layer will do retransmissions much more efficiently and quicker than usually coarse RTO estimation (the initial value is 1 second). I would say, just allowing retransmitted interests to go through will be enough. Otherwise, there will be client-based retransmission, RTO retransmission, and interaction of both (I know the timer is cancelled, but there could be some interactions with previous hops).

Actions

Copy link

#27

Updated by Junxiao Shi over 10 years ago

This strategy is designed to solve the problems now, in a limited scenario as described in design slides.

I agree with the idea of link layer retx, but it won't come in v0.3.

Also, there is no RTO retx in this strategy.

forward to face X
RTO timeout
multicast to all nexthops except face X and the downstream

Actions

Copy link

#28

Updated by Alex Afanasyev over 10 years ago

There is a form of RTO retx in this strategy. Given RTO timeout is not tied to Measurement entry lifetime, the following could happen:

incoming interest
found Measurement entry, forward to single next hop, set up RTO
RTO expires, broadcast interest to all next hops
...
incoming interest
found Measurement entry, forward to single next hop, set up RTO
RTO expires, broadcast interest to all next hops
...
(repeat until Measurement entry is not found)

I'm not sure that multicast after RTO is useful... At least RTO should reset Measurement entry, so there is no delay in multicasting the interest next time.

Actions

Copy link

#29

Updated by Junxiao Shi over 10 years ago

I don't understand. Please be precise in your scenario, and include:

Interest Name, Selectors, InterestLifetime
list of faces (with FaceId), and their RTT
what's stored in Measurements initially
time (milliseconds from scenario starts) of each event
incoming FaceId of each packet

After you provide these information, I'll determine whether there's a problem.

Actions

Copy link

#30

Updated by Junxiao Shi over 10 years ago

Status changed from In Progress to Code review
% Done changed from 80 to 100

Packet loss recovery test case is added.

Actions

Copy link

#31

Updated by Alex Afanasyev over 10 years ago

I've added Ovehead test case to show my concern about the strategy.

Actions

Copy link

#32

Updated by Junxiao Shi over 10 years ago

I've seen the Overhead test case added in commit 2c1243ada4b8199e9c3357c8b71836b3e82a749f.
The strategy's behavior may be inoptimal in this scenario, but it's still correct.
Changing the design may cause inoptimal behavior in other scenarios.
This problem should be addressed in a separate issue.

The strategy is implemented using a design approved in note-9.
If there's no other problem, the code should be approved, and that test case will be deleted before merging but readded in the future issue.
I disagree without changing the design again in this issue.
I want to let the current design out for a real-world test.

Actions

Copy link

#33

Updated by Junxiao Shi over 10 years ago

Related to Bug #2403: AccessStrategy: after producer failure, Interests still goes to last nexthop added

Actions

Copy link

#34

Updated by Junxiao Shi over 10 years ago

The design concern mentioned in note-26~29,31~32 is split to #2403.

Actions

Copy link

#35

Updated by Alex Afanasyev over 10 years ago

The intention of my comments is to raise awareness of the issues. I'm not asking for design changes as part of this task.

Actions

Copy link

#36

Updated by Junxiao Shi over 10 years ago

Status changed from Code review to Closed

Actions

Copy link

#37

Updated by Junxiao Shi over 10 years ago

A brief description of the new strategy is added to NFD Developer Guide.

Actions

Copy link

#38

Updated by Junxiao Shi over 10 years ago

File access-router-strategy_20141220.pptx access-router-strategy_20141220.pptx added

uploading a newer design slides which is supposed to be same as what's implemented in commit:80ee7cb931eb48327659621fa1b294f47f79f1ec

Actions

Copy link

#39

Updated by Junxiao Shi over 10 years ago

Blocks Bug #2452: AccessStrategy: Measurements entry lifetime is not extended added

access-router-strategy_20141022.pptx (75.8 KB) access-router-strategy_20141022.pptx		Junxiao Shi, 10/22/2014 04:38 PM
access-router-strategy_20141220.pptx (76.1 KB) access-router-strategy_20141220.pptx		Junxiao Shi, 01/30/2015 08:47 AM

Project

General

Profile

NFD

Tags

Feature #1999

Strategy for access router

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Lan Wang over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Lan Wang over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Beichuan Zhang over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Anonymous over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Anonymous over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Alex Afanasyev over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Alex Afanasyev over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Alex Afanasyev over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Alex Afanasyev over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Junxiao Shi over 10 years ago