Bug #3230
closedbest-route strategy: initial retx suppression is too low
100%
Description
best-route strategy has consumer retransmission suppression feature.
A consumer retransmission, from same or different downstream face, is suppressed (not forwarded) if it arrives within a suppression duration calcuated with an exponential back-off algorithm.
The initial suppression period is 1ms.
This setting is causing excessive retransmissions when multiple consumers are requesting the same contents.
One notable example is:
- Run one NDNRTC producer, connected on one NDN testbed router.
- Run two NDNRTC consumers fetching the same stream at same bitrate, connected on another NDN testbed router.
- Observe NDN testbed bandwidth map.
Expected: most traffic is forwarded on one path between the two routers.
Actual: about 10% traffic is also forwarded on a second path.
It's believed that this observation is caused by Interests from the second consumer being treated as a consumer retransmission.
When the two consumers are not perfectly synchronized, and the second Interest arrives more than 1ms later but less than RTT, it would be forwarded on a secondary path.
Setting a larger initial suppression period, such as 10ms, should prevent this behavior, but still allow consumer to retransmit the Interest if application detects a packet loss.
Updated by Junxiao Shi about 9 years ago
20151008 conference call discussed this problem.
The change to 10ms is approved.
Other ideas during the call are recorded below.
Beichuan believes the best initial suppression period should be the Round Trip Time between this forwarder and the content source.
Alex points out the "two NDNRTC consumer" scenario is caused by: NFD does not distinguish between Interest with same Name+Selectors coming from same face and Interest with same Name+Selectors coming from different face.
- Interest with same Name+Selectors coming from the same face is a retransmission.
- Interest with same Name+Selectors coming from a different face indicates another consumer is requesting the same Data, and should be suppressed for the first time.
However, Beichuan points out that Interest with same Name+Selectors coming from a different downstream faces can still come from the same consumer.
This can happen if the consumer is not using the same best-route strategy.
Alex also suggests that the initial suppression period can be set to 10% of InterestLifetime, because RTT cannot be reliably measured.
The benefit is that the application can affect initial suppression period by changing the InterestLifetime.
The drawback is that some application may be willing to wait for a Data for a long time (eg. 10s), but may want to retransmit the Interest faster than 10% of that (eg. 1s).
After NDNLPv2 introduces link reliability improvements, the need for application-level retransmission may be reduced but won't be eliminated.
Updated by Junxiao Shi about 9 years ago
- Status changed from New to In Progress
- Assignee set to Junxiao Shi
- Target version set to v0.4
Updated by Junxiao Shi about 9 years ago
- Status changed from In Progress to Code review
- % Done changed from 0 to 100
- Estimated time set to 1.00 h
Updated by Junxiao Shi about 9 years ago
- Status changed from Code review to Closed
Updated by Junxiao Shi almost 9 years ago
- Status changed from Closed to Feedback
- % Done changed from 100 to 90
I'm reopening this because NFD Developer Guide contains incorrect information about the retx period.
Updated by Junxiao Shi almost 9 years ago
- Status changed from Feedback to Closed
- % Done changed from 90 to 100
devguide is updated in nfd-docs:commit:6c602414e63d1e14398d132e273b9b80f6feea0e.
Updated by Jeff Burke over 8 years ago
Updated by Junxiao Shi over 8 years ago
We should not further increase suppression period because it would impede application's attempt to retransmit if it detects a packet loss using a high level mechanism.
I suggest waiting for NDNLPv2 link reliability improvements, after which the suppression period can be raised again.
Estimated timeline: more than 4 months.
Updated by Jeff Burke over 8 years ago
Ok. First, we'll try to figure out definitively if we're indeed observing something caused by this suppression. (#3551) (At some point I'd like to better understand the assumptions here - is link reliability required for best route to function properly with this type of traffic?)