Bug #2592
openIneffective duplicate suppression for satisfied Interest
0%
Description
Topology:
A---R---B
|
C
Steps to reproduce:
- at time 0ms, A sends an Interest to R with InterestLifetime=4000ms; R forwards this Interest to C
- at time 10ms, C returns Data to R, which is returned to A; the Data is cached in R's ContentStore
- at time 20ms, B sends a duplicate Interest to R (same Nonce as step 1 Interest)
Expected: R suppresses the Interest in step 3
Actual: R returns Data from ContentStore to B
Root cause:
After DeadNonceList was introduced in commit:a110f268c7ced8eddffe437abd5ac9cf923a905f, PIT entry can detect a duplicate Nonce only if any in-record or out-record contains that Nonce.
However, in incoming Data pipeline, the mark PIT satisfied step deletes all in-records.
Thus, PIT entry is unable to detect duplicate Nonce after it's satisfied.
Updated by Alex Afanasyev over 9 years ago
I see an issue, though I don't see a big problem here.
Updated by Junxiao Shi over 9 years ago
This affects measurements of other nodes.
Topology:
consumer--A-\
| |
D--R--C--producer
| |
B-/
- A sends Interest to R and D
- R forwards Interest to C
- D sends Interest to R, but it's suppressed
- C returns Data, which is returned to A
- D retries and sends Interest via B
- B sends Interest to R
- R returns Data to B from ContentStore due to this Bug, which is then returned to D
D learns that B is a better nexthop than R, but actually R is better.
See also http://www.lists.cs.ucla.edu/pipermail/nfd-dev/2015-February/000888.html.
Updated by Junxiao Shi over 9 years ago
20150302 conference call discussed this problem.
This Bug itself is an implementation problem: Nonces are deleted from PIT entry during straggler period.
But fixing this Bug can barely help with the "wrong measurements" as described in note-2:
suppose "B sends to R" happens after R's PIT entry is erased by straggler timer (which can happen if D has RttEstimator-based retx and RTO is greater than R's straggler timer), R would be unable to detect the duplicate Nonce and would return the Data, making D believe B is a better nexthop than R.
In fact, timing from original report shows that "B sends to R" happened one millisecond before R's straggler timer expires.
There are a few possible solutions to fix both this Bug and the "wrong measurements" issue:
Extend DeadNonceList to also detect duplicates that have no risk of looping.
DeadNonceList is originally designed to prevent loops, but doesn't need to prevent multi-path arrivals without looping hazard.
Extending its usage would cause DeadNonceList to use much more memory.
This solution would make R not to respond to B's Interest even after straggler timer has expired.
However, if there's another consumer from B that sends an Interest with unique Nonce, R would return Data which is cached in B's ContentStore, and could be returned to D.
Thus, a more fundamental issue is: D's strategy needs to be smarter, and shouldn't base the measurements on one retrieval that could come from an off-path cache.
Generate unique Nonces for probing.
A could use unique Nonces for the Interest to R and the Interest to D, so that R won't suppress D's Interest.
This means, duplicate Nonce always means looping, because multi-path arrival would always see unique Nonces.
Don't use Nonces for duplicate suppression.
R could ignore Nonces completely, and return Data to whoever requested it, without caring about whether there's loop or multi-path arrival.
Data can go one cycle but won't loop forever because in-record is gone when Data loops back.
This solution would also solve #1966.
This shouldn't cause congestion because when D sends Interest to R it should have accounted for the returning Data.
A related question is: if Nonce isn't used for duplicate suppression, it could be dropped from packet.
Updated by Alex Afanasyev almost 9 years ago
- Target version changed from v0.4 to v0.5
Updated by Junxiao Shi almost 7 years ago
- Assignee deleted (
Junxiao Shi) - Target version deleted (
v0.5)