Bug #3990
closedERROR: Interest size exceeds maximum limit on exclude interest for large topologies
100%
Description
I was working on the issue of using Chronosync for NLSR instead of the old fork (nsync).
The minute changes I made in Chronosync and NLSR can be found at:
https://gist.github.com/agawande/d9a6f3cc1245c133bb48c8ac58353579
For smaller topologies, NLSR converges and there are no errors. For bigger topologies such as a 58 node, 350+ links topology NLSR crashes due to:
ERROR: Interest size exceeds maximum limit
(I have included a small mini-ndn experiment in the gist that reproduces the error on the default 4 node topology by advertising 500 prefixes in succession).
This error seem to happen after expressing an interest with exclude filter:
https://github.com/named-data/ChronoSync/blob/f42aa2c05b5cbc73f7f592c4a6cee3f205f84e07/src/logic.cpp#L780
(Nsync has no problems with either large topologies or advertising 500 prefixes)
Updated by Ashlesh Gawande over 7 years ago
- Blocks Task #2400: Determine the necessary changes to use current Chronosync as dependency added
Updated by Ashlesh Gawande over 7 years ago
I made the change that if the exclude interest is greater than MAX_NDN_PACKET_SIZE then don't send it and return:
Seems to work - NLSR can converge, recover.
Then I ran a few NLSR convergence experiments in Mini-NDN:
- 10 node experiment (run on my laptop)
- NSync : 2150 Sync interests, converges in at least 35 seconds
- Chronosync: 6320 Sync interests, converges in at least 35 seconds
- 33 node current testbed (run on powerful machine)
- NSync : 12000 Sync interests, converges in at least 30 seconds
- Chronosync: 31493 Sync interests, converges in at least 30 seconds
So the number of sync interests is greater 2.5 times from the old implementation.
I saw that exclude interest is sent upon every sync data received.
Alex's comment here confirms:
https://gerrit.named-data.net/#/c/3605/11/src/logic.cpp (Exclude filter change)
logic: Sending exclude filter always when receiving sync data is not ideal solution, as it simply inflates the number of interests sent.
The solution we thought would work is to send such interests only when simultaneous data generation is detected.
This will incur addition delay, but would not incur unnecessary cost.
If I want to follow this suggestion apart from the MAX_NDN_PACKET_SIZE one, how can I detect simultaneous data generation?
(I think this is the reason why I am getting the size error in the small 4 node topology (even 2 node topology) when there is only one node producing data in quick succession).
Updated by Ashlesh Gawande over 7 years ago
- Status changed from New to Closed
- Assignee set to Ashlesh Gawande
- % Done changed from 0 to 100