Bug #3976
closedNLSR not converging due to NFD faces/create semantics change
100%
Description
NLSR currently creates adjacency by invoking NFD faces/create
command and expecting 200 response.
Since #3232 https://gerrit.named-data.net/3012, NFD faces/create
command will respond with code 409 if a face with same RemoteUri already exists, breaking the expectation of NLSR.
To solve this problem, NLSR should either use faces/query
dataset to find existing face, or inspect the response body that comes with code 409 to determine whether the existing face matches the expectation.
Updated by Junxiao Shi over 7 years ago
- Related to Bug #3232: Inaccurate log message when changing FacePersistency added
Updated by Junxiao Shi over 7 years ago
- Subject changed from NLSR not converging with latest NFD to NLSR not converging due to NFD faces/create semantics change
- Description updated (diff)
- Priority changed from High to Urgent
Updated by Ashlesh Gawande over 7 years ago
If I do the following fix:
void
HelloProtocol::onRegistrationFailure(const ndn::nfd::ControlResponse& response,
const ndn::Name& name, const ndn::time::milliseconds& timeout)
{
_LOG_DEBUG(response.getText() << " (code: " << response.getCode() << ")");
// Temporary fix until #2954
if (response.getText() == "Face with remote URI already exists") {
ndn::nfd::ControlParameters faceParams(response.getBody());
_LOG_WARN(response.getText() << " (code: " << response.getCode() << ") " << faceParams.getFaceId());
onRegistrationSuccess(faceParams, name, timeout);
return;
}
NLSR still does not converge.
In a two node topology, node b's NLSR sends HELLO interest to node a.
But NFD's best route strategy says no route:
1487969928.411582 DEBUG: [ContentStore] find /ndn/a-site/%C1.Router/cs/a/NLSR/INFO/%07%1E%08%03ndn%08%06b-site%08%08%C1.Router%08%02cs%08%01b L
1487969928.411616 DEBUG: [ContentStore] no-match
1487969928.411621 DEBUG: [Forwarder] onContentStoreMiss interest=/ndn/a-site/%C1.Router/cs/a/NLSR/INFO/%07%1E%08%03ndn%08%06b-site%08%08%C1.Router%08%02cs%08%01b
1487969928.411636 TRACE: [Strategy] lookupFib noLinkObject found=/
1487969928.411668 DEBUG: [BestRouteStrategy2] /ndn/a-site/%C1.Router/cs/a/NLSR/INFO/%07%1E%08%03ndn%08%06b-site%08%08%C1.Router%08%02cs%08%01b?ndn.MustBeFresh=1&ndn.InterestLifetime=3000&ndn.Nonce=3969901325 from=257 noNextHop
1487969928.411683 DEBUG: [Forwarder] onOutgoingNack face=257 nack=/ndn/a-site/%C1.Router/cs/a/NLSR/INFO/%07%1E%08%03ndn%08%06b-site%08%08%C1.Router%08%02cs%08%01b~NoRoute OK
1487969928.411696 TRACE: [LinkService] [id=257,local=unix:///run/b.sock,remote=fd://21] sendNack
I think I am missing some step here?
Also face from a to b is persistent,
but face from b to a is on-demand.
(a starts before b, and its HELLO reaches b)
Updated by Ashlesh Gawande over 7 years ago
Oh I think I need to update the persistency:
https://redmine.named-data.net/issues/3232#note-1
Updated by Ashlesh Gawande over 7 years ago
Okay so this does not work because I am not actually calling registerPrefixInNfd.
Updated by Junxiao Shi over 7 years ago
Reply to note-3:
- Never rely on (or even look at)
response.getText()
. The correct condition isresponse.getCode() == 409
. - Code 409 from
faces/create
just means face already exists. Routing convergence depends on not only the existence of face, but also the existence of a route to reach the peer NLSR in NFD-RIB. You are getting Nack due the lack of a RIB route. - The correct place to apply a quick fix is in
FaceController::createFaceInNfd
. Instead of callingonFailure
directly, when encountering code 409 and the conflicting face has the correct LocalUri and RemoteUri, treat it as success.
Updated by Ashlesh Gawande over 7 years ago
Thanks Junxiao, this is better!
https://gerrit.named-data.net/#/c/3724/
Also one question I have is, so for a two node topology NLSR on node a starts first and does not get a 409.
But node b does, so when you create a udp face from one NFD it is also created at the other NFD?
Updated by Ashlesh Gawande over 7 years ago
- Status changed from New to Code review
- % Done changed from 0 to 90
Updated by Ashlesh Gawande over 7 years ago
- Status changed from Code review to Closed
- % Done changed from 90 to 100