Project

General

Profile

Actions

Feature #3931

closed

Implement NDNLP Link Reliability Protocol

Added by Eric Newberry almost 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Faces
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

Implement the NDNLP Link Reliability Protocol, as per the design in #3823. This issue includes implementing the repeatable Ack field in ndn-cxx and acknowledgements, frame loss detection, and retransmissions in NFD's GenericLinkService.


Related issues 3 (0 open3 closed)

Related to NFD - Feature #3823: Congestion Control: design Local Link Loss DetectionClosedEric Newberry

Actions
Blocks NFD - Feature #4003: FaceMgmt: enable link reliabilityClosedEric Newberry

Actions
Blocked by ndn-cxx - Bug #4156: Encoding LpPackets after adding repeating field causes other fields to be lostClosedEric Newberry06/27/2017

Actions
Actions #1

Updated by Eric Newberry almost 8 years ago

  • Related to Feature #3823: Congestion Control: design Local Link Loss Detection added
Actions #2

Updated by Eric Newberry almost 8 years ago

  • Status changed from New to In Progress
  • % Done changed from 0 to 10
Actions #3

Updated by Eric Newberry almost 8 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 10 to 100
Actions #4

Updated by Eric Newberry almost 8 years ago

I noticed that the current implementation of NDNLP in ndn-cxx does not put repeatable fields of the same TLV-TYPE in the order they were added to the packet. I will submit a patch shortly that changes this behavior. I believe that inserting them in the order they are added works best for unit testing Acks in the reliability feature.

Actions #5

Updated by Alex Afanasyev almost 8 years ago

I still have a question about https://gerrit.named-data.net/#/c/3626/. When this commit is merged, what are the implications? Is reliability gets suddenly enabled on all/some links? If not, how can it be enabled?

Actions #6

Updated by Eric Newberry almost 8 years ago

Alex Afanasyev wrote:

I still have a question about https://gerrit.named-data.net/#/c/3626/. When this commit is merged, what are the implications? Is reliability gets suddenly enabled on all/some links? If not, how can it be enabled?

Like the other link services, this feature is enabled using a boolean value in GenericLinkService::Options. In this case, the option is further nested under LpReliability::Options. By default, the feature is disabled. Eventually, there should be a flag or option added to the faces/update management command to control this feature.

Actions #7

Updated by Eric Newberry almost 8 years ago

I was thinking that it might be better to enable reliability by default for certain types of links (depending on the Transport type - perhaps UnicastUdpTransport and EthernetTransport). In addition, I think that it should be possible to enable/disable this feature through FaceMgmt. However, I believe this should be an issue for a later commit, since the current implementation of LpReliability in https://gerrit.named-data.net/#/c/3626 is disabled by default.

Actions #8

Updated by Junxiao Shi almost 8 years ago

Actions #9

Updated by Eric Newberry almost 8 years ago

Eric Newberry wrote:

I was thinking that it might be better to enable reliability by default for certain types of links (depending on the Transport type - perhaps UnicastUdpTransport and EthernetTransport).

I realized that EthernetTransport is multicast, so disregard that one.

Actions #10

Updated by Eric Newberry over 7 years ago

  • % Done changed from 100 to 80

Most of the code for the reliability feature has been merged, although Klaus, Beichuan, and I have agreed to change the retransmission design again. I will post more details about the design change in #3823 soon.

Actions #11

Updated by Eric Newberry over 7 years ago

  • % Done changed from 80 to 100

The new design using TxSequence numbers in Acks has been uploaded for code review.

Actions #12

Updated by Junxiao Shi over 7 years ago

I'm looking at https://gerrit.named-data.net/3848 patchset7 and I feel that the data structure is inefficient.
Having separate m_unackedFrags and m_netPkts containers and linking them via TxSequenceNumber result in unnecessary lookups: you'll need a std::map lookup to find a linked entry.
Instead, they can be linked via iterators and pointers:

  • UnackedFrag has a shared_ptr to NetPkt. NetPkt has a set of iterators into m_unackedFrags.
  • When a fragment is acknowledged, find it in m_unackedFrags, delete its iterator in the NetPkt. If this the last TxSequenceNumber of a NetPkt, the set of iterators in NetPkt would be empty, and it's time to increment NetPkt-level counters. Finally, delete from m_unackedFrags.
  • When the algorithm retransmits a fragment, a new TxSequenceNumber would be assigned. In this operation, the fragment's iterator in NetPkt should be updated.
  • When the algorithm decides to give up on a fragment, other fragments belonging to the same NetPkt should be discarded as well. They can be found via the set of iterators in NetPkt.

The above procedure also makes no assumption on a particular fragmentation protocol that uses the SequenceNumber field, and eliminates the conditionals about whether the NetPkt is fragmented or unfragmented.
It simply treats the fragments passed to handleOutgoing call as belonging to the same NetPkt.

Actions #13

Updated by Junxiao Shi over 7 years ago

From https://gerrit.named-data.net/#/c/3848/25/daemon/face/lp-reliability.cpp:

    pkt.list<lp::AckField>(); // If this isn't here, the next add will cause all other fields to not encode
    pkt.add<lp::AckField>(ackSeq);

I cannot reproduce this problem using the minimal snippet below:

// g++ -std=c++11 -o x x.cpp $(pkg-config --cflags --libs libndn-cxx)

#include <ndn-cxx/lp/packet.hpp>
#include <iostream>

int
main()
{
  ndn::lp::Packet lpPacket;
  lpPacket.set<ndn::lp::FragIndexField>(0x8000);
  lpPacket.add<ndn::lp::AckField>(0x7001);
  lpPacket.add<ndn::lp::AckField>(0x7002);
  ndn::Block b = lpPacket.wireEncode();
  std::cout.write(reinterpret_cast<const char*>(b.wire()), b.size());
}

Its output is as expected:

64 LpPacket 10
  52 FragIndex 02
     80 00
  fd 03 44 Ack 02
     70 01
  fd 03 44 Ack 02
     70 02

You may try to find the problem by:

  1. Collect the sequence of function calls invoked on the lp::Packet in question.
  2. Confirm the problem exists in this sequence. If it doesn't, this is possibly a memory error (dangling pointer, etc), use valgrind to debug the original program.
  3. Reduce the sequence until the problem disappears.
  4. Find the minimal sequence that causes the problem. If you haven't found the problem by then, post a snippet of that minimal sequence and I'll have a look.
Actions #14

Updated by Eric Newberry over 7 years ago

It appears that inserting an Ack was causing all later non-Ack elements to be removed from the packet. I found that adding "m_wire.parse()" to lp::Packet::add just before finding the position of the new sub-element fixed the issue. I'll push a patch to Gerrit shortly.

Actions #15

Updated by Junxiao Shi over 7 years ago

inserting an Ack was causing all later non-Ack elements to be removed from the packet

I cannot reproduce this issue.
I'm trying with this snippet:

// g++ -std=c++11 -o x x.cpp $(pkg-config --cflags --libs libndn-cxx)

#include <ndn-cxx/lp/packet.hpp>
#include <iostream>

int
main()
{
  ndn::lp::Packet lpPacket;
  //lpPacket.set<ndn::lp::FragIndexField>(0x8000);
  lpPacket.add<ndn::lp::AckField>(0x7001);
  lpPacket.add<ndn::lp::AckField>(0x7002);

  lpPacket.set<ndn::lp::FragCountField>(0x9000);
  //lpPacket.add<ndn::lp::FragCountField>(0x9000);
  //lpPacket.set<ndn::lp::TxSequenceField>(0x9800);
  //lpPacket.add<ndn::lp::TxSequenceField>(0x9800);

  ndn::Block b = lpPacket.wireEncode();
  std::cout.write(reinterpret_cast<const char*>(b.wire()), b.size());
}

I've tried commenting and uncommenting the FragIndex line, using one of four lines to set/add a "later" field in which "later" is interpreted as "added later" or "appears later in encoding".
None of them makes any field disappear.

In any case, this belongs to a separate issue. Please report a bug in ndn-cxx tracker with a minimal snippet to reproduce, before attempting any fix.

Actions #16

Updated by Eric Newberry over 7 years ago

Junxiao, try putting

std::cout << lpPacket.wireEncode().size() << std::endl;

between the Ack adds. The size of the packet should become 8, only including the LpPacket TLV-TYPE and TLV-LENGTH, as well as the Ack.

Actions #17

Updated by Eric Newberry over 7 years ago

  • Blocked by Bug #4156: Encoding LpPackets after adding repeating field causes other fields to be lost added
Actions #18

Updated by Junxiao Shi over 7 years ago

  • Status changed from Code review to Closed
  • Start date deleted (01/22/2017)
Actions

Also available in: Atom PDF