Feature #3931
closedImplement NDNLP Link Reliability Protocol
100%
Description
Implement the NDNLP Link Reliability Protocol, as per the design in #3823. This issue includes implementing the repeatable Ack field in ndn-cxx and acknowledgements, frame loss detection, and retransmissions in NFD's GenericLinkService.
Updated by Eric Newberry almost 8 years ago
- Related to Feature #3823: Congestion Control: design Local Link Loss Detection added
Updated by Eric Newberry almost 8 years ago
- Status changed from New to In Progress
- % Done changed from 0 to 10
Updated by Eric Newberry almost 8 years ago
- Status changed from In Progress to Code review
- % Done changed from 10 to 100
Updated by Eric Newberry almost 8 years ago
I noticed that the current implementation of NDNLP in ndn-cxx does not put repeatable fields of the same TLV-TYPE in the order they were added to the packet. I will submit a patch shortly that changes this behavior. I believe that inserting them in the order they are added works best for unit testing Acks in the reliability feature.
Updated by Alex Afanasyev almost 8 years ago
I still have a question about https://gerrit.named-data.net/#/c/3626/. When this commit is merged, what are the implications? Is reliability gets suddenly enabled on all/some links? If not, how can it be enabled?
Updated by Eric Newberry almost 8 years ago
Alex Afanasyev wrote:
I still have a question about https://gerrit.named-data.net/#/c/3626/. When this commit is merged, what are the implications? Is reliability gets suddenly enabled on all/some links? If not, how can it be enabled?
Like the other link services, this feature is enabled using a boolean value in GenericLinkService::Options. In this case, the option is further nested under LpReliability::Options. By default, the feature is disabled. Eventually, there should be a flag or option added to the faces/update management command to control this feature.
Updated by Eric Newberry almost 8 years ago
I was thinking that it might be better to enable reliability by default for certain types of links (depending on the Transport type - perhaps UnicastUdpTransport and EthernetTransport). In addition, I think that it should be possible to enable/disable this feature through FaceMgmt. However, I believe this should be an issue for a later commit, since the current implementation of LpReliability in https://gerrit.named-data.net/#/c/3626 is disabled by default.
Updated by Junxiao Shi almost 8 years ago
- Blocks Feature #4003: FaceMgmt: enable link reliability added
Updated by Eric Newberry almost 8 years ago
Eric Newberry wrote:
I was thinking that it might be better to enable reliability by default for certain types of links (depending on the Transport type - perhaps UnicastUdpTransport and EthernetTransport).
I realized that EthernetTransport is multicast, so disregard that one.
Updated by Eric Newberry over 7 years ago
- % Done changed from 100 to 80
Most of the code for the reliability feature has been merged, although Klaus, Beichuan, and I have agreed to change the retransmission design again. I will post more details about the design change in #3823 soon.
Updated by Eric Newberry over 7 years ago
- % Done changed from 80 to 100
The new design using TxSequence numbers in Acks has been uploaded for code review.
Updated by Junxiao Shi over 7 years ago
I'm looking at https://gerrit.named-data.net/3848 patchset7 and I feel that the data structure is inefficient.
Having separate m_unackedFrags
and m_netPkts
containers and linking them via TxSequenceNumber result in unnecessary lookups: you'll need a std::map
lookup to find a linked entry.
Instead, they can be linked via iterators and pointers:
- UnackedFrag has a
shared_ptr
to NetPkt. NetPkt has a set of iterators intom_unackedFrags
. - When a fragment is acknowledged, find it in
m_unackedFrags
, delete its iterator in the NetPkt. If this the last TxSequenceNumber of a NetPkt, the set of iterators in NetPkt would be empty, and it's time to increment NetPkt-level counters. Finally, delete fromm_unackedFrags
. - When the algorithm retransmits a fragment, a new TxSequenceNumber would be assigned. In this operation, the fragment's iterator in NetPkt should be updated.
- When the algorithm decides to give up on a fragment, other fragments belonging to the same NetPkt should be discarded as well. They can be found via the set of iterators in NetPkt.
The above procedure also makes no assumption on a particular fragmentation protocol that uses the SequenceNumber field, and eliminates the conditionals about whether the NetPkt is fragmented or unfragmented.
It simply treats the fragments passed to handleOutgoing call as belonging to the same NetPkt.
Updated by Junxiao Shi over 7 years ago
From https://gerrit.named-data.net/#/c/3848/25/daemon/face/lp-reliability.cpp:
pkt.list<lp::AckField>(); // If this isn't here, the next add will cause all other fields to not encode
pkt.add<lp::AckField>(ackSeq);
I cannot reproduce this problem using the minimal snippet below:
// g++ -std=c++11 -o x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <ndn-cxx/lp/packet.hpp>
#include <iostream>
int
main()
{
ndn::lp::Packet lpPacket;
lpPacket.set<ndn::lp::FragIndexField>(0x8000);
lpPacket.add<ndn::lp::AckField>(0x7001);
lpPacket.add<ndn::lp::AckField>(0x7002);
ndn::Block b = lpPacket.wireEncode();
std::cout.write(reinterpret_cast<const char*>(b.wire()), b.size());
}
Its output is as expected:
64 LpPacket 10
52 FragIndex 02
80 00
fd 03 44 Ack 02
70 01
fd 03 44 Ack 02
70 02
You may try to find the problem by:
- Collect the sequence of function calls invoked on the
lp::Packet
in question. - Confirm the problem exists in this sequence. If it doesn't, this is possibly a memory error (dangling pointer, etc), use
valgrind
to debug the original program. - Reduce the sequence until the problem disappears.
- Find the minimal sequence that causes the problem. If you haven't found the problem by then, post a snippet of that minimal sequence and I'll have a look.
Updated by Eric Newberry over 7 years ago
It appears that inserting an Ack was causing all later non-Ack elements to be removed from the packet. I found that adding "m_wire.parse()" to lp::Packet::add
just before finding the position of the new sub-element fixed the issue. I'll push a patch to Gerrit shortly.
Updated by Junxiao Shi over 7 years ago
inserting an Ack was causing all later non-Ack elements to be removed from the packet
I cannot reproduce this issue.
I'm trying with this snippet:
// g++ -std=c++11 -o x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <ndn-cxx/lp/packet.hpp>
#include <iostream>
int
main()
{
ndn::lp::Packet lpPacket;
//lpPacket.set<ndn::lp::FragIndexField>(0x8000);
lpPacket.add<ndn::lp::AckField>(0x7001);
lpPacket.add<ndn::lp::AckField>(0x7002);
lpPacket.set<ndn::lp::FragCountField>(0x9000);
//lpPacket.add<ndn::lp::FragCountField>(0x9000);
//lpPacket.set<ndn::lp::TxSequenceField>(0x9800);
//lpPacket.add<ndn::lp::TxSequenceField>(0x9800);
ndn::Block b = lpPacket.wireEncode();
std::cout.write(reinterpret_cast<const char*>(b.wire()), b.size());
}
I've tried commenting and uncommenting the FragIndex line, using one of four lines to set/add a "later" field in which "later" is interpreted as "added later" or "appears later in encoding".
None of them makes any field disappear.
In any case, this belongs to a separate issue. Please report a bug in ndn-cxx tracker with a minimal snippet to reproduce, before attempting any fix.
Updated by Eric Newberry over 7 years ago
Junxiao, try putting
std::cout << lpPacket.wireEncode().size() << std::endl;
between the Ack adds. The size of the packet should become 8, only including the LpPacket TLV-TYPE and TLV-LENGTH, as well as the Ack.
Updated by Eric Newberry over 7 years ago
- Blocked by Bug #4156: Encoding LpPackets after adding repeating field causes other fields to be lost added
Updated by Junxiao Shi over 7 years ago
- Status changed from Code review to Closed
- Start date deleted (
01/22/2017)