Bug #3554
closedSegmentFetcher restarts from segment0 upon Nack
100%
Description
Snippet to reproduce:
// g++ -o x -std=c++0x x.cpp $(pkg-config --cflags --libs libndn-cxx)
#include <ndn-cxx/security/key-chain.hpp>
#include <ndn-cxx/security/signing-helpers.hpp>
#include <ndn-cxx/security/validator-null.hpp>
#include <ndn-cxx/util/dummy-client-face.hpp>
#include <ndn-cxx/util/segment-fetcher.hpp>
namespace ndn {
namespace demo {
using util::DummyClientFace;
using util::SegmentFetcher;
KeyChain g_keyChain;
shared_ptr<Data>
makeData(const Name& name, bool isFinalBlock = false)
{
auto data = make_shared<Data>(name);
if (isFinalBlock) {
data->setMetaInfo(MetaInfo().setFinalBlockId(name.at(-1)));
}
g_keyChain.sign(*data, ndn::signingWithSha256());
return data;
}
int
main(int argc, char** argv)
{
DummyClientFace face;
int nNacks = 2;
face.onSendInterest.connect([&] (const Interest& interest) {
std::cout << interest << std::endl;
const Name& name = interest.getName();
if (!name.at(-1).isSegment()) {
auto data0 = makeData(Name(name).appendVersion().appendSegment(0));
face.getIoService().post([&face, data0] { face.receive(*data0); });
return;
}
uint64_t segmentNo = name.at(-1).toSegment();
if (segmentNo == 1 && nNacks-- > 0) {
lp::Nack nack(interest);
nack.setReason(lp::NackReason::DUPLICATE);
face.getIoService().post([&face, nack] { face.receive(nack); });
return;
}
auto data1 = makeData(name, segmentNo == 3);
face.getIoService().post([&face, data1] { face.receive(*data1); });
});
SegmentFetcher::fetch(face,
Interest("ndn:/A"),
make_shared<ValidatorNull>(),
bind([] { std::cout << "COMPLETE" << std::endl; }),
bind([] { std::cout << "ERROR" << std::endl; }));
face.processEvents();
return 0;
}
} // namespace demo
} // namespace ndn
int
main(int argc, char** argv)
{
return ndn::demo::main(argc, argv);
}
This snippet uses SegmentFetcher to fetch 4 segments.
The first and second Interests for segment1 would be responded with Nack-Duplicate.
All other responses are Data.
Expected: SegmentFetcher retrieves segment2 after receiving Data for segment1.
Actual: SegmentFetcher discards segment2 Data and restarts from segment0.
/A?ndn.ChildSelector=1&ndn.MustBeFresh=1&ndn.Nonce=369644762
/A/%FD%00%00%01Sz%F6%FC%1E/%00%01?ndn.ChildSelector=0&ndn.Nonce=509290956
/A/%FD%00%00%01Sz%F6%FC%1E/%00%01?ndn.ChildSelector=0&ndn.Nonce=944403614
/A/%FD%00%00%01Sz%F6%FC%1E/%00%01?ndn.ChildSelector=0&ndn.Nonce=1709422619
/A/%FD%00%00%01Sz%F6%FC%1E/%00%00?ndn.ChildSelector=0&ndn.Nonce=3626967908
/A/%FD%00%00%01Sz%F6%FC%1E/%00%01?ndn.ChildSelector=0&ndn.Nonce=2208639054
/A/%FD%00%00%01Sz%F6%FC%1E/%00%02?ndn.ChildSelector=0&ndn.Nonce=765477500
/A/%FD%00%00%01Sz%F6%FC%1E/%00%03?ndn.ChildSelector=0&ndn.Nonce=84957429
COMPLETE
Root cause: in SegmentFetcher::reExpressInterest, afterSegmentReceived
is always called with isSegmentZeroExpected=true
.
Updated by Junxiao Shi over 8 years ago
- Assignee set to Muktadir Chowdhury
- Estimated time set to 2.00 h
I'm assigning this to @Muktadir who wrote SegmentFetcher::reExpressInterest
function.
Updated by Muktadir Chowdhury over 8 years ago
- Status changed from New to In Progress
Updated by Muktadir Chowdhury over 8 years ago
I submitted the patch with unit-tests for the bug. However the patch is not building on Jenkins. Unit test is failing, but it is not failing on my machines. Moreover, I cannot reproduce the error on my machines, OSX 10.10.5 and Ubuntu 14.04 (64-bit). So it is difficult for me to debug it with Valgrind. Can anybody give me some idea?
Updated by Junxiao Shi over 8 years ago
The error message reported by Jenkins on Ubuntu 14.04 64-bit is:
http://jenkins.named-data.net/job/ndn-cxx/3476/OS=Ubuntu-14.04-64bit/consoleText
Entering test suite "UtilSegmentFetcher"
Entering test case "ZeroComponentName"
unknown location(0): fatal error in "ZeroComponentName": memory access violation at address: 0x7fe59f0d07f1: invalid permissions
Test is aborted
Leaving test case "ZeroComponentName"; testing time: 680mks
Leaving test suite "UtilSegmentFetcher"
You should be able to see some error when:
- ndn-cxx is configured with
./waf -j1 --color=yes configure --debug --enable-shared --disable-static --with-tests --without-pch --with-examples
. - Tests are executed as
valgrind ./build/unit-tests --log_level=all
If not, paste the full output of valgrind so that others can have a look
Updated by Muktadir Chowdhury over 8 years ago
I pushed a new patch. It built successfully in all the machines except for OSX 10.11. Here is link of the output http://jenkins.named-data.net/job/ndn-cxx/3477/OS=OSX-10.11/console
Looks like openssl is not installed in the machine.
Updated by Junxiao Shi over 8 years ago
Reply to note-5:
INSTALL.rst does not specify OpenSSL as a dependency, so ndn-cxx shouldn't rely on OpenSSL anywhere and should not install a header that includes OpenSSL.
You may delete the offending header in a separate commit before your commit.
Alternatively, introduce OpenSSL as a dependency, with necessary updates in documentation and CI scripts.
If you take this route, report a separate bug and use a separate commit.
Updated by Muktadir Chowdhury over 8 years ago
There is a header file called openssl.hpp that includes all the openssl headers, and it is not used anywhere. ndn-cxx 0.3 release note says that unused openssl dependency was removed. Should I remove the openssl.hpp file altogether? Or just remove the headers in it?
Updated by Junxiao Shi over 8 years ago
Answer to note-7:
Yes, you can delete openssl.hpp
in a separate commit under this issue.
Updated by Muktadir Chowdhury over 8 years ago
- Status changed from In Progress to Closed