Bug #3275
closedMeasurements::get(Fib::s_emptyEntry) crash
Added by susmit shannigrahi about 9 years ago. Updated about 9 years ago.
100%
Description
Steps to reproduce:
- construct
Fib
andMeasurements
attached to the sameNameTree
- call
fib.findLongestPrefixMatch("/")
; since FIB is empty, this returnsFib::s_emptyEntry
- call
measurements.get
with the FIB entry obtained from step2
Expected: a Measurements entry of ndn:/ is created
Actual: crash
Files
nfd-core-dump.txt (20.9 KB) nfd-core-dump.txt | susmit shannigrahi, 10/19/2015 02:47 PM | ||
den_log (32 KB) den_log | Valgrind log of crashed NFD. | susmit shannigrahi, 10/26/2015 01:03 PM |
Updated by Alex Afanasyev about 9 years ago
Can you give (steps to create) environment to reproduce an error? I don't see an obvious reason why this failure happens.
Does it happens only with the latest version of with 0.3.4 too?
Updated by susmit shannigrahi about 9 years ago
Alex Afanasyev wrote:
Can you give (steps to create) environment to reproduce an error? I don't see an obvious reason why this failure happens.
Does it happens only with the latest version of with 0.3.4 too?
We are using 0.3.4, not git HEAD.
Steps to reproduce:
Have strategy installed on a node (Let's call it nodeA)
The node should have same routes(/prefix/test) to multiple data producers (nodeA connected to producerA and producerB)
Kill the producer at producerA. Make sure routes for both producerA and producerB are still present at nodeA.
Send an Interest for the prefix (/prefix/test). Sometime it takes a few tries to reproduce this crash.
Updated by Davide Pesavento about 9 years ago
In the backtrace I see some funny numbers in the use count of some shared_ptr
s (very large values, negative values, ...). I suspect some kind of memory corruption is going on. Did you compile ndn-cxx and NFD yourself, both with the same compiler version?
Updated by susmit shannigrahi about 9 years ago
Yes, we compiled both with same compiler version.
Updated by Davide Pesavento about 9 years ago
Are you running NFD on the same machine where it was compiled?
Updated by susmit shannigrahi about 9 years ago
Yes, we are running on the same machine.
Updated by susmit shannigrahi about 9 years ago
- Priority changed from Normal to High
Folks, please take a look if you can. This is a blocker for a demo we are trying to do in three weeks.
Updated by Davide Pesavento about 9 years ago
Try building and running with AddressSanitizer, see if it reports anything. Or under valgrind.
Updated by susmit shannigrahi about 9 years ago
Does this log help? Please let me know if I can provide anything else to expedite this.
Updated by susmit shannigrahi about 9 years ago
Here is the valgrind log.
This is how I ran it.
# valgrind --log-file=/tmp/den_log --track-origins=yes nfd
Updated by susmit shannigrahi about 9 years ago
Ok, we found the problem.
// FIB lookup
shared_ptr<fib::Entry> fibEntry = m_fib.findLongestPrefixMatch(*pitEntry);
this->dispatchToStrategy(pitEntry, bind(&Strategy::afterReceiveInterest, _1,
cref(inFace), cref(interest), fibEntry, pitEntry));
A fibEntry is created by longest prefix match. If there is no match, an emptyEntry is returned. The return value isn't checked before passing it to dispatchToStrategy(). This causes a crash if fibEntry is nullptr.
Here is the code that fixes the segfault.
// FIB lookup
shared_ptr<fib::Entry> fibEntry = m_fib.findLongestPrefixMatch(*pitEntry);
// XXX This assert fails
//BOOST_ASSERT(fibEntry->m_nameTreeEntry != nullptr);
if (fibEntry->m_nameTreeEntry == nullptr)
{
NFD_LOG_WARN("Received Interest " << interest.getName() << " before setting up FIB entry");
return;
}
// dispatch to strategy
this->dispatchToStrategy(pitEntry, bind(&Strategy::afterReceiveInterest, _1,
cref(inFace), cref(interest), fibEntry, pitEntry));
}
Updated by Junxiao Shi about 9 years ago
- Subject changed from NFD crashes when custom strategy tries to access measurement table to Measurements::get(Fib::s_emptyEntry) crash
- Description updated (diff)
- Category changed from Forwarding to Tables
- Assignee set to Junxiao Shi
- Priority changed from High to Normal
Original report:
We are trying to use a simple random weighted strategy with two nodes. When one node can't serve data, strategy should forward the Interests to the other node.
The strategy code is here: https://github.com/dibenede/nfd-strategies/tree/master/weighted-load-balancer
We see NFD crashes while measurement info is accessed. I am attaching the full trace.
The finding in note-11 is correct, but the solution is wrong: it's legit to receive Interest without FIB entry, and a strategy may choose to forward it (e.g. flooding).
I'll think about a better solution.
Updated by Junxiao Shi about 9 years ago
- Status changed from New to Code review
- Target version set to v0.4
- % Done changed from 0 to 100
- Estimated time set to 1.00 h
There are three possible solutions:
SOLUTION A
Fib::s_emptyEntry
should be replaced with a NameTree-attached entry, i.e. FIB entry for ndn:/
cannot be erased.
SOLUTION B
NameTree::get
can have a special case for Fib::s_emptyEntry
.
SOLUTION C
Measurements::get
can have a special case for Fib::s_emptyEntry
.
I'm unsure which solution is better.
http://gerrit.named-data.net/2585 implements solution C.
Updated by Haowei Yuan about 9 years ago
Does Solution C mean that Interests with different names will share the same measurement entry if there are no matching FIB entries?
Updated by Junxiao Shi about 9 years ago
Does Solution C mean that Interests with different names will share the same measurement entry if there are no matching FIB entries?
Every solution will return the same Measurements entry, ndn:/
, if the strategy invokes Measurements::get(const fib::Entry&)
with the FIB entry found by longest prefix match on an empty FIB.
There's no semantic difference among the three solutions.
Updated by Junxiao Shi about 9 years ago
- Status changed from Code review to Closed