Project

General

Profile

Actions

Bug #3275

closed

Measurements::get(Fib::s_emptyEntry) crash

Added by susmit shannigrahi about 9 years ago. Updated about 9 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Tables
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
1.00 h

Description

Steps to reproduce:

  1. construct Fib and Measurements attached to the same NameTree
  2. call fib.findLongestPrefixMatch("/"); since FIB is empty, this returns Fib::s_emptyEntry
  3. call measurements.get with the FIB entry obtained from step2

Expected: a Measurements entry of ndn:/ is created

Actual: crash


Files

nfd-core-dump.txt (20.9 KB) nfd-core-dump.txt susmit shannigrahi, 10/19/2015 02:47 PM
den_log (32 KB) den_log Valgrind log of crashed NFD. susmit shannigrahi, 10/26/2015 01:03 PM
Actions #1

Updated by Alex Afanasyev about 9 years ago

Can you give (steps to create) environment to reproduce an error? I don't see an obvious reason why this failure happens.

Does it happens only with the latest version of with 0.3.4 too?

Actions #2

Updated by susmit shannigrahi about 9 years ago

Alex Afanasyev wrote:

Can you give (steps to create) environment to reproduce an error? I don't see an obvious reason why this failure happens.

Does it happens only with the latest version of with 0.3.4 too?

We are using 0.3.4, not git HEAD.

Steps to reproduce:

  • Have strategy installed on a node (Let's call it nodeA)

  • The node should have same routes(/prefix/test) to multiple data producers (nodeA connected to producerA and producerB)

  • Kill the producer at producerA. Make sure routes for both producerA and producerB are still present at nodeA.

  • Send an Interest for the prefix (/prefix/test). Sometime it takes a few tries to reproduce this crash.

Actions #3

Updated by Davide Pesavento about 9 years ago

In the backtrace I see some funny numbers in the use count of some shared_ptrs (very large values, negative values, ...). I suspect some kind of memory corruption is going on. Did you compile ndn-cxx and NFD yourself, both with the same compiler version?

Actions #4

Updated by susmit shannigrahi about 9 years ago

Yes, we compiled both with same compiler version.

Actions #5

Updated by Davide Pesavento about 9 years ago

Are you running NFD on the same machine where it was compiled?

Actions #6

Updated by susmit shannigrahi about 9 years ago

Yes, we are running on the same machine.

Actions #7

Updated by susmit shannigrahi about 9 years ago

  • Priority changed from Normal to High

Folks, please take a look if you can. This is a blocker for a demo we are trying to do in three weeks.

Actions #8

Updated by Davide Pesavento about 9 years ago

Try building and running with AddressSanitizer, see if it reports anything. Or under valgrind.

Actions #9

Updated by susmit shannigrahi about 9 years ago

Does this log help? Please let me know if I can provide anything else to expedite this.

Actions #10

Updated by susmit shannigrahi about 9 years ago

Here is the valgrind log.

This is how I ran it.


# valgrind --log-file=/tmp/den_log --track-origins=yes nfd


Actions #11

Updated by susmit shannigrahi about 9 years ago

Ok, we found the problem.

  // FIB lookup
  shared_ptr<fib::Entry> fibEntry = m_fib.findLongestPrefixMatch(*pitEntry);
  this->dispatchToStrategy(pitEntry, bind(&Strategy::afterReceiveInterest, _1,
                                          cref(inFace), cref(interest), fibEntry, pitEntry));

A fibEntry is created by longest prefix match. If there is no match, an emptyEntry is returned. The return value isn't checked before passing it to dispatchToStrategy(). This causes a crash if fibEntry is nullptr.

Here is the code that fixes the segfault.

  // FIB lookup
  shared_ptr<fib::Entry> fibEntry = m_fib.findLongestPrefixMatch(*pitEntry);

  // XXX This assert fails
  //BOOST_ASSERT(fibEntry->m_nameTreeEntry != nullptr);
  if (fibEntry->m_nameTreeEntry == nullptr)
    {
      NFD_LOG_WARN("Received Interest " << interest.getName() << " before setting up FIB entry");
      return;
    }

  // dispatch to strategy
  this->dispatchToStrategy(pitEntry, bind(&Strategy::afterReceiveInterest, _1,
                                          cref(inFace), cref(interest), fibEntry, pitEntry));
}
Actions #12

Updated by susmit shannigrahi about 9 years ago

  • Category set to Forwarding
Actions #13

Updated by Junxiao Shi about 9 years ago

  • Subject changed from NFD crashes when custom strategy tries to access measurement table to Measurements::get(Fib::s_emptyEntry) crash
  • Description updated (diff)
  • Category changed from Forwarding to Tables
  • Assignee set to Junxiao Shi
  • Priority changed from High to Normal

Original report:

We are trying to use a simple random weighted strategy with two nodes. When one node can't serve data, strategy should forward the Interests to the other node.
The strategy code is here: https://github.com/dibenede/nfd-strategies/tree/master/weighted-load-balancer
We see NFD crashes while measurement info is accessed. I am attaching the full trace.

The finding in note-11 is correct, but the solution is wrong: it's legit to receive Interest without FIB entry, and a strategy may choose to forward it (e.g. flooding).

I'll think about a better solution.

Actions #14

Updated by Junxiao Shi about 9 years ago

  • Status changed from New to Code review
  • Target version set to v0.4
  • % Done changed from 0 to 100
  • Estimated time set to 1.00 h

There are three possible solutions:

SOLUTION A

Fib::s_emptyEntry should be replaced with a NameTree-attached entry, i.e. FIB entry for ndn:/ cannot be erased.

SOLUTION B

NameTree::get can have a special case for Fib::s_emptyEntry.

SOLUTION C

Measurements::get can have a special case for Fib::s_emptyEntry.

I'm unsure which solution is better.

http://gerrit.named-data.net/2585 implements solution C.

Actions #15

Updated by Haowei Yuan about 9 years ago

Does Solution C mean that Interests with different names will share the same measurement entry if there are no matching FIB entries?

Actions #16

Updated by Junxiao Shi about 9 years ago

Does Solution C mean that Interests with different names will share the same measurement entry if there are no matching FIB entries?

Every solution will return the same Measurements entry, ndn:/, if the strategy invokes Measurements::get(const fib::Entry&) with the FIB entry found by longest prefix match on an empty FIB.

There's no semantic difference among the three solutions.

Actions #17

Updated by Junxiao Shi about 9 years ago

  • Status changed from Code review to Closed
Actions

Also available in: Atom PDF