Project

General

Profile

Actions

Bug #2403

closed

AccessStrategy: after producer failure, Interests still goes to last nexthop

Added by Junxiao Shi about 9 years ago. Updated over 8 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
Forwarding
Target version:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

Test case to reproduce:

BOOST_FIXTURE_TEST_CASE(Overhead, TwoLaptopsFixture)
{
  /*
   *             /------------------\                              /------------------\
   *             | intervalConsumer |                              | intervalConsumer |
   *             \------------------/              A               \------------------/
   *                      ^ v                      f                        ^ v
   *                      | v /laptops/M           t                        | v /laptops/M
   *                      |                        e                        |
   *                      v                        r                        v
   *      /laptops << +--------+ >> /laptops                /laptops << +--------+ >> /laptops
   *           +----->| router |<------+           1             +----->| router |<------+
   *           |      +--------+       |                         |      +--------+       |
   *      10ms |                       | 20ms  === s ==>    10ms |                       | 20ms
   *           v                       v           e             v                       v
   *      +---------+             +---------+      c        +---------+             +---------+
   *      | laptopA |             | laptopB |      o        | laptopA |             | laptopB |
   *      +---------+             +---------+      n        +---------+             +---------+
   *           ^  v                                d
   *           |  v /laptops/M
   *           v
   *    /--------------\
   *    | echoProducer |
   *    \--------------/
   */

  // laptopA has prefix in router FIB; laptopB is unused in this test case
  topo.registerPrefix(router, linkA->getFace(router), "ndn:/laptops");
  topo.registerPrefix(router, linkB->getFace(router), "ndn:/laptops");

  shared_ptr<TopologyAppLink> producerA = topo.addAppFace(laptopA, "ndn:/laptops/A");
  topo.addEchoProducer(*producerA->getClientFace());

  // to build up "correct" RTO value
  shared_ptr<TopologyAppLink> consumer = topo.addAppFace(router);
  topo.addIntervalConsumer(*consumer->getClientFace(), "ndn:/laptops/A",
                           time::milliseconds(10), 100);
  this->advanceClocks(time::milliseconds(1), time::seconds(1));

  auto me = topo.getForwarder(router).getMeasurements().findLongestPrefixMatch("ndn:/laptops/A");
  BOOST_REQUIRE(me != nullptr);

  auto mi = me->getStrategyInfo<fw::AccessStrategy::MtInfo>();
  BOOST_REQUIRE(mi != nullptr);

  auto rto = mi->rtt.computeRto();
  BOOST_CHECK(time::milliseconds(10) < rto && rto < time::milliseconds(30));

  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 100);
  BOOST_CHECK_EQUAL(linkA->getFace(laptopA)->m_sentDatas.size(), 100);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 3);

  linkA->getFace(router)->m_sentInterests.clear();
  linkA->getFace(laptopA)->m_sentDatas.clear();
  linkB->getFace(router)->m_sentInterests.clear();

  // actual test
  producerA->fail();

  topo.addIntervalConsumer(*consumer->getClientFace(), "ndn:/laptops/A",
                           time::milliseconds(31), 10);

  // unicast to laptopA
  this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 1);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 0);

  // multicast
  this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 1);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 1);

  // unicast to laptopA
  this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 2);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 1);

  // multicast
  this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 2);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 2);

  // unicast to laptopA
  this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 3);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 2);

  // multicast
  this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 3);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 3);

  this->advanceClocks(time::milliseconds(5), time::seconds(1));

  BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 10);
  BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 10);

  this->advanceClocks(time::milliseconds(5), time::seconds(5));

  // all consumers should be done more than 4 seconds ago
  me = topo.getForwarder(router).getMeasurements().findLongestPrefixMatch("ndn:/laptops/A");
  BOOST_CHECK(me == nullptr);
}

Actual: (as shown in test assertions) after producer failure, Interests are unicasted to where the producer was, and then multicasted after an RTO. This continues until MeasurementsEntry expires.

Expected: after producer failure, Interests are multicasted right away.


Related issues 1 (0 open1 closed)

Related to NFD - Feature #1999: Strategy for access routerClosedJunxiao Shi

Actions
Actions #1

Updated by Junxiao Shi about 9 years ago

Actions #2

Updated by Junxiao Shi about 9 years ago

  • Description updated (diff)

This design problem was raised by Alex in commit 2c1243ada4b8199e9c3357c8b71836b3e82a749f.

I disagree with this design change.

When a producer does not return Data within RTO, it could be one of the following cases:

  1. There's no Data matching the Interest, and the producer does not support producer-generated NACK.
  2. Producer fails temporarily (eg. overloaded), and no alternate producer exists.
  3. Producer fails permanently, and no alternate producer exists.
  4. Producer has moved.

In case 1, suppose the consumer retransmits:

("unicast count" and "multicast count" include the initial Interest, "N" is number of nexthops in FIB entry, "delay" is for second Interest to reach a producer that can answer it)

unicast count multicast count delay
current design 2 2N-2 N/A
proposed change 1 2N-1 N/A

In case 1, suppose the consumer sends another Interest that has matching Data at the producer:

unicast count multicast count delay
current design 2 N-1 none
proposed change 1 2N-1 none

In case 2, suppose the consumer retransmits, or sends another Interest:

unicast count multicast count delay
current design 2 N-1 none
proposed change 1 2N-1 none

In case 3, suppose the consumer retransmits, or sends another Interest:

unicast count multicast count delay
current design 2 2N-2 NA
proposed change 1 2N-1 NA

In case 4, suppose the consumer retransmits, or sends another Interest:

unicast count multicast count delay
current design 2 2N-2 RTO
proposed change 1 2N-1 none

The current design has advantages in most scenarios: outgoing Interest count is less than or equal to the proposed design.

The current design has a disadvantage in case 4: it incurs one RTO delay; however, once the new producer is found, it would become "last nexthop" and future Interests won't suffer this delay.

Actions #3

Updated by Junxiao Shi over 8 years ago

  • Status changed from New to Rejected

20151124 conference call agrees to reject this issue.

Beichuan says AccessStrategy is a short-term solution, so it doesn't need a fix if nothing breaks.
We should focus on designing some more universally applicable strategy.

Actions

Also available in: Atom PDF