Bug #2403
closedAccessStrategy: after producer failure, Interests still goes to last nexthop
0%
Description
Test case to reproduce:
BOOST_FIXTURE_TEST_CASE(Overhead, TwoLaptopsFixture)
{
/*
* /------------------\ /------------------\
* | intervalConsumer | | intervalConsumer |
* \------------------/ A \------------------/
* ^ v f ^ v
* | v /laptops/M t | v /laptops/M
* | e |
* v r v
* /laptops << +--------+ >> /laptops /laptops << +--------+ >> /laptops
* +----->| router |<------+ 1 +----->| router |<------+
* | +--------+ | | +--------+ |
* 10ms | | 20ms === s ==> 10ms | | 20ms
* v v e v v
* +---------+ +---------+ c +---------+ +---------+
* | laptopA | | laptopB | o | laptopA | | laptopB |
* +---------+ +---------+ n +---------+ +---------+
* ^ v d
* | v /laptops/M
* v
* /--------------\
* | echoProducer |
* \--------------/
*/
// laptopA has prefix in router FIB; laptopB is unused in this test case
topo.registerPrefix(router, linkA->getFace(router), "ndn:/laptops");
topo.registerPrefix(router, linkB->getFace(router), "ndn:/laptops");
shared_ptr<TopologyAppLink> producerA = topo.addAppFace(laptopA, "ndn:/laptops/A");
topo.addEchoProducer(*producerA->getClientFace());
// to build up "correct" RTO value
shared_ptr<TopologyAppLink> consumer = topo.addAppFace(router);
topo.addIntervalConsumer(*consumer->getClientFace(), "ndn:/laptops/A",
time::milliseconds(10), 100);
this->advanceClocks(time::milliseconds(1), time::seconds(1));
auto me = topo.getForwarder(router).getMeasurements().findLongestPrefixMatch("ndn:/laptops/A");
BOOST_REQUIRE(me != nullptr);
auto mi = me->getStrategyInfo<fw::AccessStrategy::MtInfo>();
BOOST_REQUIRE(mi != nullptr);
auto rto = mi->rtt.computeRto();
BOOST_CHECK(time::milliseconds(10) < rto && rto < time::milliseconds(30));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 100);
BOOST_CHECK_EQUAL(linkA->getFace(laptopA)->m_sentDatas.size(), 100);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 3);
linkA->getFace(router)->m_sentInterests.clear();
linkA->getFace(laptopA)->m_sentDatas.clear();
linkB->getFace(router)->m_sentInterests.clear();
// actual test
producerA->fail();
topo.addIntervalConsumer(*consumer->getClientFace(), "ndn:/laptops/A",
time::milliseconds(31), 10);
// unicast to laptopA
this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 1);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 0);
// multicast
this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 1);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 1);
// unicast to laptopA
this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 2);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 1);
// multicast
this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 2);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 2);
// unicast to laptopA
this->advanceClocks(time::milliseconds(1), time::milliseconds(10));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 3);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 2);
// multicast
this->advanceClocks(time::milliseconds(1), time::milliseconds(20));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 3);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 3);
this->advanceClocks(time::milliseconds(5), time::seconds(1));
BOOST_CHECK_EQUAL(linkA->getFace(router)->m_sentInterests.size(), 10);
BOOST_CHECK_EQUAL(linkB->getFace(router)->m_sentInterests.size(), 10);
this->advanceClocks(time::milliseconds(5), time::seconds(5));
// all consumers should be done more than 4 seconds ago
me = topo.getForwarder(router).getMeasurements().findLongestPrefixMatch("ndn:/laptops/A");
BOOST_CHECK(me == nullptr);
}
Actual: (as shown in test assertions) after producer failure, Interests are unicasted to where the producer was, and then multicasted after an RTO. This continues until MeasurementsEntry expires.
Expected: after producer failure, Interests are multicasted right away.
Updated by Junxiao Shi almost 10 years ago
- Related to Feature #1999: Strategy for access router added
Updated by Junxiao Shi almost 10 years ago
- Description updated (diff)
This design problem was raised by Alex in commit 2c1243ada4b8199e9c3357c8b71836b3e82a749f.
I disagree with this design change.
When a producer does not return Data within RTO, it could be one of the following cases:
- There's no Data matching the Interest, and the producer does not support producer-generated NACK.
- Producer fails temporarily (eg. overloaded), and no alternate producer exists.
- Producer fails permanently, and no alternate producer exists.
- Producer has moved.
In case 1, suppose the consumer retransmits:
("unicast count" and "multicast count" include the initial Interest, "N" is number of nexthops in FIB entry, "delay" is for second Interest to reach a producer that can answer it)
unicast count | multicast count | delay | |
---|---|---|---|
current design | 2 | 2N-2 | N/A |
proposed change | 1 | 2N-1 | N/A |
In case 1, suppose the consumer sends another Interest that has matching Data at the producer:
unicast count | multicast count | delay | |
---|---|---|---|
current design | 2 | N-1 | none |
proposed change | 1 | 2N-1 | none |
In case 2, suppose the consumer retransmits, or sends another Interest:
unicast count | multicast count | delay | |
---|---|---|---|
current design | 2 | N-1 | none |
proposed change | 1 | 2N-1 | none |
In case 3, suppose the consumer retransmits, or sends another Interest:
unicast count | multicast count | delay | |
---|---|---|---|
current design | 2 | 2N-2 | NA |
proposed change | 1 | 2N-1 | NA |
In case 4, suppose the consumer retransmits, or sends another Interest:
unicast count | multicast count | delay | |
---|---|---|---|
current design | 2 | 2N-2 | RTO |
proposed change | 1 | 2N-1 | none |
The current design has advantages in most scenarios: outgoing Interest count is less than or equal to the proposed design.
The current design has a disadvantage in case 4: it incurs one RTO delay; however, once the new producer is found, it would become "last nexthop" and future Interests won't suffer this delay.
Updated by Junxiao Shi almost 9 years ago
- Status changed from New to Rejected
20151124 conference call agrees to reject this issue.
Beichuan says AccessStrategy
is a short-term solution, so it doesn't need a fix if nothing breaks.
We should focus on designing some more universally applicable strategy.