Project

General

Profile

Feature #1207

CS policy interface

Added by Alex Afanasyev over 5 years ago. Updated about 4 years ago.

Status:
Closed
Priority:
Normal
Category:
Tables
Target version:
Start date:
02/26/2015
Due date:
% Done:

100%

Estimated time:
12.00 h

Description

Develop a policy interface for ContentStore.

  • The ContentStore itself indexes entries by Name and serves lookups.
  • The policy should be consulted for admission control and cleanup.
  • There can be multiple policies in the code repository, but only one active policy is chosen either in configuration file or at compile time.
  • Runtime change of policy (after inserting some entries) is disallowed.

This Feature also includes moving the current embedded "priority FIFO" cleanup feature into a policy, and make it as the default.

NFD Developer Guide should be updated to reflect this change.


Related issues

Blocked by NFD - Task #1212: CS structureClosed

Blocked by NFD - Task #2254: Rewrite ContentStore based on ContentStore stubClosed

Blocks NFD - Feature #2219: CS policy: LRUClosed

History

#1 Updated by Junxiao Shi over 5 years ago

  • Category set to Tables
  • Target version set to v0.1
  • Estimated time set to 6.00 h
  • Parent task set to #1211

#2 Updated by Junxiao Shi over 5 years ago

  • Description updated (diff)

#3 Updated by Junxiao Shi over 5 years ago

  • Target version deleted (v0.1)
  • Parent task deleted (#1211)

20140307 conference call decides that policy interface isn't needed at this time, but cache policy will be needed in "longer term" when cache is much larger.

#4 Updated by Alex Afanasyev almost 5 years ago

  • Target version set to v0.3

The design is not yet decided, but I'm assigning target version 0.3

#5 Updated by Alex Afanasyev almost 5 years ago

  • Related to Task #1706: Reduce implicit digest computation in ContentStore added

#6 Updated by Junxiao Shi almost 5 years ago

#7 Updated by Junxiao Shi almost 5 years ago

  • Related to deleted (Task #1706: Reduce implicit digest computation in ContentStore)

#8 Updated by Junxiao Shi almost 5 years ago

  • Blocked by Task #2254: Rewrite ContentStore based on ContentStore stub added

#9 Updated by Lan Wang over 4 years ago

  • Assignee set to Minsheng Zhang

#10 Updated by Minsheng Zhang over 4 years ago

  • Status changed from New to In Progress
  • Start date set to 02/26/2015

Seems to me that different data can have different cache policy in the CS. So every time there is a different data with different cache policy comes, it should check the corresponding queue for replacement? (If the isUnsolicited and isStale is empty)

#11 Updated by Davide Pesavento over 4 years ago

  • Target version changed from v0.3 to v0.4

Advancing target version since v0.3 is already out...

#12 Updated by Junxiao Shi over 4 years ago

I saw some code submissions in http://gerrit.named-data.net/1868. However, there isn't an approved design for this issue.

Please first submit a design document/slides for review. Upload this document as attachment on Redmine.

#13 Updated by Minsheng Zhang over 4 years ago

I've attached a class diagram and flow chart of all the operations to review. After that I will implement it.

#14 Updated by Junxiao Shi over 4 years ago

  • Tracker changed from Task to Feature
  • Estimated time changed from 6.00 h to 15.00 h

The current CS has a "priority FIFO" policy; it's not LRU policy.

I suggest separating the code changes into two commits:

  • CS policy interface: this commit shall add a policy abstraction, and implement a policy that have exactly same "priority FIFO" policy
  • LRU policy: this commit shall add a pure Least-Recently-Used policy, which doesn't care about whether Data is unsolicited and/or stale

Estimated Time was entered a year ago. A lot has changed since then, and this Feature is becoming much more complex, so I'm increasing the estimation.


In current CS, the following declarations are specific for the "priority FIFO" policy:

Queue
QueueIt
QueueType
EntryImpl::queueType
EntryImpl::queueIt
EntryImpl::moveStaleEvent
Cs::attachQueue
Cs::detachQueue
Cs::moveToStaleQueue

As of the first commit described above, these declarations should deleted.
They can reappear inside the policy type.

EntryImpl can have some generic fields that is to be used by the active policy (similar to StrategyInfoHost).

Or, if necessary, move "query Entry" concept and operator< to Entry base type, and let each policy declare its own EntryImpl type that contains necessary fields.

#15 Updated by Minsheng Zhang over 4 years ago

I agree with moving the "query Entry" and operator< to Entry base type and make the EntryImpl have some generic fields similar to StrategyInfoHost. If we move these two into the base type, why not move the canStale and unsetUnsolicited operations into Entry base type as these operations should be used for all the Entries?

#16 Updated by Alex Afanasyev over 4 years ago

  • not all cache items may need to be cached. The current assumption
    that policy is striclty "replacement" should be related. There are
    many useful use-cases that policy should be "placement" (i.e.,
    determine whether to cache item in the first place)

    For example, onInsert method should return true/false to indicate
    whether item actually inserted or not.

  • "evictPick" is very weird name to be used in policy. I prefer a more
    verbose variant that helps understanding what the method is
    doing. For example:

     pickItemForEviction
    
  • I missed before that Cs::evictPick() returns a tuple of
    std::tuple<TableIt, std::string>, where the second parameter is
    purely for log purposes. This is not acceptable, as there is
    creation of std::string() upon every single eviction event, even
    when debug logging is not enabled.

    Log statement needs to be moved to the policy. When list-based
    chaining policy is used (see below), a similar log can be displayed
    without the need of introducing extra overhead.

  • Cs::evictPick should be removed

  • Obscure TableIt should be renamed to Cs::iterator or something like that

  • policy should allow chaining.

    • inheritance-based chaining (good, but more complex and requires compile-time changes)
    • list-based chaining (also good and simplifies maintenance)
      • Each policy is strictly independent and manages its own queues.
      • Only one policy (at a time) holds on to the CS item

Here are some implementation ideas for list-based chaining.

...

// insert()
bool isPolicyInserted = false;
for (auto& policy : m_policies) {
  if (!isNewEntry) {
    if (policy->onUpdate(it))
      break;
  }
  else {
    isPolicyInserted = policy->onInsert(it);
    if (isPolicyInserted)
      break;
  }
}

if (!isPolicyInserted) {
  m_table.erase(it);
}

...

// evict()
while (this->size() > m_limit) {
  for (auto& policy : m_policies) {
    TableIt it = policy->evict(m_table.end()); // should call onErase internally
    if (it != m_table.end()) {
      NFD_LOG_DEBUG("evict " << it->getName() << " from " << policy->getName());
      m_table.erase();
    }
  }
}  

I'm assuming that we will implement:

  • UnsolicitedFifo
  • StaleFifo
  • Fifo (with internal pointer to StaleFifo to move items there after they become stale)
  • Lru (with internal pointer to StaleFifo to move items there after they become stale)

The problems that I can think of:

  • don't know how to move from UnsolicitedFifo to other queues
  • reimplementation needed, if we want to have UnsolicitedLru/StaleLru

#17 Updated by Junxiao Shi over 4 years ago

I think there's some misunderstanding in note-16.
There shouldn't be multiple active policies: it increases complexity significantly.

My vision is to have a policy interface that is functionally same as ndnSIM 1.0. Specifically:

  • There is only one active policy.
    Code repository can have multiple alternate policies for different network environments (eg. end host vs router), but which policy to use is a choice in configuration.
  • There are two indexes:
    • An index by Name is maintained in ContentStore, and is used for lookups.
    • Another index is maintained in the policy, and is used for cleanups.
  • ContentStore capacity is controlled by the policy. The policy can decide not to insert an entry, and can decide to evict entries.

#18 Updated by Alex Afanasyev over 4 years ago

What I tried to show examples in note-16 doesn't contradict the goals you mentioned. It is just the way policy is constructed can be different from ndnSIM 1.0: in 1.0 I used inheritance as a way to combine sub-policies and I feel that it is not too convenient. "List" of policies is just a way to construct a specific policy. This list can also be internal to a specific policy...

#19 Updated by Minsheng Zhang over 4 years ago

I think it's better that ContentStore capacity is controlled by the CS class not the Policy. For example, when we need to change the policy, we need to get the capacity and pass it to the Policy. So if we use CS to control the capacity, we don't need to pass this to the Policy...

#20 Updated by Junxiao Shi over 4 years ago

note-16 is just the way policy is constructed can be different from ndnSIM 1.0: in 1.0 I used inheritance as a way to combine sub-policies and I feel that it is not too convenient. "List" of policies is just a way to construct a specific policy. This list can also be internal to a specific policy...

Yes, there could be a 'ListPolicy' that combines multiple policies. But ContentStore should see only one policy.

'ListPolicy' is out of scope of this issue.

I think it's better that ContentStore capacity is controlled by the CS class not the Policy. For example, when we need to change the policy, we need to get the capacity and pass it to the Policy. So if we use CS to control the capacity, we don't need to pass this to the Policy...

No. Capacity is an attribute of the policy, because policy is the only component that should take care of cleanups.

#21 Updated by Junxiao Shi over 4 years ago

  • Subject changed from CS policy interface and LRU policy to CS policy interface
  • Description updated (diff)
  • Estimated time changed from 15.00 h to 12.00 h

I'm splitting LRU policy to #2219. This Feature should focus on the interface only.

#22 Updated by Junxiao Shi over 4 years ago

#23 Updated by Junxiao Shi over 4 years ago

#24 Updated by Junxiao Shi over 4 years ago

I looked at the cs::Policy API design in commit:c0595655da34b8ddd868cd2c4c58dbaf0218b955.

Design comments:

  • Method names "onX" are confusing. It's better to indicate either "beforeX" or "afterX".
  • public virtual methods should be avoided, before they have the dual role of public API and internal implementation. Instead, declare private virtual methods, and invoke them from public non-virtual methods.
  • setLimit is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.
  • onInsert shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.
  • pickItemForEviction shouldn't be exposed. It's policy's responsibility to keep CS size under hard limit. CS size can go over limit after insertion or after changing limit, and the policy can inform CS to erase entries with a signal.

I suggest the following public API:

class Policy : noncopyable
{
public:
  /** \brief gets hard limit (in number of entries)
   */
  size_t
  getHardLimit() const;

  /** \brief sets hard limit (in number of entries)
   *  \post getHardLimit() == nMaxEntries
   *  \post cs.size() <= getHardLimit()
   *
   *  The policy may evict entries if necessary.
   */
  void
  setHardLimit(size_t nMaxEntries);

  /** \brief emits when an entry is being evicted
   *
   *  CS should erase the entry upon signal emission.
   */
  signal::signal<Policy, iterator> beforeEvict;

  /** \brief invoked by CS after an entry is inserted
   *  \post cs.size() <= getHardLimit()
   *
   *  The policy may evict entries if necessary.
   *  During this process, \p i might be evicted.
   */
  void
  afterInsert(iterator i);

  /** \brief invoked by CS before an entry is erased due to management command
   *  \warning CS must not invoke this method if an entry is erased due to eviction.
   */
  void
  beforeErase(iterator i);

  /** \brief invoked by CS before an entry is used to match a lookup
   *
   *  The policy may witness this usage to make better eviction decisions in the future.
   */
  void
  beforeUse(iterator i);
};

Since this is a complex Feature, I suggest the assignee to upload only headers (including protected and private parts) for review, and write implementation code after headers are reviewed.

#25 Updated by Alex Afanasyev over 4 years ago

I don't think I agree with two of the comments

  • setLimit is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.

This is an implementation detail. I do not think this detail should be defined in the policy interface.

  • onInsert shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.

Why? If item is not inserted, it should indicate directly that it will not be inserted.

#26 Updated by Junxiao Shi over 4 years ago

  • setLimit is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.

This is an implementation detail. I do not think this detail should be defined in the policy interface.

I could accept keeping the name as setLimit, but the semantics is still "hard limit" and should be indicated in Doxygen description.

  • onInsert shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.

Why? If item is not inserted, it should indicate directly that it will not be inserted.

afterInsert (formerly onInsert) takes iterator parameter type, which can be obtained only after an entry is created and inserted into the lookup index.

To undo this insertion is same as to evict this entry.
afterEvict signal would cause CS to erase this entry from the lookup index.
There would be additional complexity if (1) policy must avoid emitting afterEvict signal for this entry (2) CS must look at afterInsert return value and do the same thing as what it does for afterEvict.

#27 Updated by Alex Afanasyev over 4 years ago

I could accept keeping the name as setLimit, but the semantics is still "hard limit" and should be indicated in Doxygen description.

That is ok with me.

afterInsert (formerly onInsert) takes iterator parameter type, which can be obtained only after an entry is created and inserted into the lookup index.

To undo this insertion is same as to evict this entry.
afterEvict signal would cause CS to erase this entry from the lookup index.
There would be additional complexity if (1) policy must avoid emitting afterEvict signal for this entry (2) CS must look at afterInsert return value and do the same thing as what it does for afterEvict.

I'm still not 100% convinced. I understand that it is necessary to insert entry first into the underlying container before doing anything else. However, I do not clearly see why not creating signal "afterEvict" when entry wasn't actually inserted would complicate processing. In any case, I am not going to insist on this, just want to make a statement that this looks a little bit weird to me.

#28 Updated by Junxiao Shi over 4 years ago

I do not clearly see why not creating signal "afterEvict" when entry wasn't actually inserted would complicate processing.

When *after*Insert is invoked, the entry is already inserted.

There is no such thing that an "entry wasn't actually inserted".

Suppose the policy concludes that the inserted entry should be evicted immediately, its choices are:

DESIGN return value of afterInsert beforeEvict signal CS processing upon signal CS processing upon 'false'
A false emitted erase if not newest entry erase
B false emitted erase do nothing
C false not emitted erase erase
D void emitted erase not applicable
  • In DESIGN-A and DESIGN-C, either Policy or ContentStore must use a conditional to check whether the entry is newest entry.
  • DESIGN-B is basically same as DESIGN-D, because the 'false' return value is ignored.

DESIGN-D as proposed in note-24 is the simplest, because:

  • Unlike DESIGN-A or DESIGN-C, no conditional is needed in either Policy or ContentStore.
  • The processing of erasing the newest entry or any older entry is exactly the same on ContentStore side. Recall that this doesn't involve timers because all timers should be inside Policy.

#29 Updated by Junxiao Shi over 4 years ago

Current CS is able to distinguish whether a Data is unsolicited, and differs eviction decisions upon that.

The acceptance of unsolicited Data is to be dropped in #2181, so the Policy design does not take this into consideration.

In order to seamlessly port existing "priority FIFO" policy to be used before #2181 closes, but also avoid polluting Policy API with unsolicited parameter, the suggestion is:

  • Policy API should not have unsolicited parameter.
  • CS API remained unchanged in #1207.
  • CS entry can have fields to indicate whether Data is unsolicited. The "priority FIFO" policy may inspect those fields.

#30 Updated by Lan Wang over 4 years ago

#2181 is about unsolicited data from local apps. What about unsolicited data received from other nodes? As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.

#31 Updated by Davide Pesavento over 4 years ago

Lan Wang wrote:

As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.

Correct. We don't want to lose the ability of (optionally) caching unsolicited data coming from other nodes.

#32 Updated by Junxiao Shi over 4 years ago

#2181 is about unsolicited data from local apps. What about unsolicited data received from other nodes? As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.

This makes sense. I've noted this in #2181 note-8.

But still, policy API doesn't need an unsolicited parameter. CS entry has an unsolicited field and policy implementations can peek it.

#33 Updated by Minsheng Zhang over 4 years ago

Because when setLimit() or afterInsert(iterator i) called, it needs to make sure that cs.size() <= getLimit(). Somehow policy needs to know size of CS.

I think there are two options to have the size of CS.

  1. pass the cs reference to the policy(in the constructor Policy(const Name& name, const Cs& cs)
    or has a function setCs(const Cs& cs))

  2. policy has a function virtual size_t size() = 0 to return the entries it has got.

So which one do you think is better?

#34 Updated by Junxiao Shi over 4 years ago

Not every policy needs to index ContentStore entries. For example, the "random eviction" policy doesn't need its own index.

Therefore, .setCs(const Cs* cs) is a good solution.

Having a setter (rather than a constructor parameter) makes it easier to create policy instances, but policy code that needs to access CS would need BOOST_ASSERT(m_cs != nullptr).

#35 Updated by Junxiao Shi over 4 years ago

  • Description updated (diff)

When updating NFD Developer Guide, be sure to describe the policy interface in detail. Reference: strategy API section.

#36 Updated by Minsheng Zhang about 4 years ago

  • % Done changed from 0 to 90

Update nfs-docs for cache policy api description.

#37 Updated by Junxiao Shi about 4 years ago

  • Status changed from In Progress to Closed
  • % Done changed from 90 to 100

I confirm devguide has been updated. I will not review the text.

Also available in: Atom PDF