Feature #1207
closedCS policy interface
Added by Alex Afanasyev almost 11 years ago. Updated over 9 years ago.
100%
Description
Develop a policy interface for ContentStore.
- The ContentStore itself indexes entries by Name and serves lookups.
- The policy should be consulted for admission control and cleanup.
- There can be multiple policies in the code repository, but only one active policy is chosen either in configuration file or at compile time.
- Runtime change of policy (after inserting some entries) is disallowed.
This Feature also includes moving the current embedded "priority FIFO" cleanup feature into a policy, and make it as the default.
NFD Developer Guide should be updated to reflect this change.
Files
Content Store Cache Policy design.pdf (75.6 KB) Content Store Cache Policy design.pdf | Minsheng Zhang, 03/19/2015 02:41 PM |
Updated by Junxiao Shi almost 11 years ago
- Category set to Tables
- Target version set to v0.1
- Estimated time set to 6.00 h
- Parent task set to #1211
Updated by Junxiao Shi over 10 years ago
- Target version deleted (
v0.1) - Parent task deleted (
#1211)
20140307 conference call decides that policy interface isn't needed at this time, but cache policy will be needed in "longer term" when cache is much larger.
Updated by Alex Afanasyev about 10 years ago
- Target version set to v0.3
The design is not yet decided, but I'm assigning target version 0.3
Updated by Alex Afanasyev about 10 years ago
- Related to Task #1706: Reduce implicit digest computation in ContentStore added
Updated by Junxiao Shi almost 10 years ago
- Has duplicate Feature #2219: CS policy: LRU added
Updated by Junxiao Shi almost 10 years ago
- Related to deleted (Task #1706: Reduce implicit digest computation in ContentStore)
Updated by Junxiao Shi almost 10 years ago
- Blocked by Task #2254: Rewrite ContentStore based on ContentStore stub added
Updated by Minsheng Zhang over 9 years ago
- Status changed from New to In Progress
- Start date set to 02/26/2015
Seems to me that different data can have different cache policy in the CS. So every time there is a different data with different cache policy comes, it should check the corresponding queue for replacement? (If the isUnsolicited and isStale is empty)
Updated by Davide Pesavento over 9 years ago
- Target version changed from v0.3 to v0.4
Advancing target version since v0.3 is already out...
Updated by Junxiao Shi over 9 years ago
I saw some code submissions in http://gerrit.named-data.net/1868. However, there isn't an approved design for this issue.
Please first submit a design document/slides for review. Upload this document as attachment on Redmine.
Updated by Minsheng Zhang over 9 years ago
I've attached a class diagram and flow chart of all the operations to review. After that I will implement it.
Updated by Junxiao Shi over 9 years ago
- Tracker changed from Task to Feature
- Estimated time changed from 6.00 h to 15.00 h
The current CS has a "priority FIFO" policy; it's not LRU policy.
I suggest separating the code changes into two commits:
- CS policy interface: this commit shall add a policy abstraction, and implement a policy that have exactly same "priority FIFO" policy
- LRU policy: this commit shall add a pure Least-Recently-Used policy, which doesn't care about whether Data is unsolicited and/or stale
Estimated Time was entered a year ago. A lot has changed since then, and this Feature is becoming much more complex, so I'm increasing the estimation.
In current CS, the following declarations are specific for the "priority FIFO" policy:
Queue
QueueIt
QueueType
EntryImpl::queueType
EntryImpl::queueIt
EntryImpl::moveStaleEvent
Cs::attachQueue
Cs::detachQueue
Cs::moveToStaleQueue
As of the first commit described above, these declarations should deleted.
They can reappear inside the policy type.
EntryImpl
can have some generic fields that is to be used by the active policy (similar to StrategyInfoHost
).
Or, if necessary, move "query Entry" concept and operator<
to Entry
base type, and let each policy declare its own EntryImpl
type that contains necessary fields.
Updated by Minsheng Zhang over 9 years ago
I agree with moving the "query Entry" and operator< to Entry base type and make the EntryImpl have some generic fields similar to StrategyInfoHost. If we move these two into the base type, why not move the canStale and unsetUnsolicited operations into Entry base type as these operations should be used for all the Entries?
Updated by Alex Afanasyev over 9 years ago
not all cache items may need to be cached. The current assumption
that policy is striclty "replacement" should be related. There are
many useful use-cases that policy should be "placement" (i.e.,
determine whether to cache item in the first place)For example, onInsert method should return true/false to indicate
whether item actually inserted or not."evictPick" is very weird name to be used in policy. I prefer a more
verbose variant that helps understanding what the method is
doing. For example:pickItemForEviction
I missed before that Cs::evictPick() returns a tuple of
std::tuple<TableIt, std::string>
, where the second parameter is
purely for log purposes. This is not acceptable, as there is
creation of std::string() upon every single eviction event, even
when debug logging is not enabled.Log statement needs to be moved to the policy. When list-based
chaining policy is used (see below), a similar log can be displayed
without the need of introducing extra overhead.Cs::evictPick should be removed
Obscure TableIt should be renamed to Cs::iterator or something like that
policy should allow chaining.
- inheritance-based chaining (good, but more complex and requires compile-time changes)
- list-based chaining (also good and simplifies maintenance)
- Each policy is strictly independent and manages its own queues.
- Only one policy (at a time) holds on to the CS item
Here are some implementation ideas for list-based chaining.
...
// insert()
bool isPolicyInserted = false;
for (auto& policy : m_policies) {
if (!isNewEntry) {
if (policy->onUpdate(it))
break;
}
else {
isPolicyInserted = policy->onInsert(it);
if (isPolicyInserted)
break;
}
}
if (!isPolicyInserted) {
m_table.erase(it);
}
...
// evict()
while (this->size() > m_limit) {
for (auto& policy : m_policies) {
TableIt it = policy->evict(m_table.end()); // should call onErase internally
if (it != m_table.end()) {
NFD_LOG_DEBUG("evict " << it->getName() << " from " << policy->getName());
m_table.erase();
}
}
}
I'm assuming that we will implement:
- UnsolicitedFifo
- StaleFifo
- Fifo (with internal pointer to StaleFifo to move items there after they become stale)
- Lru (with internal pointer to StaleFifo to move items there after they become stale)
The problems that I can think of:
- don't know how to move from UnsolicitedFifo to other queues
- reimplementation needed, if we want to have UnsolicitedLru/StaleLru
Updated by Junxiao Shi over 9 years ago
I think there's some misunderstanding in note-16.
There shouldn't be multiple active policies: it increases complexity significantly.
My vision is to have a policy interface that is functionally same as ndnSIM 1.0. Specifically:
- There is only one active policy.
Code repository can have multiple alternate policies for different network environments (eg. end host vs router), but which policy to use is a choice in configuration. - There are two indexes:
- An index by Name is maintained in ContentStore, and is used for lookups.
- Another index is maintained in the policy, and is used for cleanups.
- ContentStore capacity is controlled by the policy. The policy can decide not to insert an entry, and can decide to evict entries.
Updated by Alex Afanasyev over 9 years ago
What I tried to show examples in note-16 doesn't contradict the goals you mentioned. It is just the way policy is constructed can be different from ndnSIM 1.0: in 1.0 I used inheritance as a way to combine sub-policies and I feel that it is not too convenient. "List" of policies is just a way to construct a specific policy. This list can also be internal to a specific policy...
Updated by Minsheng Zhang over 9 years ago
I think it's better that ContentStore capacity is controlled by the CS class not the Policy. For example, when we need to change the policy, we need to get the capacity and pass it to the Policy. So if we use CS to control the capacity, we don't need to pass this to the Policy...
Updated by Junxiao Shi over 9 years ago
note-16 is just the way policy is constructed can be different from ndnSIM 1.0: in 1.0 I used inheritance as a way to combine sub-policies and I feel that it is not too convenient. "List" of policies is just a way to construct a specific policy. This list can also be internal to a specific policy...
Yes, there could be a 'ListPolicy' that combines multiple policies. But ContentStore should see only one policy.
'ListPolicy' is out of scope of this issue.
I think it's better that ContentStore capacity is controlled by the CS class not the Policy. For example, when we need to change the policy, we need to get the capacity and pass it to the Policy. So if we use CS to control the capacity, we don't need to pass this to the Policy...
No. Capacity is an attribute of the policy, because policy is the only component that should take care of cleanups.
Updated by Junxiao Shi over 9 years ago
- Subject changed from CS policy interface and LRU policy to CS policy interface
- Description updated (diff)
- Estimated time changed from 15.00 h to 12.00 h
I'm splitting LRU policy to #2219. This Feature should focus on the interface only.
Updated by Junxiao Shi over 9 years ago
- Has duplicate deleted (Feature #2219: CS policy: LRU)
Updated by Junxiao Shi over 9 years ago
- Blocks Feature #2219: CS policy: LRU added
Updated by Junxiao Shi over 9 years ago
I looked at the cs::Policy
API design in commit:c0595655da34b8ddd868cd2c4c58dbaf0218b955.
Design comments:
- Method names "onX" are confusing. It's better to indicate either "beforeX" or "afterX".
- public virtual methods should be avoided, before they have the dual role of public API and internal implementation. Instead, declare private virtual methods, and invoke them from public non-virtual methods.
setLimit
is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.onInsert
shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.pickItemForEviction
shouldn't be exposed. It's policy's responsibility to keep CS size under hard limit. CS size can go over limit after insertion or after changing limit, and the policy can inform CS to erase entries with a signal.
I suggest the following public API:
class Policy : noncopyable
{
public:
/** \brief gets hard limit (in number of entries)
*/
size_t
getHardLimit() const;
/** \brief sets hard limit (in number of entries)
* \post getHardLimit() == nMaxEntries
* \post cs.size() <= getHardLimit()
*
* The policy may evict entries if necessary.
*/
void
setHardLimit(size_t nMaxEntries);
/** \brief emits when an entry is being evicted
*
* CS should erase the entry upon signal emission.
*/
signal::signal<Policy, iterator> beforeEvict;
/** \brief invoked by CS after an entry is inserted
* \post cs.size() <= getHardLimit()
*
* The policy may evict entries if necessary.
* During this process, \p i might be evicted.
*/
void
afterInsert(iterator i);
/** \brief invoked by CS before an entry is erased due to management command
* \warning CS must not invoke this method if an entry is erased due to eviction.
*/
void
beforeErase(iterator i);
/** \brief invoked by CS before an entry is used to match a lookup
*
* The policy may witness this usage to make better eviction decisions in the future.
*/
void
beforeUse(iterator i);
};
Since this is a complex Feature, I suggest the assignee to upload only headers (including protected and private parts) for review, and write implementation code after headers are reviewed.
Updated by Alex Afanasyev over 9 years ago
I don't think I agree with two of the comments
setLimit
is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.
This is an implementation detail. I do not think this detail should be defined in the policy interface.
onInsert
shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.
Why? If item is not inserted, it should indicate directly that it will not be inserted.
Updated by Junxiao Shi over 9 years ago
setLimit
is not general enough. The policy base class can have a "hard limit". Policy implementations can have other limits. For example, a policy may define a "soft limit", once exceeded, would start evicting packets from local producers but still accept packets from remote hosts.This is an implementation detail. I do not think this detail should be defined in the policy interface.
I could accept keeping the name as setLimit
, but the semantics is still "hard limit" and should be indicated in Doxygen description.
onInsert
shouldn't return boolean. Not accepting a new entry is simply expressed as immediately evicting this entry.Why? If item is not inserted, it should indicate directly that it will not be inserted.
afterInsert
(formerly onInsert
) takes iterator
parameter type, which can be obtained only after an entry is created and inserted into the lookup index.
To undo this insertion is same as to evict this entry.
afterEvict
signal would cause CS to erase this entry from the lookup index.
There would be additional complexity if (1) policy must avoid emitting afterEvict
signal for this entry (2) CS must look at afterInsert
return value and do the same thing as what it does for afterEvict
.
Updated by Alex Afanasyev over 9 years ago
I could accept keeping the name as
setLimit
, but the semantics is still "hard limit" and should be indicated in Doxygen description.
That is ok with me.
afterInsert
(formerlyonInsert
) takesiterator
parameter type, which can be obtained only after an entry is created and inserted into the lookup index.
To undo this insertion is same as to evict this entry.
afterEvict
signal would cause CS to erase this entry from the lookup index.
There would be additional complexity if (1) policy must avoid emittingafterEvict
signal for this entry (2) CS must look atafterInsert
return value and do the same thing as what it does forafterEvict
.
I'm still not 100% convinced. I understand that it is necessary to insert entry first into the underlying container before doing anything else. However, I do not clearly see why not creating signal "afterEvict" when entry wasn't actually inserted would complicate processing. In any case, I am not going to insist on this, just want to make a statement that this looks a little bit weird to me.
Updated by Junxiao Shi over 9 years ago
I do not clearly see why not creating signal "afterEvict" when entry wasn't actually inserted would complicate processing.
When *after*Insert is invoked, the entry is already inserted.
There is no such thing that an "entry wasn't actually inserted".
Suppose the policy concludes that the inserted entry should be evicted immediately, its choices are:
DESIGN | return value of afterInsert |
beforeEvict signal |
CS processing upon signal | CS processing upon 'false' |
---|---|---|---|---|
A | false | emitted | erase if not newest entry | erase |
B | false | emitted | erase | do nothing |
C | false | not emitted | erase | erase |
D | void | emitted | erase | not applicable |
- In DESIGN-A and DESIGN-C, either Policy or ContentStore must use a conditional to check whether the entry is newest entry.
- DESIGN-B is basically same as DESIGN-D, because the 'false' return value is ignored.
DESIGN-D as proposed in note-24 is the simplest, because:
- Unlike DESIGN-A or DESIGN-C, no conditional is needed in either Policy or ContentStore.
- The processing of erasing the newest entry or any older entry is exactly the same on ContentStore side. Recall that this doesn't involve timers because all timers should be inside Policy.
Updated by Junxiao Shi over 9 years ago
Current CS is able to distinguish whether a Data is unsolicited, and differs eviction decisions upon that.
The acceptance of unsolicited Data is to be dropped in #2181, so the Policy design does not take this into consideration.
In order to seamlessly port existing "priority FIFO" policy to be used before #2181 closes, but also avoid polluting Policy API with unsolicited parameter, the suggestion is:
- Policy API should not have unsolicited parameter.
- CS API remained unchanged in #1207.
- CS entry can have fields to indicate whether Data is unsolicited. The "priority FIFO" policy may inspect those fields.
Updated by Lan Wang over 9 years ago
#2181 is about unsolicited data from local apps. What about unsolicited data received from other nodes? As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.
Updated by Davide Pesavento over 9 years ago
Lan Wang wrote:
As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.
Correct. We don't want to lose the ability of (optionally) caching unsolicited data coming from other nodes.
Updated by Junxiao Shi over 9 years ago
#2181 is about unsolicited data from local apps. What about unsolicited data received from other nodes? As far as I know, the vehicular networking applications rely heavily on nodes caching unsolicited data.
This makes sense. I've noted this in #2181 note-8.
But still, policy API doesn't need an unsolicited parameter. CS entry has an unsolicited field and policy implementations can peek it.
Updated by Minsheng Zhang over 9 years ago
Because when setLimit() or afterInsert(iterator i) called, it needs to make sure that cs.size() <= getLimit(). Somehow policy needs to know size of CS.
I think there are two options to have the size of CS.
pass the cs reference to the policy(in the constructor Policy(const Name& name, const Cs& cs)
or has a function setCs(const Cs& cs))policy has a function virtual size_t size() = 0 to return the entries it has got.
So which one do you think is better?
Updated by Junxiao Shi over 9 years ago
Not every policy needs to index ContentStore entries. For example, the "random eviction" policy doesn't need its own index.
Therefore, .setCs(const Cs* cs)
is a good solution.
Having a setter (rather than a constructor parameter) makes it easier to create policy instances, but policy code that needs to access CS would need BOOST_ASSERT(m_cs != nullptr)
.
Updated by Junxiao Shi over 9 years ago
- Description updated (diff)
When updating NFD Developer Guide, be sure to describe the policy interface in detail. Reference: strategy API section.
Updated by Minsheng Zhang over 9 years ago
- % Done changed from 0 to 90
Update nfs-docs for cache policy api description.
Updated by Junxiao Shi over 9 years ago
- Status changed from In Progress to Closed
- % Done changed from 90 to 100
I confirm devguide has been updated. I will not review the text.