Feature #4050
closedContent Store Manager
100%
Description
Design Content Store management protocol CsMgmt and implement the corresponding manager for it.
In this issue, the protocol shall support retrieving and updating CS config, including:
- Get the maximum number of entries
- Set the maximum number of entries (entries in excess are evicted)
- Enable/disable the CS
- This could be split into two orthogonal flags: (a) whether to cache incoming Data or not; and (b) whether to serve Data from the cache or not
Updated by Junxiao Shi over 7 years ago
What's the application scenario, for each proposed feature?
Updated by Davide Pesavento over 7 years ago
Monitoring/measurement/experimentation.
Updated by Qi Zhao over 7 years ago
I am also interested in this feature. I think the controller should have this feature and it can help with experimentation as well. NDN-CC should also provide this feature as well.
Updated by Junxiao Shi over 7 years ago
My main concern is on the scalability of the enumeration operation as part of the Management protocol.
On the NDN testbed, each router has 65536-entry CS. Suppose each record is 50 octets, and each Data packet in the dataset carries 8000 octets of payload, a full enumerate would produce 409 Data packets. Creating and signing 409 Data packets takes more than 1 second (according to #1589-1 benchmark on v1 KeyChain), which means NFD cannot process any new packets for more than 1 second, which is unacceptable.
Management thread (#3565) can eliminate the signing overhead, but it still requires the CS to be locked for some time in order to copy the packet names.
Monitoring/measurement/experimentation.
These purposes can be achieved with ndnSIM, where the scenario code has direct access to the nfd::Cs
object. We just need to add methods to flush and enable/disable the CS.
Updated by Davide Pesavento over 7 years ago
Junxiao Shi wrote:
My main concern is on the scalability of the enumeration operation as part of the Management protocol.
Yes, that is my concern as well. However:
- The management protocol should not be constrained by current implementation limitations or inefficiencies, a future version of NFD may remove these limitations or a different forwarder may not have them in the first place.
- Enumeration can be a privileged operation.
- I'm not sure we actually need full CS enumeration. In fact, it would be enough as a first step to have a "query" mgmt interface that given a name + selectors returns any matching entry, basically wrapping
Cs::find()
. - Additionally, in the case of filtered enumeration, the CS manager can return only the first
N
matching entries, whereN
is a configuration parameter.
Monitoring/measurement/experimentation.
These purposes can be achieved with ndnSIM
I'm not interested in simulations.
Updated by Junxiao Shi over 7 years ago
it would be enough as a first step to have a "query" mgmt interface that given a name + selectors returns any matching entry, basically wrapping
Cs::find()
.
You already have it (untested snippet):
auto interest = make_shared<Interest>("/A");
interest->setTag(make_shared<lp::NextHopFaceId>(254));
face.expressInterest(*interest, nullptr, nullptr, nullptr);
This Interest can only be forwarded to the special "content store" face, which means it will retrieve any Data present in the CS, but will be Nacked (with standards-conforming forwarding plane implementation) if the Data is missing.
Updated by Davide Pesavento over 7 years ago
Junxiao Shi wrote:
interest->setTag(make_shared<lp::NextHopFaceId>(254));
My understanding is that: (1) there's no guarantee that NextHopFaceId
will be respected, and (2) everything concerning FaceId 254 is an internal implementation detail and must not be relied upon.
Updated by Davide Pesavento over 7 years ago
Another question is whether we need a new "cs" privilege for this manager or not.
Also, can we add the "max number of entries" to the GeneralStatus
dataset, or do we need to create a separate CsStatus
dataset for just one number?
Updated by Davide Pesavento over 7 years ago
- Status changed from New to In Progress
Updated by Davide Pesavento over 7 years ago
- Assignee changed from Davide Pesavento to Pauline Chan
Updated by Junxiao Shi over 7 years ago
can we add the "max number of entries" to the
GeneralStatus
dataset, or do we need to create a separateCsStatus
dataset for just one number?
This can belong to status/general
, if there's no plan for more fields.
Updated by Davide Pesavento over 7 years ago
The only 2 things I can think of right now are:
- Obtain the current state of the "cache" and "serve" flags (the "get" counterpart of the "set" described above).
- Get statistics on the CS, e.g. hit/miss count. However, I would expect monitoring software to poll these counters much more frequently than other relatively static values, so keeping them separate might not be a bad idea.
Finally, status/general
already provides NCsEntries
, so it seems natural to add a MaxCsEntries
there.
Updated by Pauline Chan over 7 years ago
Draft of the specification:
https://gist.github.com/pchan1/29d603fb490ac3fc706ce3caba6eafeb
Suggestions are welcomed
Updated by Junxiao Shi over 7 years ago
I've reviewed design as of revision 797a7998bac6dc65ddeb9eb1c4bde9bd7c76f372.
I do not understand this sentence:
This command does not have a hard limit and does not account for limits that exceed memory space
What's a use case for a CS with "cache" enabled but "serve" disabled?
What's a use case for a CS with "serve" enabled but "cache" disabled?
Why are
set-limit
andset-flags
commands needed?
Similar functionality is already supported as (1) modifynfd.conf
(2) send SIGHUP to trigger config reload.
What's a use case that cannot be supported through existing config reload mechanism?
Hint: a valid use case should require runtime changes of each setting, and the justification needs to state why modifyingnfd.conf
is impractical.Assuming there is a compelling use case that requires commands, why not defining a single command that controls all attributes of the CS including capacity limit, serve/cache flags, and eviction policy?
Should
flush
command support flushing only Data packets under a certain prefix?
If so,erase
is a better name.list
dataset could be huge.
Suppose the CS is at default capacity of 65536 packets and is completely full, and the average entry length is 32 octets, the dataset would have 263 segments each with 8000 octets of payload.
Generating and signing so many Data packets would cause a 664ms delay (according to #1589-1 benchmark), which is unacceptable.
However, this is not a protocol problem.
Updated by Davide Pesavento over 7 years ago
Junxiao Shi wrote:
I do not understand this sentence:
This command does not have a hard limit and does not account for limits that exceed memory space
It means that we make no attempt to limit the CS limit based on currently available RAM or other metrics. So if you ask for a 50GB limit on a 1GB machine, you'll probably trigger an OOM condition.
- What's a use case for a CS with "cache" enabled but "serve" disabled?
For testing/experimentation, it can simplify or accelerate cache warm-up in some wireless scenarios. Also for demonstration purposes (I saw a CICN demo where they turned off "serving data from the CS").
- What's a use case for a CS with "serve" enabled but "cache" disabled?
We don't really have a strong use case in mind at the moment.
However, if the cache-on serve-off use case above is approved, then we still need at least 2 bits to represent all 3 possible states, therefore adding the cache-off serve-on combination comes basically for free.
- Why are
set-limit
andset-flags
commands needed? Similar functionality is already supported as (1) modifynfd.conf
(2) send SIGHUP to trigger config reload. What's a use case that cannot be supported through existing config reload mechanism? Hint: a valid use case should require runtime changes of each setting, and the justification needs to state why modifyingnfd.conf
is impractical.
Changing nfd.conf
requires root privileges.
Also, I could ask the same thing about StrategyChoice management...
- Assuming there is a compelling use case that requires commands, why not defining a single command that controls all attributes of the CS including capacity limit, serve/cache flags, and eviction policy?
What's your idea, specifically? We're open to suggestions on this.
- Should
flush
command support flushing only Data packets under a certain prefix? If so,erase
is a better name.
Maybe, but we didn't have a use case. However, I agree that the protocol can use "erase" as command verb, with one parameter (the prefix), and it doesn't necessarily have to match nfdc
command line.
list
dataset could be huge. Suppose the CS is at default capacity of 65536 packets and is completely full, and the average entry length is 32 octets, the dataset would have 263 segments each with 8000 octets of payload. Generating and signing so many Data packets would cause a 664ms delay (according to #1589-1 benchmark), which is unacceptable. However, this is not a protocol problem.
We already discussed about this (see above). We'll probably make this a "query" operation rather than "list".
For "query", the question is whether it finds only a single match or multiple matches. In the latter case, one can probably construct a query that matches every packet in the CS, so the scaling problem is still there. However, a CS Manager implementation may decide to truncate the response if it's larger than a predefined threshold.
Updated by Junxiao Shi over 7 years ago
Changing
nfd.conf
requires root privileges.
This reason is invalid. The sysadmin can create a UNIX group as the owner of /etc/ndn/nfd.conf
, and add users who are authorized to modify this file to the group.
Also, I could ask the same thing about StrategyChoice management...
Initially NFD management was designed to rely entirely on nfdc
commands. table.strategy_choice
config option was added later.
why not defining a single command that controls all attributes of the CS including capacity limit, serve/cache flags, and eviction policy?
What's your idea, specifically? We're open to suggestions on this.
I'd prefer a single set-config
command for all these settings, because they can be changed together.
In particular, changing eviction policy requires clearing the entire CS. If both eviction policy and capacity limit are to be changed, it's more efficient to do them together. Otherwise, assuming there is a clear
method on the existing eviction policy that is faster than repeatedly calling evict
, in case the caller lowers capacity limit and then changes eviction policy, there is useless work during the first command for evicting excess entries first, given the second command would clear the CS anyways.
Symmetrically, there should be a config
StatusDataset that reveals the current config.
The set of available eviction policies needs a separate cs/available-policies
dataset, to support nfdc
autocompletion.
I agree that the protocol can use "erase" as command verb, with one parameter (the prefix), and it doesn't necessarily have to match
nfdc
command line.
erase
is more general and should be preferred. My use case is to clean up poisoned Data.
For "query", the question is whether it finds only a single match or multiple matches. In the latter case, one can probably construct a query that matches every packet in the CS, so the scaling problem is still there. However, a CS Manager implementation may decide to truncate the response if it's larger than a predefined threshold.
query
definitely should support multiple matches. The desired number of entries should be provided as a parameter in the Interest, with an implementation-defined upper bound. Paging support is unnecessary.
Also, the Dispatcher's StatusDataset signing can be optimized: it can defer signing a segment until an Interest asking for this segment arrives. This still requires signing all the segments, but it does not block NFD for long time, and gives other faces a chance to compete for NFD forwarding processing.
Updated by Davide Pesavento over 7 years ago
- Related to Feature #4219: ContentStore hit/miss counters added
Updated by Davide Pesavento over 7 years ago
Junxiao Shi wrote:
Changing
nfd.conf
requires root privileges.This reason is invalid. The sysadmin can create a UNIX group as the owner of
/etc/ndn/nfd.conf
, and add users who are authorized to modify this file to the group.
Your solution is inferior, as it is too coarse-grained (all or nothing). With NFD mgmt, the admin can grant only the "cs" privilege to a user, disallowing all other changes.
I agree that the protocol can use "erase" as command verb, with one parameter (the prefix), and it doesn't necessarily have to match
nfdc
command line.
erase
is more general and should be preferred. My use case is to clean up poisoned Data.
Sure. What I meant was that we can have a nfdc cs flush
command (similar to ip address flush
) in addition to nfdc cs erase <prefix>
.
Updated by Junxiao Shi over 7 years ago
I agree with note-19.
I feel this scope of this issue is too board and thus it's going to take very long time (I'm estimating more than 6 months). I suggest splitting get/set config, enumeration, and deleting data as three issues. Then, we can start with get/set config which is the easiest.
Updated by Davide Pesavento over 7 years ago
- Assignee changed from Pauline Chan to Davide Pesavento
- % Done changed from 0 to 10
Updated by Junxiao Shi over 7 years ago
- Description updated (diff)
- Target version set to v0.7
Updated by Davide Pesavento about 7 years ago
- Status changed from In Progress to Feedback
- Assignee deleted (
Davide Pesavento) - Target version deleted (
v0.7)
Updated by Davide Pesavento about 7 years ago
- Related to Feature #4317: Content Store enumeration added
Updated by Davide Pesavento about 7 years ago
- Related to Feature #4318: Content Store flush/erase command added
Updated by Junxiao Shi almost 7 years ago
- Status changed from Feedback to In Progress
- Assignee set to Junxiao Shi
- Target version set to v0.7
- % Done changed from 10 to 20
Protocol update for cs/config
command and cs/info
dataset extension are in CsMgmt rev5 and ControlCommand rev34.
Updated by Davide Pesavento almost 7 years ago
Junxiao Shi wrote:
Protocol update for
cs/config
command andcs/info
dataset extension are in CsMgmt rev5 and ControlCommand rev34.
LGTM.
I suggest adding NCsEntries
to cs/info
as well, so that we have both capacity and current # of entries in the same dataset.
Updated by Junxiao Shi almost 7 years ago
I suggest adding
NCsEntries
tocs/info
as well, so that we have both capacity and current # of entries in the same dataset.
Agreed. This is added in CsMgmt rev7.
Updated by Junxiao Shi almost 7 years ago
- % Done changed from 20 to 30
https://gerrit.named-data.net/4453 declares cs/config
command in ndn-cxx.
Updated by Junxiao Shi almost 7 years ago
why don't we require the client to set the mask appropriately, as in FaceUpdateCommand
Agreed.
ControlCommand rev35 and FaceMgmt rev73 and CsMgmt rev8 update the protocol. Definition of Mask field has been moved to ControlCommand page.
Updated by Junxiao Shi almost 7 years ago
https://gerrit.named-data.net/4453 patchset3 implements the updated protocol.
Updated by Junxiao Shi almost 7 years ago
- % Done changed from 30 to 40
https://gerrit.named-data.net/4467 adds CS enablement flags to Cs
class.
Updated by Junxiao Shi almost 7 years ago
https://gerrit.named-data.net/4468 adds Capacity and CS enablement flags to CsInfo
struct.
It is a design choice to not have getFlags
and to use the name getNStored
instead of getNCsEntries
. These function names are suitable for the semantics of this struct, instead of matching the encoding.
Updated by Davide Pesavento almost 7 years ago
Junxiao Shi wrote:
It is a design choice [...] to use the name
getNStored
instead ofgetNCsEntries
.
Why not getSize
or getNEntries
?
These function names are suitable for the semantics of this struct, instead of matching the encoding.
This sentence is completely meaningless.
Updated by Junxiao Shi almost 7 years ago
Why not
getSize
orgetNEntries
?
getSize
does not indicate whether the unit is “entries” or “octets”.
getNEntries
is not better than getNStored
, ie “number of stored entries”.
the field in the serialized/deserialized way is called
NCsEntries
(I thought it is the same until check). I think it should simply match that one and be called getNCsEntries / setNCsEntries
No.
- TLV-TYPE does not have a name. It’s just a number.
- The constant naming that number is indeed “NCsEntries”. This constant name is defined in the context of general status dataset. “CS” part would be duplicate in the context of
CsInfo
struct. - Public API of
CsInfo
is designed to make sense to the caller, not to match the encoding.
Updated by Davide Pesavento almost 7 years ago
Junxiao Shi wrote:
Why not
getSize
orgetNEntries
?
getSize
does not indicate whether the unit is “entries” or “octets”.
Neither does getNStored
.
getNEntries
is not better thangetNStored
, ie “number of stored entries”.
getNEntries
provides a more uniform naming since "entries" is used elsewhere in management with the same meaning. And see also the previous point.
the field in the serialized/deserialized way is called
NCsEntries
(I thought it is the same until check). I think it should simply match that one and be called getNCsEntries / setNCsEntriesNo.
- TLV-TYPE does not have a name. It’s just a number.
- The constant naming that number is indeed “NCsEntries”. This constant name is defined in the context of general status dataset. “CS” part would be duplicate in the context of
CsInfo
struct.
True. But getNEntries
doesn't contain this redundancy.
Updated by Junxiao Shi almost 7 years ago
CsInfo::wireDecode
has a bug: when it encounters Capacity element, the value is incorrectly assigned to m_nHits
instead of m_capacity
. This is fixed in https://gerrit.named-data.net/4512. This bug was not caught earlier because:
- The unit test passes the wire encoding to
wireDecode
, and then checks the decodedCsInfo
equals the original. - To check the decoded
CsInfo
equals the original, the unit test usesoperator==
. operator==
does not invoke individual getters, but simply checks the encodings are equal.wireDecode
caches the encoding, sooperator==
is comparing the cached encoding.- As a result, there's no test coverage checking the correctness in decoded fields.
I have changed the test case to check the return value of each getter. To avoid the same mistake in the future, TLV struct decoding test cases should always check with individual getters.
Updated by Junxiao Shi almost 7 years ago
- % Done changed from 40 to 70
https://gerrit.named-data.net/4514 implements cs/config
command.
https://gerrit.named-data.net/4513 adds capacity and enablement flags to cs/info
dataset (kept as draft until Change 4512 merges).
Updated by Junxiao Shi almost 7 years ago
Remaining part of this issue is adding nfdc commands. Design is below, but implementation needs to wait after https://gerrit.named-data.net/4489 to avoid conflicts.
nfdc cs info will switch to a multi-line format similar to nfdc status
.
CS information:
capacity=17274
admit=on
serve=off
nEntries=4615
nHits=8958
nMisses=13540
There will be a new nfdc cs config command:
nfdc cs config [capacity <CAPACITY>] [admit on|off] [serve on|off]
Updated by Davide Pesavento almost 7 years ago
Junxiao Shi wrote:
Remaining part of this issue is adding nfdc commands. Design is below, but implementation needs to wait after https://gerrit.named-data.net/4489 to avoid conflicts.
4489 has been merged.
I agree with the proposed syntax for config and output format for info.
Updated by Davide Pesavento almost 7 years ago
- Related to Task #4498: 'nfdc cs' should be accepted as alias of 'nfdc cs info' added
Updated by Davide Pesavento almost 7 years ago
- Blocks Task #4497: Release 0.6.1 added
Updated by Davide Pesavento almost 7 years ago
Junxiao, do you think you can finish this ASAP so that we can have it in v0.6.1?
Updated by Davide Pesavento almost 7 years ago
- Blocks deleted (Task #4497: Release 0.6.1)
Updated by Junxiao Shi almost 7 years ago
- % Done changed from 70 to 90
https://gerrit.named-data.net/4597 introduces text::OnOff
helper for printing boolean as 'on' or 'off'.
https://gerrit.named-data.net/4598 adds nfdc cs config
command. Addition to nfdc cs info
will come in another commit.
Updated by Junxiao Shi almost 7 years ago
- Status changed from In Progress to Code review
- % Done changed from 90 to 100
https://gerrit.named-data.net/4604 nfdc cs info
includes config fields; also updating XML, XSD, XSL, tested manually with online validator.
Extras: the other vertical output in nfdc status show
is implemented alike. xml::Flag
struct prints optional XML element for boolean field, so that unit test of formatItemXml
does not need both true and false cases to achieve full coverage.
Updated by Junxiao Shi almost 7 years ago
- Status changed from Code review to Closed