Project

General

Profile

Feature #2411

ContentStore API: asynchronous lookup

Added by Junxiao Shi over 4 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Tables
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
9.00 h

Description

Change ContentStore lookup API from:

const Data* find(const Interest& interest) const;

to:

typedef std::function<void(const Interest&, const Data&)> HitCallback;
typedef std::function<void(const Interest&)> MissCallback;
void find(const Interest& interest, HitCallback hit, MissCallback miss) const;

This allows ContentStore implementation to perform lookup operations asynchronously.

This asynchronous API enables the following use case:

On a low-memory device, it's desirable to store part of ContentStore on external storage, in order to increase the capacity of ContentStore.
A possible design is: keep the index in main memory, but store some Data packets on a flash drive.

Under such design, a ContentStore in low-memory device can synchronously determine whether a match exists, but needs to load the matched Data packet from the flash drive asynchronously.

Although the API is asynchronous and does not specify a timeout, ContentStore implementation should invoke the callback quickly (within a few milliseconds).

If the match result is already known, the callback may be invoked within find function.

Caller of find function must guarantee validity of const Interest& argument until either HitCallback or MissCallback is invoked.

The const Interest& passed to HitCallback and MissCallback is the same instance of the argument passed to find function.

const Data& passed to HitCallback is guaranteed to be valid only for the duration of this callback.
However, this Data object is always created by make_shared and therefore the callback can safely obtain shared ownership with shared_from_this in order to extend its validity.

Forwarding pipelines must be changed accordingly: the second half of "Incoming Interest pipeline" is split to "ContentStore Hit pipeline" and "ContentStore Miss pipeline".


Related issues

Blocked by NFD - Feature #2185: Recognize CachingPolicy=NoCacheClosed

History

#1 Updated by Alex Afanasyev over 4 years ago

Would it be better if find has 2 callbacks, one for success and one for failure? This would simplify dispatching in Forwarder:

typedef std::function<void(const Data&)> CacheHitCallback;
typedef std::function<void()> CacheMissCallback;
void asyncFind(Interest interest, CacheHitCallback ch, CacheMissCallback cm) const;

I'm also suggesting to add "async" to explicitly indicate that the call is asyncrhornous.

#2 Updated by Junxiao Shi over 4 years ago

Using two callbacks requires two binds or lambdas, which causes parameters such as references to PIT entry to be copied twice.

I plan to use a lambda inside incoming Interest pipeline as the single callback, and put a conditional in that lambda to enter hit or miss pipeline.

'async' is unnecessary because there isn't a synchronous operation.

Example: ndn::Face::expressInterest is asynchronous, but its name doesn't contain 'async'.

#3 Updated by Alex Afanasyev over 4 years ago

Using two callbacks requires two binds or lambdas, which causes parameters such as references to PIT entry to be copied twice.

This is a really bad argument, giving tons of other copyings that happen throughout Forwarder and Strategy classes.

I plan to use a lambda inside incoming Interest pipeline as the single callback, and put a conditional in that lambda to enter hit or miss pipeline.

Anyhow. I wouldn't object that hard, but it is less explicit than separate callback.

'async' is unnecessary because there isn't a synchronous operation.

Example: ndn::Face::expressInterest is asynchronous, but its name doesn't contain 'async'.

I kind of can go either way. However, expressInterest is not "getData", it as a phrase shows that we are "expressing Interests" for some data (for me, just a different way to say async). Also, you yourself were suggesting to be more explicit in other places. What's harm to be explicit here?

#4 Updated by Davide Pesavento over 4 years ago

I would tend to agree with Alex on both points, although not strongly, so I'd be ok with either of the proposed APIs.

One question: what's the guaranteed lifetime of the "returned" pointer/reference? In other words, would a weak_ptr make sense?

#5 Updated by Junxiao Shi over 4 years ago

Currently, the return Data packet is valid synchronously only.

This reminds me that return type needs to be shared_ptr:
a ContentStore may load a Data packet from flash drive but not retain it in memory.
In this case, a shared_ptr is necessary to delete the Data packet after Forwarder has done with it.


I disagree with 'async' naming when synchronous operation does not exist.

The function signature already indicates it's asynchronous.

I disagree with two callbacks due to reason stated in note-2.

#6 Updated by Davide Pesavento over 4 years ago

Junxiao Shi wrote:

I disagree with two callbacks due to reason stated in note-2.

This is premature optimization and does not constitute a valid reason against the two-callbacks approach.

#7 Updated by Davide Pesavento over 4 years ago

Junxiao Shi wrote:

This reminds me that return type needs to be shared_ptr:
a ContentStore may load a Data packet from flash drive but not retain it in memory.
In this case, a shared_ptr is necessary to delete the Data packet after Forwarder has done with it.

A possible alternative is guaranteeing the validity of the Data only for the duration of the callback. If the callback wants to retain the Data for a longer time, it must make a copy.

#8 Updated by Junxiao Shi over 4 years ago

  • Description updated (diff)

Updated description to use two callbacks, and added a paragraph about validity duration of returned Data.

#9 Updated by Junxiao Shi over 4 years ago

forwarding-pipelines_20150211.pptx is the updated pipelines design to support this API change.

This issue is more about Forwarding than Tables.

#10 Updated by Lan Wang over 4 years ago

  • Assignee set to Minsheng Zhang

#11 Updated by Minsheng Zhang over 4 years ago

To support async, does that mean we have two threads, one for CS hit pipeline and another one for CS miss pipeline?

#12 Updated by Junxiao Shi over 4 years ago

To support async, does that mean we have two threads, one for CS hit pipeline and another one for CS miss pipeline?

No, everything is still in one thread.

In this Feature, when void Cs::find(Interest interest, HitCallback hit, MissCallback miss) const is called, it synchronously performs the lookup, and immediately invokes either hit or miss callback.

This Feature is to change the API without changing how ContentStore works.

If a future ContentStore impl needs to run the lookup procedure in a separate thread, it shall ensure hit or miss is invoked in the main thread.

#13 Updated by Minsheng Zhang over 4 years ago

But how does this support async, still not quite understand about it.

#14 Updated by Alex Afanasyev over 4 years ago

For now it will be just a conceptual async. This separation by itself would allow us in the future to implement really async content store which will not needed to be fully synchronized with the forwarding pipelines.

#15 Updated by Minsheng Zhang over 4 years ago

  • Status changed from New to Code review
  • % Done changed from 0 to 40

#16 Updated by Junxiao Shi over 4 years ago

  • Blocked by Feature #2185: Recognize CachingPolicy=NoCache added

#17 Updated by Junxiao Shi over 4 years ago

Please reassign this issue to me after all code changes are completed and merged. I'll take care of updating NFD Developer Guide after the updates for #2185.

#18 Updated by Junxiao Shi over 4 years ago

From Alex's review on commit:6d8e68c9f59cab507804709c83ae5622afb140bb

change prototype to typedef std::function<void(const Interest&)> MissCallback;

This is not symmetric with HitCallback.
Either pass Interest to both callbacks, or to neither.

Also, please explain why Interest should be passed to callbacks.

#19 Updated by Alex Afanasyev over 4 years ago

My suggestion was based entirely on the use case: the interest was already passed to the callback, but in an obscured (and error-prone) way by binding reference of the interest, lifetime of which is assumed to exist until the invocation.

I don't see how "symmetric" argument matters here. However, I think we should add Interest as a parameter to HitCallback as well. Though, we can add it later, when the need arises (if arises).

#20 Updated by Junxiao Shi over 4 years ago

I disagree with having Interest in one callback but not the other. ContentStore should have a consistent API regardless of use case.

Either pass const Interest& in both callbacks, or keep the current API unchanged.

#21 Updated by Alex Afanasyev over 4 years ago

I prefer then that the interest is supplied to both callbacks.

#22 Updated by Junxiao Shi over 4 years ago

  • Description updated (diff)
  • Status changed from Code review to In Progress

Spec changed according to note-21.

#23 Updated by Junxiao Shi over 4 years ago

  • Assignee changed from Minsheng Zhang to Junxiao Shi
  • % Done changed from 40 to 70

Code is merged. I'll update NFD Developer Guide.

#24 Updated by Junxiao Shi over 4 years ago

  • Status changed from In Progress to Resolved
  • Start date deleted (01/25/2015)
  • % Done changed from 70 to 100

NFD Developer Guide is updated.

#25 Updated by Junxiao Shi over 4 years ago

  • Status changed from Resolved to Closed

#26 Updated by Junxiao Shi over 2 years ago

https://gerrit.named-data.net/3652 improves the test suite.

Also available in: Atom PDF