https://redmine.named-data.net/https://redmine.named-data.net/favicon.ico?14759811232018-08-03T10:32:22ZNDN project issue tracking systemndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=237602018-08-03T10:32:22ZDavide Pesavento
<ul><li><strong>Description</strong> updated (<a title="View differences" href="/journals/23760/diff?detail_id=20821">diff</a>)</li><li><strong>Start date</strong> deleted (<del><i>08/03/2018</i></del>)</li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238052018-08-07T16:47:57ZAshlesh Gawande
<ul></ul><p>One solution is to:</p>
<ul>
<li>Stop passing shared_ptr around</li>
<li>Remove deprecated functions</li>
</ul>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238062018-08-07T19:23:40ZDavide Pesavento
<ul></ul><p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>One solution is to:</p>
<ul>
<li>Stop passing shared_ptr around</li>
</ul>
</blockquote>
<p>Uhm no, there is a reason why we do that. SegmentFetcher needs to manage its own lifetime. You could pass <code>weak_ptr</code> around, instead of <code>shared_ptr</code>, and give ownership to the caller, but that's an incompatible API change, because current consumers expect the fetch to continue even if they don't retain the <code>shared_ptr</code>.<br>
I would not disagree with changing this, but you need to find existing consumers of the new <code>start()</code>-based API and fix them. And you still need to cancel outstanding operations in an orderly fashion.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238092018-08-08T06:26:02ZAshlesh Gawande
<ul></ul><p>1) Yes, this will affect a lot of users if they are (expectedly) relying on not managing the shared_ptr. NLSR does rely on segment fetcher managing itself. If we were to change segment fetcher as you suggested, then in the application do we need to keep track of shared_ptr for each instance to make sure it does not go out of scope and destructs? Also, can you please elaborate on "cancel outstanding operations in an orderly fashion?"</p>
<p>2) If it is a lot of work to change dependent applications, how about a stop() function that does the following:</p>
<ul>
<li>Cancel any scheduled events</li>
<li>Manipulate (m_timeLastSegmentReceived + m_options.maxTimeout) such that it is less than time::steady_clock::now()</li>
<li>Call afterTimeoutCb</li>
</ul>
<p>But again applications who wish to cancel any pending segment fetchers so as to release memory when they are successfully terminated need to keep track of each segment fetcher and call stop on each when destructor is called.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238102018-08-08T06:58:07ZJunxiao Shi
<ul></ul><p>I disagree with any breaking changes regarding requiring the calling app to manage <code>SegmentFetcher</code> instance lifetime.</p>
<p>SegmentFetcher should keep tracking of <code>PendingInterestId*</code> it has expressed. <code>SegmentFetcher::cancel</code> simply cancels all pending Interests, which stops the transfer, and <em>naturally</em> releases <code>shared_ptr<SegmentFetcher></code> because no callback function still references it.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238112018-08-08T08:47:46ZAshlesh Gawande
<ul></ul><p>What about the scheduled timeout callback, afterTimeoutCb?<br>
<a href="https://github.com/named-data/ndn-cxx/blob/master/src/util/segment-fetcher.cpp#L148">https://github.com/named-data/ndn-cxx/blob/master/src/util/segment-fetcher.cpp#L148</a><br>
We also need to cancel pending events from scheduler.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238132018-08-08T17:09:14ZDavide Pesavento
<ul></ul><p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>1) Yes, this will affect a lot of users if they are (expectedly) relying on not managing the shared_ptr.</p>
</blockquote>
<p>Not really. I'm only talking about the users of the <em>new</em> <code>start()</code>-based API. We can ignore the users of the deprecated API since we won't be changing that. I'm not expecting many users of the new API, given that it was added not long ago and it's not in any released version of ndn-cxx.</p>
<blockquote>
<p>If we were to change segment fetcher as you suggested, then in the application do we need to keep track of shared_ptr for each instance to make sure it does not go out of scope and destructs?</p>
</blockquote>
<p>Correct.</p>
<blockquote>
<p>Also, can you please elaborate on "cancel outstanding operations in an orderly fashion?"</p>
</blockquote>
<p>Cancel pending interests and scheduled events.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238142018-08-08T17:11:15ZDavide Pesavento
<ul></ul><p>Junxiao Shi wrote:</p>
<blockquote>
<p>I disagree with any breaking changes regarding requiring the calling app to manage <code>SegmentFetcher</code> instance lifetime.</p>
</blockquote>
<p>Again, as I said, it wouldn't really be a breaking change since the new API hasn't appeared in any released version of ndn-cxx yet.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238812018-08-21T20:06:33ZAshlesh Gawande
<ul></ul><p>So if I make the following changes to the segment fetcher:</p>
<ul>
<li>Pass around weak_ptr instead of shared_ptr</li>
<li>Add a destructor where I erase pending interests and scheduled events</li>
</ul>
<p>Then how would an application manage the memory?</p>
<p>In mgmt/nfd/controller, the user of the new API, I added a vector to store the shared_ptr.<br>
But I cannot erase the stored shared_ptr in the call back connected to the shared_ptr w/o heap-use-after-free.</p>
<pre><code> auto fetcher = SegmentFetcher::start(m_face, Interest(prefix), m_validator, fetcherOptions);
m_fetchers.push_back(fetcher);
if (processResponse) {
fetcher->onComplete.connect([this, processResponse] (ConstBufferPtr bufferPtr) {
processResponse(std::move(bufferPtr));
for (auto it = m_fetchers.begin(); it != m_fetchers.end();) {
if (it->use_count() == 1) {
it = m_fetchers.erase(it);
}
else {
++it;
}
}
});
}
if (onFailure) {
fetcher->onError.connect([=] (uint32_t code, const std::string& msg) {
processDatasetFetchError(onFailure, code, msg);
});
}
</code></pre> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238822018-08-22T02:38:21ZJunxiao Shi
<ul></ul><blockquote>
<p>Then how would an application manage the memory?</p>
<p>In mgmt/nfd/controller, the user of the new API, I added a vector to store the shared_ptr.</p>
</blockquote>
<p>Revert that.</p>
<blockquote>
<p>But I cannot erase the stored shared_ptr in the call back connected to the shared_ptr w/o heap-use-after-free.</p>
</blockquote>
<p>My first idea is to capture shared_ptr into both callbacks, but this wouldn’t work: the callback is still hooked on a signal and thus would never release the shared_ptr.</p>
<p>My second idea can work:</p>
<pre><code>struct PtrWrapper
{
shared_ptr<SegmentFetcher> fetcher;
};
// when you start fetching
wrapper = new PtrWrapper();
wrapper.fetcher = SegmentFetcher::start(
// in a callback
delete wrapper;
</code></pre>
<p>Basically it turns a shared_ptr into a bare pointer. Incredibly ugly, isn’t it? This is exactly the reason that a caller shouldn’t be expected to maintain the lifetime of SegmentFetcher.<br>
In fact, the reason to name the API <code>SegmentFetcher::start</code> Is to convey that the caller doesn’t need to keep the return value. If you change semantics, you should let the caller invoke SegmentFetcher constructor, and then provide a new <code>start</code> implementation that internally maintains the object lifetime.</p>
<p>My third idea: convert the API into a <strong>Promise</strong>. There are Promise libraries out there.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238832018-08-22T08:41:22ZAshlesh Gawande
<ul></ul><p>Okay thanks.<br>
I think for now, providing stop() function that does the following is quicker:</p>
<ul>
<li>Cancel pending interests</li>
<li>Cancel scheduled events</li>
</ul>
<p>Then in PSync I can have a shared_ptr member variable for segment fetcher, on which I call stop() before assigning it a new segment fetcher via start.<br>
Also, I can provide a stop function in PSync consumer which calls stop on segment fetcher member variable, that will be called in unit test destructor so that ASan leak-sanitizer does not complain.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=238992018-08-23T16:04:19ZAshlesh Gawande
<ul><li><strong>Status</strong> changed from <i>New</i> to <i>Code review</i></li><li><strong>Assignee</strong> set to <i>Ashlesh Gawande</i></li><li><strong>% Done</strong> changed from <i>0</i> to <i>100</i></li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239002018-08-23T16:45:54ZDavide Pesavento
<ul></ul><p>Junxiao Shi wrote:</p>
<blockquote>
<p>Basically it turns a shared_ptr into a bare pointer. Incredibly ugly, isn’t it? This is exactly the reason that a caller shouldn’t be expected to maintain the lifetime of SegmentFetcher.</p>
</blockquote>
<p>Junxiao, don't spread FUD please. "maintaining the lifetime" here simply means that when the last <code>shared_ptr</code> goes out of scope, the segment fetcher is automatically stopped... it's not that different from the usual RAII principle.</p>
<blockquote>
<p>In fact, the reason to name the API <code>SegmentFetcher::start</code> Is to convey that the caller doesn’t need to keep the return value.</p>
</blockquote>
<p>I don't know where you got this idea from... the word "start" does not imply or even suggest anything like that.</p>
<blockquote>
<p>If you change semantics, you should let the caller invoke SegmentFetcher constructor, and then provide a new <code>start</code> implementation that internally maintains the object lifetime.</p>
</blockquote>
<p>That's not going to help with the original issue that triggered this whole discussion: the memory "leak".</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239012018-08-23T16:47:51ZDavide Pesavento
<ul></ul><p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>In mgmt/nfd/controller, the user of the new API, I added a vector to store the shared_ptr.<br>
But I cannot erase the stored shared_ptr in the call back connected to the shared_ptr w/o heap-use-after-free.</p>
</blockquote>
<p>You're doing something wrong... are the segment fetcher callbacks checking that their <code>weak_ptr</code> is still valid? I can't really tell what's going on if I don't see the code.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239062018-08-24T10:27:10ZAshlesh Gawande
<ul></ul><p>Davide Pesavento wrote:</p>
<blockquote>
<p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>In mgmt/nfd/controller, the user of the new API, I added a vector to store the shared_ptr.<br>
But I cannot erase the stored shared_ptr in the call back connected to the shared_ptr w/o heap-use-after-free.</p>
</blockquote>
<p>You're doing something wrong... are the segment fetcher callbacks checking that their <code>weak_ptr</code> is still valid? I can't really tell what's going on if I don't see the code.</p>
</blockquote>
<p>You are right, I was not checking if <code>weak_ptr</code> is still valid. Updated the same gerrit change.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239112018-08-24T18:39:46ZDavide Pesavento
<ul><li><strong>Target version</strong> set to <i>v0.7</i></li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239122018-08-24T18:45:09ZDavide Pesavento
<ul></ul><p>I was wondering if we should instead have a "scoped handle" type, separate from <code>SegmentFetcher</code> itself, that automatically cancels the associated fetch operation when destructed. This design would be consistent with several other facilities already available in ndn-cxx, such as <code>ScopedConnection</code>, <code>ScopedEventId</code>, and soon <code>ScopedPendingInterest</code> (<a class="issue tracker-2 status-5 priority-2 priority-default closed" title="Feature: Scoped expressed Interest (Closed)" href="https://redmine.named-data.net/issues/4316">#4316</a>).</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239132018-08-24T18:47:35ZJunxiao Shi
<ul></ul><blockquote>
<p>I was wondering if we should instead have a "scoped handle" type, separate from <code>SegmentFetcher</code> itself, that automatically cancels the associated fetch operation when destructed.</p>
</blockquote>
<p>Yes. It’ll take a while before I get to that template, however.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239622018-09-01T21:09:38ZAshlesh Gawande
<ul><li><strong>File</strong> <a href="/attachments/856">heap-free-after-use.txt</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/856/heap-free-after-use.txt">heap-free-after-use.txt</a> added</li></ul><p>Problem with storing the shared_ptr as a class member in SegmentFetcher is when to destroy it? I destroy it after onComplete and after onError. I also stopped passing weak_ptr around since we have the instance in the class and simply check if the member is alive before proceeding in the functions.</p>
<p>All the test where data is received passes w/o any problems.</p>
<p>I added a test (similar to TestStatusDataset/Failures/Timeout test to see why that was producing heap-free-after-use):</p>
<pre><code class="cpp syntaxhl" data-language="cpp"><span class="n">BOOST_AUTO_TEST_CASE</span><span class="p">(</span><span class="n">StatusR</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">DummyValidator</span> <span class="n">acceptValidator</span><span class="p">;</span>
<span class="n">SegmentFetcher</span><span class="o">::</span><span class="n">Options</span> <span class="n">options</span><span class="p">;</span>
<span class="n">options</span><span class="p">.</span><span class="n">maxTimeout</span> <span class="o">=</span> <span class="mi">3000</span><span class="n">_ms</span><span class="p">;</span>
<span class="p">{</span>
<span class="k">auto</span> <span class="n">fetcher</span> <span class="o">=</span> <span class="n">SegmentFetcher</span><span class="o">::</span><span class="n">start</span><span class="p">(</span><span class="n">face</span><span class="p">,</span> <span class="n">Interest</span><span class="p">(</span><span class="s">"/localhost/nfd/faces/list"</span><span class="p">),</span>
<span class="n">acceptValidator</span><span class="p">,</span> <span class="n">options</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">advanceClocks</span><span class="p">(</span><span class="mi">500</span><span class="n">_ms</span><span class="p">,</span> <span class="mi">7</span><span class="p">);</span>
<span class="p">}</span>
</code></pre>
<p>which leads to heap-free-after-use. If I don't reset the fetcher after onError, then I get memory leaks.</p>
<p>@Junxiao, can you please elaborate your solution once again?</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239632018-09-02T00:04:12ZDavide Pesavento
<ul></ul><p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>I also stopped passing weak_ptr around since we have the instance in the class and simply check if the member is alive before proceeding in the functions.</p>
</blockquote>
<p>Well this is wrong. The mere act of checking the member <code>shared_ptr</code> is undefined behavior, because the instance of SegmentFetcher that contains the <code>shared_ptr</code> might no longer exist at the time of checking, hence the use-after-free.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239652018-09-03T15:29:49ZAshlesh Gawande
<ul></ul><p>I removed those checks, but still got the heap-use-after-free. So I am not sure what's wrong - I have uploaded the code.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=239662018-09-03T17:32:12ZDavide Pesavento
<ul></ul><p>Removing the checks doesn't solve the problem I pointed out, it makes it worse. This part of your change was correct a few patch sets ago, I don't know why you decided to change it... anyway, I'll upload a new revision to show my design suggestion.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241112018-10-02T17:43:28ZAshlesh Gawande
<ul><li><strong>File</strong> <a href="/attachments/861">asan</a> <a class="icon-only icon-download" title="Download" href="/attachments/download/861/asan">asan</a> added</li></ul><p>So I think while running OutOfScopeTimeout test, we get heap-use-after-free in scheduler.cpp at <code>while (!m_queue.empty()) {</code><br>
because when we reset the shared_ptr, m_this, in onError it destructs the scheduler. But the io service is still running so it can trigger some call back in scheduler (I think).</p>
<p>Could a potential solution be to let users pass a scheduler to the segment fetcher (not tested yet)?</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241122018-10-02T18:04:46ZJunxiao Shi
<ul></ul><blockquote>
<p>Could a potential solution be to let users pass a scheduler to the segment fetcher (not tested yet)?</p>
</blockquote>
<p>No, the API should remain unchanged.</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241132018-10-02T18:12:21ZAshlesh Gawande
<ul></ul><p>Junxiao Shi wrote:</p>
<blockquote>
<blockquote>
<p>Could a potential solution be to let users pass a scheduler to the segment fetcher (not tested yet)?</p>
</blockquote>
<p>No, the API should remain unchanged.</p>
</blockquote>
<p>Okay, any suggestion on getting around the destruction of scheduler?<br>
(I have to reset m_this in onError, otherwise ASan will report memory leaks).</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241212018-10-02T20:01:22ZDavide Pesavento
<ul></ul><p>Ashlesh Gawande wrote:</p>
<blockquote>
<p>Okay, any suggestion on getting around the destruction of scheduler?</p>
</blockquote>
<p>Capture a <code>shared_ptr</code> in the scheduled callback, to ensure <code>SegmentFetcher</code>, and therefore <code>Scheduler</code>, stays alive? <code>Scheduler</code> is unable to guarantee its own survival, callers must ensure the instance remains valid while there are outstanding callbacks.</p>
<p>We've had this problem before, see e.g. <a class="issue tracker-1 status-5 priority-2 priority-default closed" title="Bug: UtilFaceUri/CanonizeEmptyCallback triggers use-after-free in dns::Resolver (Closed)" href="https://redmine.named-data.net/issues/2653">#2653</a>. And I'm starting to think that <code>Scheduler</code> is causing more problems than it solves...</p>
<blockquote>
<p>(I have to reset m_this in onError, otherwise ASan will report memory leaks).</p>
</blockquote>
<p>Do you expect the new <code>stop()</code> function to behave synchronously?</p>
ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241262018-10-03T16:58:11ZDavide Pesavento
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-3 status-1 priority-2 priority-default" href="/issues/4637">Task #4637</a>: ndncatchunks: use SegmentFetcher</i> added</li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241702018-10-18T17:10:53ZAshlesh Gawande
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-2 status-5 priority-3 priority-high3 closed" href="/issues/4662">Feature #4662</a>: Segment hello and sync data in partial sync</i> added</li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=241742018-10-18T17:11:47ZAshlesh Gawande
<ul><li><strong>Blocks</strong> <i><a class="issue tracker-3 status-5 priority-2 priority-default closed" href="/issues/4716">Task #4716</a>: Segment sync data in full sync</i> added</li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=242432018-11-01T08:40:09ZAshlesh Gawande
<ul><li><strong>Status</strong> changed from <i>Code review</i> to <i>Closed</i></li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=243432018-11-21T14:12:38ZAshlesh Gawande
<ul><li><strong>Related to</strong> <i><a class="issue tracker-2 status-1 priority-2 priority-default" href="/issues/4776">Feature #4776</a>: Re-design SegmentFetcher</i> added</li></ul> ndn-cxx - Feature #4692: SegmentFetcher: allow applications to stop/cancel fetchhttps://redmine.named-data.net/issues/4692?journal_id=269042020-06-29T23:42:50ZDavide Pesavento
<ul><li><strong>Tags</strong> set to <i>API</i></li></ul>