Task #3807
openContent store profiling
Added by Chengyu Fan about 8 years ago. Updated about 1 year ago.
50%
Description
Profile the performance of content store, and understand where the bottlenecks are.
Files
callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out (131 KB) callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out | Callgrind profiling output file for CS findMissInsert test on a ONL Host8core machine | Chengyu Fan, 10/11/2016 12:33 PM | |
callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out (134 KB) callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out | Callgrind profiling output file for CS InsertFindHit test on a ONL Host8core machine | Chengyu Fan, 10/11/2016 12:33 PM | |
callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out (37.1 KB) callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out | Callgrind profiling output file for CS LeftMost test on a ONL Host8core machine | Chengyu Fan, 10/11/2016 12:33 PM | |
callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out (38.9 KB) callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out | Callgrind profiling output file for CS rightMost test on a ONL Host8core machine | Chengyu Fan, 10/11/2016 12:33 PM | |
callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out (111 KB) callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out | allgrind profiling output file for CS InsertFindHit test on a ONL Host8core machine with gerrit 3262 | Chengyu Fan, 10/19/2016 12:20 PM | |
callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out (130 KB) callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out | Callgrind profiling output file for CS findMissInsert test on a ONL Host8core machine with gerrit 3262 | Chengyu Fan, 10/19/2016 12:20 PM |
Updated by Chengyu Fan about 8 years ago
- File callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out added
- File callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out added
- File callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out added
- File callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out added
Uploaded the callgrind output files for cs-benchmark. Each test case has its own callgrind file.
According to the output file:
For test cases "insertFindHit" and "findMissInsert", the major contributor are
nfd::cs::EntryImpl::operator<(nfd::cs::EntryImpl const&) const 91%
ndn::Name::compare(unsigned long, unsigned long, ndn::Name const& ...) 80%nfd::cs::compareDataWithData() uses much more time than nfd::cs::compareQueryWithData(): 63% vs. 27%
ndn::name::Component::compare() uses half of the running time
Updated by Davide Pesavento about 8 years ago
- Target version changed from v0.5 to v0.6
v0.5 has already been released.
Updated by Junxiao Shi about 8 years ago
ndn::name::Component::compare()
uses half of the running time
https://gerrit.named-data.net/3262 is an attempt to optimize name::Component::compare
. Can @Chengyu Fan run profiling again with this patch?
Updated by Chengyu Fan about 8 years ago
Junxiao Shi wrote:
ndn::name::Component::compare()
uses half of the running timehttps://gerrit.named-data.net/3262 is an attempt to optimize
name::Component::compare
. Can @Chengyu Fan run profiling again with this patch?
Will do
Updated by Chengyu Fan about 8 years ago
- File callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out added
- File callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out added
- % Done changed from 0 to 100
Junxiao Shi wrote:
ndn::name::Component::compare()
uses half of the running timehttps://gerrit.named-data.net/3262 is an attempt to optimize
name::Component::compare
. Can @Chengyu Fan run profiling again with this patch?
I have run the profiling again with gerrit patch 3262 (https://gerrit.named-data.net/3262)
However, there is no distinct difference for the name::Component::compare()
time percentage with 3262 and without 3262.
I have also put the results in https://www.dropbox.com/sh/ars2l07kd93q1g1/AADuE9eTc3Ss7qFeDF14KiJXa/nfd-profiling/3807-cs-benchmark-profiling?dl=0
Updated by Junxiao Shi about 8 years ago
I compared callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out
with callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out
.
After ndn-cxx:commit:010f0868cd204f75f661acc4320803d783786213, name::Component::compare
indeed takes the expected "fast path", but it's overall overhead is almost the same.
In the old "slow path", each name::Component::compare
invokes Block::value
twice and Block::value_size
four times (both indirectly calling Block::hasValue
).
In the new "fast path", each name::Component::compare
invokes Block::wire
twice and Block::size
twice (both indirectly calling Block::hasWire
) and Block::hasWire
twice.
Although Block::size
is cheaper than Block::value_size
, Block::hasWire
is more expensive than Block::hasValue
, so that the overhead of both code paths break even.
Updated by Chengyu Fan about 8 years ago
Junxiao Shi wrote:
I compared
callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out
withcallgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out
.
After ndn-cxx:commit:010f0868cd204f75f661acc4320803d783786213,name::Component::compare
indeed takes the expected "fast path", but it's overall overhead is almost the same.In the old "slow path", each
name::Component::compare
invokesBlock::value
twice andBlock::value_size
four times (both indirectly callingBlock::hasValue
).
In the new "fast path", eachname::Component::compare
invokesBlock::wire
twice andBlock::size
twice (both indirectly callingBlock::hasWire
) andBlock::hasWire
twice.
AlthoughBlock::size
is cheaper thanBlock::value_size
,Block::hasWire
is more expensive thanBlock::hasValue
, so that the overhead of both code paths break even.
I should make this clearer. The patch did change the CS behavior, but the overhead is the same. "fast path" is not fast.
Updated by Davide Pesavento almost 7 years ago
- Category deleted (
Integration Tests) - Target version deleted (
v0.6) - % Done changed from 100 to 50
Updated by Davide Pesavento about 1 year ago
- Category set to Tables
- Assignee deleted (
Chengyu Fan) - Start date deleted (
10/11/2016)