Project

General

Profile

Task #3807

Content store profiling

Added by Chengyu Fan over 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
10/11/2016
Due date:
% Done:

50%

Estimated time:

Description

Profile the performance of content store, and understand where the bottlenecks are.


Files

callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out (131 KB) callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out Callgrind profiling output file for CS findMissInsert test on a ONL Host8core machine Chengyu Fan, 10/11/2016 12:33 PM
callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out (134 KB) callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit.out Callgrind profiling output file for CS InsertFindHit test on a ONL Host8core machine Chengyu Fan, 10/11/2016 12:33 PM
callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out (37.1 KB) callgrind_nfd_0.5.0_cs_benchmark_Leftmost.out Callgrind profiling output file for CS LeftMost test on a ONL Host8core machine Chengyu Fan, 10/11/2016 12:33 PM
callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out (38.9 KB) callgrind_nfd_0.5.0_cs_benchmark_Rightmost.out Callgrind profiling output file for CS rightMost test on a ONL Host8core machine Chengyu Fan, 10/11/2016 12:33 PM
callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out (111 KB) callgrind_nfd_0.5.0_cs_benchmark_InsertFindHit_with_gerrit_name-component-3262.out allgrind profiling output file for CS InsertFindHit test on a ONL Host8core machine with gerrit 3262 Chengyu Fan, 10/19/2016 12:20 PM
callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out (130 KB) callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out Callgrind profiling output file for CS findMissInsert test on a ONL Host8core machine with gerrit 3262 Chengyu Fan, 10/19/2016 12:20 PM
#1

Updated by Chengyu Fan over 3 years ago

Uploaded the callgrind output files for cs-benchmark. Each test case has its own callgrind file.

According to the output file:

  1. For test cases "insertFindHit" and "findMissInsert", the major contributor are

    nfd::cs::EntryImpl::operator<(nfd::cs::EntryImpl const&) const 91%
    ndn::Name::compare(unsigned long, unsigned long, ndn::Name const& ...) 80%

  2. nfd::cs::compareDataWithData() uses much more time than nfd::cs::compareQueryWithData(): 63% vs. 27%

  3. ndn::name::Component::compare() uses half of the running time

#2

Updated by Davide Pesavento over 3 years ago

  • Target version changed from v0.5 to v0.6

v0.5 has already been released.

#3

Updated by Junxiao Shi over 3 years ago

ndn::name::Component::compare() uses half of the running time

https://gerrit.named-data.net/3262 is an attempt to optimize name::Component::compare. Can Chengyu Fan run profiling again with this patch?

#4

Updated by Chengyu Fan over 3 years ago

Junxiao Shi wrote:

ndn::name::Component::compare() uses half of the running time

https://gerrit.named-data.net/3262 is an attempt to optimize name::Component::compare. Can Chengyu Fan run profiling again with this patch?

Will do

#5

Updated by Chengyu Fan over 3 years ago

Junxiao Shi wrote:

ndn::name::Component::compare() uses half of the running time

https://gerrit.named-data.net/3262 is an attempt to optimize name::Component::compare. Can Chengyu Fan run profiling again with this patch?

I have run the profiling again with gerrit patch 3262 (https://gerrit.named-data.net/3262)

However, there is no distinct difference for the name::Component::compare() time percentage with 3262 and without 3262.
I have also put the results in https://www.dropbox.com/sh/ars2l07kd93q1g1/AADuE9eTc3Ss7qFeDF14KiJXa/nfd-profiling/3807-cs-benchmark-profiling?dl=0

#6

Updated by Junxiao Shi over 3 years ago

I compared callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out with callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out.
After ndn-cxx:commit:010f0868cd204f75f661acc4320803d783786213, name::Component::compare indeed takes the expected "fast path", but it's overall overhead is almost the same.

In the old "slow path", each name::Component::compare invokes Block::value twice and Block::value_size four times (both indirectly calling Block::hasValue).
In the new "fast path", each name::Component::compare invokes Block::wire twice and Block::size twice (both indirectly calling Block::hasWire) and Block::hasWire twice.
Although Block::size is cheaper than Block::value_size, Block::hasWire is more expensive than Block::hasValue, so that the overhead of both code paths break even.

#7

Updated by Chengyu Fan over 3 years ago

Junxiao Shi wrote:

I compared callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert.out with callgrind_nfd_0.5.0_cs_benchmark_FindMissInsert_with_gerrit_name-component-3262.out.
After ndn-cxx:commit:010f0868cd204f75f661acc4320803d783786213, name::Component::compare indeed takes the expected "fast path", but it's overall overhead is almost the same.

In the old "slow path", each name::Component::compare invokes Block::value twice and Block::value_size four times (both indirectly calling Block::hasValue).
In the new "fast path", each name::Component::compare invokes Block::wire twice and Block::size twice (both indirectly calling Block::hasWire) and Block::hasWire twice.
Although Block::size is cheaper than Block::value_size, Block::hasWire is more expensive than Block::hasValue, so that the overhead of both code paths break even.

I should make this clearer. The patch did change the CS behavior, but the overhead is the same. "fast path" is not fast.

#8

Updated by Davide Pesavento over 2 years ago

  • Category deleted (Integration Tests)
  • Target version deleted (v0.6)
  • % Done changed from 100 to 50

Also available in: Atom PDF