Project

General

Profile

Feature #2240

StrategyInfoHost: store multiple StrategyInfo of distinct types

Added by Junxiao Shi almost 5 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Tables
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
2.00 h

Description

Allow multiple StrategyInfo objects of distinct types to be attached to StrategyInfoHost.

StrategyInfo derived class must declare a int getTypeId() static method, which returns a unique identifier for the StrategyInfo type.
Those identifiers are assigned on StrategyInfoType wiki page.

StrategyInfoHost will store StrategyInfo objects in std::map<int, shared_ptr<StrategyInfo>> container.

Necessity:

  • Various parts of a complex strategy needs to store different types of information that are used at different occasions.
    Although it's possible to make a complex StrategyInfo type that contains everything, it's more convenient if StrategyInfoHost can allow multiple StrategyInfo of distinct types, because it allows sub modules to be shared between strategies.
  • In a composed strategy (#2000), each building block wants to store information for its own use.
    Although it's possible to compose a StrategyInfo type that contains everything, that is basically a container of building block's StrategyInfo indexed by type, which is same as what StrategyInfoHost is expected to do.
  • Measurements could be shared between strategies if they are using compatible types.
    StrategyInfo doesn't have to be cleared when changing strategy.

History

#1 Updated by Junxiao Shi almost 5 years ago

  • Status changed from New to In Progress

I'm starting now because my experiment needs this.

#2 Updated by Junxiao Shi almost 5 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 0 to 100
  • Estimated time set to 2.00 h

#3 Updated by Junxiao Shi almost 5 years ago

  • Target version set to v0.3

#4 Updated by Junxiao Shi almost 5 years ago

In commit 207ca4e3467ec1877dc2021a7fd195c630fe6cf8, std::type_index is used as hashtable key.

There's concern about its "compiler and target dependent" performance.

One suggestion is to NS-3 style custom typeid, which reinvents std::type_index.

I will not make this change until a benchmark confirms a performance penalty on a supported platform.

If a new platform is added to the list of supported platforms in the future, and performance penalty due to the usage of typeid operator is confirmed on that platform, we could make the change at that time.

It won't become a widespread change at that time, because all StrategyInfo types have the same base class.

#5 Updated by Michael Sweatt almost 5 years ago

The reasons discussed before for type_id not being recommended are all true, if you want to check the limited required semantics for type_id, please see section 5.28 of the C++03, 11, or 14 standards, it has not changed. There is no requirement that on a format for the output, no rule that it be constant time for non-polymorphic types. Only a requirement that the argument by a glvalue (fine in this case) and that <typeid> by included, which does not adhere to. It does depend on RTTI which can in many cases be omitted, and thus does risk RTTI output when it otherwise not be. This does have clear performance implications.

#6 Updated by Junxiao Shi almost 5 years ago

I did a simple benchmark.

#include <ndn-cxx/util/time.hpp>
#include <unordered_map>
#include <typeindex>

struct A
{
  static const int TYPE;
};
const int A::TYPE = 1;

int
main() {
  std::unordered_map<int, int> m1;
  std::unordered_map<std::type_index, int> m2;

  auto t1 = ndn::time::system_clock::now();
  for (int i = 0; i <= 99999999; ++i) {
    m1[A::TYPE] = i;
  }
  auto t2 = ndn::time::system_clock::now();
  m1.clear();

  auto t3 = ndn::time::system_clock::now();
  for (int i = 0; i <= 99999999; ++i) {
    m2[typeid(A)] = i;
  }
  auto t4 = ndn::time::system_clock::now();
  m2.clear();

  std::cout << (t2 - t1) << std::endl;
  std::cout << (t4 - t3) << std::endl;

  return 0;
}

Results:

OS int hashtable type_index hashtable
Ubuntu 12.04 56.6ns/op 148.9ns/op
OSX 10.9 62.0ns/op 65.6ns/op

Surely there's a noticeable performance difference on Ubuntu.

However, the absolute difference is 92.3ns/op, or 1 millisecond per 10834 operations.

It doesn't worth programmer effort to optimize for this tiny difference.

#7 Updated by Davide Pesavento almost 5 years ago

On Ubuntu 14.10 with gcc-4.9.1 and -O2, I get around 9.34 ns/op and 15.10 ns/op. Are you sure you're building with optimizations turned on?

#8 Updated by Junxiao Shi almost 5 years ago

On Ubuntu 14.10 with gcc-4.9.1 and -O2, I get around 9.34 ns/op and 15.10 ns/op. Are you sure you're building with optimizations turned on?

I recall that I didn't specify -O3, so optimization isn't turned on.

Your result further confirms my claim: the absolute different is too small that it doesn't worth programmer effort to optimize.

See Time exchange rate.

#9 Updated by Alex Afanasyev almost 5 years ago

I will simply not agree of using the undefined behavior of C++ feature, even if it "about the same" difference for the compilers we currently support.

It is trivial to implement it within defined behavior without introducing any complexity to the code.


I'm actually not really seeing why do we need multiple strategy host infos. I would like to see a use case how it going to be used.

#10 Updated by Junxiao Shi almost 5 years ago

A use case is: suppose one wants to add SuppressionInfo (#1913) to an existing strategy where PIT entry has a different StrategyInfo type, it would be more convenient to allow multiple StrategyInfo of distinct types.

Surely everything can be coordinated inside the strategy, but that would increase the complexity of strategy, or force strategy to have a type-based table similar to this one.

#11 Updated by Junxiao Shi almost 5 years ago

I will simply not agree of using the undefined behavior of C++ feature, even if it "about the same" difference for the compilers we currently support.

If std::type_index isn't broken, there's no reason to reinvent this feature.

As long as correctness is guaranteed, this isn't undefined behavior.

It is trivial to implement it within defined behavior without introducing any complexity to the code.

Only 0.5-hour tasks are trivial. ns3::TypeId cannot be imported due to license conflict, and cannot be reimplemented in 0.5 hour, so this is not trivial.

#12 Updated by Davide Pesavento almost 5 years ago

Michael Sweatt wrote:

The reasons discussed before for type_id not being recommended are all true, if you want to check the limited required semantics for type_id, please see section 5.28 of the C++03, 11, or 14 standards, it has not changed. There is no requirement that on a format for the output, no rule that it be constant time for non-polymorphic types. Only a requirement that the argument by a glvalue (fine in this case) and that <typeid> by included, which does not adhere to. It does depend on RTTI which can in many cases be omitted, and thus does risk RTTI output when it otherwise not be. This does have clear performance implications.

I'm aware of what the C++ standard says regarding typeid and type_info. In fact, the standard doesn't even guarantee that different type_info objects have different hash values, therefore a ISO-compliant implementation may return the same value for every type. However I don't expect that any sane compiler does that.

You're right about the fact that <typeinfo> must be included, otherwise the program is ill-formed.

RTTI is technically not required in this case, as in our case the typeid is applied to a type-id (per the ISO std definition), thus the type is static and known at compile time. This does not mean that the result of the typeid is actually calculated at compile-time, but no runtime class hierarchy walk or vtable lookup is needed either.

Unfortunately gcc disables all usages of typeid when -fno-rtti is passed. So, if the ability to compile NFD with -fno-rtti is a goal, then yes, using typeid creates a problem. However, I don't think this is worth our time, especially at this stage of development.

#13 Updated by Davide Pesavento almost 5 years ago

Alex Afanasyev wrote:

I will simply not agree of using the undefined behavior of C++ feature, even if it "about the same" difference for the compilers we currently support.

It is not undefined behavior.

#14 Updated by Alex Afanasyev almost 5 years ago

If alternative implementation (having a static ID value for the strategy info class) was hard to implement, I could be ok with having typeid for now. If it wasn't primary processing path, I will be ok too.

However, it is so trivial that for the time we are talking here, this feature could have been implemented long time ago. I will simply not going to merge commit that uses typeid in the primary processing path.

#15 Updated by Junxiao Shi almost 5 years ago

Please prove typeid operator is bad, and justify the necessity of reinventing the wheel.

#16 Updated by Davide Pesavento over 4 years ago

While reading shared_ptr_base.h for an unrelated reason, I (somewhat surprisingly) discovered that, in the GCC implementation, one of the constructors of __shared_ptr (the one used by make_shared and allocate_shared) applies typeid on an empty struct if RTTI is available. And we already use make_shared extensively. This should confirm that using typeid on statically-known types is not a big deal.

#17 Updated by Alex Afanasyev over 4 years ago

This doesn't change my position. It is compiler dependent. Even though the primary focus right now is clang and gcc, I will not agree of using typeid in the primary path in our code. It is ok for gcc, since this code is compiler-specific.

#18 Updated by Junxiao Shi over 4 years ago

My position is still note-4: prove typeid is broken, otherwise it's good.

#19 Updated by Davide Pesavento over 4 years ago

Alex Afanasyev wrote:

This doesn't change my position. It is compiler dependent. Even though the primary focus right now is clang and gcc, I will not agree of using typeid in the primary path in our code. It is ok for gcc, since this code is compiler-specific.

And what other compilers are we realistically going to support in the near/mid term? MSVC? I'm not even sure if it's able to compile our C++11 code... typeid is the least of our problems there...

#20 Updated by Junxiao Shi over 4 years ago

In fact, typeid cannot have non-constant time.

Imagine an inefficient compiler which maintains a table of all types in an executable, and typeid iterates over this table. The complexity of typeid in this case is still constant time, because the size of such table is a compile time constant for a given executable.

#21 Updated by Davide Pesavento over 4 years ago

Junxiao Shi wrote:

Imagine an inefficient compiler which maintains a table of all types in an executable, and typeid iterates over this table. The complexity of typeid in this case is still constant time, because the size of such table is a compile time constant for a given executable.

err... what? I'd say that's linear in the number of types present in the final binary, since such a table would have to be scanned at runtime.

In any case, this is not how it's done, and it's rather pointless to make these worst-case speculations on compiler implementations. My previous comments should already have proven that using typeid on statically-known types is far from being a problem in practice.

#22 Updated by Junxiao Shi over 4 years ago

  • Description updated (diff)

20150102 conference call decides to use int as index key, because note-6 shows it's faster than type_index.

The uniqueness of integer identifiers is guaranteed by a registry.

#23 Updated by Junxiao Shi over 4 years ago

  • Status changed from Code review to In Progress
  • % Done changed from 100 to 50

I'll rework the Change according to note-22 decision.

#24 Updated by Junxiao Shi over 4 years ago

  • Description updated (diff)
  • Status changed from In Progress to Code review
  • % Done changed from 50 to 100

http://gerrit.named-data.net/1486 patchset 8 implements note-22 idea.

Unfortunately it's unfeasible to maintain backwards compatibility because getTypeId method is introduced.

#25 Updated by Junxiao Shi over 4 years ago

The container type is std::map, because the number of stored items is expected to be small (less than five), and std::map is believed to consume less memory than std::unordered_map in this case.

#26 Updated by Junxiao Shi over 4 years ago

  • Status changed from Code review to Closed

Also available in: Atom PDF