Task #1541: Investigate applicability of unordered_map for NameTree - NFD - NDN project issue tracking system

Actions

Copy link

Task #1541

closed

Investigate applicability of unordered_map for NameTree

Added by Alex Afanasyev over 11 years ago. Updated about 9 years ago.

Status:

Abandoned

Priority:

Low

Assignee:

Haowei Yuan

Category:

Tables

Target version:

v0.5

Start date:

Due date:

% Done:

60%

Estimated time:

Description

Depending on how our hash table is used, either std::unordered_map or boost::intrusive::unordered_map may simplify hash table implementation.

Actions

Copy link

Updated by Junxiao Shi over 11 years ago

Description updated (diff)

Actions

Copy link

Updated by Alex Afanasyev about 11 years ago

Target version changed from v0.2 to Unsupported

Actions

Copy link

Updated by Junxiao Shi almost 11 years ago

Subject changed from Investigate applicability of boost::unordered_map or boost::intrusive::unordered_map for NameTree to Investigate applicability of Boost unordered_map for NameTree
Priority changed from Normal to Low
Target version changed from Unsupported to v0.3
Start date deleted (~~04/25/2014~~)

@Haowei agrees to work on this Task in 0.3 timeline.

Actions

Copy link

Updated by Junxiao Shi over 10 years ago

Subject changed from Investigate applicability of Boost unordered_map for NameTree to Investigate applicability of unordered_map for NameTree
Description updated (diff)

Change boost::unordered_map to std::unordered_map because it becomes available with C++11.

Actions

Copy link

Updated by Haowei Yuan over 10 years ago

Status changed from New to In Progress

Actions

Copy link

Updated by Haowei Yuan over 10 years ago

% Done changed from 0 to 40

Actions

Copy link

Updated by Haowei Yuan over 10 years ago

What is the Table structure used in the current Content Store implementation? Is it the same as defined in http://www.boost.org/doc/libs/1_48_0/boost/unordered/detail/table.hpp?

Actions

Copy link

Updated by Junxiao Shi over 10 years ago

ContentStore index requires an ordered container type. Currently std::set is chosen. It's benchmarked to be faster than Ilya's skip list.

Note that ContentStore isn't attached to NameTree.

Actions

Copy link

Updated by Haowei Yuan over 10 years ago

% Done changed from 40 to 60

The std::unordered_map supports customized hash function, and a equal_to predicate can be provided to determine if two keys match. As a result, it seems possible to use std::unordered_map to achieve hash table functions. However, I have noticed two performance related issues so far.

lookup() The name tree lookup keys are the name prefix strings. To accelerate lookup, in the current hash table implementation, each hash table entry stores the full hash value, and string matching is required only if the hash values match.

It is not clear if the same optimization can be supported by std::unordered_map. For instance, to use the equal_to predicate, we could customize a key structure that contains both the hash value and the key. However, during a lookup, the hash value needs to be computed before calling the unordered_map.find() function, which wastes CPU cycles as the hash values would essentially be computed twice.

The following is an example using a customized key (original link: http://stackoverflow.com/questions/17016175/c-unordered-map-using-a-custom-class-type-as-the-key)

struct Key
{
  std::string first;
  std::string second;
  int         third;

  bool operator==(const Key &other) const
  { return (first == other.first
            && second == other.second
            && third == other.third);
  }
};

int main()
{
  std::unordered_map<Key,std::string> m6 = {
    { {"John", "Doe", 12}, "example"},
    { {"Mary", "Sue", 21}, "another"}
  };
}

Hash table resize()

The other benefit of storing hash values in the hash table is that, during resize(), we don't need to recompute the hash values.

The std::unordered_map rehash() function would require rehash all the items stored in the hash table. When the table size is large, the rehash() function may take considerable amount of time. So far, I have not found alternative methods for rehash().

I will continue exploring ways to accelerate lookup() and resize(). Please let me know if you are aware of any alternative methods.

Actions

Copy link

#10

Updated by Junxiao Shi about 10 years ago

during a lookup, the hash value needs to be computed before calling the unordered_map.find() function, which wastes CPU cycles as the hash values would essentially be computed twice

I don't understand this statement.

Consider the following key type:

class Key
{
public:
  // implicit conversion from Name
  Key(const Name& name)
    : hash(std::hash<Name>()(name))
    , m_name(name)
  {
  }

  // implicit conversion to Name
  operator Name()
  {
    return m_name;
  }

public:
  const size_t hash;

private:
  const Name m_name;
};

namespace std {

template<>
inline size_t
hash<Key>::operator()(const Key& key)
{
  return key.hash;
}

} // namespace std

When doing a lookup, NameTree constructs a Key instance which involves one Name hash computation (call std::hash<Name>::operator()).

There's no more Name hash computation inside the unordered_map.

During a rehash, there is no Name hash computation, because std::hash<Key>::operator() never invokes Name hash computation.

Actions

Copy link

#11

Updated by Haowei Yuan about 10 years ago

Thank you, Junxiao, this is very helpful.
I will try to go though the required functions with this Key structure, and see if all of them can be supported.

Actions

Copy link

#12

Updated by Davide Pesavento almost 10 years ago

Target version changed from v0.3 to v0.5

any news here?

Actions

Copy link

#13

Updated by Haowei Yuan almost 10 years ago

Davide Pesavento wrote:

any news here?

Not much recently, I will post some progress soon.

Actions

Copy link

#14

Updated by Haowei Yuan over 9 years ago

I will write some code to verify all the existing functions can be supported this week, and estimate how much work will be involved if we decide to take this direction.

Actions

Copy link

#15

Updated by Junxiao Shi about 9 years ago

Status changed from In Progress to Abandoned

20160526 call decides to abandon this task, because there's no evidence that Haowei's hash table implementation has bugs, and switching to STL is likely to worsen performance.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

NFD

Tags

Task #1541

Investigate applicability of unordered_map for NameTree

Updated by Junxiao Shi over 11 years ago

Updated by Alex Afanasyev about 11 years ago

Updated by Junxiao Shi almost 11 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Haowei Yuan over 10 years ago

Updated by Haowei Yuan over 10 years ago

Updated by Haowei Yuan over 10 years ago

Updated by Junxiao Shi over 10 years ago

Updated by Haowei Yuan over 10 years ago

Updated by Junxiao Shi about 10 years ago

Updated by Haowei Yuan about 10 years ago

Updated by Davide Pesavento almost 10 years ago

Updated by Haowei Yuan almost 10 years ago

Updated by Haowei Yuan over 9 years ago

Updated by Junxiao Shi about 9 years ago