Project

General

Profile

Actions

Feature #4843

closed

EndpointId in LinkService

Added by Junxiao Shi almost 6 years ago. Updated about 5 years ago.

Status:
Abandoned
Priority:
Normal
Category:
Faces
Target version:
-
Start date:
Due date:
% Done:

40%

Estimated time:

Description

Functions and signals on LinkService and Face classes need to have EndpointId argument.

Also, create a IncomingFaceTag that carries both FaceId and EndpointId.
In NFD primary thread (i.e. except RIB), replace all usage of IncomingFaceIdTag with IncomingFaceTag.


Related issues 2 (0 open2 closed)

Related to NFD - Feature #4283: Refactor Ethernet unicast communicationRejected

Actions
Related to NFD - Feature #4849: EndpointId in forwarding and Strategy APIAbandonedMd Ashiqur Rahman

Actions
Actions #1

Updated by Junxiao Shi almost 6 years ago

  • Blocks Feature #4281: Develop self-learning for broadcast and ad hoc wireless faces added
Actions #2

Updated by Junxiao Shi almost 6 years ago

  • Related to Feature #4283: Refactor Ethernet unicast communication added
Actions #3

Updated by Junxiao Shi almost 6 years ago

  • Subject changed from Extend IncomingFaceIdTag to IncomingFaceTag to EndpointId in LinkService
  • Description updated (diff)
Actions #4

Updated by Junxiao Shi almost 6 years ago

  • Related to Feature #4849: EndpointId in forwarding and Strategy API added
Actions #5

Updated by Davide Pesavento almost 6 years ago

Btw, we should move the Transport::EndpointId typedef to face.hpp as nfd::face::EndpointId (and possibly aliased as nfd::EndpointId, similar to FaceId) and then use EndpointId everywhere instead of uint64_t.

Actions #6

Updated by Md Ashiqur Rahman almost 6 years ago

Btw, we should move the Transport::EndpointId typedef to face.hpp as nfd::face::EndpointId (and possibly aliased as nfd::EndpointId, similar to FaceId) and then use EndpointId everywhere instead of uint64_t.

Yes, I agree. Can we have this as a separate issue so that only this change can be done over there?

Actions #7

Updated by Davide Pesavento almost 6 years ago

Separate commit, yes. I don't see the need for a separate redmine issue. This is a mechanical change.

Actions #8

Updated by Md Ashiqur Rahman almost 6 years ago

Separate commit, yes. I don't see the need for a separate redmine issue. This is a mechanical change.

Yes. That can be done. But then which issue should the change refer to in the commit message?

Actions #9

Updated by Davide Pesavento almost 6 years ago

You can refer to this issue, but I don't really care, it's unimportant.

Actions #10

Updated by Md Ashiqur Rahman almost 6 years ago

Ok. If I move out the typedef for EndpointId from transport.hpp to face.hpp then should I rename the EndpointId remoteEndpoint to uint64_t remoteEndpoint in the Packet subclass under Transport class or keep the typedef uint64_t EndpointId in the transport.hpp too?

Moving the typedef is causing me error even though I did the aliasing in face.hpp, some examples:

../daemon/face/transport.hpp:137:5: error: ‘EndpointId’ does not name a type; did you mean ‘dprintf’?
     EndpointId remoteEndpoint;
.
.
../daemon/face/lp-reassembler.hpp:100:62: error: template argument 2 is invalid
   signal::Signal<LpReassembler, Transport::EndpointId, size_t> beforeTimeout;
.
.
../daemon/face/datagram-transport.hpp:197:6: error: ‘class nfd::face::Transport::Packet’ has no member named ‘remoteEndpoint’
   tp.remoteEndpoint = makeEndpointId(m_sender);

Am I doing something wrong here? If not then I think we should keep the EndpointId in Transport too and just change the description.

Actions #11

Updated by Davide Pesavento almost 6 years ago

Is the new typedef visible in transport.hpp? if not, you can't expect it to work. Also you need to change all Transport::EndpointId to just EndpointId.
To avoid including face.hpp from transport.hpp, it's probably better to keep the EndpointId typedef in transport.hpp but outside the Transport class.

Actions #12

Updated by Md Ashiqur Rahman almost 6 years ago

Thanks for the clarification.

Actions #13

Updated by Md Ashiqur Rahman almost 6 years ago

In NFD primary thread (i.e. except RIB), replace all usage of IncomingFaceIdTag with IncomingFaceTag.

Please correct me if I'm wrong. Does this mean that in ndn-cxx/lp/tags.hpp,

typedef SimpleTag<uint64_t, 10> IncomingFaceIdTag;

becomes:

typedef SimpleTag<FaceEndpoint, 10> IncomingFaceTag;

Where FaceEndpoint is similar to #4849?

Actions #14

Updated by Junxiao Shi almost 6 years ago

FaceEndpoint itself can be extended to be used as a tag. No need for SimpleTag wrapper.

Actions #15

Updated by Davide Pesavento almost 6 years ago

Junxiao Shi wrote:

FaceEndpoint itself can be extended to be used as a tag. No need for SimpleTag wrapper.

Please elaborate, I don't understand what you want to do. NFD's Face object is meaningless outside NFD code, in library code and tools we use the face ID.

Actions #16

Updated by Md Ashiqur Rahman almost 6 years ago

Davide Pesavento wrote:

Junxiao Shi wrote:

FaceEndpoint itself can be extended to be used as a tag. No need for SimpleTag wrapper.

Please elaborate, I don't understand what you want to do. NFD's Face object is meaningless outside NFD code, in library code and tools we use the face ID.

I'm guessing the question is for me. Yes, there's no need to keep Face outside NFD. I just asked to confirm if we want the FaceId and EndpointId together in one tag or have separate tags. From what Junxiao mentioned, we can have:

typedef SimpleTag<std::pair<uint64_t, uint64_t>, 10> IncomingFaceIdEndpointTag;

But to make it better representable, I think he's suggesting to use a class similar to FaceEndpoint. If we follow this, then for the tag, the class should be FaceIdEndpoint with members (FaceId,Endpoint,TypeId).

Actions #17

Updated by Md Ashiqur Rahman almost 6 years ago

  • % Done changed from 0 to 20
Actions #18

Updated by Junxiao Shi almost 6 years ago

  • Status changed from New to In Progress

https://gerrit.named-data.net/5304 is a mechanical change.
Within GenericLinkService, reassembler and reliability both need to be aware of EndpointId.
Reassembler already handles EndpointId internally and no change is necessary.

For reliability, there are three choices:

  • Create one instance of LpReliability for each endpointId.
  • Add endpointId awareness into LpReliability type.
  • Do not enable reliability for non-zero endpointId: don't call handleOutgoing and piggyback when sending to non-zero endpointId; consequently, LpReliability would not track or attempt to retransmit these frames.
Actions #19

Updated by Md Ashiqur Rahman almost 6 years ago

We still haven't addressed the issue of why other strategies raise segmentation fault error at runtime in unit tests. I saw the logs in here: http://jenkins.named-data.net/job/NFD/OS=Ubuntu-16.04-64bit/6857/console

Not fully understanding the reason. But what I can see, seems like buggy Address is being set.


Address 0x7f1dd52ef880 is located in stack of thread T0 at offset 0 in frame

SUMMARY: AddressSanitizer: stack-use-after-return /usr/include/boost/asio/detail/thread_info_base.hpp:45 boost::asio::detail::thread_info_base::allocate(boost::asio::detail::thread_info_base*, unsigned long)
Shadow bytes around the buggy address:
0x0fe43aa55ec0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55ed0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55ee0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55ef0: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55f00: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
=>0x0fe43aa55f10:[f5]f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55f20: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0fe43aa55f30: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0fe43aa55f40: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
0x0fe43aa55f50: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00
0x0fe43aa55f60: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00

Actions #20

Updated by Davide Pesavento almost 6 years ago

Junxiao Shi wrote:

For reliability, there are three choices:

  • Create one instance of LpReliability for each endpointId.

This seems to be a good compromise for the time being, and shouldn't be too complicated to implement.

Actions #21

Updated by Md Ashiqur Rahman almost 6 years ago

Davide Pesavento wrote:

Junxiao Shi wrote:

For reliability, there are three choices:

  • Create one instance of LpReliability for each endpointId.

This seems to be a good compromise for the time being, and shouldn't be too complicated to implement.

Does this mean having something similar to a map container for m_reliability to maintain per-endpointId instance?

Actions #22

Updated by Md Ashiqur Rahman over 5 years ago

I am kind of lost here. I understand that an instance of LpReliability can be created with say an std::unordered_map. However, as the LpReliability sends out an Lp packet through the LinkService, with this line in LpReliability::onLpPacketLost(lp::Sequence txSeq) -

m_linkService->sendLpPacket(lp::Packet(newTxFrag.pkt), 0);

This requires to know the EndpointId. If that's the case then LpReliability also needs to be aware of EndpointId. Then why create an LpReliability instance per EndpointId if LpReliability eventually needs to be aware of EndpointId anyway?

Actions #23

Updated by Davide Pesavento over 5 years ago

Md Ashiqur Rahman wrote:

This requires to know the EndpointId.

Yes, each instance of LpReliability knows its own EndpointId. It's effectively a constant within each LpReliability instance.

Alternatively, GenericLinkService can supply a callback to LpReliability's constructor that LpReliability uses instead of calling sendLpPacket(). In this case, the EndpointId is already bound to the callback, so LpReliability doesn't need to know its EndpointId at all. However, this approach is slightly less performant because it involves an std::function.

If that's the case then LpReliability also needs to be aware of EndpointId.

No, that doesn't mean "being aware of EndpointId". LpReliability blindly passes the provided EndpointId along to the functions that need it, but other than that, no LpReliability logic depends on EndpointId.

Of course you're always free to choose option 2 of note-18, if you think that's easier to implement for you.

Actions #24

Updated by Md Ashiqur Rahman over 5 years ago

I think this is more like a c++ question:
I'm using this for per EndpointId instance of LpReliability in GenericLinkService:

std::unordered_map<EndpointId, LpReliability> m_reliability;

But how should I initialize an instance of LpReliability for an EndpointId? Currently, I am looking to do this:

m_reliability[endpointId] = LpReliability(m_options.reliabilityOptions, this);

However, it's definitely not going to work as LpReliability and also LinkService is noncopyable.

Other than this initialization problem, I think I have a good draft implementation for the desired behavior.

Actions #26

Updated by Md Ashiqur Rahman over 5 years ago

I tried using this:

m_reliability.emplace(endpointId, LpReliability(m_options.reliabilityOptions, this));

But getting this error:

/usr/include/c++/7/ext/new_allocator.h:136:4: error: no matching function for call to 
std::pair<const long unsigned int, nfd::face::LpReliability>::pair(long unsigned int&, nfd::face::LpReliability)
  { ::new((void *)__p) _Up(std::forward<_Args>(__args)...); }

complete log: https://gist.github.com/ashiqopu/2849352e2d1a80e8c6d6496e4567b068

Actions #27

Updated by Davide Pesavento over 5 years ago

m_reliability.emplace(std::piecewise_construct,
                      std::forward_as_tuple(endpointId),
                      std::forward_as_tuple(m_options.reliabilityOptions, this));
Actions #28

Updated by Md Ashiqur Rahman over 5 years ago

Thanks. worked. I tried the one you provided earlier and was still getting error as I was simply trying to access element as m_reliability[endpointId].Foo() which does not work. Had to use `m_reliability.at(endpointId).Foo(). Compiles ok now. I will push the patch after I fix the unit test.

Actions #29

Updated by Davide Pesavento over 5 years ago

Md Ashiqur Rahman wrote:

I tried the one you provided earlier and was still getting error

It is the exact same function, you were just using it incorrectly.

I was simply trying to access element as m_reliability[endpointId].Foo() which does not work. Had to use `m_reliability.at(endpointId).Foo().

This is a different problem and not related to emplace.

Actions #30

Updated by Davide Pesavento over 5 years ago

  • % Done changed from 20 to 40
Actions #31

Updated by Md Ashiqur Rahman over 5 years ago

One possible way to clear LpReliability instance tied to an EndpointId could be to keep the latest timestamp of the instance being used and clear after some amount of inactive time (e.g. 1s or 500ms).

Actions #32

Updated by Davide Pesavento over 5 years ago

Md Ashiqur Rahman wrote:

One possible way to clear LpReliability instance tied to an EndpointId could be to keep the latest timestamp of the instance being used and clear after some amount of inactive time (e.g. 1s or 500ms).

Why such a short timeout? It seems to me that it could easily be in the order of several seconds or even minutes...

Also, when you destruct and recreate LpReliability for a given endpoint, the sequence number will reset, right? How is the other peer going to react to that?

Actions #33

Updated by Md Ashiqur Rahman over 5 years ago

Davide Pesavento wrote:

Md Ashiqur Rahman wrote:

One possible way to clear LpReliability instance tied to an EndpointId could be to keep the latest timestamp of the instance being used and clear after some amount of inactive time (e.g. 1s or 500ms).

Why such a short timeout? It seems to me that it could easily be in the order of several seconds or even minutes...

This is just an examples. Of course the interval can be set much higher. But could this interval be related to how stable/unstable the link might be? For example, in wired networks where links are more stable, an interval in minutes makes sense. However, in a highly dynamic vehicular network, a higher interval might create/keep a large number of instances. What could be the overhead of having large number of LpReliability Instances? Any thoughts or suggestions?

Also, when you destruct and recreate LpReliability for a given endpoint, the sequence number will reset, right? How is the other peer going to react to that?

Yes. In this case, a naive choice would be to set a large interval before destroying an instance. But then the question of overhead comes again. I am not sure what to do exactly. Maybe attach some sort of signal from the Packet sender side and the peer who is responsible for sending ACKs reacts to those signals?

Actions #34

Updated by Md Ashiqur Rahman over 5 years ago

I have been trying to figure out what's causing the build to fail: http://jenkins.named-data.net/job/NFD/7083/
Looking into the logs, is it possible that std::unordered_map::erase is only removing the pointer to the list that only contains pointer to the actual <key, value> stored in the memory and not clearing the actual values right away?

I found this stackoverflow link as a reference but not sure.

I also looked at boost::unordered_map here. Can I try this instead of std::unordered_map?

Actions #35

Updated by Davide Pesavento over 5 years ago

Something I don't quite understand in the overall design: how does the link service know if a particular feature (e.g. LpReliability) applies to a given endpoint? For instance, LpReliability doesn't make sense for multi-access links. This used to be easy to check: if the underlying transport is not point-to-point, disable (or don't enable) the reliability feature. After the EndpointId refactoring, the underlying transport may use unicast or multicast depending on the EndpointId, so how can the link service know if LpReliability can be enabled for a given endpoint?

Actions #36

Updated by Md Ashiqur Rahman over 5 years ago

I'll be honest here, I don't know this either. But then again, the changes made are making everything "per endpoint" dependant decision. Maybe Junxiao can provide a better answer to this.

Actions #37

Updated by Davide Pesavento about 5 years ago

  • Target version deleted (v0.7)
Actions #38

Updated by Davide Pesavento about 5 years ago

  • Blocks deleted (Feature #4281: Develop self-learning for broadcast and ad hoc wireless faces)
Actions #39

Updated by Davide Pesavento about 5 years ago

  • Status changed from In Progress to Abandoned

No longer needed with the new design.

Actions

Also available in: Atom PDF