Project

General

Profile

Actions

Feature #4003

closed

FaceMgmt: enable link reliability

Added by Junxiao Shi almost 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Management
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
3.00 h

Description

In FaceMgmt, add a flag to faces/create and faces/update commands to enable/disable best-effort link reliability.


Related issues 3 (0 open3 closed)

Related to NFD - Feature #3823: Congestion Control: design Local Link Loss DetectionClosedEric Newberry

Actions
Blocked by NFD - Feature #3931: Implement NDNLP Link Reliability ProtocolClosedEric Newberry

Actions
Blocks NFD - Feature #4004: nfdc face create: reliability optionClosedEric Newberry

Actions
Actions #1

Updated by Junxiao Shi almost 8 years ago

  • Blocked by Feature #3931: Implement NDNLP Link Reliability Protocol added
Actions #2

Updated by Davide Pesavento almost 8 years ago

Should this be generalized to all NDNLP features, such as fragmentation and reassembly?

Actions #3

Updated by Junxiao Shi almost 8 years ago

Should this be generalized to all NDNLP features, such as fragmentation and reassembly?

Not until there is a use case. Link reliability needs a flag because it has not-trivial overhead (Acks) so it should be enabled only if the link is lossy.

Actions #4

Updated by Junxiao Shi almost 8 years ago

  • Blocks Feature #4004: nfdc face create: reliability option added
Actions #5

Updated by Davide Pesavento almost 8 years ago

The use case is enabling/disabling fragmentation on UDP tunnels, so that we can use NDNLP fragmentation instead of relying on IP fragmentation.

Actions #6

Updated by Junxiao Shi almost 8 years ago

The use case is enabling/disabling fragmentation on UDP tunnels

Please discuss this use case in #4005.
This issue, #4003, is specifically about enabling link reliability.

Actions #7

Updated by Eric Newberry over 7 years ago

  • Status changed from New to In Progress

Starting design work

Actions #8

Updated by Eric Newberry over 7 years ago

A ReliabilityEnabled Flag bitfield has been added to FaceMgmt for faces/create and faces/update.

Actions #9

Updated by Eric Newberry over 7 years ago

  • % Done changed from 0 to 10

A change to ndn-cxx adding the Face reliability feature flag has been pushed to Gerrit: https://gerrit.named-data.net/#/c/4020/

Actions #10

Updated by Eric Newberry over 7 years ago

  • Related to Feature #3823: Congestion Control: design Local Link Loss Detection added
Actions #11

Updated by Eric Newberry over 7 years ago

The ReliabilityEnabled flag has been renamed to LpReliabilityEnabled.

Actions #12

Updated by Eric Newberry over 7 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 10 to 100
Actions #13

Updated by Junxiao Shi over 7 years ago

are we allowing to enable LpReliability on any type of face? local faces too?

I think LpReliability should only be allowed on lossy links. We could consider all links as "lossy" unless specified otherwise. Therefore, the following LocalUri schemes are not lossy: unix, tcp4, tcp6, ws.

It has nothing to do with local or non-local. A non-local tcp6 face is still not lossy.

Actions #14

Updated by Eric Newberry over 7 years ago

Only allowing reliability to be enabled on lossy transports presents a design issue: how should reliability be enabled on new faces (and validated for appropriateness)?

The current design with enabling local fields adds a parameter to ProtocolFactory::createFace, which then validates the parameter's value according to the type of the factory. However, I'm concerned that we are adding too many parameters to this function, especially if we add additional management flags. Thus I currently see two solutions (there may be more):

  • Add the parameter to createFace (which I don't want to do). This also involves needing to update all factory test cases to use the new createFace syntax.

  • Create a separate validation function in each factory to validate face flags. This function would be called before the face was created (and the face wouldn't be created if any flags failed validation). The face options would then be adjusted after the face was created, similar to how faces/update works.

Actions #15

Updated by Davide Pesavento over 7 years ago

The current design with enabling local fields adds a parameter to ProtocolFactory::createFace, which then validates the parameter's value according to the type of the factory.

Actually, the current implementation of BIT_LOCAL_FIELDS_ENABLED is a bit inconsistent. createFace() doesn't validate it (except for a sanity in the success callback), while updateFace() does the validation on its own.

Add the parameter to createFace (which I don't want to do). This also involves needing to update all factory test cases to use the new createFace syntax.

I agree that this approach doesn't scale, thus I would be mildly against it.

Create a separate validation function in each factory to validate face flags. This function would be called before the face was created (and the face wouldn't be created if any flags failed validation). The face options would then be adjusted after the face was created, similar to how faces/update works.

Acceptable. This is also similar to how faces/update handles persistency changes (see canChangePersistencyTo()). However, a major complication is that flag validation may require an arbitrary amount of additional information. In other words, whether a flag value is valid or not does not depend exclusively on the type of factory or face. E.g.: BIT_LOCAL_FIELDS_ENABLED requires checking the face scope.

There's also a third option: instead of passing each ndn::nfd::BIT_* as a separate boolean parameter, pass a single flags parameter to createFace(). This approach alone doesn't solve the validation problem in updateFace() though.

Actions #16

Updated by Davide Pesavento over 7 years ago

Junxiao Shi wrote:

It has nothing to do with local or non-local. A non-local tcp6 face is still not lossy.

I know, but I don't think we should restrict this setting based on the underlying protocol. If someone wants to enable LpReliability on a TCP link, let them do it.

The rationale for prohibiting it on local faces was that on the other end of a local face there's an application, which doesn't know anything about the LpReliability protocol.

Actions #17

Updated by Eric Newberry over 7 years ago

Davide Pesavento wrote:

Junxiao Shi wrote:

It has nothing to do with local or non-local. A non-local tcp6 face is still not lossy.

I know, but I don't think we should restrict this setting based on the underlying protocol. If someone wants to enable LpReliability on a TCP link, let them do it.

I agree with Davide on this one. If reliability is enabled on a non-lossy face, it adds a bit of unnecessary overhead, but it doesn't (shouldn't) break anything.

The rationale for prohibiting it on local faces was that on the other end of a local face there's an application, which doesn't know anything about the LpReliability protocol.

This also makes sense to me. Otherwise, we'd need to implement reliability in the ndn-cxx face as well.

Actions #18

Updated by Junxiao Shi over 7 years ago

https://gerrit.named-data.net/4029 patchset3 changes ProtocolFactory::createFace function signature with a flags field. This is a bad trend.

  • Details of management protocol are going deeper and deeper into face system.
  • A widespread change in face system is needed upon each addition to the LinkService. While flags cloud reduce such situations, some future parameters not represented in the flags still require changes.

I'd suggest defining a face creation parameters struct. This struct is part of the face system rather than management. It collects all parameters needed for face creation, and is accepted on ProtocolFactory::createFace and passed down as needed.

struct FaceCreateParams
{
  FaceUri remoteUri;
  optional<FaceUri> localUri;
  FacePersistency persistency;
  bool wantLocalFields;
  bool wantLpReliability;
};

Then, each time adding a new face feature, it may still be necessary to make changes in the Factory-Channel-Transport hierarchy of every time, but the changes are limited to the "verify parameters" function for transports that do not support the new feature.

Note that if this solution is adopted, the refactoring shall be performed in a separate commit before the addition of LpRelibility.

Actions #19

Updated by Eric Newberry over 7 years ago

Junxiao Shi wrote:

Note that if this solution is adopted, the refactoring shall be performed in a separate commit before the addition of LpRelibility.

I like this design but I really don't see the need for a separate refactoring commit, since there would be about as many changes (to as many files) as there are in the existing commit.

Actions #20

Updated by Junxiao Shi over 7 years ago

A refactoring commit is needed so as to highlighting what are the changes to enable reliability.

Actions #21

Updated by Davide Pesavento over 7 years ago

From change 4029,3:

The scope needs to be determined before it can be tested whether LpReliability can be enabled and, as seen in TcpFactory, there can be some protocol-specific tests that need to be run first. Therefore, we either need to set this parameter after creating the face (as I suggested in an earlier comment on Redmine) or we need to push this logic down into the factory.

Exactly, this is what I was trying to say. What are the pros and cons of each of the two approaches?

Actions #22

Updated by Eric Newberry over 7 years ago

Davide Pesavento wrote:

From change 4029,3:

The scope needs to be determined before it can be tested whether LpReliability can be enabled and, as seen in TcpFactory, there can be some protocol-specific tests that need to be run first. Therefore, we either need to set this parameter after creating the face (as I suggested in an earlier comment on Redmine) or we need to push this logic down into the factory.

Exactly, this is what I was trying to say. What are the pros and cons of each of the two approaches?

Let's go through them:

Keeping the system as-is, with the checks pushed down into the specific factories:

  • + Face is never considered up with these features either incorrectly enabled/disabled
  • - Complicates parameters to createFace (even if they're in a struct, you need to set the values in that struct)

Creating the face with features at their default values and enabling them after creation (and checking the scope):

  • + Can eliminate wantLocalFieldsEnabled and wantLpReliabilityEnabled from the createFace method signature
  • - If this is a critical feature, the face may be up for a brief period without this feature enabled
Actions #23

Updated by Davide Pesavento over 7 years ago

ok, so for createFace the first option seems best. What about updateFace?

Actions #24

Updated by Eric Newberry over 7 years ago

Davide Pesavento wrote:

ok, so for createFace the first option seems best. What about updateFace?

I don't really have any complaints about the existing implementation of updateFace. As it is, updateFace checks for the appropriateness of enabling local fields given the scope of the existing face, as well as the persistency (through Transport::canChangePersistencyTo). Then, if all changes check out, it sets them to the new values. Otherwise, it fails to change any properties of the face. I can't really think of a better design for this at the moment.

Actions #25

Updated by Davide Pesavento over 7 years ago

But then the logic will be duplicated, once in updateFace and once somewhere in the factory in the face creation code path. Moreover, if the "appropriateness" check is factory-specific, it cannot be done in FaceManager::updateFace.

Actions #26

Updated by Eric Newberry over 7 years ago

Davide Pesavento wrote:

But then the logic will be duplicated, once in updateFace and once somewhere in the factory in the face creation code path. Moreover, if the "appropriateness" check is factory-specific, it cannot be done in FaceManager::updateFace.

The reason we need to move the check into the factories in the first option is because the scope of the face has yet to be determined before the face is created. However, the reason it can be done in updateFace is because, at this point, the scope has already been determined during face creation. FaceManager simply needs to check the scope of the face when validating these options.

Actions #27

Updated by Davide Pesavento over 7 years ago

Eric Newberry wrote:

The reason we need to move the check into the factories in the first option is because the scope of the face has yet to be determined before the face is created. However, the reason it can be done in updateFace is because, at this point, the scope has already been determined during face creation. FaceManager simply needs to check the scope of the face when validating these options.

I understand that, and I'm not disputing it. But you're missing the point about duplication, and the fact that a future check may be factory-specific, like the persistency check.

Actions #28

Updated by Eric Newberry over 7 years ago

Davide Pesavento wrote:

Eric Newberry wrote:

The reason we need to move the check into the factories in the first option is because the scope of the face has yet to be determined before the face is created. However, the reason it can be done in updateFace is because, at this point, the scope has already been determined during face creation. FaceManager simply needs to check the scope of the face when validating these options.

I understand that, and I'm not disputing it. But you're missing the point about duplication, and the fact that a future check may be factory-specific, like the persistency check.

I see what you mean now. In this case, we should probably push this check down into the factory as well. We might even be able to factor it out into a separate function.

Actions #29

Updated by Junxiao Shi over 7 years ago

20170807 NFD call decides that there's no need to prevent LpReliability enabled on TCP tunnels. Also, there's no need to prevent LpReliability enabled on local faces. This leads to less code.

Actions #30

Updated by Davide Pesavento over 7 years ago

  • Status changed from Code review to Closed
Actions

Also available in: Atom PDF