Task #4606: TestChronoSync: All users not receiving chat messages - jndn - NDN project issue tracking system

Actions

Copy link

Task #4606

closed

TestChronoSync: All users not receiving chat messages

Added by Anonymous about 7 years ago. Updated over 6 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Start date:

05/03/2018

Due date:

% Done:

Estimated time:

Description

From Price Clarke:

I started testing Chronosync by getting started with the TestChronoSync demo code and kept running into with duplicates and lost messages in a "perfect" network environment. So, using a slightly modified version of TestChronoSync.java (had to pull out some of the internal classes, etc. but left the actual code/callbacks etc. the same... except for a few key areas, e.g. I had to change the System.out.println() responsible for printing new chat messages into a function call). https://github.com/gpwclark/chronosync-chat-simulation - I made a program that wraps around the 'Chat' class and simulates concurrent chat users sending a predetermined list of chat messages. In the README.md there are directions on how you can vary the number of participants, and messages. In addition, I made a mock chat class to demonstrate correctness/a control scenario. My apriori reasoning is:

Chronosync does not properly inform every chatter about every other bit of data generated by every other user in the form of a named interest they can use to fetch the data.
Every chat user expresses interest in every bit of data every other user has generated but the data is not delivered because it either.

failed in the onInterest CB of the producer or
failed to be delivered by NFD or
failed in the onData CB of the consumer.

Or it is a unique combination of the two problems and the subproblems involving data delivery. My fear is that there is an issue is with 1, as the actual TestChronoSync code is very straightforward, so I'm currently working to test that. If you have any insights or you think/know I'm wrong I'm all ears!

Actions

Copy link

Updated by Anonymous about 7 years ago

Hi Price. If multiple users issue new chat messages in less time than the ChronoSync system can reach a new concensus, it's possible that there could be dropped messages. This was discussed in the ChronoSync paper:
http://named-data.net/publications/chronosync/

ChronoSync works best in a system where each user is responding to another user's action (like in a chat). What is the minimum interval between messages in your example?

Actions

Copy link

Updated by price clark about 7 years ago

Of course!

Currently the chat delay is

@Override
    public long getChatDelayTime() {
        int range = 1000;
        int interval = 10;
        // 1 <= n <= range
        int n = rand.nextInt(range) + 1;

        //(1 * interval) ms <= chatDelayTime <= (range * interval) ms
        long chatDelayTime = (long) n * interval;
        return chatDelayTime;
    }

So the range is currently 10ms to 10000ms. Given this distribution it's certainly possible that what you say is happening. And I can certainly experiment with the interval as a delay time or make it more like users are "responding" to others like you suggested. When you say "if multiple users issue new chat messages in less time than the ChronoSync system can reach a new consensus" are you referring to issues around "Handling simultaneous data generations"? If so, I feel that is an interesting future problem in the code base. As I understand it, if two users respond to the same sync interest at more-or-less the same time then they disseminate sync data which can then cause the chat room to partition into two groups that each have a different sync digest. The paper suggests using exclude filters but that seems to be a big issue right? Exclude filters are being deprecated: [[https://named-data.net/doc/NDN-packet-spec/current/changelog.html#version-0-3]]. I guess the real problem I'm running into is this: I started using exclude filters after reading: https://named-data.net/wp-content/uploads/p68.pdf. I had an identical issue to them, they needed to discover conferences and conference participants, and the only way to really do that client side with no apriori knowledge about state is to use some sort of broadcast route so you can actually learn about the names of the data you are interested in obtaining. But then I read exclude filters were going to be deprecated. So I turned to chronosync to solve this problem I was having in my application around maintaining shared knowledge and giving clients the ability to discover named data. Chronosync really does a great job of solving this problem, but I won't always be able to guarantee I won't run into simultaneous data generation. Chronochat is about chatting which requires physical human input. So sending chat messages as soon as 10ms apart is of course ludicrous. But what if my clients are all generating data and are likely to generate simultaneous data? If exclude filters can no longer be used to partition broadcast style data I already know about and broadcast style data I desperately need, then how do I get the "unknown" data I need with a 1-1 interest-data response? There's essentially no guarantee that my interests will ever go to the "other" producer that has the broadcast style data I want, in this case the producer of the sync digest I didn't receive after the simultaneous data generation scenario. Anyway, chronosync is really cool and I'm very excited to be discussing ndn in a public forum. Thank you so much for implementing it in Java. I'm going to modify my chat delay times to be more "human" before I try to orchestrate interactivty between N threads haha. If you have any thoughts about simultaneous data generation post exclude filter generation please let me know. If I was way off base about what you meant by "less time than the ChronoSync system can reach a new consensus" then I look forward to hearing you discuss your thoughts in more detail!

Actions

Copy link

Updated by price clark about 7 years ago

I updated my code base to simulate a chat more in the spirit of what you suggested around allowing ChronoSync to "reach a consensus" by allowing simulated users to chat one at a time with a minimum delay between each message. The ChronoSync2013 class delivered on all it's guarantees in this scenario. However, I'm confused as to why ChronoSync2013 requires this consensus. There are recovery procedures in place to handle the situation where a given client is given a digest with which it is unfamiliar? So does "reach a consensus" really mean that new sync data needs to be disseminated to all ChronoSync clients before the network can handle new sync data? To what extent are recovery procedures expected to address this sort of issue? Since recovery interests don't suffer the same problem sync interests do (sync interests are always answered, recovery interest are only answered if possible), it seems to me that ChronoSync should be able to reach "eventual" consistency even if new data is published simultaneously on different clients, unless I'm missing something key or failing to comprehend some nuance in the interaction between nfd, chronosync, and ndn-ccl?

Actions

Copy link

Updated by Anonymous about 7 years ago

The original design for conflict recover (not in ChronoSync2013) requires the Interest Exclude filter, which is being deprecated for the new NDN packet format v0.3. Most design attention is on a newer sync protocol called state vector sync (SVS) which should solve some problems with simultaneous data generation without needing the Exclude filter. Some of the NDN team is schedule to implement and this in a C++ library this weekend. The jNDN implementation would follow. We need to see how this performs.

Actions

Copy link