Project

General

Profile

Actions

Feature #2565

closed

Implement async operations for Face.processEvents

Added by Alex Afanasyev over 9 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
High
Assignee:
-
Start date:
Due date:
% Done:

90%

Estimated time:

Description

Right now, one is required to have an infinite loop with with polling of processEvents.

I believe, this can be done using java.nio


Related issues 1 (0 open1 closed)

Blocked by jndn - Task #2860: expressInterest should queue interests while async I/O is connectingClosed06/08/2015

Actions
Actions #1

Updated by Jeff Burke over 9 years ago

+1. Any thoughts on priority relative to other issues?

Actions #2

Updated by Alex Afanasyev over 9 years ago

On my list it is near the top. Infinite loops with poll and sleep operations, specifically in android case would unnecessarily drain battery, as opposed to true async operations where use processing power/battery only happens when processing is really necessary.

Actions #3

Updated by Anonymous over 9 years ago

  • Priority changed from Normal to High
Actions #4

Updated by Anonymous over 9 years ago

If the application's onInterest or onData callback throws an exception (in the nio thread), where would you expect the exception to be handled? Is it good enough for nio to catch and ignore/log it?

Actions #5

Updated by Anonymous over 9 years ago

For example, it doesn't look like ndn-cxx catches exceptions in the async I/O callback. If onInterest throws an exception, the application quits. (I tried this with the ndn-cxx producer example.)
https://github.com/named-data/ndn-cxx/blob/02a4bf3f7f2673412fc6c9001ae93779a1a013e4/src/transport/stream-transport.hpp#L221

Should the application be responsible for deciding whether to catch exceptions in the callback? If the library catches and skips exceptions in the base async I/O callback, it may remove some flexibility from the application, and the throw/catch in every I/O callback may be overhead.

Actions #6

Updated by Andrew Brown over 9 years ago

Jeff, what's your opinion on having some type of callback to handle those exceptions? That way I can avoid having a try-catch block in every single OnInterest since any type of IO could throw an IOException; I could just instantiate one handler somewhere and pass it in like registerPrefix(prefix, onInterestCallback, onRegisterFailed, onInterestCallbackFailed).

Actions #8

Updated by Anonymous over 9 years ago

Thanks for the reference in the NIO classes.

At least in Java, the application is already forced to handle I/O exceptions in the callback. E.g.:
https://github.com/named-data/jndn/blob/e3ae21611c2489ab807b0d0e4b4773adfeae2019/examples/src/net/named_data/jndn/tests/TestPublishAsyncNfd.java#L188

The Java Logger support already accepts handlers. Could we use that infrastructure?

Actions #9

Updated by Alex Afanasyev over 9 years ago

Exceptions that are not explicitly handled within callback are in most cases critical. NFD/NLSR and some other apps catch exception around processEvents (io_service::run()), but also terminate the application with error exit code.

Actions #10

Updated by Anonymous over 9 years ago

Hi Alex. For the interest timeout it is possible to schedule an event to fire at the time of expiration. But ndn-cxx has a loop which runs every 100 ms to check for expiration.
https://github.com/named-data/ndn-cxx/blob/c759a20046bce82a83c7956d3a28739b0be1f982/src/detail/face-impl.hpp#L330

What are the advantages of running a 100 ms loop instead of setting a timer to fire when the interest expires?

Actions #11

Updated by Alex Afanasyev over 9 years ago

This "loop" (actually it is event rescheduled every 100ms) is just a legacy part that is in pipeline to be replaced to just scheduled events per interest. I just didn't have time yet to finalize the implementation.

Actions #12

Updated by Anonymous over 9 years ago

OK, thanks. I'll try to use a scheduled event for the interest timeout.

Actions #13

Updated by Anonymous over 9 years ago

Hi Andrew. You say "the two NIO classes would be ...." but don't we need to use the async versions such as https://docs.oracle.com/javase/7/docs/api/java/nio/channels/AsynchronousSocketChannel.html ?

Actions #14

Updated by Andrew Brown over 9 years ago

I hadn't seen that one; I guess I thought the point of using NIO for jndn was to avoid the while(true){ face.processEvents(); Thread.sleep(10); } everywhere. But now that I think of it, both of the classes we suggested need some thread somewhere doing something similar: the AsynchronousSocketChannel (from what I can tell) would need a loop checking if the Future returned from read() is finished and the ServerSocketChannel I recommended would need a loop somewhere that checks if a SelectionKey has been triggered (I was thinking you would want to use https://docs.oracle.com/javase/7/docs/api/java/nio/channels/spi/AbstractSelectableChannel.html#register(java.nio.channels.Selector,%20int,%20java.lang.Object) and then poll for events; from what I've read this uses something fast like epoll under the hood). But both need the loop so I'm not sure how to get around having another thread running.

Actions #15

Updated by Anonymous over 9 years ago

Yes, Java nio seems to have a different approach from Python, JavaScript and Boost. In those the library provides an event loop, and when you connect you pass a callback which is repeatedly called on the event loop's thread when data is received. But in Java nio it seems that you need to set up your own event loop and call read with a CompletionHandler which is called ONCE when data is received. Then the CompletionHandler has to call read again.
https://docs.oracle.com/javase/7/docs/api/java/nio/channels/AsynchronousSocketChannel.html#read%28java.nio.ByteBuffer,%20A,%20java.nio.channels.CompletionHandler%29

For the event loop thread, it looks like the application needs to set up an ExecutorService. Do you have any experience with this?
http://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ExecutorService.html

Actions #16

Updated by Andrew Brown over 9 years ago

Yeah, I implemented my event loop with https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/Executors.html#newScheduledThreadPool(int) and a small number of threads; it is handy for other things too so I use it globally throughout the application (probably a lot like the IO service Alex talks about). If we go that route, I would prefer some way to be able to pass in a ScheduledExecutorService so that I don't have to have two thread pools for the application.

Re: AsynchronousSocketChannel vs ServerSocketChannel vs current implementation... if we are using a thread pool (or single thread) to turn the crank of face.processEvents(), do we even need NIO? Because the face.processEvents() reads bytes sequentially off the wire, there is no speed-up by making that async, right? What might be a speed-up is to dispatch the OnInterestCallback/OnData/etc. tasks to the thread pool so that the processEvents() doesn't block while the user is executing some lengthy operation.

Actions #17

Updated by Anonymous over 9 years ago

Yes, the application will provide the thread pool to the library.

About the need for NIO, the concern is not for speed-up but for CPU/power consumption. Needing a low-power device to run a loop that checks every 10 ms can use more CPU/power than relying on NIO which can be idle and only "wake up" when there is an event to process.

Even though async I/O and thread safety are theoretically separate, in practice they go together. For example, the pending interest table is manipulated in response to async I/O events and these events and other manipulation of the table need to be coordinated in a thread-safe way. The most straight-forward solution is to dispatch all operations (like expressInterest) to the same thread that handles async I/O.

Actions #18

Updated by Andrew Brown over 9 years ago

Makes sense; I wonder if there is a good way to benchmark the before and after for both speed and CPU time.

Actions #19

Updated by Anonymous over 9 years ago

My idea for real-world CPU benchmarks of the event loop is to run on a Raspberry Pi and look at uptime. Hopefully, the NIO-based code will have a lower CPU load while waiting than the "processEvents infinite loop" method. I don't have a good idea to test speed since this relies on interaction with a forwarder.

Actions #20

Updated by Anonymous over 9 years ago

Hi Andrew. Thanks for the mention of ScheduledExecutorService, but it seems that this schedules code to run on some thread in the thread pool, but you don't know which one. How can this be used to solve the concurrency problem where you want to schedule tasks to run in the same thread so that the don't conflict?

Actions #21

Updated by Andrew Brown over 9 years ago

I mean speed like how many more bytes could the async version process if the network channel is always saturated with bytes; on a 2-core machine it should be a lot faster if it can dispatch handlers to the second core and continue reading bytes.

I don't think any of the ExecutorService derivatives let you do that except maybe the fixed thread pool (and setting it to 1, which is a bit limiting). What about making the PIT thread-safe? I think it's currently an ArrayList... what about http://docs.oracle.com/javase/7/docs/api/java/util/Collections.html#synchronizedList%28java.util.List%29?

Actions #22

Updated by Anonymous over 9 years ago

The problem with synchronized is that a call to a synchronized method is effectively a blocking call (until the other thread is finished with the synchronized code). The other libraries are able to do a non-blocking dispatch to a processing thread. I feel frustrated that it seems that Java doesn't have any utilities to allow. In the Java world, is there a philosophical stance that synchronized code is the only way to go?

Actions #23

Updated by Anonymous over 9 years ago

.. or is the idea that code should use both: dispatch to "some thread" in a thread pool where that thread does a blocking call to a synchronized method?

Actions #24

Updated by Andrew Brown over 9 years ago

Ha, I hear you; the benefits of a single managed event loop are awesome, even if it does make JavaScript single-threaded. I don't know if I can speak for all of Java but I think they would say something like, "build your own event loop thread and put a queue on it; add tasks to the end of the queue and the thread polls the front". I guess that is an option but then then we wouldn't need the ExecutorService (or I guess we could, we could run the event loop thread inside the ExecutorService).

Another option is to not make the method itself synchronized but to only synchronize on the smallest common denominator; if the PIT were a LinkedList two different threads could search through the PIT and only lock the LinkedList once they find the entry to remove; since LinkedList iterator.remove() is a O(1) operation so this shouldn't be for very long.

Actions #25

Updated by Anonymous over 9 years ago

  • Assignee set to Anonymous

Hi Andrew. For starters, I'll switch the data structures like pendingInterestTable_ and interestFilterTable_ to use synchronizedList and to use a synchronized block when iterating. (This is already hard enough to get right, so we'll save even more fine-grained optimization for later.) Once that's done I'll make a Face subclass which takes a ScheduledExecutorService and overrides the methods like expressInterest to dispatch to it.

Actions #26

Updated by Andrew Brown over 9 years ago

Sounds good; are you working in a specific branch?

Actions #27

Updated by Anonymous over 9 years ago

Yes, still in progress. Branch issue/2565-async-IO
https://github.com/named-data/jndn/tree/issue/2565-async-IO

Actions #28

Updated by Anonymous over 9 years ago

Here's the commit to use synchronizedList for pendingInterestTable, etc. I tried to be careful not to call the callbacks while the list is locked.
https://github.com/named-data/jndn/commit/e08a2233d8bd415c3b67dd53ffa8d1fb92cc74ba

Actions #29

Updated by Anonymous over 9 years ago

Hi Andrew. A sanity check: In Python we have a Face subclass to dispatch pretty much every method call to the single event loop, but this is for thread-safe access to the data structures like the pending interest table. Dispatching also passes I/O operations to the event loop. In Java, however, we are using synchronizedList so we don't need to dispatch for the purpose of thread safety. So the Face subclass only need to dispatch to the ScheduledExecutorService for doing I/O operations. Is that right?

Actions #30

Updated by Andrew Brown over 9 years ago

Yes, I believe you are right: if I understand the commit correctly, you use Collections.synchronizedList so any add() operations are synchronized by that wrapper and in all cases where you iterate and remove entries you lock the whole list. That should prevent concurrency issues with the lists so only IO operations need the event loop.

A couple of thoughts (mostly oriented towards optimization so feel free to disregard for now):

  • you have the setIsRemoved() on the PIT entries; have you thought of just setting that and not removing the entry immediately, instead having one cleanup thread in the event loop do the locking and removing periodically? The other tables could use a similar idea.
  • if you can guarantee that entries are always added to the end, you could lock for less time (and speed up remove()) by using the LinkedList from #24
  • on line 477, have you thought of sending filter callbacks to the event loop? That way you can guarantee that even if the user (me) writes long-running callbacks your processEvents() doesn't lock up. The same may be true elsewhere (or is this the sole entry point to onInterestCallback?).
Actions #31

Updated by Anonymous over 9 years ago

  • you have the setIsRemoved() on the PIT entries; have you thought of just setting that and not removing the entry immediately, instead having one cleanup thread in the event loop do the locking and removing periodically? The other tables could use a similar idea.

The idea of setIsRemoved() is to make the "callLater" callback for the timeout quickly terminate. You say "having one cleanup thread in the event loop" but the idea is that with async I/O and "callLater" support for timeouts that there is no more loop doing periodic checks. It there's nothing to do (including to check for stuff to clean up), then the application is idle.

  • if you can guarantee that entries are always added to the end, you could lock for less time (and speed up remove()) by using the LinkedList from #24

Linked lists are neat but churn more heap memory by creating all the little liked list nodes. Maybe I shouldn't worry about that?

  • on line 477, have you thought of sending filter callbacks to the event loop? That way you can guarantee that even if the user (me) writes long-running callbacks your processEvents() doesn't lock up. The same may be true elsewhere (or is this the sole entry point to onInterestCallback?).

Glad you asked. I think the answer depends on the following issue. I've read the documentation for AsynchronousSocketChannel.read with CompletionHandler but I still don't see the answer to the following question. On which thread is the CompletionHandler callback called? On the thread which calls read? If not then is it called on some random thread?

Actions #32

Updated by Andrew Brown over 9 years ago

From slide 6 and slide 27 at http://openjdk.java.net/projects/nio/presentations/TS-4222.pdf: I read that to mean that we can't really know what thread will run the CompletionHandler; it "may" be the same thread as the IO reader or it may not.

And see slide 33: they don't want any blocking operations inside that CompletionHandler or it may freeze up the IO. That was why I wanted to get the onInterestCallbacks out of there; so that the CompletionHandlers could finish even if someone's onInterestCallback takes forever. We could submit() the onInterestCallbacks to the eventLoop and they would get thread time eventually.

Actions #33

Updated by Anonymous over 9 years ago

I want to keep the base class agnostic about which (if any) thread pool implementation is being used. So I don't want the interest receiving code itself to submit the callback to a thread pool. The plan is to make a subclass of Face called ThreadsafeFace which takes (a particular implementation of) a thread pool. So I propose that the override for registerPrefix can put the supplied OnInterestCalback in a wrapper which takes care of submitting it to the thread pool when called. This way, the base implementation doesn't need to change. Would that work?

Actions #34

Updated by Andrew Brown over 9 years ago

Yes, makes sense; like we talked about on the phone, some CPU usage benchmarks between the AsyncSocketChannel and the current processEvents() would be good before we do all of that and the necessary synchronizations.

Actions #35

Updated by Anonymous over 9 years ago

  • Blocked by Task #2860: expressInterest should queue interests while async I/O is connecting added
Actions #36

Updated by Anonymous over 9 years ago

I did some tests on my Raspberry Pi using "top" for the client app which sends an interest and waits for the reply. The version with the "processEvents loop" uses 3.5% of the CPU. The version with the ScheduledExecutorService thread pool uses 0.3% or 0% of the CPU (the java process drops off the "top" monitor). So yes, it does seem worth getting the ScheduledExecutorService working.

Actions #37

Updated by Anonymous over 9 years ago

  • Status changed from New to In Progress

Regarding note #33, here how I did the override for onData in expressInterest to submit it to the thread pool:
https://github.com/named-data/jndn/blob/c6cf1557e07437d8fefd8d8a332b2e53c6cce2aa/src/net/named_data/jndn/ThreadPoolFace.java#L78

Actions #38

Updated by Andrew Brown over 9 years ago

Some comments after looking at https://github.com/named-data/jndn/blob/issue/2565-async-IO:

Actions #39

Updated by Anonymous over 9 years ago

See comments inline based on our discussion.

Done. https://github.com/named-data/jndn/commit/ff790ee3e1538e3d3a1d5f281a18e809137e624e

  • ThreadPoolFace.processEvents() will call Face.processEvents(); shouldn't this be overriden to do nothing since all events will be handled asynchronously?

ThreadPoolFace has a constructor which takes a Transport which may not use async I/O, in which case the application may still need to use a processEvents loop. This is mentioned in the doc comment. https://github.com/named-data/jndn/blob/d871cb549ee79d45ced5a76ab0774a906e2893f4/src/net/named_data/jndn/ThreadPoolFace.java#L40

Agreed that it would clarify the code to split DelayedCallTable, etc into their own classes. I'll report on a later note in this issue. I don't think it would clarify to make a DescendingSynchronizedTable since this "specialized generalized" class may cause more confusion than it clears up. Note that this also applies to the Node classes in the other CCL libraries, so for consistency I should change those too.

  • Trivial note: if we do create new table classes, can we move those and their entry classes (e.g. InterestFilterEntry) into a new package as separate files?

We'll put these in a subfolder called impl.

  • Even more trivial: Logger.getLogger(...) use could/should be converted to a private static final for the class.

Done. https://github.com/named-data/jndn/commit/1056ac39a331801009a8834b2faa9e02918fcf23

Yes, it's going away in a few months. No need to pay much attention to improving that code.

OK, I removed the TODO and updated the comments. https://github.com/named-data/jndn/commit/cc4297cd2f262b547c0f812282683bcaae4eff47

The plan is to add a simple method to Transport called isAsync() for expressInterest to use. If not isAsync() then just call expressInterestHelper which should make the code a lot easier to follow for the non-async case. I'll report in another note in this thread. (Note that this applies to the other CCL libraries too.)

For the code that is not in the application's callback, if an exception is thrown then it is pretty severe, like out-of-memory. The plan is to log these exceptions and keep processing (as opposed to abandoning the thread). We will make it clear in the documentation and in the example apps that the user's callback should add its own outer try/catch. I'll report in another note in this thread. (Note that this applies to the other CCL libraries too.)

Actions #40

Updated by Anonymous about 9 years ago

See comments inline for updates.

Done. For example see the updates for your link above for removePendingInterest and removeRegisteredPrefix: https://github.com/named-data/jndn/blob/61b1232e76b388f195a2d66c3863ff5d6c0160da/src/net/named_data/jndn/Node.java#L194, https://github.com/named-data/jndn/blob/61b1232e76b388f195a2d66c3863ff5d6c0160da/src/net/named_data/jndn/Node.java#L298 . I tried moving the code around onConnectedCallbacks_, but it didn't seem to help clarify so I left it.

Done. I added Transport.isAsync and the sync portion of expressInterest is much simpler: https://github.com/named-data/jndn/blob/61b1232e76b388f195a2d66c3863ff5d6c0160da/src/net/named_data/jndn/Node.java#L111 .

  • For the code that is not in the application's callback, if an exception is thrown then it is pretty severe, like out-of-memory. The plan is to log these exceptions and keep processing (as opposed to abandoning the thread). We will make it clear in the documentation and in the example apps that the user's callback should add its own outer try/catch. I'll report in another note in this thread. (Note that this applies to the other CCL libraries too.)

Since we need to catch exceptions in callbacks throughout the library (not just in I/O), I added this as a separate Redmine task #3152.

Actions #41

Updated by Anonymous about 9 years ago

  • % Done changed from 0 to 90

The code is merged to the master branch. I'm keeping this issue open because it was originally to avoid the processEvents on Android, and we still need the Android version of AsyncTcpTransport (since Android doesn't support nio async I/O).

Actions #42

Updated by Anonymous over 8 years ago

Hi Alex. I did some searching, but I can't find the async I/O support for Android (since it doesn't have nio async I/O). Do you know where I can look?

Actions #43

Updated by Andrew Brown over 8 years ago

Jeff, I was actually just looking at this issue to try to implement some type of reconnection logic if the link goes down. Since there is no API callback the user can pass to be notified of disconnection (and there probably shouldn't be, other protocols don't connect/disconnect), I was thinking of trying to implement some type of retry mechanism down at the AsyncTcpTransport level. Have you thought of this or already started something similar?

Actions #44

Updated by Anonymous over 8 years ago

Hi Andrew. No, I haven't thought much about reconnection logic. Interesting that you suggest an async transport might handle it differently than a blocking transport. You made some comments in issue #2075 note 6. I suppose an async transport wouldn't care about processEvents.

Actions #45

Updated by Andrew Brown over 8 years ago

Perfect. I'll continue the discussion there since that issue is exactly what I am talking about.

Actions #46

Updated by Teng Liang almost 7 years ago

I have implemented an NDN application in Java using AsyncTcpTransport, and I am porting it to Android. The current maven file doesn't include the AsyncTcpTransport when generating Jar for Android. However, it seems that AsynchronousChannel is added in the new Android API https://developer.android.com/reference/java/nio/channels/AsynchronousChannel.html. Can we also update Jndn as well for Android?

Actions #47

Updated by Anonymous over 6 years ago

jNDN version 0.16 added pom-android-with-async-io.xml since asynchronous was added in API level 26 .

Actions #48

Updated by Anonymous over 6 years ago

  • Status changed from In Progress to Closed
Actions

Also available in: Atom PDF