Project

General

Profile

Actions

Bug #4866

open

ndnputchunks consumes a large amount of RAM

Added by Anonymous about 5 years ago. Updated about 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Start date:
Due date:
% Done:

0%

Estimated time:

Description

ndnputchunks needs over 3x the memory as the size of the content that is served.

For example:

  • 100MB file -> needs 336MB RAM
  • 300MB file -> needs 980MB RAM

The required memory is completely freed up after closing ndnputchunks, so it's not about caching inside NFD.

I wonder what explains this inefficiency?

Moreover, is it possible to have ndnputchunks store files on the disk, in order to save memory?

Actions #1

Updated by Junxiao Shi about 5 years ago

  • Tracker changed from Task to Bug

I wonder what explains this inefficiency?

I’m not surprised. ndnputchunks puts options.maxSegmentSize octets into each packet, and the default is 4400 octets. 300MiB file needs 71494 Data packets. 980MiB memory usage means 14KiB per packet.
Each Data packet has an internal buffer of 8800 octets regardless of how much payload actually has. It also comes with plus a bunch of “parsed” field indicating its structure, such as pointers to name component boundaries. These can easily add up to 14KiB.

If you are wonder where the “estimator” goes in this picture: it isn’t. While the majority of ndn-cxx encoding routines use an “estimator” to determine final buffer size, KeyChain::sign(Data&) is a notable exception, because TLV-LENGTH of SignatureValue is unknown before signing actually occurs, and re-adjusting packet buffer size may cause copying. In this case, speed was chosen over memory usage.

Moreover, is it possible to have ndnputchunks store files on the disk, in order to save memory?

There are two designs for serving files:

  • https://github.com/remap/ndnfs-port : every packet is pre-signed with signature stored in a database; payload stays in the file; producer assembles packets when requested. Files must reside in a FUSE mountpoint, which means it’s unusable in OpenVZ container.
  • https://github.com/yoursunny/ndn6-tools file-server.cpp: every packet is generated and signed upon request.

ndnputchunks does not serve a file, but works with standard input stream.
It made the design choice of pre-signing every Data packet, necessitating having every packet in memory. There’s no need to manually write packets to “files” for saving memory: the kernel takes care of that much more efficiently through the swapfile mechanism.
Neither of the above two designs is possible with ndnputchunks because you cannot seek the standard input.

However, a third design needs exploration: instead of reading through the standard input and generating all packets, only read enough input to fulfill the current Interest. This is a similar to how netcat pushes a file to a TCP socket.

  1. Producer has an InMemoryStoragePersistent.
  2. When an Interest requests segment X, the producer reads enough input toward this segment, and puts the packets to the IMS.
  3. Producer answers Interests from the IMS and erases used packets since they are now in forwarder cache.
  4. When input is EOF and IMS is empty, producer exits.

The expected memory usage is the consumer’s window size, but the worst case is still all segments in memory, which happens if the consumer requests the final segment first.
There are variations that can enforce a memory cap, but it would have to Nack out-of-window requests.

Actions #2

Updated by Davide Pesavento about 5 years ago

  • Description updated (diff)
Actions #3

Updated by Davide Pesavento about 5 years ago

There are two designs for serving files:

Clarification: there are currently two implemented designs. Several more design choices are possible, optimizing for one use case or another.

Actions #4

Updated by Anonymous about 5 years ago

Each Data packet has an internal buffer of 8800 octets regardless of how much payload actually has.

I think this is a big problem, especially when reducing the chunk size.

I just did some experiment with 400 Byte size chunks (instead of 4400 by default), and as predicted, the memory requirement is through the roof.

Is it possible to reduce the internal buffer to the actual size of the data packet?

Actions #5

Updated by Davide Pesavento about 5 years ago

Moreover, is it possible to have ndnputchunks store files on the disk, in order to save memory?

ndnputchunks is supposed to be a relatively simple tool, meant to be used as example and for quick experiments. I don't know if it's a good idea to add too many features to it. A dedicated tool that serves (potentially large) files from a file system or a database would be preferable in my opinion. And then there's repo-ng if you need longer-term storage.

Actions #6

Updated by Anonymous about 5 years ago

Davide Pesavento wrote:

Moreover, is it possible to have ndnputchunks store files on the disk, in order to save memory?

ndnputchunks is supposed to be a relatively simple tool, meant to be used as example and for quick experiments. I don't know if it's a good idea to add too many features to it. A dedicated tool that serves (potentially large) files from a file system or a database would be preferable in my opinion. And then there's repo-ng if you need longer-term storage.

Well, Beichuan told me the following rationale:

Catchunks/putchunks is most likely the first tool than anyone new to NDN will use. Therefore it should be easy to use and have good performance.

Let's say you want to compare catchunks performance with TCP performance in a virtual machine with a 100Mbps link for 30 seconds.

For TCP performance you can use iperf3, which needs 0 RAM and transfers about 342 Megabytes of data. You can assign the virtual machine 512MB of Ram, which will be easily sufficient.

For NDN, the same experiment needs at least 2-4GB RAM (putchunks + NFD CS), or you have to increase the size of the swap file (my recent Ubuntu VM used 1GB by default). When you reduce the chunk size, you need even more RAM (or swap space).

Considering this scenario, a much better default would be to serve a file from the disk, rather than put it into RAM. Modern SSDs are a good amount faster than NFD anyways.

Actions #7

Updated by Davide Pesavento about 5 years ago

For TCP performance you can use iperf3

That's an unfair comparison. iperf is a traffic generator tool, ndnchunks is not. Have you tried ndn-traffic-generator?
And then again, TCP has no security at all. As Junxiao said, the fact that NDN Data must be signed complicates things considerably.

Modern SSDs are a good amount faster than NFD anyways.

Let's leave NFD out of this, it's an entirely separate problem.

Actions #8

Updated by Anonymous about 5 years ago

Have you tried ndn-traffic-generator?

ndn-traffic-generator only allows a fixed pipeline size (x packets per second), and thus is not very useful when comparing against TCP.

But okay, let's compare ndnchunks to something like ftp or wget. All three are used for downloading files, but only ndnchunks requires to put the whole file into memory (+ 2x extra memory overhead).

Sounds like we need your "dedicated tool that serves (potentially large) files from a file system or a database".

Actions #9

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

Have you tried ndn-traffic-generator?

ndn-traffic-generator only allows a fixed pipeline size (x packets per second), and thus is not very useful when comparing against TCP.

That's true. I guess we don't really have a tool that can be compared directly...

But okay, let's compare ndnchunks to something like ftp or wget. All three are used for downloading files, but only ndnchunks requires to put the whole file into memory (+ 2x extra memory overhead).

Hold on. This issue is about the producer/server-side (ndnputchunks), while wget is client-side. So you should compare against an http or ftp server instead. And those usually serve files from a filesystem, ndnputchunks cannot do that. I could agree with adding a "-i" (or similar) option to ndnputchunks to read data from a file instead of stdin, with the caveat that if another process modifies the file while ndnputchunks is running, the result is undefined.

There is another difference between IP-based and NDN-based file transfer programs, at least on Linux: the TCP/IP stack resides in the kernel. This has major consequences on performance. For instance, it means that those programs can efficiently move bytes between a file and the network (a socket) using the sendfile(2) or splice(2) system calls, which do not require copying the data to userspace.

Actions #10

Updated by Anonymous about 5 years ago

  • Related to Bug #4861: ndncatchunks: improve performance on high-delay or low-quality links added
Actions #11

Updated by Davide Pesavento about 5 years ago

I fail to see how this is related to #4861

Actions #12

Updated by Anonymous about 5 years ago

Well it's very vaguely related to ndnchunks performance :)

If people looks at Bug #4861 for ndnchunks issues, it doesn't hurt to get a pointer to here.

Actions #13

Updated by Davide Pesavento about 5 years ago

The relationship is too loose to warrant a "related to", otherwise we'd have so many "related to" in NFD and other larger projects that they'd become useless. Moreover, catchunks and putchunks are two different programs.

Actions #14

Updated by Davide Pesavento about 5 years ago

  • Related to deleted (Bug #4861: ndncatchunks: improve performance on high-delay or low-quality links)
Actions #15

Updated by Anonymous about 5 years ago

  • Assignee set to Ju Pan

Ju will fix this :)

Actions #16

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

Ju will fix this :)

What exactly is the task here?

Actions #17

Updated by Anonymous about 5 years ago

Davide Pesavento wrote:

What exactly is the task here?

I'd say figuring that out is part of the task :)

But a possible solution would be to allow putchunks (or another tool) to serve files from disk without putting the whole file into memory. However, caching/pre-fetching some parts of the file might make sense.

Actions #18

Updated by Junxiao Shi about 5 years ago

allow another tool to serve files from disk without putting the whole file into memory

It's called a file server. See note-1 for two implementations. No need for a third one.

allow putchunks to serve files from disk

ndnputchunks deals with a stream, not a file.

caching/pre-fetching some parts of the file might make sense.

Any good Unix tool should do that. I use xzcat | dd combination all the time on huge files and they won't take up all the memory.

  1. Set a "producer window", the maximum number of packets to be buffered in ndnputchunks memory.
  2. Initially, read enough input to create packets to fill the producer window.
  3. As soon as the leftmost packet in the producer window has been retrieved, move the window to the right, reading more input to create more packets.
  4. Discard any packet no longer in the producer window.
  5. Additional Interests for out-of-window packets should be satisfied by the CS. If they reach the producer, unfortunately they must be Nacked.

With this change, ndnputchunks becomes more like a tool to do a one-time stream transfer, and cannot serve a stream persistently (think netcat rather than nginx). Setting the producer window to infinite restores the old behavior.

Actions #19

Updated by Anonymous about 5 years ago

Junxiao Shi wrote:

allow another tool to serve files from disk without putting the whole file into memory

It's called a file server. See note-1 for two implementations. No need for a third one.

Is the ndnfs-port compatible with catchunks?

I tried your ndn6-tools code (https://github.com/yoursunny/ndn6-tools) and it doesn't compile on Ubuntu 18.04

make
g++ -std=c++14 -Wall -Werror `pkg-config --cflags libndn-cxx` -DBOOST_LOG_DYN_LINK -o facemon facemon.cpp `pkg-config --libs libndn-cxx`
facemon.cpp: In function ‘void printInterest(const ndn::Name&, const ndn::Interest&)’:
facemon.cpp:31:8: error: ‘cout’ is not a member of ‘std’

I didn't install ndn-cxx via ppa though (as said in the readme), but via "sudo ./waf install"

Actions #20

Updated by Anonymous about 5 years ago

Junxiao Shi wrote:
ndnputchunks deals with a stream, not a file.

Yes, but so what? As I argued earlier, it's much more intuitive to have the default tool serve a file rather than a stream.

I assume >90% of ndnputchunks usage is to serve a file from the local disk.

caching/pre-fetching some parts of the file might make sense.

Any good Unix tool should do that. I use xzcat | dd combination all the time on huge files and they won't take up all the memory.

  1. Set a "producer window", the maximum number of packets to be buffered in ndnputchunks memory.
  2. Initially, read enough input to create packets to fill the producer window.
  3. As soon as the leftmost packet in the producer window has been retrieved, move the window to the right, reading more input to create more packets.
  4. Discard any packet no longer in the producer window.
  5. Additional Interests for out-of-window packets should be satisfied by the CS. If they reach the producer, unfortunately they must be Nacked.

With this change, ndnputchunks becomes more like a tool to do a one-time stream transfer, and cannot serve a stream persistently (think netcat rather than nginx). Setting the producer window to infinite restores the old behavior.

Yeah, I think the "one-time" aspect makes this solution less useful. So you have to restart putchunks for every new consumer?

Actually, I'm not sure how important the caching/pre-fetching feature is, given the current speed of SSD hard drives.

Actions #21

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

Junxiao Shi wrote:
ndnputchunks deals with a stream, not a file.

Yes, but so what? As I argued earlier, it's much more intuitive to have the default tool serve a file rather than a stream.

"stream" is more general than "file on a filesystem". All files can be piped into the stdin of another program. The vice versa is not true, not all streams are files.

Actually, I'm not sure how important the caching/pre-fetching feature is, given the current speed of SSD hard drives.

I don't think he's referring to I/O latency, but signing overhead (you need to repackage/resign the chunk when you re-read it from the disk).

Actions #22

Updated by Junxiao Shi about 5 years ago

I tried your ndn6-tools code (https://github.com/yoursunny/ndn6-tools) and it doesn't compile on Ubuntu 18.04

ndn6-tools repository follows stable PPA, which means ndn-cxx 0.6.3 at the moment. Although I no longer operate a public ndn6.tk node, I have several nodes using the stable PPA or its Debian derivative. I only test with Ubuntu 16.04 (for my OpenVZ nodes) and Debian stretch (for my physical nodes).

make file-server should work with ndn-cxx 0.6.5. It's not in make all target because it depends on metadata-object.hpp that is in ndn-cxx 0.6.5 but not 0.6.3.

Someone removed certain #include lines, and that causes missing identifier errors.
I'm waiting on ndn-cxx 0.6.5 to be released to stable PPA.
After that, I'll fix include errors and enable file-server program in make all.
For now, you can compile make file-server target only.

Actions #23

Updated by Anonymous about 5 years ago

Davide Pesavento wrote:

Klaus Schneider wrote:

Actually, I'm not sure how important the caching/pre-fetching feature is, given the current speed of SSD hard drives.

I don't think he's referring to I/O latency, but signing overhead (you need to repackage/resign the chunk when you re-read it from the disk).

Ah okay. So is there any possibility to store the signed chunks on disk (instead of memory), so you don't have to sign them on the fly?

This could improve performance over the current file-server, while still not requiring RAM linear to the file size.

For now, you can compile make file-server target only.

Okay, I tried it and it seems to work pretty well! Maybe we should package the file-server with ndn-tools and tell people to use the file-server instead of (or in addition to) ndnputchunks?

Actions #24

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

Ah okay. So is there any possibility to store the signed chunks on disk (instead of memory), so you don't have to sign them on the fly?

Then why don't you use https://github.com/remap/ndnfs-port ?

Okay, I tried it and it seems to work pretty well! Maybe we should package the file-server with ndn-tools and tell people to use the file-server instead of (or in addition to) ndnputchunks?

Yes. But they serve different use cases, so definitely not "instead of" ndnputchunks.

Actions #25

Updated by Anonymous about 5 years ago

Davide Pesavento wrote:

Klaus Schneider wrote:

Ah okay. So is there any possibility to store the signed chunks on disk (instead of memory), so you don't have to sign them on the fly?

Then why don't you use https://github.com/remap/ndnfs-port ?

Does it work with ndncatchunks?

Okay, I tried it and it seems to work pretty well! Maybe we should package the file-server with ndn-tools and tell people to use the file-server instead of (or in addition to) ndnputchunks?

Yes. But they serve different use cases, so definitely not "instead of" ndnputchunks.

Yeah, my point was that the file-server (if the performance is good enough) is a much better default tool than ndnputchunks. For large files (let's say 1GB) the putchunks initial signing takes forever (minutes), and the memory overhead is unacceptable.

So it seems that the tools I wanted already exist (file-server, ndnfs), but are not well documented.

I think they should be part of ndn-tools and put into the "Getting Started" guide http://named-data.net/doc/NFD/current/INSTALL.html

We could put in a quick example of how to run file-server + catchunks, which would be much better than the current Getting Started suggestions, namely:

  • Write your own consumer/producer in ndn-cxx/ccl
  • A bunch of links to other software that's harder to set up.

The only easy to use example is the ndn-traffic-generator, but 1) it doesn't implement congestion control and 2) requesting a file via catchunks is a more natural scenario than generating random traffic.

P.S. Unrelated point: Why are the "Getting Started Guides" in the documentation https://named-data.net/codebase/platform/documentation/ only meant for developers? (= contributing guide + developer resources)

Shouldn't you first describe how to use NFD, before jumping into how to contribute to NFD?

Actions #26

Updated by Davide Pesavento about 5 years ago

  • Start date deleted (03/05/2019)

Klaus Schneider wrote:

Davide Pesavento wrote:

Klaus Schneider wrote:

Ah okay. So is there any possibility to store the signed chunks on disk (instead of memory), so you don't have to sign them on the fly?

Then why don't you use https://github.com/remap/ndnfs-port ?

Does it work with ndncatchunks?

I don't know. Try it :) But if it doesn't we should fix it. I'm pretty sure the ndnfs server doesn't implement a metadata responder, because that's a very recent thing, so you'd have to provide the exact version number to catchunks.
In any case, even if it doesn't work, it's still not a good reason to reinvent the wheel and add all that complexity into putchunks.

Yeah, my point was that the file-server (if the performance is good enough) is a much better default tool than ndnputchunks.

Again, you're comparing apples to oranges. And again, the two tools are optimized for different use cases, so there's a tradeoff involved.
The signing still has to happen at some point, whether you pre-sign the whole file at the beginning or sign each segment on demand. The rationale for pre-signing is to avoid any slowdowns while the producer is serving the segments. If you serve the segments on demand, there's a bunch of additional operations (reading the file chunk, encoding the Data packet, signing) that must be done on the fly, and all this overhead might slow down the producer's response rate, therefore you'll get somewhat inaccurate network-level performance measurements.

For large files (let's say 1GB) the putchunks initial signing takes forever (minutes), and the memory overhead is unacceptable.

You can use DigestSha256 "signing" to reduce the time it takes to prepare the file.

So it seems that the tools I wanted already exist (file-server, ndnfs), but are not well documented.
[...]

It's not a secret that our documentation/tutorials aren't great. Please contribute your improvements.

Actions #27

Updated by Anonymous about 5 years ago

Davide Pesavento wrote:

Klaus Schneider wrote:

Yeah, my point was that the file-server (if the performance is good enough) is a much better default tool than ndnputchunks.

Again, you're comparing apples to oranges. And again, the two tools are optimized for different use cases, so there's a tradeoff involved.

It doesn't look like we're actually disagreeing on anything here.

Yes, I know about the trade-off, I'm just saying the file-server (late signing) is the better trade-off for large files (i.e., file size >100M). It's better to get lower throughput, than being unable to run the experiment because your VM memory is full.

Also you can combine the file-server with a local content store if you want the in-memory caching. This has the benefit that you can limit the content store size, and don't need to keep everything in memory.

But I do agree with Junxiao's earlier point, that you can also use a swap file, to put the memory required by putchunks back on disk. However, having putchunks fill up all your memory might influence the performance of other applications on the machine?

Anyways, here's some numbers for the performance difference (100MB file everything run on localhost):

- putchunks:   258.757585 Mbit/s
- file-server:  43.568980 Mbit/s
- CS:          313.576854 Mbit/s

The signing still has to happen at some point, whether you pre-sign the whole file at the beginning or sign each segment on demand. The rationale for pre-signing is to avoid any slowdowns while the producer is serving the segments. If you serve the segments on demand, there's a bunch of additional operations (reading the file chunk, encoding the Data packet, signing) that must be done on the fly, and all this overhead might slow down the producer's response rate, therefore you'll get somewhat inaccurate network-level performance measurements.

For large files (let's say 1GB) the putchunks initial signing takes forever (minutes), and the memory overhead is unacceptable.

You can use DigestSha256 "signing" to reduce the time it takes to prepare the file.

Following the earlier theme, this option is also completely undocumented.

So it seems that the tools I wanted already exist (file-server, ndnfs), but are not well documented.
[...]

It's not a secret that our documentation/tutorials aren't great. Please contribute your improvements.

Well I wrote my suggestions earlier. Whoever is able to change the NDN website is free to take them :)

I assume there's no way to submit the proposed change to Gerrit?

Actions #28

Updated by Davide Pesavento about 5 years ago

Klaus Schneider wrote:

But I do agree with Junxiao's earlier point, that you can also use a swap file, to put the memory required by putchunks back on disk. However, having putchunks fill up all your memory might influence the performance of other applications on the machine?

Yes, it probably will.

You can use DigestSha256 "signing" to reduce the time it takes to prepare the file.

Following the earlier theme, this option is also completely undocumented.

It is documented, you just need to know where to look :)
This is the syntax of the signing info: https://named-data.net/doc/ndn-cxx/current/doxygen/d8/dc8/classndn_1_1security_1_1SigningInfo.html#afc960f9f5da5536b958403dc7b701826

But seriously, contributions are extremely welcome as I said.

Well I wrote my suggestions earlier. Whoever is able to change the NDN website is free to take them :)

I assume there's no way to submit the proposed change to Gerrit?

Actually, many (but not all) of those documents on the website are taken from ndn-cxx and NFD repos, for example the INSTALL file you cited earlier. So yes you can submit changes to gerrit for those files. Same goes for the README files in ndn-tools repo.

Actions #29

Updated by Anonymous about 5 years ago

  • Status changed from New to In Progress
  • Assignee changed from Ju Pan to Anonymous
Actions #30

Updated by Davide Pesavento about 4 years ago

  • Subject changed from ndnputchunks: high memory requirement to ndnputchunks consumes a large amount of RAM
  • Status changed from In Progress to New
  • Assignee deleted (Anonymous)
Actions

Also available in: Atom PDF