Project

General

Profile

Actions

Task #2513

closed

Optimize multi-threaded logging using lock-free queue and separate thread

Added by Alex Afanasyev about 9 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Assignee:
Category:
Core
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
12.00 h

Description

To ensure proper logging in a multi-threaded environment, currently we use std::lock_guard to prevent any race conditions.

This process can be optimized with the use of lock-free queue(s):

  • individual threads (NFD and RIB) write logs, line by line, into lock-free queue(s)
  • separate thread that pulls log lines from queue(s), and writes them into std::clog
|---------------|
| NFD thread    |  queue  |------------------|
|          push >>>>>>>>>>> pop              |
|---------------|         |                  |
                          |  logging thread  >> std::clog
|---------------|         |                  |
|          push >>>>>>>>>>> pop              |
| RIB thread    |  queue  |------------------|
|---------------|

A lock-free queue, such as boost::lockfree::queue or boost::lockfree:spsc_queue, from Boost.Lockfree library, should be used.
On a platform where Boost.Lockfree library is not available, a regular queue protected by std::lock_guard could be used.

The design of logging thread should be flexible about the number of log-generating threads.
In the future, when we need to add a third log-generating thread to nfd process, the logging thread should not require a major change.


Files

config.log (55.7 KB) config.log Yumin Xia, 03/06/2016 10:54 PM
20160313235309.tgz (127 KB) 20160313235309.tgz integ 2707,15 Junxiao Shi, 03/14/2016 10:40 AM
20160316142459.tgz (169 KB) 20160316142459.tgz integ master Junxiao Shi, 03/16/2016 10:47 AM
20160316131536.tgz (125 KB) 20160316131536.tgz integ 2707,16 Junxiao Shi, 03/16/2016 10:47 AM

Related issues 3 (1 open2 closed)

Related to NFD - Task #2489: Merge nrd into nfdClosedAlex Afanasyev

Actions
Blocks ndn-cxx - Feature #3553: Logging facility: bounded record queueNew

Actions
Blocked by NFD - Task #3599: Drop build support for Ubuntu 12.04ClosedDavide Pesavento04/21/2016

Actions
Actions #1

Updated by Alex Afanasyev about 9 years ago

Actions #2

Updated by Alex Afanasyev about 9 years ago

Just for background. Boost.Log uses locks to ensure thread safety. This task may not be that important.

On the other hand, it is probably a good idea to implement thread separation for log statements and actual log writing. The latter can become a serious bottleneck in some cases. I would also think that logger can be, to some extent, lossy. In other words, when log writing is bottleneck, it could be ok to skip some messages---this property may need to be configurable, turned on by default.

Actions #3

Updated by Davide Pesavento about 9 years ago

Yep we need a rate limiter in the logger. I thought I already opened a ticket about that but I can't find it.

Does boost or the std library provide lock-free data structures?

Actions #4

Updated by Davide Pesavento about 9 years ago

Davide Pesavento wrote:

Does boost or the std library provide lock-free data structures?

Boost.Lockfree has been introduced in 1.53, so that's not an option.

Actions #5

Updated by Junxiao Shi about 9 years ago

  • Subject changed from Optimize multihreaded logging using lock-free queue and separate thread to Optimize multi-threaded logging using lock-free queue and separate thread
  • Description updated (diff)

In one of my experiments (where Forwarder logging is used extensively, and NFD is compiled in DEBUG mode), commenting out the logging lock results in about 30% decrease in system "load average".

It's safe to do so in that experiment after setting loglevel=NONE on all loggers in RIB thread.

Actions #6

Updated by Junxiao Shi over 8 years ago

  • Assignee set to Yumin Xia
Actions #7

Updated by Davide Pesavento over 8 years ago

Davide Pesavento wrote:

Does boost or the std library provide lock-free data structures?

Boost.Lockfree has been introduced in 1.53, so that's not an option.

Actually, this wouldn't be a problem if we dropped support for Ubuntu 12.04, because at that point our oldest supported version would be 14.04, which contains boost 1.54.0

Actions #8

Updated by Junxiao Shi over 8 years ago

Per platform policy, we can't drop 12.04 support until Ubuntu 16.04 release, which is still 9 months away.

I'll be okay with using Boost.Lockfree when it's available, and continue having locks when it's unavailable.

It's really unnecessary to reinvent Boost.Lockfree for one platform.

Actions #9

Updated by Yumin Xia over 8 years ago

So how should I finish this task? Since Boost.lockfree is not available for boost 1.49.

Also, can you explain more about what does

separate thread that pulls log lines from both queues, and writes them into std::clog

mean?

Thanks

Actions #10

Updated by Junxiao Shi over 8 years ago

  • Description updated (diff)
  • Estimated time set to 6.00 h

20150810 conference call approves note-8.

Answer to note-9 is added to issue description.

Actions #11

Updated by Davide Pesavento over 8 years ago

A few design questions:

  • Why multiple queues? boost::lockfree::queue is a multi-writer/multi-reader queue, therefore one shared queue should be enough for all threads. On the other hand, we can use boost::lockfree::spsc_queue instead (one per writer thread), which is even wait-free (a stronger property than lock-free).
  • Are the queues fixed-capacity or do they grow dynamically in size? If dynamic size requires memory allocation, we lose lock-freedom (shouldn't be a problem in practice if allocations are rare)
  • What is contained in the queue exactly? The log messages? The problem is that boost::lockfree::queue<T> requires T to be trivially assignable and trivially destructible, which std::string is not. spsc_queue does not suffer from this limitation.

Also, I'm afraid 6 hours are too few to design and implement the solution.

Actions #12

Updated by Yumin Xia over 8 years ago

Can I summarize the work as follow?

  1. If boost::lockfree::queue are used, there are multiple log producer threads sharing one lock-free queue, and only one consumer thread.
  2. If boost::lockfree::spsc_queue are used, there are multiple threads with multiple queues, and for each queue, these is a consumer thread getting logs out of the queue.
  3. The consumer thread(s) is in charge of extracting logs from the lock-free queue, and write it to std::clog.
  4. Use boost::lockfree::queue when it is available, otherwise implement a so-called 'lock-free' queue with std::lock_guard. std::lock_guard will lock on both producer pointer and consumer pointer when writing are called.

Are the queues fixed-capacity or do they grow dynamically in size?

@Davide Pesavento, could you explain more about what does 'being trivially assignable and trivially destructible' mean? I googled these words but I can not find any informative clue? Or just tell me which book should I look up to understand 'trivially ***'

Actions #13

Updated by Junxiao Shi over 8 years ago

  • Description updated (diff)
  • Estimated time changed from 6.00 h to 129.00 h

Reply to note-12: I've updated issue description so that which queue type to use is up to assignee.

Actions #14

Updated by Davide Pesavento over 8 years ago

Yumin Xia wrote:

  1. If boost::lockfree::queue are used, there are multiple log producer threads sharing one lock-free queue, and only one consumer thread.

correct

  1. If boost::lockfree::spsc_queue are used, there are multiple threads with multiple queues, and for each queue, these is a consumer thread getting logs out of the queue.

No. In this case there is exactly one queue per producer thread, and only one consumer thread in total, which pulls log lines from all queues. (question: in what order? how do we establish a total order between log messages coming from different queues?)

  1. The consumer thread(s) is in charge of extracting logs from the lock-free queue, and write it to std::clog.

Yes, more specifically there is one and only one consumer thread.

  1. Use boost::lockfree::queue when it is available, otherwise implement a so-called 'lock-free' queue with std::lock_guard. std::lock_guard will lock on both producer pointer and consumer pointer when writing are called.

Yes, in this case use standard locking mechanisms.

@Davide Pesavento, could you explain more about what does 'being trivially assignable and trivially destructible' mean? I googled these words but I can not find any informative clue? Or just tell me which book should I look up to understand 'trivially ***'

They are standard concepts widely used across the C++ std library. See http://en.cppreference.com/w/cpp/types/is_destructible and http://en.cppreference.com/w/cpp/types/is_copy_assignable

IOW, "trivially destructible" means that the type has a trivial destructor. "Trivial destructor" is defined here: http://en.cppreference.com/w/cpp/language/destructor

"Trivially assignable" means that the type has a trivial assignment operator. Here I'm not sure if boost means copy assignment, or if either one of copy or move assignment is enough. Assuming the former, "trivial copy assignment operator" is defined here: http://en.cppreference.com/w/cpp/language/as_operator

Actions #15

Updated by Davide Pesavento over 8 years ago

Another question for which I don't have an answer at the moment: how do we handle the "empty queue" situation? IOW, how do we notify the consumer thread that more messages have been enqueued? The classic solution is using a condition variable but that involves locking, which kind of defeats the purpose of a lock-free queue... Or the consumer thread can do polling, but that increases latency and/or CPU usage...

Are we sure locking is the actual bottleneck, rather than writing log messages to disk?

Actions #16

Updated by Yumin Xia over 8 years ago

Huge thanks to @Davide.

So the questions are

  1. Are we sure locking is the actual bottleneck, rather than writing log messages to disk?
  2. in what order should customer thread pull logs from queues. And if there were no limitation on how much a customer thread could pull from one queue for once, there might be a starving issue.
Actions #17

Updated by Junxiao Shi over 8 years ago

Are we sure locking is the actual bottleneck, rather than writing log messages to disk?

Locking is one of the bottlenecks, as shown in note-5.

Writing log messages to disk is also a bottleneck on OSX, as shown in #2492 note-54.

Without an answer to #2492 note-58, I cannot predict whether this issue can help with #2492.

in what order should customer thread pull logs from queues. And if there were no limitation on how much a customer thread could pull from one queue for once, there might be a starving issue.

It's unlikely for NFD and RIB threads to produce logs faster than the logging thread to write logs, so this question is unimportant.

Any design would be acceptable.

Actions #18

Updated by Davide Pesavento over 8 years ago

Junxiao Shi wrote:

It's unlikely for NFD and RIB threads to produce logs faster than the logging thread to write logs, so this question is unimportant.

This statement contradicts the previous statement that "Writing log messages to disk is also a bottleneck on OSX". If writing can be a bottleneck it means that a thread is able to produce log messages faster than the logging thread can consume them (for an unbounded period of time). A naive solution can cause message drops from the queue that is not consumed frequently enough, or can block the thread that pushes to that queue (depending on the "full queue" behavior we choose).

Actions #19

Updated by Davide Pesavento over 8 years ago

Boost.Log supports async logging and seems otherwise a pretty flexible and featureful logging library.

Are we sure we want to reimplement that from scratch? If we really need lockless queuing (I'm not convinced yet), we should be able to just implement a new record queuing strategy and instantiate an asynchronous_sink with it instead of using the provided strategies.

Actions #20

Updated by Junxiao Shi over 8 years ago

  • Estimated time changed from 129.00 h to 12.00 h

That "129 hours" is a typo. Don't be scared.

Actions #21

Updated by Alex Afanasyev over 8 years ago

Davide, the only problem with Boost.Log is that it is available only in the new versions of Boost. If we can make it use with older versions (e.g., when not available, fetching and building as part of NFD build process), then I will be happy to use it.

The only other option is to require at least boost 1.54, with issues on older platforms (12.04) and, I suspect, some embedded devices.

Actions #22

Updated by Alex Afanasyev over 8 years ago

During Sept 3, 2015 call, we decided that additional investigation necessary before proceeding with this issue:

  • what versions of boost are available in supported distribution, and what version are available for other distributions, including OSes for embedded devices (including, but not limited to adruino, openwrt, raspberry pi, ios, android >= 1.57).

  • what is the initialization procedure of boost log and how is it compatible with the currently implemented logic

  • whether it is possible to use stand-alone latest version of Boost.Log library with an older versions of core boost libraries

Actions #23

Updated by Yumin Xia over 8 years ago

Boost.Log requir "boost/move/core.hpp", and this file is introduced into Boost1.49

So, for old platforms using Boost1.48 (debian7 & ubuntu12.04), it might not be possible to use stand-alone latest version of Boost.Log

Actions #24

Updated by Davide Pesavento over 8 years ago

Yumin Xia wrote:

Boost.Log requir "boost/move/core.hpp", and this file is introduced into Boost1.49

So, for old platforms using Boost1.48 (debian7 & ubuntu12.04), it might not be possible to use stand-alone latest version of Boost.Log

What about older versions of Boost.Log? Do they all require that header file?

Actions #25

Updated by Yumin Xia over 8 years ago

I just check the source code of Boost.Logv2 in boost1.54, (the earliest version of Boost.Logv2?), it still require that header file.

Actions #26

Updated by Davide Pesavento over 8 years ago

Maybe we're lucky... that file includes only boost/config.hpp (plus two others that are only relevant for MSVC), so maybe we can bundle a copy of boost/move/core.hpp for platforms that don't have it, use the system-installed boost/config.hpp, and patch the standalone version of Boost.Log to use the bundled boost/move/core.hpp. This fallback would be only needed until Apr 2016 anyway.

Actions #27

Updated by Yumin Xia over 8 years ago

Hi Davide,

But I guess this won't work. This "boost/move/core.hpp" file is just the one I used as an example, there are a lot of files besides this one that are required by Boost.Logv2. "boost/utility/explicit_operator_bool.hpp" is another.

Actions #28

Updated by Davide Pesavento over 8 years ago

Oh ok. From your comment it sounded like move/core.hpp was the only missing file... Never mind then.

Actions #29

Updated by Junxiao Shi over 8 years ago

note-23~27 have answered NO to note-22 question-3.

Additional investigation is needed to answer question-1 and question-2.

Actions #30

Updated by Davide Pesavento over 8 years ago

is this still on track for v0.4?

Actions #31

Updated by Yumin Xia over 8 years ago

Sorry for the delay, I was busy with graduate school application and final exams. I will continue work on this issue now.

Actions #32

Updated by Yumin Xia over 8 years ago

Before I start, there is some questions that I want to note:

Accroding to this page:

Q1:

If your application consists of more than one module (e.g. an exe and one or several dll's) that use Boost.Log, the library must be built as a shared object. If you have a single executable or a single module that works with Boost.Log, you may build the library as a static library.

so static or shared?

Q2:

The logging library uses several other Boost libraries that require building too. These are Boost.Filesystem, Boost.System, Boost.DateTime, Boost.Thread and in some configurations Boost.Regex.

In wscript, there is not Filesystem & Datetime in boost_libs, but it is required in boost.log. Is it the right way to just add it to boost_libs in wscript?

Q3:

The library requires run-time type information (RTTI) to be enabled for both the library compilation and user's code compilation.

Is RTTI disabled in NFD?

Thanks, guys.

Actions #33

Updated by Davide Pesavento over 8 years ago

Yumin Xia wrote:

Q1
[...]
so static or shared?

This refers to how to build libboost-log, not our code... so it's not really relevant. In any case, ubuntu provides (only) the shared libraries so we're good.

Q2:
In wscript, there is not Filesystem & Datetime in boost_libs, but it is required in boost.log. Is it the right way to just add it to boost_libs in wscript?

I believe so, but Alex is the waf expert.

Q3:
Is RTTI disabled in NFD?

No.

Actions #34

Updated by Alex Afanasyev over 8 years ago

Q2:
In wscript, there is not Filesystem & Datetime in boost_libs, but it is required in boost.log. Is it the right way to just add it to boost_libs in wscript?

I believe so, but Alex is the waf expert.

Yes. It is enough to just add the necessary libraries in boost_libs variable.

Actions #35

Updated by Yumin Xia over 8 years ago

After reading the current code, the order of logger initialization confused me. I guess the init process is like this:

  1. modules loggers init (with default level)
  2. onConfig(), change the level if needed.
  3. Using log macros to log.

two questions:

One is that: In the current code, onConfig() change the level of the loggers mentioned in configure file. Why can it be sure that the init code of each module would run before the onConfig() function? I mean, like if NFD_LOG_INIT("test") runs after function onConfig(), things would go wrong. Is it because that the loggers are static variables (reference), so the construction happens before main() function?

The other one: is it true that, although each module use the same nfd::g_logger as variable name in those macros, actually
they are different, because that g_logger is static, so it refers to only to the g_logger in its cpp file.(after compile, in its .o file)?

Actions #36

Updated by Davide Pesavento over 8 years ago

Yumin Xia wrote:

One is that: In the current code, onConfig() change the level of the loggers mentioned in configure file. Why can it be sure that the init code of each module would run before the onConfig() function? I mean, like if NFD_LOG_INIT("test") runs after function onConfig(), things would go wrong. Is it because that the loggers are static variables (reference), so the construction happens before main() function?

Yes.

The other one: is it true that, although each module use the same nfd::g_logger as variable name in those macros, actually
they are different, because that g_logger is static, so it refers to only to the g_logger in its cpp file.(after compile, in its .o file)?

That's correct. g_logger is declared static therefore it has internal linkage. In other words, it is only accessible from the "current" translation unit.

Actions #37

Updated by Alex Afanasyev over 8 years ago

  • Target version changed from v0.4 to v0.5
Actions #38

Updated by Yumin Xia over 8 years ago

I have a simple plan: to use boost.log with 'asynchronous_sink' backend to replace the orignial logger.

The asynchronous_sink use a dedicated thread to output log records, and there are two defects in this design:

there will be a latency between log emission and its actual processing.
This behavior may be inadequate in some contexts, such as debugging an application that is prone to crashes.

And another one is:

However, avoiding dynamic unloading is the only way to solve the problem completely.

There could not be any dynamic unloading in NFD.

As for the implementation details, boost.log will simplely be a private member of the original logger,
and it will behave like a stream. The initialization code of boost.log will be written in LoggerFactory constructor.

Actions #39

Updated by Davide Pesavento over 8 years ago

Yumin Xia wrote:

I have a simple plan: to use boost.log with 'asynchronous_sink' backend to replace the orignial logger.

And what about the platforms that don't have Boost.Log? Do we keep using the current logging system?

Actions #40

Updated by Yumin Xia over 8 years ago

yes. For boost version less than 1.55, will still compile the current logging system. But I only have a naive plan about this.

  1. in .wscript file, link boost log and dependencies if boost >= 1.55
  2. in LoggerFactory and Logger, use macro to choose to compile old design or boost log.

I will make a commit and let it be reviewed to see if it is OK.

Actions #41

Updated by Yumin Xia over 8 years ago

Hi guys,

I met a compile problem. GCC gave compiler error on this:

../daemon/face/udp-factory.cpp: In member function ‘std::shared_ptr<nfd::UdpChannel> nfd::UdpFactory::createChannel(const Endpoint&, const seconds&)’:
../daemon/face/udp-factory.cpp:113:11: error: no match for ‘operator=’ (operand types are ‘std::shared_ptr<nfd::UdpChannel>’ and ‘boost::detail::sp_if_not_array<nfd::UdpChannel>::type {aka boost::shared_ptr<nfd::UdpChannel>}’)
   channel = make_shared<UdpChannel>(endpoint, timeout);

It seems that it is using boost::make_shared instead of std::make_shared. I did not change anything in this file, so I guess it is because something wrong with my modifications in logger.hpp or logger-factory.hpp. Have you ever met this kind of problem before?

In logger.hpp, new included files are:

#include <boost/log/common.hpp>
#include <boost/log/sources/logger.hpp>

In logger-factory.hpp:

#include <boost/log/core.hpp>
#include <boost/log/expressions.hpp>
#include <boost/utility/empty_deleter.hpp>
Actions #42

Updated by Junxiao Shi over 8 years ago

Reply to note-41:

I've seen a similar error when I add #include <boost/thread.hpp>.

Strangely, this compilation error only occurs in UdpFactory::createChannel and StrategyInfoHost::getOrCreateStrategyInfo.

The fix is changing into std::make_shared in both places).

Actions #43

Updated by Yumin Xia about 8 years ago

The problem was solved by add prefix 'std::' as Junxiao replied last time. But I think this solution is too tricky, and it might raise compile errors in later development.

Another problem is how to do the unit-test. Since it is using asynchronous logging, there is a delay between logging and writing to clog, the old unit-test could not work as expected. I tried to flush the logging explicitly, but it do not work. So I change add a explicit flush before getting buffer.

Actions #44

Updated by Davide Pesavento about 8 years ago

  • Status changed from New to Code review
Actions #45

Updated by Yumin Xia about 8 years ago

Hey guys,

I have pushed a new patch set to gerrit. According to reviews, I have made changes to most of them, but some need to be discussed:

  1. where to define macro BOOST_LOG_DYN_LINK ( or BOOST_ALL_DYN_LINK). In patch set 4 I define it in the 'core/logger.hpp' file.
  2. Is it Ok to use 'namespace logging = boost::log' in cpp file?
  3. Building system failed in './waf confiigure' under Ubuntu 12.04. The error message is that log lib does not exist. However, it should not check boost::log if version is less than 1.54.
  4. After this commit, sometimes it would be necessary to explicitly use std::make_shared<>. (One thing I found might help is that, both files, in which this problem happened, do not include "common.hpp" directly in header files).
Actions #46

Updated by Junxiao Shi about 8 years ago

sometimes it would be necessary to explicitly use std::make_shared<>.
One thing I found might help is that, both files, in which this problem happened, do not include "common.hpp" directly in header files

I guess the problem occurs because something in Boost is included before common.hpp.
Try including common.hpp at the top of logger.hpp, and see whether this helps.

Actions #47

Updated by Davide Pesavento about 8 years ago

Yumin Xia wrote:

  1. where to define macro BOOST_LOG_DYN_LINK ( or BOOST_ALL_DYN_LINK). In patch set 4 I define it in the 'core/logger.hpp' file.

I think the best place would be the top-level wscript, but Alex is the build system expert.

  1. Is it Ok to use 'namespace logging = boost::log' in cpp file?

Yes, but why "logging" instead of "log"? I don't see the point in changing (and lengthening) the name...

  1. Building system failed in './waf confiigure' under Ubuntu 12.04. The error message is that log lib does not exist. However, it should not check boost::log if version is less than 1.54.

I guess conf.env.BOOST_VERSION_NUMBER is not initialized until conf.check_boost() is called.

Actions #48

Updated by Yumin Xia about 8 years ago

Junxiao Shi wrote:

sometimes it would be necessary to explicitly use std::make_shared<>.
One thing I found might help is that, both files, in which this problem happened, do not include "common.hpp" directly in header files

I guess the problem occurs because something in Boost is included before common.hpp.
Try including common.hpp at the top of logger.hpp, and see whether this helps.

There is already common.hpp at the top of logger.hpp, but it does not help.

Actions #49

Updated by Yumin Xia about 8 years ago

Davide Pesavento wrote:

Yes, but why "logging" instead of "log"? I don't see the point in changing (and lengthening) the name...

So should I just use boost::log?

I guess conf.env.BOOST_VERSION_NUMBER is not initialized until conf.check_boost() is called.

Yes, I check the .waf-tools/boost.py file, and it initialized in check_boost(). I could hack a little bit in boost.py file, but is it OK?

Actions #50

Updated by Alex Afanasyev about 8 years ago

The easiest workaround would be to add

conf.check_boost()

prior to version checking. In other words, this would check boost twice: once for just header files, the other one for header files and necessarily libraries.

Actions #51

Updated by Yumin Xia about 8 years ago

Some design decisions:

  1. Why not boost.log's severity_logger: Using severity_logger will change much more code than using a basic async logger. Since the original logger will still be there for some time, I suggest using basic logger. Besides, severity_logger does not offer more functions than the already implemented one.

  2. Using auto_flush. Accrording to http://www.boost.org/doc/libs/1_54_0/libs/log/doc/html/log/detailed/sink_backends.html , using auto_flush will degrade performance, but in case of an application crash, there is a good chance that last log records will be not be lost. I did a naive test of logging a string for 200000 times with and without auto_flush respectively, and use time to see its performance. The results are:

    with auto_flush: build/main 2> a.txt  7.19s user 3.74s system 133% cpu 8.184 total
    without : build/main 2> a.txt  6.69s user 3.74s system 123% cpu 8.420 total
    

Quesiton:

@Alex, Where could I define the BOOST_LOG_DYN_LINK in wscript? I tried to append it to conf.env.defines, but then ./waf configure failed.

Actions #52

Updated by Alex Afanasyev about 8 years ago

I would say, Boost.Log did something really really strange, complicating life of users. Static single threaded, static multithreaded, shared single threaded, shared multithreaded versions of the library use their own internal namespace. Moreover, each require proper macros defined.

I have updated boost detection tool (I also submitted a pull request to the waf upstream). I also changed to force multithreaded versions of all libraries.

Actions #53

Updated by Yumin Xia about 8 years ago

Hi Alex,

I have pulled the new patch-set, but it seems that it does not work correctly.

Checking boost includes : 1.55.0
Checking boost libs : ok
Checking for boost linkage : Could not link against boost libraries using supplied options
does not using boost_libs (This is a log I add when try..except...)

It fails to detect that boost.log is available in system.

Actions #54

Updated by Alex Afanasyev about 8 years ago

Which system you're testing it on? And how boost is installed (from packages or from source)?

Also. Show build/config.log, as it shows what exactly failed.

Actions #55

Updated by Yumin Xia about 8 years ago

The distribution version is Debian Jessie 8.1, and boost was installed from package manager apt-get.

Actions #56

Updated by Alex Afanasyev about 8 years ago

I have reproduced the problem and now at loss what to do. This is really Boost.Log problem. Specifically, the strange decision to have different naming schemes for:

  • static single-threaded version of library
  • static multi-threaded
  • dynamic single-threaded
  • dynamic multi-threaded

Static/dynamic fact is obvious. Multi/single-thrededness is normally identified by presence or absence of "-mt" suffix. This works in OSX. However, Ubuntu packaging have only one version, which I'm guessing is multithreaded, but doesn't have -mt suffix.

I can add additional logic to do extra guessing, but I don't fully understand the grounds for Boost.Log decision.

Actions #57

Updated by Davide Pesavento about 8 years ago

As far as I understand, boost dropped the -mt suffix for the multi-threaded variant in 1.40.0, but most distros kept a libboost_foo-mt.so -> libboost_foo.so symlink for backward compatibility reasons... until Debian/Ubuntu dropped the symlink sometime between 12.04 and 14.04.

So I think the general logic on Linux could be that we first look for libboost_foo-mt.so, and if that doesn't exist we fallback to libboost_foo.so, and we assume it's the multi-threaded variant (we require 1.48 so it should be... unless someone manually compiled only the single-threaded variant).

Actions #58

Updated by Davide Pesavento about 8 years ago

Note that the above applies to all boost libraries, not just to Boost.Log

Actions #59

Updated by Alex Afanasyev about 8 years ago

I'm not sure what you mean by "dropping support. There are still 4 versions installed, not just symlinks. At least on my system.

And this logic not going to work without some magic. On my system, libboost_log.so is single threaded and if attempted to link without BOOST_LOG_NO_THREADS, it fails. I'm really frustrated by this Boost.Log decision. I'm not aware of other boost libraries that do such crazy thing.

Actions #60

Updated by Davide Pesavento about 8 years ago

Are you talking about Linux or OSX? What I said applies to linux only. On OSX the situation is completely different: both single- and multi-threaded versions are built and installed separately.

Actions #61

Updated by Alex Afanasyev about 8 years ago

Yeah. I mentioned behavior on OSX, as I was observing it.

Actions #62

Updated by Junxiao Shi about 8 years ago

As of Change 2707 patchset 15, code looks mostly fine, but the Assignee needs to test its execution.

The test shall include at least the following scenarios:

  • normal NFD operation, default level set to INFO, light traffic (around 10 packets per second)
  • normal NFD operation, default level set to TRACE, moderate traffic (around 100 packets per second)
  • custom logger (it's sufficient to test the compilation with custom logger; it's unnecessary to test the operations)
  • NFD integration tests (I have triggered an automated execution and will upload the results)

The test report shall be posted under this issue.

Actions #63

Updated by Alex Afanasyev about 8 years ago

What exactly you expect to be tested? What metrics you expect to see?

Btw. I would appreciate if Yumin or somebody else point me to Boost.Log documentation where described how async logging works. Does it create thread on its own?

Is the buffer/queue infinite or finite? If the former, then we haven't really fixed much in this change :)

Actions #64

Updated by Davide Pesavento about 8 years ago

Alex Afanasyev wrote:

Btw. I would appreciate if Yumin or somebody else point me to Boost.Log documentation where described how async logging works. Does it create thread on its own?

http://www.boost.org/doc/libs/1_60_0/libs/log/doc/html/log/detailed/sink_frontends.html#log.detailed.sink_frontends.async

"All log records are passed to the backend in a dedicated thread, which makes it suitable for backends that may block for a considerable amount of time (network and other hardware device-related sinks, for example). The internal thread of the frontend is spawned on the frontend constructor and joined on its destructor (which implies that the frontend destruction may block)."

Is the buffer/queue infinite or finite? If the former, then we haven't really fixed much in this change :)

(I asked the same in note-11 but I guess we didn't discuss it)

Long story short, Boost.Log allows some compile-time customization via template arguments, see http://www.boost.org/doc/libs/1_60_0/libs/log/doc/html/log/detailed/sink_frontends.html#log.detailed.sink_frontends.async.customizing_record_queueing_strategy

The current patch set uses the default unbounded_fifo_queue. I think that's fine for this change. Changing the queuing policy should go in a separate commit.

Actions #65

Updated by Yumin Xia about 8 years ago

Alex Afanasyev wrote:

Btw. I would appreciate if Yumin or somebody else point me to Boost.Log documentation where described how async logging works. Does it create thread on its own?

Boost.log create async thread on it's own in the front-end construction function. Here in the codes, it happens in the Line 67th in logger-factory.cpp. (m_sink = boost::make_shared<Sink>(backend);)

Is the buffer/queue infinite or finite? If the former, then we haven't really fixed much in this change :)

It is now using the default unbounded_fifo_queue strategy.

Actions #66

Updated by Junxiao Shi about 8 years ago

Addition to note-62:

  • normal NFD operation, default level set to INFO, light traffic (around 10 packets per second)
    Expected: no crash; logs from each thread are on separate lines
  • normal NFD operation, default level set to TRACE, moderate traffic (around 100 packets per second)
    Expected: no crash; logs from each thread are on separate lines
  • custom logger
    Expected: code compiles with custom logger (it's unnecessary to test the operations)
  • NFD integration tests
    Expected: pass
Actions #67

Updated by Alex Afanasyev about 8 years ago

The failure of the integrated tests has nothing to do with this commit. I see that it has to do something with DNS / auto-registration.

Trying multicast discovery...
Stage 1 failed: Timeout
Sending DNS query for SRV record for _ndn._udp
ERROR: FaceUri canonization failed: Hostname resolution timed out
Actions #68

Updated by Alex Afanasyev about 8 years ago

  • Blocks Feature #3553: Logging facility: bounded record queue added
Actions #69

Updated by Alex Afanasyev about 8 years ago

Thanks Davide and Yumin for clarifying. We should definitely implement limited queue and I agree to make it a separate task (I just created an issue to track it).

Actions #71

Updated by Davide Pesavento about 8 years ago

Since this won't be merged for 0.4.1, and 0.4.2 will presumably happen after ubuntu 16.04 is released, can we remove all preprocessor conditionals and hard-depend on Boost.Log (i.e. boost >= 1.54) ?

Actions #72

Updated by Junxiao Shi about 8 years ago

  • Status changed from Code review to Feedback

Since this won't be merged for 0.4.1, and 0.4.2 will presumably happen after ubuntu 16.04 is released, can we remove all preprocessor conditionals and hard-depend on Boost.Log (i.e. boost >= 1.54) ?

I agree, although this implies the Change cannot merge until Ubuntu 16.04 release date.

Actions #73

Updated by Yumin Xia about 8 years ago

As Davide has replied on Gerrit, now() is not thread-safe. There are two ways to solve this problem:

one is to simply make now() thread-safe by drop 'static' of 'static char buffer[10 + 1 + 6 + 1];'(no overheads in terms of running, only sub more on ESP when entering the function).

The other is to write a custom attribute in boost.log.

Which one do you prefer?

Actions #74

Updated by Junxiao Shi about 8 years ago

Answer to note-73:

The first approach requires changing the return type of now() to std::string, otherwise it would return a pointer to a stack-allocated buffer which becomes invalid upon return.

Returning std::string causes a copy.

I'm unfamiliar with "custom attribute in boost.log" and will not comment on the second approach.

ndn-cxx-logging project adopts a different approach: introduce a custom type whose operator<< prints a timestamp.

This is thread-safe and does not copy the timestamp string into a std::string.
I prefer this approach, partially because I invented it.

Actions #75

Updated by Davide Pesavento about 8 years ago

Junxiao Shi wrote:

The first approach requires changing the return type of now() to std::string, otherwise it would return a pointer to a stack-allocated buffer which becomes invalid upon return.

Returning std::string causes a copy.

...and a memory allocation, which is very bad for performance, especially at that point in the code. I'm against this approach.

Actions #76

Updated by Junxiao Shi about 8 years ago

Answer to note-71:

20160414 call decides to make this commit without considering older Boost versions.

Actions #77

Updated by Junxiao Shi about 8 years ago

  • Blocked by Task #3599: Drop build support for Ubuntu 12.04 added
Actions #78

Updated by Davide Pesavento almost 8 years ago

  • Status changed from Feedback to In Progress
  • % Done changed from 0 to 80

Moving back to "in progress" then.

Actions #79

Updated by Davide Pesavento almost 8 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 80 to 100
Actions #80

Updated by Yumin Xia almost 8 years ago

  • Status changed from Code review to Closed
Actions

Also available in: Atom PDF