Bug #2548
closedLogger: race condition between isEnabled and setLogLevel
100%
Description
The root issue is this:
setLogLevel
is called byLoggerFactory::onConfig
andLoggerFactory::setDefaultLevel
, and is protected byLoggerFactory::m_loggersGuard
.isEnabled
is called byNFD_LOG
, and is not protected by any mutex.
A data race can result e.g. during a call to rib::Nrd::initializeLogging
in the NRD thread, while the NFD thread is doing anything that involves printing a log message via NFD_LOG_*
.
(A related issue is that the logging subsystem is basically initialized twice...)
Updated by Joao Pereira over 9 years ago
I see 2 options here:
1 - We start using g_logMutex inside the methods setLogLevel and isEnabled
This is a more drastic measure because when we are writing a log we cannot check another log or change level.
2 - We pass by ref to the Logger class the mutex from LoggerFactory
This approach is more dangerous because we will split the locking between 2 different classes and this will make a potential deadlock if future changes are not done careful
3 - Add a mutex to the Logger class
This approach is the easiest to implement, but having a mutex per logger looks like a overkill
==========
From my point of view I believe the solution 3 is the one that a could cause less issues in a future, maybe it would bit a little more heavy in memory but it can also be faster because we will not be blocking every logger action.
Updated by Junxiao Shi over 9 years ago
I believe this issue would disappear after #2513. If so, we don't need to do anything.
Updated by Joao Pereira over 9 years ago
Not sure if it solves the problem, because the race condition is between isEnabled and setLogLevel, so even if we move the log writing into a separated thread the isEnabled is called in that thread but setLogLevel is called in all the other Threads. So the race condition remains, unless I misunderstood the change you want to implement in that issue.
Updated by Junxiao Shi over 9 years ago
Looking at note-4, I agree that this race condition won't disappear with #2513.
If we don't change anything, when a race condition happens, how bad the behavior would be?
If it's just printing / not printing a few log lines, this isn't worth a lock.
Updated by Davide Pesavento over 9 years ago
A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination. Moreover, it makes it impossible to run nfd under helgrind/DRD or with ThreadSanitizer. We cannot ignore this bug.
I don't remember the details now, but maybe it's possible to fix this without introducing additional locks by reordering the init sequence of the logging subsystem...
Updated by Junxiao Shi over 9 years ago
A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination.
Please explain, at CPU instruction level, how this data race on Logger::m_enabledLogLevel
can cause something worse than writing or not writing a few log lines.
Logger::m_enabledLogLevel
is a 32-bit signed integer. Load/save operation on this type is atomic on all supported platforms.
Updated by Davide Pesavento over 9 years ago
Junxiao, from your question it is clear that you have no idea what you're talking about... you're forgetting about so many things: compiler optimizations, reordering of loads/stores, memory fences, coherency on multi-processor systems, etc... Also "atomic" in this context doesn't mean what you think it means.
And even if "it's just printing / not printing a few log lines", isn't it still a bug? Why is it so hard to acknowledge it?
Updated by Junxiao Shi over 9 years ago
So, we need to explicitly use std::atomic_uint32
?
I can smell a further worsening of logging performance.
Updated by Davide Pesavento over 9 years ago
Not necessarily. It should be possible to avoid locking with a minor redesign of the logging system. We need to:
- ensure that each logger can be used at most by one thread
- keep the locking around accesses to
LoggerMap m_loggers
- when the config is read/changed and the logging is (re-)initialized, call a per-thread function that updates
m_enabledLogLevel
(this doesn't require any additional locking if the two conditions above are met)
Updated by Alex Afanasyev about 9 years ago
- Target version changed from v0.4 to v0.5
Updated by Junxiao Shi over 8 years ago
Is this still relevant after switching to Boost.Log?
Updated by Davide Pesavento over 8 years ago
Junxiao Shi wrote:
Is this still relevant after switching to Boost.Log?
Yes.
Updated by Davide Pesavento over 7 years ago
- Target version changed from v0.5 to v0.6
Updated by Davide Pesavento over 6 years ago
- Target version set to v0.7
This is rendered obsolete by #4580
Updated by Davide Pesavento over 6 years ago
- Blocked by Task #4580: Reimplement NFD logging using ndn-cxx logging facility added
Updated by Davide Pesavento over 6 years ago
- Status changed from New to Closed
- Assignee set to Davide Pesavento
- % Done changed from 0 to 100
Logger/LoggerFactory are gone in NFD commit a3148081cef9d3f5f218948e4aa135ad8cf7ca45
. Closing.