Bug #2548
closedLogger: race condition between isEnabled and setLogLevel
100%
Description
The root issue is this:
setLogLevel
is called byLoggerFactory::onConfig
andLoggerFactory::setDefaultLevel
, and is protected byLoggerFactory::m_loggersGuard
.isEnabled
is called byNFD_LOG
, and is not protected by any mutex.
A data race can result e.g. during a call to rib::Nrd::initializeLogging
in the NRD thread, while the NFD thread is doing anything that involves printing a log message via NFD_LOG_*
.
(A related issue is that the logging subsystem is basically initialized twice...)
Updated by Joao Pereira almost 10 years ago
I see 2 options here:
1 - We start using g_logMutex inside the methods setLogLevel and isEnabled
This is a more drastic measure because when we are writing a log we cannot check another log or change level.
2 - We pass by ref to the Logger class the mutex from LoggerFactory
This approach is more dangerous because we will split the locking between 2 different classes and this will make a potential deadlock if future changes are not done careful
3 - Add a mutex to the Logger class
This approach is the easiest to implement, but having a mutex per logger looks like a overkill
==========
From my point of view I believe the solution 3 is the one that a could cause less issues in a future, maybe it would bit a little more heavy in memory but it can also be faster because we will not be blocking every logger action.
Updated by Junxiao Shi almost 10 years ago
I believe this issue would disappear after #2513. If so, we don't need to do anything.
Updated by Joao Pereira almost 10 years ago
Not sure if it solves the problem, because the race condition is between isEnabled and setLogLevel, so even if we move the log writing into a separated thread the isEnabled is called in that thread but setLogLevel is called in all the other Threads. So the race condition remains, unless I misunderstood the change you want to implement in that issue.
Updated by Junxiao Shi almost 10 years ago
Looking at note-4, I agree that this race condition won't disappear with #2513.
If we don't change anything, when a race condition happens, how bad the behavior would be?
If it's just printing / not printing a few log lines, this isn't worth a lock.
Updated by Davide Pesavento almost 10 years ago
A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination. Moreover, it makes it impossible to run nfd under helgrind/DRD or with ThreadSanitizer. We cannot ignore this bug.
I don't remember the details now, but maybe it's possible to fix this without introducing additional locks by reordering the init sequence of the logging subsystem...
Updated by Junxiao Shi almost 10 years ago
A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination.
Please explain, at CPU instruction level, how this data race on Logger::m_enabledLogLevel
can cause something worse than writing or not writing a few log lines.
Logger::m_enabledLogLevel
is a 32-bit signed integer. Load/save operation on this type is atomic on all supported platforms.
Updated by Davide Pesavento almost 10 years ago
Junxiao, from your question it is clear that you have no idea what you're talking about... you're forgetting about so many things: compiler optimizations, reordering of loads/stores, memory fences, coherency on multi-processor systems, etc... Also "atomic" in this context doesn't mean what you think it means.
And even if "it's just printing / not printing a few log lines", isn't it still a bug? Why is it so hard to acknowledge it?
Updated by Junxiao Shi almost 10 years ago
So, we need to explicitly use std::atomic_uint32
?
I can smell a further worsening of logging performance.
Updated by Davide Pesavento almost 10 years ago
Not necessarily. It should be possible to avoid locking with a minor redesign of the logging system. We need to:
- ensure that each logger can be used at most by one thread
- keep the locking around accesses to
LoggerMap m_loggers
- when the config is read/changed and the logging is (re-)initialized, call a per-thread function that updates
m_enabledLogLevel
(this doesn't require any additional locking if the two conditions above are met)
Updated by Alex Afanasyev about 9 years ago
- Target version changed from v0.4 to v0.5
Updated by Junxiao Shi almost 9 years ago
Is this still relevant after switching to Boost.Log?
Updated by Davide Pesavento almost 9 years ago
Junxiao Shi wrote:
Is this still relevant after switching to Boost.Log?
Yes.
Updated by Davide Pesavento over 7 years ago
- Target version changed from v0.5 to v0.6
Updated by Davide Pesavento almost 7 years ago
- Target version set to v0.7
This is rendered obsolete by #4580
Updated by Davide Pesavento almost 7 years ago
- Blocked by Task #4580: Reimplement NFD logging using ndn-cxx logging facility added
Updated by Davide Pesavento almost 7 years ago
- Status changed from New to Closed
- Assignee set to Davide Pesavento
- % Done changed from 0 to 100
Logger/LoggerFactory are gone in NFD commit a3148081cef9d3f5f218948e4aa135ad8cf7ca45
. Closing.