Project

General

Profile

Actions

Bug #2548

closed

Logger: race condition between isEnabled and setLogLevel

Added by Davide Pesavento about 10 years ago. Updated almost 7 years ago.

Status:
Closed
Priority:
Normal
Category:
Core
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

The root issue is this:

  • setLogLevel is called by LoggerFactory::onConfig and LoggerFactory::setDefaultLevel, and is protected by LoggerFactory::m_loggersGuard.
  • isEnabled is called by NFD_LOG, and is not protected by any mutex.

A data race can result e.g. during a call to rib::Nrd::initializeLogging in the NRD thread, while the NFD thread is doing anything that involves printing a log message via NFD_LOG_*.

(A related issue is that the logging subsystem is basically initialized twice...)


Related issues 1 (0 open1 closed)

Blocked by NFD - Task #4580: Reimplement NFD logging using ndn-cxx logging facilityClosedDavide Pesavento

Actions
Actions #1

Updated by Davide Pesavento about 10 years ago

  • Description updated (diff)
Actions #2

Updated by Joao Pereira almost 10 years ago

I see 2 options here:

1 - We start using g_logMutex inside the methods setLogLevel and isEnabled

This is a more drastic measure because when we are writing a log we cannot check another log or change level.

2 - We pass by ref to the Logger class the mutex from LoggerFactory

This approach is more dangerous because we will split the locking between 2 different classes and this will make a potential deadlock if future changes are not done careful

3 - Add a mutex to the Logger class

This approach is the easiest to implement, but having a mutex per logger looks like a overkill

==========

From my point of view I believe the solution 3 is the one that a could cause less issues in a future, maybe it would bit a little more heavy in memory but it can also be faster because we will not be blocking every logger action.

Actions #3

Updated by Junxiao Shi almost 10 years ago

I believe this issue would disappear after #2513. If so, we don't need to do anything.

Actions #4

Updated by Joao Pereira almost 10 years ago

Not sure if it solves the problem, because the race condition is between isEnabled and setLogLevel, so even if we move the log writing into a separated thread the isEnabled is called in that thread but setLogLevel is called in all the other Threads. So the race condition remains, unless I misunderstood the change you want to implement in that issue.

Actions #5

Updated by Junxiao Shi almost 10 years ago

Looking at note-4, I agree that this race condition won't disappear with #2513.

If we don't change anything, when a race condition happens, how bad the behavior would be?

If it's just printing / not printing a few log lines, this isn't worth a lock.

Actions #6

Updated by Davide Pesavento almost 10 years ago

A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination. Moreover, it makes it impossible to run nfd under helgrind/DRD or with ThreadSanitizer. We cannot ignore this bug.

I don't remember the details now, but maybe it's possible to fix this without introducing additional locks by reordering the init sequence of the logging subsystem...

Actions #7

Updated by Junxiao Shi almost 10 years ago

A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination.

Please explain, at CPU instruction level, how this data race on Logger::m_enabledLogLevel can cause something worse than writing or not writing a few log lines.

Logger::m_enabledLogLevel is a 32-bit signed integer. Load/save operation on this type is atomic on all supported platforms.

Actions #8

Updated by Davide Pesavento almost 10 years ago

Junxiao, from your question it is clear that you have no idea what you're talking about... you're forgetting about so many things: compiler optimizations, reordering of loads/stores, memory fences, coherency on multi-processor systems, etc... Also "atomic" in this context doesn't mean what you think it means.

And even if "it's just printing / not printing a few log lines", isn't it still a bug? Why is it so hard to acknowledge it?

Actions #9

Updated by Junxiao Shi almost 10 years ago

So, we need to explicitly use std::atomic_uint32?
I can smell a further worsening of logging performance.

Actions #10

Updated by Davide Pesavento almost 10 years ago

Not necessarily. It should be possible to avoid locking with a minor redesign of the logging system. We need to:

  • ensure that each logger can be used at most by one thread
  • keep the locking around accesses to LoggerMap m_loggers
  • when the config is read/changed and the logging is (re-)initialized, call a per-thread function that updates m_enabledLogLevel (this doesn't require any additional locking if the two conditions above are met)
Actions #11

Updated by Alex Afanasyev about 9 years ago

  • Target version changed from v0.4 to v0.5
Actions #12

Updated by Junxiao Shi almost 9 years ago

Is this still relevant after switching to Boost.Log?

Actions #13

Updated by Davide Pesavento almost 9 years ago

Junxiao Shi wrote:

Is this still relevant after switching to Boost.Log?

Yes.

Actions #14

Updated by Davide Pesavento over 7 years ago

  • Target version changed from v0.5 to v0.6
Actions #15

Updated by Davide Pesavento about 7 years ago

  • Target version deleted (v0.6)
Actions #16

Updated by Davide Pesavento almost 7 years ago

  • Target version set to v0.7

This is rendered obsolete by #4580

Actions #17

Updated by Davide Pesavento almost 7 years ago

  • Blocked by Task #4580: Reimplement NFD logging using ndn-cxx logging facility added
Actions #18

Updated by Davide Pesavento almost 7 years ago

  • Status changed from New to Closed
  • Assignee set to Davide Pesavento
  • % Done changed from 0 to 100

Logger/LoggerFactory are gone in NFD commit a3148081cef9d3f5f218948e4aa135ad8cf7ca45. Closing.

Actions

Also available in: Atom PDF