Project

General

Profile

Actions

Bug #2548

closed

Logger: race condition between isEnabled and setLogLevel

Added by Davide Pesavento about 9 years ago. Updated almost 6 years ago.

Status:
Closed
Priority:
Normal
Category:
Core
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:

Description

The root issue is this:

  • setLogLevel is called by LoggerFactory::onConfig and LoggerFactory::setDefaultLevel, and is protected by LoggerFactory::m_loggersGuard.
  • isEnabled is called by NFD_LOG, and is not protected by any mutex.

A data race can result e.g. during a call to rib::Nrd::initializeLogging in the NRD thread, while the NFD thread is doing anything that involves printing a log message via NFD_LOG_*.

(A related issue is that the logging subsystem is basically initialized twice...)


Related issues 1 (0 open1 closed)

Blocked by NFD - Task #4580: Reimplement NFD logging using ndn-cxx logging facilityClosedDavide Pesavento

Actions
Actions #1

Updated by Davide Pesavento about 9 years ago

  • Description updated (diff)
Actions #2

Updated by Joao Pereira almost 9 years ago

I see 2 options here:

1 - We start using g_logMutex inside the methods setLogLevel and isEnabled

This is a more drastic measure because when we are writing a log we cannot check another log or change level.

2 - We pass by ref to the Logger class the mutex from LoggerFactory

This approach is more dangerous because we will split the locking between 2 different classes and this will make a potential deadlock if future changes are not done careful

3 - Add a mutex to the Logger class

This approach is the easiest to implement, but having a mutex per logger looks like a overkill

==========

From my point of view I believe the solution 3 is the one that a could cause less issues in a future, maybe it would bit a little more heavy in memory but it can also be faster because we will not be blocking every logger action.

Actions #3

Updated by Junxiao Shi almost 9 years ago

I believe this issue would disappear after #2513. If so, we don't need to do anything.

Actions #4

Updated by Joao Pereira almost 9 years ago

Not sure if it solves the problem, because the race condition is between isEnabled and setLogLevel, so even if we move the log writing into a separated thread the isEnabled is called in that thread but setLogLevel is called in all the other Threads. So the race condition remains, unless I misunderstood the change you want to implement in that issue.

Actions #5

Updated by Junxiao Shi almost 9 years ago

Looking at note-4, I agree that this race condition won't disappear with #2513.

If we don't change anything, when a race condition happens, how bad the behavior would be?

If it's just printing / not printing a few log lines, this isn't worth a lock.

Actions #6

Updated by Davide Pesavento almost 9 years ago

A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination. Moreover, it makes it impossible to run nfd under helgrind/DRD or with ThreadSanitizer. We cannot ignore this bug.

I don't remember the details now, but maybe it's possible to fix this without introducing additional locks by reordering the init sequence of the logging subsystem...

Actions #7

Updated by Junxiao Shi almost 9 years ago

A data race, akin to undefined behavior, can result in anything, from nothing to incorrect results to program termination.

Please explain, at CPU instruction level, how this data race on Logger::m_enabledLogLevel can cause something worse than writing or not writing a few log lines.

Logger::m_enabledLogLevel is a 32-bit signed integer. Load/save operation on this type is atomic on all supported platforms.

Actions #8

Updated by Davide Pesavento almost 9 years ago

Junxiao, from your question it is clear that you have no idea what you're talking about... you're forgetting about so many things: compiler optimizations, reordering of loads/stores, memory fences, coherency on multi-processor systems, etc... Also "atomic" in this context doesn't mean what you think it means.

And even if "it's just printing / not printing a few log lines", isn't it still a bug? Why is it so hard to acknowledge it?

Actions #9

Updated by Junxiao Shi almost 9 years ago

So, we need to explicitly use std::atomic_uint32?
I can smell a further worsening of logging performance.

Actions #10

Updated by Davide Pesavento almost 9 years ago

Not necessarily. It should be possible to avoid locking with a minor redesign of the logging system. We need to:

  • ensure that each logger can be used at most by one thread
  • keep the locking around accesses to LoggerMap m_loggers
  • when the config is read/changed and the logging is (re-)initialized, call a per-thread function that updates m_enabledLogLevel (this doesn't require any additional locking if the two conditions above are met)
Actions #11

Updated by Alex Afanasyev over 8 years ago

  • Target version changed from v0.4 to v0.5
Actions #12

Updated by Junxiao Shi almost 8 years ago

Is this still relevant after switching to Boost.Log?

Actions #13

Updated by Davide Pesavento almost 8 years ago

Junxiao Shi wrote:

Is this still relevant after switching to Boost.Log?

Yes.

Actions #14

Updated by Davide Pesavento over 6 years ago

  • Target version changed from v0.5 to v0.6
Actions #15

Updated by Davide Pesavento about 6 years ago

  • Target version deleted (v0.6)
Actions #16

Updated by Davide Pesavento about 6 years ago

  • Target version set to v0.7

This is rendered obsolete by #4580

Actions #17

Updated by Davide Pesavento about 6 years ago

  • Blocked by Task #4580: Reimplement NFD logging using ndn-cxx logging facility added
Actions #18

Updated by Davide Pesavento almost 6 years ago

  • Status changed from New to Closed
  • Assignee set to Davide Pesavento
  • % Done changed from 0 to 100

Logger/LoggerFactory are gone in NFD commit a3148081cef9d3f5f218948e4aa135ad8cf7ca45. Closing.

Actions

Also available in: Atom PDF