Project

General

Profile

Actions

Task #2541

closed

Enhance exception throwing and fatal log with Boost Exception library

Added by Alex Afanasyev about 9 years ago. Updated almost 8 years ago.

Status:
Closed
Priority:
Normal
Category:
Core
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
4.50 h

Description

In a real world NFD deployment, it is difficult to determine cause of a fatal error caused by an exception.
That either needs code modification, or running gdb with catch throw.

Boost has partial solution to the problem: Boost Exception library.
The exception is annotated with additional information, including: source name, function name, and line number of the place of the exception throwing.

This issue includes:

  • replace all throw exceptions with BOOST_THROW_EXCEPTION macros
  • in NFD main.cpp's top-level catch, print the annotated additional information into the fatal log

Related issues 1 (0 open1 closed)

Related to ndn-cxx - Task #2997: Enhance exception throwing with Boost Exception libraryClosedSpyros Mastorakis

Actions
Actions #1

Updated by Davide Pesavento about 9 years ago

I agree that our exception handling is not ideal at the moment.

Are you proposing to just use BOOST_THROW_EXCEPTION to augment the information contained in the exception message? Or more than that?

Actions #2

Updated by Alex Afanasyev about 9 years ago

To use the macro, we need also add boost::exception as another parent class for all our Error classes. After this change, yes, just replace throw constructs with BOOST_THROW_EXCETION would be enough to add annotation. The only another small change is adding part to extract annotations from the exceptions in catch blocks.

Actions #3

Updated by Davide Pesavento about 9 years ago

Alex Afanasyev wrote:

To use the macro, we need also add boost::exception as another parent class for all our Error classes.

Where does that requirement come from? By briefly reading the documentation, deriving from boost::exception doesn't seem to be needed.

Actions #4

Updated by Alex Afanasyev about 9 years ago

I think you're right. I somehow assumed that it is required. More careful read of http://www.boost.org/doc/libs/1_57_0/libs/exception/doc/enable_error_info.html is saying that it will create a new class derived from both exceptions when needed.

So, the task then is even more simple than I thought. We just need to change raw throw statements to BOOST_THROW_EXCEPTION and we will have much better information to debug problems.

Actions #5

Updated by Junxiao Shi about 9 years ago

I'm not convinced about the necessity.

What's wrong with commenting out the catch statements in main function, and run in GDB (without catch throw)?

Also, post an example output from Boost.Exception to illustrate what additional information is collected, and how they are displayed.

Actions #6

Updated by Alex Afanasyev about 9 years ago

On the running system, you don't have gdb. The only thing we see in some of the bug reports is that "[NFD] ". If we had information about who exactly throw the exception, we could get better idea what's going on without the need of gdb. It would work of course not in all cases, but at least some of the problems could get simpler diagnostics.

I will post some examples later.

Actions #7

Updated by Junxiao Shi about 9 years ago

On the running system, you don't have gdb. The only thing we see in some of the bug reports is that "[NFD] ". If we had information about who exactly throw the exception, we could get better idea what's going on without the need of gdb. It would work of course not in all cases, but at least some of the problems could get simpler diagnostics.

Does this mean exceptions in release builds are changed as well?

In this case, please provide benchmark result to prove its performance penalty is small.

Actions #8

Updated by Alex Afanasyev about 9 years ago

We are not relying on exceptions in any of the primary processing path, there is nothing to benchmark. Anything from boost already uses this technique, actually.

Actions #9

Updated by Junxiao Shi about 9 years ago

We are not relying on exceptions in any of the primary processing path, there is nothing to benchmark.

I just checked the processing of incomplete incoming packet on stream face, and it doesn't use exceptions (I thought it needs exceptions).

Looks like I'll need to work on #2261 soon.

Actions #10

Updated by Junxiao Shi about 9 years ago

  • Category set to Core

This Task involves:

  1. add a code-style rule to require using BOOST_THROW_EXCEPTION
  2. replace all throw in existing code with BOOST_THROW_EXCEPTION
  3. require BOOST_THROW_EXCEPTION in all new code

Shall we do a mass-replacement, or replace throws in a file as it's touched?

Actions #11

Updated by Alex Afanasyev about 9 years ago

There is 4th part of this task: enable extended error reporting in relevant catch blocks.

As for the question. My opinion is that this should be a mass-replacement change.

Actions #12

Updated by Davide Pesavento about 9 years ago

Alex Afanasyev wrote:

There is 4th part of this task: enable extended error reporting in relevant catch blocks.

Can you elaborate please? I thought this change was transparent for the exception "catching" code. Isn't the additional info added to the .what() string?

As for the question. My opinion is that this should be a mass-replacement change.

Agreed.

Actions #13

Updated by Alex Afanasyev about 9 years ago

Can you elaborate please? I thought this change was transparent for the exception "catching" code. Isn't the additional info added to the .what() string?

Per my understanding, additional information is just "attached" to the exception and not exposed with .what() call. The simplest opition is to use diagnostic_information processing http://www.boost.org/doc/libs/1_57_0/libs/exception/doc/diagnostic_information.html (I think it is also possible to extract manually, not yet sure how).

Actions #14

Updated by Junxiao Shi almost 9 years ago

  • Description updated (diff)
  • Estimated time set to 4.50 h

20150629 conference call approves this issue, and determines that it's suitable for a beginner.

Actions #15

Updated by Junxiao Shi almost 9 years ago

  • Subject changed from Refactor exceptions and exception throwing to Enhance exception throwing and fatal log with Boost Exception library
  • Description updated (diff)
Actions #16

Updated by Junxiao Shi almost 9 years ago

  • Related to Task #2997: Enhance exception throwing with Boost Exception library added
Actions #17

Updated by Spyros Mastorakis almost 9 years ago

  • Status changed from New to In Progress
  • Assignee set to Spyros Mastorakis
  • % Done changed from 0 to 10
Actions #18

Updated by Spyros Mastorakis over 8 years ago

  • Status changed from In Progress to Code review
  • % Done changed from 10 to 90
Actions #19

Updated by Junxiao Shi over 8 years ago

Don't forget to:

add a code-style rule to require using BOOST_THROW_EXCEPTION

This is listed as item 1 in note-10.

Alex's note-11 implicitly confirms the necessity of this code-style rule.

Actions #20

Updated by Junxiao Shi over 8 years ago

  • Status changed from Code review to Closed
  • % Done changed from 90 to 100
Actions #21

Updated by Alex Afanasyev over 8 years ago

  • Status changed from Closed to Code review
  • % Done changed from 100 to 90

We haven't finished this issue yet.

Actions #22

Updated by Spyros Mastorakis over 8 years ago

Could you please let me know what else needs to be done to finish this issue?

Actions #23

Updated by Davide Pesavento over 8 years ago

  • % Done changed from 90 to 80

Alex doesn't like the current output from the top-level catch, and asked if we can implement something custom instead of using boost::diagnostic_information.

I think we should merge the commit converting throw into BOOST_THROW_EXCEPTION as soon as possible in order to avoid continuous painful rebases. Printing the exception info should be split into a separate commit, as we may have to experiment a bit with different output formats.

Actions #24

Updated by Spyros Mastorakis over 8 years ago

Davide Pesavento wrote:

Alex doesn't like the current output from the top-level catch, and asked if we can implement something custom instead of using boost::diagnostic_information.

Yeah, I saw that Davide. My question was basically how I should proceed with it. :)

I think we should merge the commit converting throw into BOOST_THROW_EXCEPTION as soon as possible in order to avoid continuous painful rebases. Printing the exception info should be split into a separate commit, as we may have to experiment a bit with different output formats.

I will wait for a day for other people to express any objections to this suggestion. If nothing comes up, I will proceed in this way.

Thanks!

Actions #25

Updated by Junxiao Shi over 8 years ago

  • Status changed from Code review to Feedback

This issue is not in CodeReview because no Change is waiting for review.

Actions #26

Updated by Alex Afanasyev over 8 years ago

@Spyros. Given you're assignee for the issue, can you come up with a proposal/example on how error messages can/will look like? Davide already mentioned how this can be implemented (you can also use boost documentation and take a look how diagnosic_information function is implemented in the source code)

Actions #27

Updated by Spyros Mastorakis over 8 years ago

I submitted the follow-up commit on gerrit:

http://gerrit.named-data.net/#/c/2437/

Actions #28

Updated by Spyros Mastorakis over 8 years ago

With some basic testing (threw an exception in main.cpp before runner.initialize()), the exception would be displayed as follows:

1442260026.494885 FATAL: [NFD] Exception caught:

File: ../daemon/main.cpp

Line: 287

Function: int main(int, char **)

Exception info: std::exception

Any comments on this format?

Actions #29

Updated by Alex Afanasyev over 8 years ago

I would say, the error should be more user friendly. May be something like this:

1442260026.494885 FATAL: [NFD] Error statement from what() method
1442260026.494885 FATAL: [NFD]     from file ../daemon/main.cpp:287

Actions #30

Updated by Spyros Mastorakis over 8 years ago

I feel that the error message with these spaces in the second looks ugly. I think it would be better without the spaces:

With spaces:

1442343459.626724 FATAL: [NFD] bind: Permission denied
1442343459.626808 FATAL: [NFD]     no extended exception information available

Without spaces:

1442343459.626724 FATAL: [NFD] bind: Permission denied
1442343459.626808 FATAL: [NFD] no extended exception information available
Actions #31

Updated by Alex Afanasyev over 8 years ago

Without spaces or any other indication, the two error messages do not look related.

The second line could start with some other prefix (--->, +--->,===>) and the extended information can be included in parentheses or square braces.

Actions #32

Updated by Spyros Mastorakis over 8 years ago

How about this?

1442355727.170192 FATAL: [NFD] bind: Permission denied
1442355727.170284 FATAL: [NFD] ====> no extended exception information available
Actions #33

Updated by Davide Pesavento over 8 years ago

Spyros Mastorakis wrote:

How about this?

1442355727.170192 FATAL: [NFD] bind: Permission denied
1442355727.170284 FATAL: [NFD] ====> no extended exception information available

2 things:

  • if we're using a second line for the extended info, and extended info isn't available, just don't print the second line at all instead of printing a placeholder.

  • I would not use two NFD_LOG statements. The reason is that, since there's no locking, the two lines could be interleaved with unrelated output from another thread. This might not happen in practice at the moment, but it's better to be future-proof.

Actions #34

Updated by Alex Afanasyev over 8 years ago

I would agree with future-proof argument of not having multiple log statements. The reason I was inclined to have multiple statements is to ensure the error info and extra information of the info are clearly separated (on separate lines) without multi-line log statements, as we don't have them anywhere else.

I also have no problems of not showing anything when extended info is not available. We added this only to simplify implementation. With one log statement, there is no need for additional complexity.

Which way would you prefer:

  • multi-line

    1442355727.170192 FATAL: [NFD] Error message
    ====> from foo/bar.cpp:1211
    
  • single line

    1442355727.170192 FATAL: [NFD] Error message (from foo/bar.cpp:1211)
    

With multi-line way we can potentially include other information later, if we needed it. However, it may creates issues for log processing scripts, as they would need to properly handle multi-line log statements.

The one liner doesn't have the drawback, but may not be suitable for future extensions. Though, this may not be a big deal.

What do you think?

Actions #35

Updated by Spyros Mastorakis over 8 years ago

Any thoughts on this proposal?

Actions #36

Updated by Davide Pesavento over 8 years ago

I don't have a strong preference between the two, but at this point I think I'm leaning towards the one-line version. But I'd like to see the function name added, if it doesn't require demangling or if demangling is easy to do.

Actions #37

Updated by Alex Afanasyev over 8 years ago

If one-liner, I'm not sure it is a good idea to include function name. For example, here is a potential output, which kind of becoming cryptic at the end. Exception of other places can be even more cryptic...

1442819621.405291 FATAL: [NFD] Error processing configuration file nfd.conf: no module subscribed for section "-log" [from file ../core/config-file.cpp:43 in static void nfd::ConfigFile::throwErrorOnUnknownSection(const std::string &, const std::string &, const ConfigSection &, bool)]
Actions #38

Updated by Davide Pesavento over 8 years ago

I don't see a problem with that... it looks pretty legible to me. Function signatures can look scary, but I don't see how printing it on a separate line improves anything.

Actions #39

Updated by Davide Pesavento over 8 years ago

anything else to do here?

Actions #40

Updated by Davide Pesavento over 8 years ago

  • Status changed from Feedback to Closed
  • % Done changed from 80 to 100

I guess not.

Actions #41

Updated by Junxiao Shi almost 8 years ago

  • Status changed from Closed to Feedback

I notice a catch in NfdRunner::run is not displaying extended exception information.

https://gerrit.named-data.net/2973 fixes it.

Actions #42

Updated by Junxiao Shi almost 8 years ago

  • Status changed from Feedback to Closed
Actions

Also available in: Atom PDF