Project

General

Profile

Feature #3006

Improvements of string helpers

Added by Alex Afanasyev about 5 years ago. Updated about 5 years ago.

Status:
Closed
Priority:
Normal
Category:
Utils
Target version:
Start date:
07/02/2015
Due date:
% Done:

100%

Estimated time:

Description

There are a few things that needs to be improved with string helpers (util/string-helper.hpp):

  • test coverage (currently missing)
  • add additional toHex helper that outputs to the supplied output stream
  • introduce fromHex helper to convert the supplied string to shared_ptr<const Buffer>

After enabling new helpers, some of the code in the library should make use of them, instead of using CryptoPP routines.

#1

Updated by Junxiao Shi about 5 years ago

One of the functions declared in commit:01c44e4c60d33d742192ea65a1198736784ca8c7 is:

std::ostream&
toHex(std::ostream& os, const uint8_t* array, size_t arraySize, bool isUpperCase = true);

This function can be invoked as:

uint8_t ARR[] { 0x01, 0x02, 0x03 };
toHex(os << "before", ARR, sizeof(ARR), false) << "after";
// -- or, not utilizing the return value --
os << "before";
toHex(os, ARR, sizeof(ARR), false);
os << "after";

What about allowing this invocation syntax:

uint8_t ARR[] { 0x01, 0x02, 0x03 };
os << "before" << toHex(ARR, sizeof(ARR), false) << "after";

This could be achieved with a helper class:

class StringHexHelper
{
public:
  /** \brief obtain the hexadecimal representation as a string
   */
  operator std::string() const;

  /** \brief print the hexadecimal representation to a stream
   */
  std::ostream&
  operator<<(std::ostream& os) const;

private:
  StringHexHelper(const uint8_t* buffer, size_t count, bool isUpperCase = true);

  friend StringHexHelper toHex(const uint8_t* buffer, size_t count, bool isUpperCase = true);

private:
  const uint8_t* m_buffer;
  size_t m_count;
  bool m_isUpperCase;
};
#2

Updated by Alex Afanasyev about 5 years ago

I don't think this helper is really necessary, as I cannot think about use cases. Yes, it may simplify writing some of the code, and I would not disagree adding it if somebody really needs it.

The only place (so far) where toHex is really used is inside Digest class (I separated implementation into generic toHex, since I needed to print out hex values of already computed digest in ChronoShare).

The Digest class already provides the same functionality as you are suggesting for StringHexHelper.

#3

Updated by Junxiao Shi about 5 years ago

Reply to note-2:

If you don't want to design as note-1 yet, rename toHex to printHex, and change return type to void.

  • Per rule 2.25, it's wrong to have toHex method name but return a std::ostream.
  • Returning std::ostream causes ugly code as shown in note-1, unless the return value is ignored. If the return value is to be ignored, it shouldn't be returned in the first place.
#4

Updated by Junxiao Shi about 5 years ago

Question from commit:d91269c6a3cbcb62f3510474335b5672244883f9

  // @todo AA: Decide what to do with this. I'm not sure this is the correct error model (ignoring)
  BOOST_CHECK_EQUAL(unescape(test2), "Invalid escape %ZZ (not a hex value)");
  BOOST_CHECK_EQUAL(unescape(test3), "Invalid escape %a (should be two hex symbols)");
  BOOST_CHECK_EQUAL(unescape(test4), "Invalid escape %a");

Answer:

I didn't find a rule in RFC3986 about how to decode an invalid percent-encoded string.

I tested PHP, JavaScript, and Python.
All of them are returning "%ZZ" and "%a" unchanged, same as the above test case.

So I suggest defining this as the intended behavior.

#5

Updated by Junxiao Shi about 5 years ago

fromHex function duplicates the functionality of ndn::security::transform::HexDecode.
How should we resolve this duplication?

#6

Updated by Yingdi Yu about 5 years ago

I am not aware of this issue until Junxiao pointed out. The toHex/fromHex method in this task is different from ndn::security::transform::HexDecode/HexEncode, because the second one is designed for transformation pipeline. The transform classes may be placed in the middle of a transformation pipeline. Therefore we should keep ndn::security::transform::HexDecode/HexEncode, but I am not sure about toHex/fromHex method...

#7

Updated by Alex Afanasyev about 5 years ago

to/fromHex must exist, as they are convenience functions. Their internal implementation (if somebody really wants to) can be changed to use HexDecode/Encode.

However, HexDecode/Encode does not belong to security namespace.

#8

Updated by Junxiao Shi about 5 years ago

note-5 is not talking about eliminating one of the APIs, but is talking about using one to implement the other.

In commit:7e779655a350d4b0292387f4f11025e5978ec168, fromHex is implemented with using CryptoPP.

This implementation must be replaced with ndn::security::transform::HexDecode to achieve the #2949 goal of eliminating CryptoPP.

printHex is implemented with <iostream>.

Not changing to ndn::security::transform::HexEncode is acceptable to me, because the change would incur 2 additional copies in:
(1) constructing a std::stringstream for StreamSource, and
(2) HexEncode output buffer before writing to StreamSink.

Now the question is how to resolve the conflict of fromHex's usage of CryptoPP:

  • Proceed with CryptoPP implementation, and change to HexDecode under a subtask of #2949.
  • #3009 blocks #3006.

I have no preference between these two options. Assignee needs to make a choice.

#9

Updated by Yingdi Yu about 5 years ago

Junxiao Shi wrote:

Now the question is how to resolve the conflict of fromHex's usage of CryptoPP:

  • Proceed with CryptoPP implementation, and change to HexDecode under a subtask of #2949.
  • #3009 blocks #3006.

I prefer the first one. Given #3006 is in master branch, and #3009 will not be merged into master branch in a near future (e.g., next two weeks).

#10

Updated by Junxiao Shi about 5 years ago

  • Status changed from Code review to Closed

Also available in: Atom PDF