Project

General

Profile

Bug #4097

Misaligned memory accesses in TLV decoding

Added by Davide Pesavento about 3 years ago. Updated about 3 years ago.

Status:
Closed
Priority:
Low
Assignee:
Category:
Base
Target version:
Start date:
Due date:
% Done:

100%

Estimated time:
4.50 h

Description

Build with '--with-sanitizer=undefined --with-tests' and run the unit tests. These are just a few examples.

../tests/unit-tests/data.t.cpp(123): Entering test case "DataEqualityChecks"
../src/encoding/tlv.hpp:269:66: runtime error: load of misaligned address 0x55cf8ceb89b1 for type 'const uint32_t', which requires 4 byte alignment
0x55cf8ceb89b1: note: pointer points here
 00 00 00  fe ff ff ff ff 00 00 00  70 3b e9 8c cf 55 00 00  e0 02 ea 8c cf 55 00 00  01 00 00 00 00
              ^ 

../tests/unit-tests/data.t.cpp(318): Entering test case "FullName"
../src/encoding/tlv.hpp:451:66: runtime error: load of misaligned address 0x55cf8ceca85f for type 'const uint16_t', which requires 2 byte alignment
0x55cf8ceca85f: note: pointer points here
 14 04 19 02 27  10 15 08 53 55 43 43 45  53 53 21 16 03 1b 01 00  17 20 9c 97 0c f4 5b b2  e6 2c 06
             ^ 

../tests/unit-tests/encoding/tlv.t.cpp(77): Entering test case "ReadFromBuffer"
/home/davide/ndn-cxx/src/encoding/tlv.hpp:260:66: runtime error: load of misaligned address 0x55cf8825f673 for type 'const uint16_t', which requires 2 byte alignment
0x55cf8825f673: note: pointer points here
 00  01 fc fd 00 fd fe 00 01  00 00 ff 00 00 00 01 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00
              ^ 

../tests/unit-tests/face.t.cpp(65): Entering test case "ExpressInterestData"
../src/interest.cpp:70:62: runtime error: load of misaligned address 0x55cf8ceac9e2 for type 'const uint32_t', which requires 4 byte alignment
0x55cf8ceac9e2: note: pointer points here
 00 00  0a 04 64 6d c3 30 00 00  40 c9 ea 8c cf 55 00 00  20 00 00 00 00 00 00 00  31 00 00 00 00 00
              ^ 

../tests/unit-tests/lp/cache-policy.t.cpp(59): Entering test case "DecodeUnknownPolicyError"
../src/encoding/tlv.hpp:260:66: runtime error: load of misaligned address 0x55cf8826fb31 for type 'const uint16_t', which requires 2 byte alignment
0x55cf8826fb31: note: pointer points here
 00 00 00  fd 03 34 08 fd 03 35 04  ff ff ff ff 00 00 00 00  fd 03 34 05 fd 03 35 01  01 00 00 00 00
              ^ 

Files

c4012,1_benchmark.txz (1.7 KB) c4012,1_benchmark.txz Junxiao Shi, 07/08/2017 10:36 AM

Related issues

Related to NFD - Bug #2768: Misaligned memory accesses in TLV decoding (and probably elsewhere)Rejected

Actions
Related to ndn-cxx - Feature #4172: Optimize TLV decoding for contiguous memory inputClosedJunxiao Shi

Actions
#1

Updated by Davide Pesavento about 3 years ago

  • Follows Bug #2768: Misaligned memory accesses in TLV decoding (and probably elsewhere) added
#2

Updated by Davide Pesavento about 3 years ago

  • Follows deleted (Bug #2768: Misaligned memory accesses in TLV decoding (and probably elsewhere))
#3

Updated by Davide Pesavento about 3 years ago

  • Related to Bug #2768: Misaligned memory accesses in TLV decoding (and probably elsewhere) added
#5

Updated by Junxiao Shi about 3 years ago

Build with '--with-sanitizer=undefined --with-tests' and run the unit tests.

There are linker errors. Ubuntu 16.04 32-bit.

$ clang++ -v
clang version 3.8.0-2ubuntu4 (tags/RELEASE_380/final)
Target: i686-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Found candidate GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/5.4.0
Found candidate GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/6.0.0
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/5.4.0
Found candidate GCC installation: /usr/lib/gcc/i686-linux-gnu/6.0.0
Selected GCC installation: /usr/bin/../lib/gcc/i686-linux-gnu/5.4.0
Candidate multilib: .;@m32
Selected multilib: .;@m32

static library, gold

CXX=clang++ ./waf configure --enable-static --disable-shared --with-tests --debug --without-pch --with-sanitizer=undefined

[345/358] Linking build/tests/integrated/face
/usr/bin/../lib/gcc/i686-linux-gnu/5.4.0/../../../../include/c++/5.4.0/chrono:176: error: undefined reference to '__mulodi4'
/usr/include/boost/chrono/duration.hpp:277: error: undefined reference to '__mulodi4'
/usr/include/boost/chrono/duration.hpp:277: error: undefined reference to '__mulodi4'
/usr/include/boost/chrono/duration.hpp:277: error: undefined reference to '__mulodi4'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Waf: Leaving directory `/home/ubuntu/clang-ndn-cxx/build'
Build failed
 -> task in 'face' failed (exit status 1): 
    {task 3068172300L: cxxprogram face.cpp.1.o,identity-management-fixture.cpp.2.o -> face}
['clang++', '-fuse-ld=gold', '-Wl,-O1', '-fsanitize=undefined', '-pthread', 'tests/integrated/face.cpp.1.o', 'tests/identity-management-fixture.cpp.2.o', '-o', '/home/ubuntu/clang-ndn-cxx/build/tests/integrated/face', '-Wl,-Bstatic', '-L.', '-L/usr/lib/i386-linux-gnu', '-lndn-cxx', '-Wl,-Bdynamic', '-L/usr/lib/i386-linux-gnu', '-L/usr/lib', '-L/usr/lib', '-lboost_system', '-lboost_filesystem', '-lboost_date_time', '-lboost_iostreams', '-lboost_regex', '-lboost_program_options', '-lboost_chrono', '-lboost_thread', '-lboost_log', '-lboost_log_setup', '-lboost_unit_test_framework', '-lcryptopp', '-lssl', '-lcrypto', '-lsqlite3', '-lrt', '-lpthread']

static library, ld

revert ndn-cxx:commit:37271c5b48d60b05dd4f5aa5c719cf5e3e46bf7a

CXX=clang++ ./waf configure --enable-static --disable-shared --with-tests --debug --without-pch --with-sanitizer=undefined

[354/358] Linking build/tests/integrated/scheduler-benchmark
tests/integrated/scheduler-benchmark.cpp.3.o: In function `boost::chrono::detail::duration_cast_aux<boost::chrono::duration<long long, boost::ratio<1ll, 1ll> >, boost::chrono::duration<long long, boost::ratio<1ll, 1000000000ll> >, boost::ratio<1000000000ll, 1ll>, false, true>::operator()(boost::chrono::duration<long long, boost::ratio<1ll, 1ll> > const&) const':
/usr/include/boost/chrono/duration.hpp:277: undefined reference to `__mulodi4'
tests/integrated/scheduler-benchmark.cpp.3.o: In function `boost::chrono::detail::duration_cast_aux<boost::chrono::duration<long long, boost::ratio<1ll, 1000ll> >, boost::chrono::duration<long long, boost::ratio<1ll, 1000000000ll> >, boost::ratio<1000000ll, 1ll>, false, true>::operator()(boost::chrono::duration<long long, boost::ratio<1ll, 1000ll> > const&) const':
/usr/include/boost/chrono/duration.hpp:277: undefined reference to `__mulodi4'
tests/identity-management-fixture.cpp.2.o: In function `boost::chrono::detail::duration_cast_aux<boost::chrono::duration<int, boost::ratio<86400ll, 1ll> >, boost::chrono::duration<long long, boost::ratio<1ll, 1000000000ll> >, boost::ratio<86400000000000ll, 1ll>, false, true>::operator()(boost::chrono::duration<int, boost::ratio<86400ll, 1ll> > const&) const':
/usr/include/boost/chrono/duration.hpp:277: undefined reference to `__mulodi4'
./libndn-cxx.a(name-component.cpp.2.o): In function `boost::chrono::detail::duration_cast_aux<boost::chrono::duration<long long, boost::ratio<1ll, 1000000ll> >, boost::chrono::duration<long long, boost::ratio<1ll, 1000000000ll> >, boost::ratio<1000ll, 1ll>, false, true>::operator()(boost::chrono::duration<long long, boost::ratio<1ll, 1000000ll> > const&) const':
/usr/include/boost/chrono/duration.hpp:277: undefined reference to `__mulodi4'
./libndn-cxx.a(certificate.cpp.2.o): In function `boost::chrono::detail::duration_cast_aux<boost::chrono::duration<long long, boost::ratio<1ll, 1ll> >, boost::chrono::duration<long long, boost::ratio<1ll, 1000ll> >, boost::ratio<1000ll, 1ll>, false, true>::operator()(boost::chrono::duration<long long, boost::ratio<1ll, 1ll> > const&) const':
/usr/include/boost/chrono/duration.hpp:277: undefined reference to `__mulodi4'
./libndn-cxx.a(scheduler.cpp.2.o):/usr/include/boost/date_time/time_duration.hpp:111: more undefined references to `__mulodi4' follow
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Waf: Leaving directory `/home/ubuntu/clang-ndn-cxx/build'
Build failed
 -> task in 'scheduler-benchmark' failed (exit status 1): 
    {task 3068178860L: cxxprogram scheduler-benchmark.cpp.3.o,identity-management-fixture.cpp.2.o -> scheduler-benchmark}
['clang++', '-fsanitize=undefined', '-pthread', 'tests/integrated/scheduler-benchmark.cpp.3.o', 'tests/identity-management-fixture.cpp.2.o', '-o', '/home/ubuntu/clang-ndn-cxx/build/tests/integrated/scheduler-benchmark', '-Wl,-Bstatic', '-L.', '-L/usr/lib/i386-linux-gnu', '-lndn-cxx', '-Wl,-Bdynamic', '-L/usr/lib/i386-linux-gnu', '-L/usr/lib', '-L/usr/lib', '-lboost_system', '-lboost_filesystem', '-lboost_date_time', '-lboost_iostreams', '-lboost_regex', '-lboost_program_options', '-lboost_chrono', '-lboost_thread', '-lboost_log', '-lboost_log_setup', '-lboost_unit_test_framework', '-lcryptopp', '-lssl', '-lcrypto', '-lsqlite3', '-lrt', '-lpthread']

shared library, gold

CXX=clang++ ./waf configure --with-tests --debug --without-pch --with-sanitizer=undefined

[354/359] Linking build/bin/ndnsec
/usr/include/boost/date_time/time_duration.hpp:282: error: undefined reference to '__mulodi4'
/usr/include/boost/date_time/time_system_counted.hpp:42: error: undefined reference to '__mulodi4'
/usr/include/boost/date_time/time_system_counted.hpp:92: error: undefined reference to '__mulodi4'
/usr/include/boost/date_time/time_duration.hpp:111: error: undefined reference to '__mulodi4'
clang: error: linker command failed with exit code 1 (use -v to see invocation)

Waf: Leaving directory `/home/ubuntu/clang-ndn-cxx/build'
Build failed
 -> task in '../bin/ndnsec' failed (exit status 1): 
    {task 3067674540L: cxxprogram cert-dump.cpp.1.o,cert-gen.cpp.1.o,cert-install.cpp.1.o,delete.cpp.1.o,export.cpp.1.o,get-default.cpp.1.o,import.cpp.1.o,key-gen.cpp.1.o,list.cpp.1.o,main.cpp.1.o,set-default.cpp.1.o,sign-req.cpp.1.o,unlock-tpm.cpp.1.o,util.cpp.1.o -> ndnsec}
['clang++', '-fuse-ld=gold', '-Wl,-O1', '-fsanitize=undefined', '-pthread', 'tools/ndnsec/cert-dump.cpp.1.o', 'tools/ndnsec/cert-gen.cpp.1.o', 'tools/ndnsec/cert-install.cpp.1.o', 'tools/ndnsec/delete.cpp.1.o', 'tools/ndnsec/export.cpp.1.o', 'tools/ndnsec/get-default.cpp.1.o', 'tools/ndnsec/import.cpp.1.o', 'tools/ndnsec/key-gen.cpp.1.o', 'tools/ndnsec/list.cpp.1.o', 'tools/ndnsec/main.cpp.1.o', 'tools/ndnsec/set-default.cpp.1.o', 'tools/ndnsec/sign-req.cpp.1.o', 'tools/ndnsec/unlock-tpm.cpp.1.o', 'tools/ndnsec/util.cpp.1.o', '-o', '/home/ubuntu/clang-ndn-cxx/build/bin/ndnsec', '-Wl,-Bstatic', '-L.', '-L/usr/lib/i386-linux-gnu', '-Wl,-Bdynamic', '-L.', '-L/usr/lib/i386-linux-gnu', '-L/usr/lib', '-L/usr/lib', '-lndn-cxx', '-lboost_system', '-lboost_filesystem', '-lboost_date_time', '-lboost_iostreams', '-lboost_regex', '-lboost_program_options', '-lboost_chrono', '-lboost_thread', '-lboost_log', '-lboost_log_setup', '-lboost_unit_test_framework', '-lcryptopp', '-lssl', '-lcrypto', '-lsqlite3', '-lrt', '-lpthread']
#6

Updated by Davide Pesavento about 3 years ago

Just follow the instructions, why are you adding random extra options? Don't use --debug with the sanitizers, it's useless and just slows everything down.
Also why are you using clang? In any case, clang 4.0.1 works for me.

#7

Updated by Junxiao Shi about 3 years ago

  • Status changed from New to In Progress
  • Assignee set to Junxiao Shi
  • Target version set to v0.6
  • Estimated time set to 4.50 h

why are you adding random extra options?

They are the "standard" options I've been using.

  • --enable-static --disable-shared selects static instead of shared library, so that I don't have to run sudo ./waf install before running unit tests.
  • --debug allows me to use GDB.
  • --without-pch catches missing #includes.

Don't use --debug with the sanitizers, it's useless and just slows everything down.

Also why are you using clang? In any case, clang 4.0.1 works for me.

ASan is only available in clang, isn't it?


I got this working on macOS. ./waf configure --enable-static --disable-shared --with-tests --without-pch --without-osx-keychain --with-sanitizer=undefined.
The output is below (test cases without errors are deleted):

monaco:ndn-cxx-dev shijunxiao$ build/unit-tests -t Encoding/TestTlv -l test_suite
Running 9 test cases...
Entering test module "ndn-cxx Tests"
../tests/unit-tests/encoding/block-helpers.t.cpp:31: Entering test suite "Encoding"
../tests/unit-tests/encoding/tlv.t.cpp:77: Entering test case "ReadFromBuffer"
/Users/shijunxiao/code/ndn-cxx-dev/src/encoding/tlv.hpp:261:24: runtime error: load of misaligned address 0x00010ffdbce3 for type 'const uint16_t' (aka 'const unsigned short'), which requires 2 byte alignment
0x00010ffdbce3: note: pointer points here
 00  01 fc fd 00 fd fe 00 01  00 00 ff 00 00 00 01 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00
              ^ 
/Users/shijunxiao/code/ndn-cxx-dev/src/encoding/tlv.hpp:270:24: runtime error: load of misaligned address 0x00010ffdbce6 for type 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte alignment
0x00010ffdbce6: note: pointer points here
 fd 00 fd fe 00 01  00 00 ff 00 00 00 01 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ff ff
             ^ 
/Users/shijunxiao/code/ndn-cxx-dev/src/encoding/tlv.hpp:279:24: runtime error: load of misaligned address 0x00010ffdbceb for type 'const uint64_t' (aka 'const unsigned long long'), which requires 8 byte alignment
0x00010ffdbceb: note: pointer points here
 01  00 00 ff 00 00 00 01 00  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  ff ff 00 00 27 62 6f
              ^ 
../tests/unit-tests/encoding/tlv.t.cpp:77: Leaving test case "ReadFromBuffer"; testing time: 1030us
../tests/unit-tests/encoding/tlv.t.cpp:40: Leaving test suite "VarNumber"; testing time: 1669us
../tests/unit-tests/encoding/tlv.t.cpp:277: Entering test suite "NonNegativeInteger"
../tests/unit-tests/encoding/tlv.t.cpp:315: Entering test case "ReadFromBuffer"
/Users/shijunxiao/code/ndn-cxx-dev/src/encoding/tlv.hpp:461:24: runtime error: load of misaligned address 0x00010ffdc22a for type 'const uint32_t' (aka 'const unsigned int'), which requires 4 byte alignment
0x00010ffdc22a: note: pointer points here
 00 01  01 01 01 01 01 01 01 01  01 01 01 01 01 01 00 00  00 00 00 00 00 00 00 00  ff ff 00 00 27 62
              ^ 
/Users/shijunxiao/code/ndn-cxx-dev/src/encoding/tlv.hpp:470:24: runtime error: load of misaligned address 0x00010ffdc22e for type 'const uint64_t' (aka 'const unsigned long long'), which requires 8 byte alignment
0x00010ffdc22e: note: pointer points here
 01 01 01 01 01 01  01 01 01 01 01 01 00 00  00 00 00 00 00 00 00 00  ff ff 00 00 27 62 6f 6f  73 74
             ^ 
../tests/unit-tests/encoding/tlv.t.cpp:315: Leaving test case "ReadFromBuffer"; testing time: 317us
../tests/unit-tests/encoding/tlv.t.cpp:277: Leaving test suite "NonNegativeInteger"; testing time: 667us
../tests/unit-tests/encoding/tlv.t.cpp:405: Entering test suite "PrintHelpers"
../tests/unit-tests/encoding/tlv.t.cpp:407: Entering test case "PrintSignatureTypeValue"
../src/encoding/tlv.cpp:30:11: runtime error: load of value 4294967295, which is not a valid value for type 'const ndn::tlv::SignatureTypeValue'
../tests/unit-tests/encoding/tlv.t.cpp:407: Leaving test case "PrintSignatureTypeValue"; testing time: 183us
../tests/unit-tests/encoding/tlv.t.cpp:405: Leaving test suite "PrintHelpers"; testing time: 207us
../tests/unit-tests/encoding/tlv.t.cpp:35: Leaving test suite "TestTlv"; testing time: 2602us
../tests/unit-tests/encoding/block-helpers.t.cpp:31: Leaving test suite "Encoding"; testing time: 2625us
Leaving test module "ndn-cxx Tests"; testing time: 2674us

*** No errors detected

I find the PrintSignatureTypeValue error concerning: there seems to be another halt-and-catch-fire bug.

#8

Updated by Davide Pesavento about 3 years ago

Junxiao Shi wrote:

  • --enable-static --disable-shared selects static instead of shared library, so that I don't have to run sudo ./waf install before running unit tests.

This is a poor reason. Use LD_LIBRARY_PATH instead.

ASan is only available in clang, isn't it?

No. It's available in gcc since 4.8, and UBSan since 4.9.

#9

Updated by Junxiao Shi about 3 years ago

https://gerrit.named-data.net/4012

Old code needs more than just InputIterator. New code requires InputIterator and nothing more, which is assured by a Boost archetype check.

Benchmark on Ubuntu 16.04 32-bit indicates new code is 50% slower with 3-octet VarNumber.
Configure line was ./waf configure --enable-static --disable-shared --with-tests.
Numbers shown are nanoseconds for 100 million iterations, so the absolute time per decode is small.

octets offset old avg old stdev new avg new stdev new/old
3 0 812482053.6 701817.6516277572 1191746895.4 1593530.9325787497 1.4668
3 1 774634608 1319320.5595839852 1185522397.8 1739824.4589390908 1.53043
3 2 782558356.2 60501227.94289818 1185481972.2 1861340.424274587 1.51488
3 3 788475709.2 948486.7703187536 1192259314 2702461.5261764596 1.51211
3 4 779499019 646236.0000595758 1186723887.4 2237346.5582245635 1.52242
3 5 773129479 360371.03450332966 1211794156 1902999.727570527 1.56739
3 6 762311734.6 2132297.9371582433 1210019451 2471185.3215690847 1.5873
3 7 788610076.8 480931.3301851939 1192905055.4 1331583.3891930315 1.51267
5 0 799881597.4 28430233.06545826 1651223595.8 1481778.2679074153 2.06434
5 1 811857209.8 35273182.40241241 1663190552.8 1417013.2396520507 2.04862
5 2 800209664.6 27915129.02644787 1649934943 1131077.2531551502 2.06188
5 3 796349938.6 1095874.877914354 1661255365.4 2476026.4300833503 2.08609
5 4 787944984 999771.4858626445 1646849213.2 1212937.0703578154 2.09006
5 5 787570458.6 1363913.150264818 1648333275.2 2273197.2348053525 2.09293
5 6 796333850.6 1415933.5906098492 1660508521.6 964842.0505838249 2.08519
5 7 788270998.2 1387687.358025827 1648317067.8 2358090.135777829 2.09105
9 0 821622216.2 33191559.53856336 2404011805.4 946031.4809153023 2.92593
9 1 804525165.8 35446554.45638711 2400989705 1880422.884851038 2.98436
9 2 821298807.2 32628174.377149355 2401157826.2 1233170.9520174807 2.92361
9 3 788152437.2 1367256.6681333832 2400973331.8 3744956.911946945 3.04633
9 4 807503746 1777432.257083656 2398303221.8 1467681.7745363605 2.97002
9 5 806602488.2 542079.9331820908 2399626818.8 1878104.450648499 2.97498
9 6 787717199.4 267770.6504357787 2400369779.8 2419781.950184913 3.04725
9 7 806359870 263729.44687975215 2400726222 1417025.392265255 2.97724
#10

Updated by Davide Pesavento about 3 years ago

Junxiao Shi wrote:

Old code needs more than just InputIterator. New code requires InputIterator and nothing more, which is assured by a Boost archetype check.

While we probably do need a version that works on input iterators, in the vast majority of cases decoding operates on memory buffers, which allow random access iterators (or even contiguous). If the decoding can be made more efficient by leveraging stronger properties offered by the iterator, then we should definitely take advantage of that.

Benchmark on Ubuntu 16.04 32-bit indicates new code is 50% slower with 3-octet VarNumber.

Have you tried std::copy and/or memcpy?

#11

Updated by Junxiao Shi about 3 years ago

  • Related to Feature #4172: Optimize TLV decoding for contiguous memory input added
#12

Updated by Junxiao Shi about 3 years ago

in the vast majority of cases decoding operates on memory buffers, which allow random access iterators (or even contiguous).

RandomAccessIterator won't help because it still only allows reading one octet at a time.
Contigous memory allows memcpy. This is to be investigated in #4172.

#13

Updated by Davide Pesavento about 3 years ago

This is to be investigated in #4172.

I disagree with this way of proceeding.
As you said yourself, the current code already requires more than an input iterator. Specifically, it assumes that all read bytes are stored in contiguous memory locations. It makes no sense to weaken this requirement and make decoding ~1.5-3x slower because of that. NFD is already slow enough. If we can take advantage of the current de facto requirement of contiguous memory, we should do it.

#14

Updated by Davide Pesavento about 3 years ago

memcpy: https://godbolt.org/g/bGJees
unaligned access: https://godbolt.org/g/6Cd49X

Both gcc and clang generate exactly the same assembly code in the two cases. This is true for 2-, 4-, and 8-byte reads, and on x86_64, arm64 (aka aarch64), ppc, and ppc64. I suppose that means the compiler knows that unaligned accesses are safe (and efficient) on those architectures, and optimizes the memcpy appropriately.

#15

Updated by Junxiao Shi about 3 years ago

  • Status changed from Code review to Closed

Benchmark is merged. Bugfix itself is in #4172.

Also available in: Atom PDF