Task #4488
closedAdd ARM slaves to Jenkins
100%
Description
x86 (32-bit) is becoming obsolete. Notably, Ubuntu 18.04 no longer supports x86. On the other hand, ARM architecture is increasingly popular. Microsoft has released Windows 10 to ARM-based laptops.
This task is to add ARM slaves to Jenkins, as a first step to adopt ARM as an officially supported platform.
Updated by Eric Newberry almost 7 years ago
Should we go for aarch32, aarch64, or both? It looks like Ubuntu only has an image for 64-bit ARM (and only 16.04 at that).
Updated by Davide Pesavento almost 7 years ago
This task requires discussion in an NFD call before any actions are taken.
Updated by Junxiao Shi almost 7 years ago
Should we go for aarch32, aarch64, or both?
I’m interested in ARMv7 (armhf
) because all my toys have this CPU. Scaleway’s C1 instance is ARMv7 and they have Ubuntu images, although I prefer to use Debian instead because most of my toys don’t work well with Ununtu.
Admittedly there are some personal agenda going on in this issue. I’m willing to sponsor 200 hours of Scaleway C1 instance fees for testing.
Updated by Nicholas Gordon over 6 years ago
WRT CI time, Memphis has been for some time, unsuccessfully, to get a second server onlined to run Jenkins instances. For whatever reason, Xen just will not work on it, probably hardware gremlins. Instead, we could use QEMU to test the ARM platform.
Regarding arm vs aarch64, while Ubuntu doesn't have an arm image according to Eric, we're all quite familiar with Raspbian images, which should do the job in that niche.
Updated by Junxiao Shi over 6 years ago
- Project changed from ndn-cxx to NFD
- Category changed from Build to Build
we could use QEMU to test the ARM platform.
QEMU is too slow. I'm aiming at a physical or virtual machine with a real ARM CPU.
we're all quite familiar with Raspbian images, which should do the job in that niche.
Yes but it's Debian.
Updated by Nicholas Gordon over 6 years ago
I suppose QEMU is slow, that's true. If we want to target it officially, wouldn't it be best to purchase a few boards? ODROID sells boards with both arm and aarch64 chips and are aimed at higher performance, while still being quite cheap.
Updated by Junxiao Shi over 6 years ago
If we want to target it officially, wouldn't it be best to purchase a few boards?
You'll have maintenance headache with self-hosted devices. It's preferable to provision resources from the cloud (see note-5), rather than hosting them yourself.
Updated by Nicholas Gordon over 6 years ago
- Category deleted (
Build)
Sure, cloud instances are very low-maintenance, but I expect that we would consume the cloud resources at a pretty high rate, especially if we are targeting multiple library versions, compilers, etc. That would cost a lot of money that I don't think the NDN project is willing to spend. Otherwise, we should agree on a pretty constrained target for arm/aarch64 to minimize the costs.
Updated by Junxiao Shi over 6 years ago
That would cost a lot of money that I don't think the NDN project is willing to spend
You can provision two C1 instances same as today's Jenkins slaves, so that the cost stays at six euros per month. New builds do not use new instances, but have to wait, just like today.
The benefit is that you can replace a node instantly whenever something goes wrong (although you'll spend more because the "monthly" price applies to an instance only if it runs long enough consecutively).
Updated by Nicholas Gordon over 6 years ago
I don't really understand the impetus for just now deciding to use a cloud-based service. We have been doing it ourselves for quite some time without too much incidence, save only for intermittent problems. We would be trading one headache for another, but instead of having to deal with obscure or rare hardware problems, we have to deal with cloud configuration problems.
ODROID boards, for the most powerful unit, would pay for themselves in less than a year based on your EUR6/month guess. If we were spinning this up for the first time, I might be swayed. However, we already have Eric and just recently myself managing the build slaves. I fail to see how much work it could be after getting the initial kinks ironed out.
Addressing your complaint about not having Ubuntu -- there's a version of Ubuntu that's targeted for embedded, Ubuntu Core. However, I still maintain that Raspbian would be a better fit, as it has had much more time and support.
Updated by Eric Newberry over 6 years ago
Nicholas Gordon wrote:
Addressing your complaint about not having Ubuntu -- there's a version of Ubuntu that's targeted for embedded, Ubuntu Core. However, I still maintain that Raspbian would be a better fit, as it has had much more time and support.
There appears to be an official Ubuntu release for ARM (16.04 only): https://www.ubuntu.com/download/server/arm
Updated by Davide Pesavento over 6 years ago
Eric Newberry wrote:
There appears to be an official Ubuntu release for ARM (16.04 only): https://www.ubuntu.com/download/server/arm
That's only for arm64 (aka AArch64).
Updated by Eric Newberry over 6 years ago
Davide Pesavento wrote:
Eric Newberry wrote:
There appears to be an official Ubuntu release for ARM (16.04 only): https://www.ubuntu.com/download/server/arm
That's only for arm64 (aka AArch64).
Oh ok nevermind then.
Updated by Davide Pesavento over 6 years ago
Eric Newberry wrote:
Davide Pesavento wrote:
Eric Newberry wrote:
There appears to be an official Ubuntu release for ARM (16.04 only): https://www.ubuntu.com/download/server/arm
That's only for arm64 (aka AArch64).
Oh ok nevermind then.
Well, ideally we would add both arm(32) and aarch64, the latter is becoming increasingly important in the mobile sector. So yours is still a valid suggestion.
For arm(32), both Raspbian and Ubuntu Core are good suggestions.
Updated by Eric Newberry over 6 years ago
- Status changed from New to In Progress
I've began some tests on Raspbian to check how long it takes to build, etc. Ideally we should use Raspberry Pi 3 Model B units.
Updated by Eric Newberry over 6 years ago
Davide Pesavento wrote:
Eric Newberry wrote:
Davide Pesavento wrote:
Eric Newberry wrote:
There appears to be an official Ubuntu release for ARM (16.04 only): https://www.ubuntu.com/download/server/arm
That's only for arm64 (aka AArch64).
Oh ok nevermind then.
Well, ideally we would add both arm(32) and aarch64, the latter is becoming increasingly important in the mobile sector. So yours is still a valid suggestion.
For arm(32), both Raspbian and Ubuntu Core are good suggestions.
It looks like Raspberry Pi 3's have 64-bit CPUs, while earlier models have 32-bit CPUs.
Updated by Junxiao Shi over 6 years ago
I've began some tests on Raspbian to check how long it takes to build, etc. Ideally we should use Raspberry Pi 3 Model B units.
Using pi3 hardware and Ubuntu Server 16.04, it takes 40-50 minutes for a single release build without building or running tests. Adding the tests I think it would be over one hour.
You can try to find out whether the bottleneck is on CPU, memory, or disk. CPU bottleneck has no solution. Memory bottleneck can be relieved by moving to ScaleWay cloud or using Orange Pi Plus 2E, both with 2GB memory. Disk bottleneck can be relieved by mounting a NFS filesystem or USB hard drive.
It looks like Raspberry Pi 3's have 64-bit CPUs, while earlier models have 32-bit CPUs.
Hmm I’ll need some new kernels for my computer.
Updated by Davide Pesavento over 6 years ago
Eric Newberry wrote:
I've began some tests on Raspbian to check how long it takes to build, etc. Ideally we should use Raspberry Pi 3 Model B units.
I wonder if you got some results on build times and memory consumption..?
Also, why are you saying "ideally... RPi 3"? The problem with RPi is that it has little memory, I don't think we can use multiple build jobs (-jN
) on it. And with 1 build job, we're not taking advantage of the multiple cores (and with only 1 core it will probably take too long).
Updated by Eric Newberry over 6 years ago
Davide Pesavento wrote:
Eric Newberry wrote:
I've began some tests on Raspbian to check how long it takes to build, etc. Ideally we should use Raspberry Pi 3 Model B units.
I wonder if you got some results on build times and memory consumption..?
I stopped the build after it took 12 minutes to build 20 source files (besides the 20 or so files in tools/wrapper
built at the beginning). This was done on a Pi 2 Model B running Raspbian. I don't believe I specified the -jN
option to it, so it was likely building on all 4 cores.
Also, why are you saying "ideally... RPi 3"? The problem with RPi is that it has little memory, I don't think we can use multiple build jobs (
-jN
) on it. And with 1 build job, we're not taking advantage of the multiple cores (and with only 1 core it will probably take too long).
Yes, it looks like they actually have the same amount of RAM (1GB) and the CPU on the Pi 3 Model B only has 300MHz greater frequency than the Pi 2 Model B.
Updated by Eric Newberry over 6 years ago
It looks like building ndn-cxx on the RPi 2 Model B in debug mode with tests (-j1) took approximately 161 minutes, including the configure step. In this middle of this, I stopped the build and, upon starting it again, it appeared to start over from the beginning, so I'm only including the second build attempt. Unit tests in non-root mode took approximately 10 minutes to run.
I'm still building NFD. I'll post the time results when I have them.
To get ./waf configure
to find Boost, I had to add /usr/lib/arm-linux-gnueabihf
to BOOST_LIBS
in .waf-tools/boost.py
.
Updated by Junxiao Shi over 6 years ago
I stopped the build and, upon starting it again, it appeared to start over from the beginning
If you did not run ./waf distclean
, previous built objects are reused, and your timing is invalid.
Please clarify: are you doing a “limited build” (single compilation) or a “full build” (debug, release, debug with unit tests)?
To get
./waf configure
to find Boost, I had to add/usr/lib/arm-linux-gnueabihf
toBOOST_LIBS
in.waf-tools/boost.py
.
Don’t. The specific strings for x86 and amd64 should be removed as well. See https://github.com/named-data/ppa-packaging/issues/15#issuecomment-269842891
Updated by Eric Newberry over 6 years ago
Junxiao Shi wrote:
I stopped the build and, upon starting it again, it appeared to start over from the beginning
If you did not run
./waf distclean
, previous built objects are reused, and your timing is invalid.
It stepped through every single file again.
Please clarify: are you doing a “limited build” (single compilation) or a “full build” (debug, release, debug with unit tests)?
Just a single compilation of debug with unit tests.
To get
./waf configure
to find Boost, I had to add/usr/lib/arm-linux-gnueabihf
toBOOST_LIBS
in.waf-tools/boost.py
.Don’t. The specific strings for x86 and amd64 should be removed as well. See https://github.com/named-data/ppa-packaging/issues/15#issuecomment-269842891
This makes 0% difference in the timings.
Updated by Junxiao Shi over 6 years ago
To get
./waf configure
to find Boost, I had to add/usr/lib/arm-linux-gnueabihf
toBOOST_LIBS
in.waf-tools/boost.py
.Don’t. The specific strings for x86 and amd64 should be removed as well. See https://github.com/named-data/ppa-packaging/issues/15#issuecomment-269842891
This makes 0% difference in the timings.
Yes, this has nothing to do with timing. It’s a proper method to fix the codebase for armhf and other architectures.
Updated by Nicholas Gordon over 6 years ago
Has anyone thought about just cross-compiling? I was talking to Ashlesh about setting up some kind of pi cluster to get acceptable speeds, but then he suggested cross compiling. Is there any specific reason we can't go through the trouble of setting up the cross-compilation environment, grab that image and store it somewhere?
For the unit tests we could spin up QEMU images just to run the tests, no compilation there.
I also suggested that we could set up a pi cluster to run each of the Jenkins tasks in parallel. We can't do any better within the individual task, but we could have four pis running:
- NFD with tests
- NFD without tests
- NFD with tests, statically linked
- NFD without tests, statically linked
etc. Even though the pis are a lot slower, if we could parallelize the tasks like that, it might work out to be a similar time.
Updated by Eric Newberry over 6 years ago
Nicholas Gordon wrote:
Has anyone thought about just cross-compiling? I was talking to Ashlesh about setting up some kind of pi cluster to get acceptable speeds, but then he suggested cross compiling. Is there any specific reason we can't go through the trouble of setting up the cross-compilation environment, grab that image and store it somewhere?
This would probably be acceptable.
For the unit tests we could spin up QEMU images just to run the tests, no compilation there.
I tried and tried and tried and tried to get a QEMU image of Ubuntu to boot, but was unable to get it working. So I'm not sure if this is feasible.
I also suggested that we could set up a pi cluster to run each of the Jenkins tasks in parallel. We can't do any better within the individual task, but we could have four pis running:
- NFD with tests
- NFD without tests
- NFD with tests, statically linked
- NFD without tests, statically linked
Nitpick: the configurations we build for NFD are actually:
- with tests
- with other tests
- with tests, debug, without PCH, potentially with ASAN (depending on platform)
All configurations are dynamically linked with ndn-cxx.
etc. Even though the pis are a lot slower, if we could parallelize the tasks like that, it might work out to be a similar time.
It took 171 minutes to build NFD on the Pi with tests in debug mode. Without PCH, it would likely be longer. I don't think this would be an acceptable build time, given that it currently takes about an hour to build/test NFD on Ubuntu agents (and about two on macOS agents).
Updated by Davide Pesavento over 6 years ago
Nicholas Gordon wrote:
We can't do any better within the individual task
There is always distcc...
Updated by Davide Pesavento over 6 years ago
Nicholas Gordon wrote:
Has anyone thought about just cross-compiling? I was talking to Ashlesh about setting up some kind of pi cluster to get acceptable speeds, but then he suggested cross compiling. Is there any specific reason we can't go through the trouble of setting up the cross-compilation environment, grab that image and store it somewhere?
I don't think anyone is excluding cross-compilation in principle. I was thinking about it some time ago, and I concluded that it could significantly complicate our jenkins scripts, because the building+testing will then span multiple slaves, and I have no idea how to make that fit our current CI framework.
Updated by Junxiao Shi over 6 years ago
building+testing will then span multiple slaves, and I have no idea how to make that fit our current CI framework.
- Make the build machine a Jenkins slave.
- Pair a test machine with each build machine.
- In the build script, instead of invoking unit tests directly, install them into the test machine, and execute the tests over ssh.
- Run BOINC or bitcoin miner on the test machine when idle, so they can pay for themselves.
Updated by Eric Newberry over 6 years ago
Another ARM machine has been purchased and I plan to conduct further testing on it.
Updated by Eric Newberry over 6 years ago
The new board (ODROID XU4) looks promising.
I built ndn-cxx and NFD 0.6.1 and got the following timings. I had to modify .waf-tools/boost.py
to also search for Boost in /usr/lib/arm-linux-gnueabihf
.
Timings for ndn-cxx:
./waf configure --with-tests --debug:
real 0m28.672s
user 0m23.970s
sys 0m4.578s
./waf -j4:
real 16m33.297s
user 58m41.969s
sys 4m52.230s
sudo ./waf install:
real 0m4.024s
user 0m0.974s
sys 0m0.862s
sudo ldconfig:
Almost instantaneous
build/unit-tests:
real 2m55.511s
user 2m39.875s
sys 0m2.545s
Timings for NFD:
./waf configure --with-tests --debug:
real 0m33.592s
user 0m28.468s
sys 0m5.021s
./waf -j4:
real 17m16.690s
user 60m45.778s
sys 5m18.359s
sudo ./waf install:
real 0m5.160s
user 0m0.899s
sys 0m1.042s
build/unit-tests-core:
real 0m0.788s
user 0m0.083s
sys 0m0.040s
build/unit-tests-daemon:
real 2m14.655s
user 0m5.420s
sys 0m4.314s
build/unit-tests-rib:
real 0m1.371s
user 0m1.110s
sys 0m0.066s
build/unit-tests-tools:
real 0m11.805s
user 0m0.981s
sys 0m0.470s
Updated by Eric Newberry over 6 years ago
Regarding the ODROID XU4 above, even though the system has 8 cores in a big.LITTLE configuration, I had to limit the build to 4 threads to avoid GCC being killed due to OOM. Other than this and the modifications to find Boost, it appears to work flawlessly.
Updated by Davide Pesavento over 6 years ago
Is AddressSanitizer working on arm?
Eric Newberry wrote:
I had to modify .waf-tools/boost.py to also search for Boost in /usr/lib/arm-linux-gnueabihf.
Can you report a separate bug? I remember someone else reported this problem recently, but I can't find an open issue on redmine.
it appears to work flawlessly.
All tests pass? I'm amazed.
Updated by Eric Newberry over 6 years ago
Davide Pesavento wrote:
Is AddressSanitizer working on arm?
I don't see why it wouldn't, but I'll test tomorrow.
Eric Newberry wrote:
I had to modify .waf-tools/boost.py to also search for Boost in /usr/lib/arm-linux-gnueabihf.
Can you report a separate bug? I remember someone else reported this problem recently, but I can't find an open issue on redmine.
Will do.
Updated by Eric Newberry over 6 years ago
- Blocked by Task #4573: Detect architecture-specific location for Boost libs on Ubuntu added
Updated by Eric Newberry over 6 years ago
I rebuilt ndn-cxx with ASan enabled on the ODROID and ran the unit tests, encountering no issues. However, it did increase the build time, but only from 17-18 minutes to 23-24 minutes.
Updated by Nicholas Gordon over 6 years ago
I have the odroid-xu4 and I like it. I currently use to host some old HDDs through the USB 3.0 port. Something to consider, since you said you couldn't compile with -j4, is that USB 3.0 has speeds fast enough to make using a good flash drive as extra RAM a reasonable idea. Just put the swapfile on the USB device, make it quite large, and you may be able to get decent build performance out of them. I don't have a USB 3.0 flash drive, or I could test on mine at home.
Updated by Eric Newberry over 6 years ago
Nicholas Gordon wrote:
I have the odroid-xu4 and I like it. I currently use to host some old HDDs through the USB 3.0 port. Something to consider, since you said you couldn't compile with -j4, is that USB 3.0 has speeds fast enough to make using a good flash drive as extra RAM a reasonable idea. Just put the swapfile on the USB device, make it quite large, and you may be able to get decent build performance out of them. I don't have a USB 3.0 flash drive, or I could test on mine at home.
Interesting. I'll see if I can test this. It might improve compilation performance.
I was able to compile with -j4
. It didn't work with the default -j
value (which would be 8, due to big.LITTLE).
Updated by Eric Newberry over 6 years ago
I also tried ASan with NFD and it worked. However, I had to reduce the number of jobs to 3 to avoid running out of virtual memory. This made the build time increase to 28-29 minutes.
I also tested the privileged tests in NFD and they worked just fine.
Updated by Eric Newberry over 6 years ago
We have purchased two more ODROID XU4s and plan to connect all three to Jenkins soon.
Updated by Eric Newberry over 6 years ago
- Status changed from In Progress to Feedback
- % Done changed from 0 to 100
The ARM machines have been set up and connected to Jenkins. I've added them to all applicable projects (platform "Ubuntu-16.04-armhf"). I'm keeping this issue open for now in case any issues arise.
Updated by Eric Newberry over 6 years ago
- Status changed from Feedback to Closed
I reduced the number of build threads down to the standard 2 for the ARM agents (from 3) to avoid any potential issues seen in earlier tests. I saw successful builds from both ndn-cxx and ndnSIM (also a failure on ndn-cxx, but it may have been related to too many concurrent threads), so I'm going to go ahead and close this issue.