Bug #1790
closedNLSR throws exception and quits if executed simultaneously on all machines in emulab environment.
0%
Description
I am running NLSR on 10 emulab nodes. There is a script that runs nfd first and then nlsr. If I run this script manually at each node with 2 to 3 seconds delay of going from node to another, the nlsr works fine. However, when I execute the script simultaneously at each node (using a scheduler or by broadcasting the command to all terminals) then nlsr quits giving following error:
terminate called after throwing an instance of 'ndn::SecPublicInfoSqlite3::Error'
what(): Key does not exist:/ndn/caida/%C1.Router/router1/NLSR/ksk-1406145072693
At one point I also got this error:
nlsr: /usr/include/boost/smart_ptr/shared_ptr.hpp:418: boost::shared_ptr<T>::reference boost::shared_ptr<T>::operator*() const [with T = ndn::IdentityCertificate, boost::shared_ptr<T>::reference = ndn::IdentityCertificate&]: Assertion `px != 0' failed.
The home folder in emulab is shared among all nodes, where I guess the keys are stored. Not sure but do you think that the concurrent access to that folder may be causing this error.
Updated by Alex Afanasyev over 10 years ago
Shared home folder is definitely a problem. The certificate/keys are generated and stored under $HOME/.ndn
.
I would strongly recommend setting HOME variable to machine-specific folder (e.g., /tmp/nlsr) before running NLSR, NFD (if it is run as root, make sure you set HOME to /tmp/root-nlsr
or something like that), nrd.
Updated by Alex Afanasyev over 10 years ago
Hint. For the task description and comments, you can use markdown syntax (e.g., 4 spaces for literal/code blocks). Compared with "pre" it properly escapes underscores and other special symbols.
Updated by Syed Amin over 10 years ago
changing HOME variable may affect other things as well. Isn't it possible to specify where to store ".ndn" folder.
Updated by Alex Afanasyev over 10 years ago
Which other things? You can change HOME just before running the process (to affect only that process).
HOME=/test ./nlsr
Unfortunately, there is no other way currently to make this change. But even if the was, shared HOME would pose problems for applications.
Updated by Syed Amin over 10 years ago
Changing the HOME to tmp or to some local folder seems to fix the issue. I ran it for couple of times, for two to three hours and didn't face the issue that I reported. One word of caution, which might be helpful for others that there is a significant difference between:
HOME=/test ./nlsr
and
HOME=/test; ./nlsr
I once by mistake issued the second command, which changed the environment variable for all processes of that session and messed up the other scripts.
Updated by A K M Mahmudul Hoque over 10 years ago
Should we not close this issue?
Updated by Syed Amin over 10 years ago
- Subject changed from NLSR throws exception and quits if executed simultaneously on all machines. to NLSR throws exception and quits if executed simultaneously on all machines in emulab environment.
Are the details about the files created at the "HOME" folder by nfd mentioned somewhere in the documents?
I didn't know about this folder until Alex mentioned to delete a file from there (I was having trouble in starting nfd and was getting authorization errors). Since then I've ran into similar problem many times and deleting ndnsec-public-info.db in ".ndn" folder helped in most of the cases.
Updated by Junxiao Shi about 10 years ago
- Is duplicate of Bug #2009: SecPublicInfoSqlite3 is unsafe on NFS share added