ccnd 0.7.2 forwarding strategy¶
ccnd has a hierarchical name prefix table that stores both permanent and temporary forwarding states.
Forwarding entries, similar to IP's static routing table, are manually configured or registered by local applications, and usually lasts for a long time and can be considered "permanent".
For each incoming ContentObject, ccnd remembers where it comes from (npe->src
) in the name prefix table at every node that is a prefix of this ContentObject (eg. source of /A/B/C is remembered at /, /A, /A/B, and /A/B/C nodes); this information is temporary, and expires after 8 seconds.
In many cases, multiple forwarding entries apply to the same Name, indicating that ContentObject MAY be available from multiple Faces.
Unlike IP network, having a forwarding entry to a Face does not guarantee the successful retrieval of ContentObject.
Interest is first forwarded to the "best" Face (npe->src
) of a Name Prefix entry with at least one forwarding entry (eg. if only /A and /A/B have forwarding entries, Interest to /A/B/C/D goes to npe->src
of /A/B node; in strategy_callout:CCNST_FIRST
).
If a matching ContentObject comes in predicted time, the prediction is adjusted down; otherwise, the prediction is adjusted up, and the Interest is forwarded to N other possible faces (N = MIN(2, number of unexpired downstreams); in do_propagate
).
explanation from CCNx developer Marc Mosko¶
http://www.lists.cs.ucla.edu/pipermail/nfd-dev/2015-July/001198.html
The timeout is initialized between 8 and 12 msec. If we get a response within the timeout it is decreased by 1/128th (i.e. t = .992 * t). Else it times out, it is increased by 1/8th (t = 1.125 t).
The general rational was that about every 15th packet would timeout, that’s why the decrease factor is 1/128th and the increase factor is 1/8th. After the i-th decrease, the timeout will be t = (1-1/128)i * t. After you hit the timeout, you then increase it to t = t * (1 + 1/8). You find the equilibrium solving 1 / (1 - 1/128) i = (1 + 1/8). Taking the log and rearranging gives i = 15.02, so about ever 15th packet you will cross the RTT timeout (assuming it’s constant and you started at about the right value). I have not analyzed the convergence of stability, but because the decrease is so much less than the increase I assume it will not ring too much once you cross the lower boundary and start increasing. However, if the RTT is much less than the initial value it could take a while to converge.
For example, if you have a 10,000 usec RTT and the timeout is 10,000, then the first hit will decrease to 9922 and you’ll miss. This will increase it to 11162, then decrease to 11075, …, 10001 (14th decrease), 9923 (15th decrease). Clearly 9923 is almost the same as 9922, so the next increase will keep oscillating in the same ballpark.
I think the initial value, minimum, and maximum are largely arbitrary (unlike the /128 and /8).
The minimum value of 127 usec is the maximum value less than the division factor of 1/128th. So, once you reach the floor it will not be decreased any more. One could theorize that an 8000 byte datagram at 1 Gbps takes 64 usec (each way), so having the minimum oscillate around 128 usec would be about the minimum on a fast network. But, that might be reading too much in to it. Also, it clearly does not scale for very fast networks (10G and beyond) nor for very slow networks (over 160 msec).
For example, I could have a 10G link and a 40G link. If I first tried the 10G link I would never timeout so never probe the 40G link. Likewise, if all my links are slow, say a 200msec and a 600 msec, I will timeout every time and always oscillate between them, even though trying the 600 msec every other packet is not a good idea.
The decrease / increase trick used here is pretty solid, I think, though the three constants need adjusting based on the type of network. I think some applied statistics could give you reasonable learned values that would adjust for very fast or very slow. Clearly for very fast you need to work in nanos not micros, so you will have a uint32 problem.
Updated by Junxiao Shi over 9 years ago · 2 revisions