Repo » History » Version 2
Suravi Regmi, 11/25/2025 07:40 PM
| 1 | 1 | Suravi Regmi | # Repo |
|---|---|---|---|
| 2 | 2 | Suravi Regmi | |
| 3 | We currently use ndn-python-repo |
||
| 4 | |||
| 5 | 1 | Suravi Regmi |  |
| 6 | 2 | Suravi Regmi | |
| 7 | ## Development Summary: NDN Repo Integration & Performance Debugging in mGuard |
||
| 8 | ### Background |
||
| 9 | |||
| 10 | The original mGuard implementation relied on the bulk-insert API from ndn-python-repo. Under realistic workloads, bulk insert proved unsuitable: |
||
| 11 | |||
| 12 | * the repo crashed when batches exceeded ~100 packets, |
||
| 13 | * even after fixes, insert operations began failing around ~600 packets, |
||
| 14 | * the bulk API provided no timing visibility, making debugging impossible. |
||
| 15 | |||
| 16 | To address this, a C++ Interest-based repo publisher was implemented to mirror the Python publisher’s behavior. The long-term objective is to turn this into a standalone reusable C++ module independent of mGuard. |
||
| 17 | |||
| 18 | --- |
||
| 19 | |||
| 20 | ### C++ Repo Publisher (Interest-Based Insertion) |
||
| 21 | |||
| 22 | The C++ publisher followed the notify → msg → data workflow defined by ndn-python-repo. |
||
| 23 | During integration, a key problem emerged: |
||
| 24 | at ~50 Hz data rates, the repo began timing out, not because of repo limitations, but because the producer was issuing insert operations faster than repo/NFD could process them. |
||
| 25 | |||
| 26 | This highlighted that the failure was not purely repo-related. NFD scheduling, face load, and filtering overhead all contributed to the observed timeouts. |
||
| 27 | |||
| 28 | To stabilize mGuard insert behavior, a QueueManager was added to the producer. It paced inserts and prevented bursts that previously caused repo crashes or inconsistent manifest publication. |
||
| 29 | |||
| 30 | --- |
||
| 31 | |||
| 32 | ### Repo Workflow vs. mGuard Workflow: Fundamental Mismatch |
||
| 33 | |||
| 34 | The repository’s architecture is optimized for the classical file model: |
||
| 35 | |||
| 36 | * one file name, |
||
| 37 | * multiple data segments, |
||
| 38 | * fetch pipeline: notify → msg → fetch many segments. |
||
| 39 | |||
| 40 | mGuard, however, produces the opposite workload: |
||
| 41 | |||
| 42 | * thousands of distinct names per stream, |
||
| 43 | * each name usually 1–2 segments (data + CK), |
||
| 44 | * every packet triggers its own full control-plane sequence. |
||
| 45 | |||
| 46 | As a result, the repo must execute: |
||
| 47 | |||
| 48 | * one notify per packet, |
||
| 49 | * one msg per packet, |
||
| 50 | * one data fetch per packet. |
||
| 51 | |||
| 52 | For a 5,000-packet batch: |
||
| 53 | |||
| 54 | * 5,000 notify Interests |
||
| 55 | * 5,000 msg Interests |
||
| 56 | * 5,000 fetch Interests |
||
| 57 | |||
| 58 | This creates extremely high Interest volume and rapidly stresses both repo and NFD, particularly when multiple interest filters and PSync traffic are also active. |
||
| 59 | |||
| 60 | This mismatch is the core reason mGuard stresses ndn-python-repo under high-rate streaming workloads. |
||
| 61 | |||
| 62 | --- |
||
| 63 | |||
| 64 | ### Key Findings |
||
| 65 | **1. Repo is not the primary bottleneck** |
||
| 66 | |||
| 67 | Experiments with an isolated publisher showed that repo alone can sustain serial insertions up to several thousand packets without failure. Failures in mGuard stem from repo load combined with: |
||
| 68 | |||
| 69 | * high Interest rate, |
||
| 70 | * many interest filters, |
||
| 71 | * PSync activity, |
||
| 72 | * NFD face congestion. |
||
| 73 | |||
| 74 | **2. Sustainable throughput is ~80–100 inserts/sec** |
||
| 75 | |||
| 76 | This is the practical ceiling after which notify/msg RTTs grow and timeouts appear. |
||
| 77 | |||
| 78 | mGuard should pace producer insert operations using a window of ≈ 10 packets to stay within this limit. |
||
| 79 | |||
| 80 | **3. Repo delays > 4 seconds cause hard failures** |
||
| 81 | |||
| 82 | If repo cannot fetch the actual data within 4 seconds after notify/msg exchange, insertion is aborted. |
||
| 83 | This tends to affect the first 1–2 packets in a batch, before the pipeline stabilizes. |