Artifact badging aims to rank the quality of submitted research artifacts and promote reproducibility. However, artifact badging may not indicate inherent design and evaluation limitations.
This work explores current limits in artifact badging using a performance-based evaluation of the NDP artifact. We evaluate the NDP artifact beyond the Reusable badge’s level, investigating the effect of aspects such as packet size and random-number seed on throughput and flow completion time.
Our evaluation demonstrates that while the NDP artifact is reusable, it is not robust, and we identify architectural, implementation and evaluation limitations.