Click has significant advantages for middlebox development, including modularity, extensibility, and reprogrammability. Despite these features, Click still has no native TCP support and only uses nonblocking I/O, preventing its applicability to middleboxes that require access to application data and blocking I/O. In this paper, we attempt to bridge this gap by introducing Click middleboxes (CliMB). CliMB provides a full-fledged modular TCP layer supporting TCP options, congestion control, both blocking and nonblocking I/O, as well as socket and zero-copy APIs to applications. As a result, any TCP network function may now be realized in Click using a modular L2-L7 design. As proof of concept, we develop a zero-copy SOCKS proxy using CliMB that shows up to 4x gains compared to an equivalent implementation using the Linux in-kernel network stack.
Network latency is critical to the success of many highspeed, low-latency applications. RFC 2544 discusses and defines a set of tests that can be used to describe the performance characteristics of a network device. However, most of the available measurement tools cannot perform all the tests as described in this standard. As a novel approach, this paper proposes Metherxis, a system that can be implemented on general purpose hardware and enables Virtualized Network Functions (VNFs) to measure network device latency with micro-second grade accuracy. Results show that Metherxis achieves highly accurate latency measurements when compared to OFLOPS, a well known measurement tool.
Measuring quality of Web users experience (WebQoE) faces the following trade-off. On the one hand, current practice is to resort to metrics, such as the document completion time (onLoad), that are simple to measure though knowingly inaccurate. On the other hand, there are metrics, like Google’s SpeedIndex, that are better correlated with the actual user experience, but are quite complex to evaluate and, as such, relegated to lab experiments. In this paper, we first provide a comprehensive state of the art on the metrics and tools available for WebQoE assessment. We then apply these metrics to a representative dataset (the Alexa top-100 webpages) to better illustrate their similarities, differences, advantages, and limitations. We next introduce novel metrics, inspired by Google’s SpeedIndex, that offer significant advantage in terms of computational complexity, while maintaining a high correlation with the SpeedIndex. These properties make our proposed metrics highly relevant and of practical use.
Control planes for global carrier networks should be programmable and scalable. Neither traditional control planes nor new SDN-based control planes meet both of these goals. Here we propose a framework for recursive routing computations that combines the best of SDN (programmability through centralized controllers) and traditional networks (scalability through hierarchy) to achieve these two desired properties. Through simulation on graphs of up to 10,000 nodes, we evaluate our design’s ability to support a variety of unicast routing and traffic engineering solutions, while incorporating a fast failure recovery mechanism based on network virtualization.
Public review by Joseph Camp
While software-defined networks have received significant attention in recent years, the networks studied often lack multiple orders of magnitude from today’s global carrier networks in terms of geographical span and nodal scale. Hence, this paper sets forth a recursive routing computation framework that balances the programmability of SDNs with the scalability of a traditional hierarchical structure. Simulations of about 10,000 nodes are used to show the viability of such an approach. Remarkably, the authors show that their recovery approach can offer “five 9s” of network repair even under a heavy link failure scenario.
Real-time media communication requires not only congestion control, but also minimization of queuing delays to provide interactivity. In this work we consider the case of real-time communication between web browsers (WebRTC) and we focus on the interplay of an end-to-end delay-based congestion control algorithm, i.e. the Google congestion control (GCC), with two delay-based AQM algorithms, namely CoDel and PIE, and two flow queuing schedulers, i.e. SFQ and Fq_Codel. Experimental investigations show that, when only GCC flows are considered, the end-to-end algorithm is able to contain queuing delays without AQMs. Moreover the interplay of GCC flows with PIE or CoDel leads to higher packet losses with respect to the case of a DropTail queue. In the presence of concurrent TCP traffic, PIE and CoDel reduce the queuing delays with respect to DropTail at the cost of increased packet losses. In this scenario flow queuing schedulers offer a better solution.
Public review by Fabian Bustamante
For an increasingly important class of Internet applications – such as videoconference and personalized live streaming – high delay, rather than limited bandwidth, is the main obstacle to improved performance. A common problem that impacts this class of applications is “bufferbloat”, where excess buffering in the network causes high latency and jitter. Solutions for persistently full buffer problems, active queue management (AQM) schemes such as the original RED, have been known for two decades. Yet, while RED is simple and effective at reducing persistent queues is not widely or consistently configured and enabled in routers and sometimes directly unavailable.
Recent focus on bufferbloat has brought a number of new AQM proposals, including PIE and CoDel, which explicitly control the queuing delay and have no knobs for operators, users or implementers to adjust. This paper considers the interplay between some of these AQM protocols and the new end-to-end delay-based congestion control algorithm, Google Congestion Control (GCC) part of the WebRTC framework.
Two sets of reviewers agree that, while the topic is well established, there is still significant work to be done and the authors contribute and incremental yet valuable analysis in the context of real-time communication and the increasingly popular WebRTC. The authors were encouraged to release the software used for conducting their measurements to let other researchers in the community replicate their results and explore some of the variants and alternative scenarios raised by different reviewers.
David Hauweele, Bruno Quoitin, Cristel Pelsser, Randy Bush.
The Border Gateway Protocol propagates routing information accross the Internet in an incremental manner. It only advertises to its peers changes in routing. However, as early as 1998, observations have been made of BGP announcing the same route multiple times, causing router CPU load, memory usage and convergence time higher than expected.
In this paper, by performing controlled experiments, we pinpoint multiple causes of duplicates, ranging from the lack of full RIB-Outs to the discrete processing of update messages. To mitigate these duplicates, we insert a cache at the output of the routers. We test it on public BGP traces and discuss the relation of the cache performance with the existence of bursts of updates in the trace.
Public review by Alberto Dainotti
“What do parrots and BGP routers have in common?“
“Nothing, of course.” — you might answer the question in this paper’s title.
“Since parrots simply repeat the sounds they hear, with no understanding of their meaning“. On the contrary, BGP speakers process the messages they receive and, hopefully, understand them before talking. However, a careful check of literature, may (or may not) make you reconsider the question:
E. N. Colbert-White, M. A. Covington, D. M. Fragaszy,
“Social Context Influences the Vocalizations of a Home-Raised African Grey Parrot (Psittacus erithacus erithacus)”
Journal of Comparative Psychology, Online First Publication, March 7, 2011. doi: 10.1037/a0022097
Moving from the paper title to the content: the authors investigate the problem of redundant BGP update messages (duplicate updates) generated by BGP routers. This phenomenon would normally be prevented by the Adj-RIBs-Out, which “contains the routes for advertisement to specific peers by means of the local speaker’s UPDATE messages.” [RFC 4271]. However, the Adj-RIBs-Out is sometimes not fully implemented or disabled in order to save memory. Previous studies have shown that duplicates can reach percentages above 80% in busy times (showing similarity to parrots to a BGP peer) and be detrimental to operations by causing high CPU loads.
This study contributes to the problem in two ways: (i) it explains the origin of several types of duplicate occurrences; (ii) it demonstrates that a simple
cache, requiring less memory usage than the Adj-RIBs-Out, can significantly
mitigate the problem. Reviewers appreciated the novelty of the contributions but would have liked to see an exhaustive analysis and characterization of all the common causes of duplicates in real world traces. This work is only a first step in fully understanding all the dynamics involved in redundant BGP update messages.
Rayman Preet Singh, Benjamin Cassell , S. Keshav, Tim Brecht.
Networked sensors and actuators are increasingly permeating our computing devices, and provide a variety of functions for Internet of Things (IoT) devices and applications. However, this sensor data can also be used by applications to extract private information about users. Applications and users are thus in a tussle over access to private data. Tussles occur in operating systems when stakeholders with competing interests try to access shared resources such as sensor data, CPU time, or network bandwidth. Unfortunately, existing operating systems lack a principled approach for identifying, tracking, and resolving such tussles. Moreover, users typically have little control over how tussles are resolved. Controls for sensor data tussles, for example, often fail to address trade-offs between functionality and privacy. Therefore, we propose a framework to explicitly recognize and manage tussles. Using sensor data as an example resource, we investigate the design of mechanisms for detecting and resolving privacy tussles in a cyber-physical system, enabling privacy and functionality to be negotiated between users and applications. In doing so, we identify shortcomings of existing research and present directions for future work.
Public review by Dave Choffnes
Ubiquitous Internet connectivity and sensing is quickly becoming reality. Many of us welcome this new world and its myriad applications ranging from entertainment and communication to health and education. On the other hand, this new functionality comes with an often invisible and thorny cost: exposure of private information. Historically, operating systems have focused on enabling functionality, with privacy controls being blunt, bolt-on features, if present at all. The Yelp app, for example, will use GPS coordinates to identify local businesses, but there is no easy way for the user to negotiate the use of coarser-grained location data for potentially less customized results but without the privacy cost.
In this paper, the authors propose using tussles as a way to manage the trade-offs between functionality and privacy settings that restrict it, and to provide this service at the operating system layer. Specifically, the paper identifies high-level abstractions to specify privacy and functionality requirements, techniques to resolve competing requirements, and mechanism to enforce the resolved behavior. Instead of focusing on any specific solution, the authors survey application functionality and user privacy requirements, and suggest how they might be addressed. Rather than offering a solution to the problem, this work serves as a starting point for a conversation about how to improve OS-level support for privacy.
The reviewers agreed that the authors identified an important problem and proposed an interesting potential direction for addressing it. The case studies in the paper provide supporting evidence that the approach is viable. There were concerns that the paper raises more questions than it answers (which is typical for a position paper) and that privacy negotiations have been proposed in previous work (impacting novelty). Despite these issues, the reviewers agreed that TussleOS is an interesting topic for future work.
Zaafar Ahmed, Muhammad Hamad Alizai, Affan A. Syed .
InKeV is a network virtualization platform based on eBPF, an in-kernel execution engine recently upstreamed into linux kernel. InKeV’s key contribution is that it enables in-kernel programmability and configuration of virtualized network functions, allowing to create a distributed virtual network across all edges hosting tenant workloads. Despite high performance demands of production environments, existing virtualization solutions have largely static in-kernel components due to the difficulty of developing and maintaining kernel modules and their years-long feature delivery time. The resulting compromise is either in programmability of network functions that rely on the data plane, such as payload processing, or in performance, due to expensive user-/kernel-space context switching. InKeV addresses these concerns: The use of eBPF allows it to dynamically insert programmable network functions into a running kernel, requiring neither to package a custom kernel nor to hope for acceptance in mainline kernel. Its novel stitching feature allows to flexibly configure complete virtual networks by creating a graph of network functions inside the kernel. Our evaluation reports on the flexibility of InKeV, and in-kernel implementation benefits such as low latency and impressive flow creation rate.
Public review by Katerina Argyraki
The ability to program the data plane — to introduce new packet-processing functionality in the network — is one of the main challenges faced by network operators today, whether in the context of datacenters or Internet service providers. A popular approach is to introduce “network functions” at the edge of the network, in general-purpose machines (not custom network equipment). To maximize performance, we would normally run these network functions inside the kernel, however, the standard ways of doing this, e.g., packaging custom kernels, are impractical. Instead, this paper proposes to leverage the extended Berkeley Packet Filter (eBFP), a way to safely introduce new functionality into the kernel, which is now part of the Linux kernel. The paper contributes a framework for managing network functions that are implemented on top of eBPF, and it shows experimentally that the proposed approach significantly outperforms the current standard approach, represented by OpenStack Neutron (which does not run entire network functions in the kernel). The reviewers appreciated the proposed solution for its simplicity and the clear demonstration of its performance benefits; in a sea of proposals for how to build programmable data planes, this one stands out for its potential for practical impact.