Category Archives: 2021

Surviving switch failures in cloud datacenters

Rachee Singh, Muqeet Mukhtar, Ashay Krishna, Aniruddha Parkhi, Jitendra Padhye, David Maltz

Abstract

Switch failures can hamper access to client services, cause link congestion and blackhole network traffic. In this study, we examine the nature of switch failures in the datacenters of a large commercial cloud provider through the lens of survival theory. We study a cohort of over 180,000 switches with a variety of hardware and software configurations and find that datacenter switches have a 98% likelihood of functioning uninterrupted for over 3 months since deployment in production. However, there is significant heterogeneity in switch survival rates with respect to their hardware and software: the switches of one vendor are twice as likely to fail compared to the others. We attribute the majority of switch failures to hardware impairments and unplanned power losses. We find that the in-house switch operating system, SONiC, boosts the survival likelihood of switches in datacenters by 1% by eliminating switch failures caused by software bugs in vendor switch OSes.

Download from ACM

What do information centric networks, trusted execution environments, and digital watermarking have to do with privacy, the data economy, and their future?

Nikolaos Laoutaris, Costas Iordanou

Abstract

What if instead of having to implement controversial user tracking techniques, Internet advertising & marketing companies asked explicitly to be granted access to user data by name and category, such as Alice→Mobility→05-11-2020? The technology for implementing this already exists, and is none other than the Information Centric Networks (ICN), developed for over a decade in the framework of Next Generation Internet (NGI) initiatives. Beyond named access to personal data, ICN’s in-network storage capability can be used as a substrate for retrieving aggregated, anonymized data, or even for executing complex analytics within the network, with no personal data leaking outside. In this opinion article we discuss how ICNs combined with trusted execution environments and digital watermarking, can be combined to build a personal data overlay inter-network in which users will be able to control who gets access to their personal data, know where each copy of said data is, negotiate payments in exchange for data, and even claim ownership, and establish accountability for data leakages due to malfunctions or malice. Of course, coming up with concrete designs about how to achieve all the above will require a huge effort from a dedicated community willing to change how personal data are handled on the Internet. Our hope is that this opinion article can plant some initial seeds towards this direction.

Download from ACM

Italian operators’ response to the COVID-19 pandemic

Massimo Candela, Antonio Prado

Abstract

Since the beginning of the COVID-19 pandemic, governments introduced several social restrictions. As of 18 March 2020, more than 250 million people were in lockdown in Europe. This drastically increased the number of online activities. Due to this unprecedented situation, some concerns arose about the suitability of the Internet network to sustain the increased usage.

Italy was severely hit by the first wave of the pandemic and various regions underwent a lockdown before the main country-wide one. The Italian network operators started sharing information about improvements carried out on the network and new measures adopted to support the increase in Internet usage. In this report, by means of a questionnaire, we collect information and provide a quantitative overview of the actions undertaken by network operators in Italy. The attitude of Italian operators was synergic and proactive in supporting the changed market conditions caused by the public health emergency.

Download from ACM

The case for model-driven interpretability of delay-based congestion control protocols

Muhammad Khan, Yasir Zaki, Shiva R. Iyer, Talal Ahamd, Thomas Poetsch, Jay Chen, Anirudh Sivaraman, Lakshmi Subramanian

Abstract

Analyzing and interpreting the exact behavior of new delay-based congestion control protocols with complex non-linear control loops is exceptionally difficult in highly variable networks such as cellular networks. This paper proposes a Model-Driven Interpretability (MDI) congestion control framework, which derives a model version of a delay-based protocol by simplifying a congestion control protocol’s response into a guided random walk over a two-dimensional Markov model. We demonstrate the case for the MDI framework by using MDI to analyze and interpret the behavior of two delay-based protocols over cellular channels: Verus and Copa. Our results show a successful approximation of throughput and delay characteristics of the protocols’ model versions across variable network conditions. The learned model of a protocol provides key insights into an algorithm’s convergence properties.

Download from ACM

Experience-driven research on programmable networks

Hyojoon Kim, Xiaoqi Chen, Jack T Brassil, Jennifer Rexford

Abstract

Many promising networking research ideas in programmable networks never see the light of day. Yet, deploying research prototypes in production networks can help validate research ideas, improve them with faster feedback, uncover new research questions, and also ease the subsequent transition to practice. In this paper, we show how researchers can run and validate their research ideas in their own backyards—on their production campus networks—and we have seen that such a demonstrator can expedite the deployment of a research idea in practice to solve real network operation problems. We present P4Campus, a proof-of-concept that encompasses tools, an infrastructure design, strategies, and best practices—both technical and non-technical—that can help researchers run experiments against their programmable network idea in their own network. We use network tapping devices, packet brokers, and commodity programmable switches to enable running experiments to evaluate research ideas on a production campus network. We present several compelling data-plane applications as use cases that run on our campus and solve production network problems. By sharing our experiences and open-sourcing our P4 apps [28], we hope to encourage similar efforts on other campuses.

Download from ACM

The January 2021 issue

This January 2021 issue contains three technical papers as well as two editorial notes.

The first technical paper, Distrinet: a Mininet Implementation for the Cloud, by Giuseppe Di Lena and his colleagues, proposes Distrinet, a distributed implementation of Mininet over multiple hosts, based on LXD/LXC, Ansible, and VXLAN tunnels. Distrinet is compatible with Mininet programs, generic and can deploy experiments on Linux clusters as well as on the Amazon EC2 cloud platform. Given how popular Mininet is for SDN evaluation, this contribution potentially provides a lot of value to our research community.

The second technical paper, Experience-Driven Research on Programmable Networks, by Hyojoon Kim and colleagues, presents a proof-of-concept to help researchers run experiments against their programmable network idea, in their own network. The authors present several data-plane applications as use cases that run on their campus and solve production network problems. While not fully reproducible, this paper is a good step towards encouraging similar efforts in our community.

Our third paper, The Case for Model-Driven Interpretability of Delay-based Congestion Control Protocols, by Muhammad Khan and his colleagues, presents a study of different delay-based congestion control algorithms for TCP. The proposed framework is flexible and allows to model delay-based protocols, by simplifying a congestion control protocol’s response into a guided random walk over a two-dimensional Markov model. The model is evaluated against actual traces collected in 3G/4G networks, and allows to get the intuition of which regime the congestion control loop is spending most of the time.

Then, we have two editorial notes. The first one, Italian Operators’ Response to the COVID-19 Pandemic, by Massimo Candela and Antonio Prado, reports on the actions undertaken by network operators in Italy in response to COVID-19. The second editorial note, What do Information Centric Networks, Trusted Execution Environments, and Digital Watermarking have to do with Privacy, the Data Economy, and their future?, by Nikolaos Laoutaris and Costas Iordanou, discusses how ICNs combined with trusted execution environments and digital watermarking can be combined to build a personal data overlay inter-network that has a plethora of desirable properties for end-users.

I hope that you will enjoy reading this new issue and welcome comments and suggestions on CCR Online (https://ccronline.sigcomm.org) or by email at ccr-editor at sigcomm.org.

Distrinet: a Mininet Implementation for the Cloud

Giuseppe Di Lena, Andrea Tomassilli, Damien Saucez, Frédéric Giroire, Thierry Turletti, Chidung Lac

Abstract

Networks have become complex systems that combine various concepts, techniques, and technologies. As a consequence, modelling or simulating them now is extremely complicated and researchers massively resort to prototyping techniques. Mininet is the most popular tool when it comes to evaluate SDN propositions. Mininet allows to emulate SDN networks on a single computer but shows its limitations with resource intensive experiments as the emulating host may become overloaded. To tackle this issue, we propose Distrinet, a distributed implementation of Mininet over multiple hosts, based on LXD/LXC, Ansible, and VXLAN tunnels. Distrinet uses the same API than Mininet, meaning that it is compatible with Mininet programs. It is generic and can deploy experiments on Linux clusters (e.g., Grid’5000), as well as on the Amazon EC2 cloud platform.

Download from ACM