Author Archives: Steve Uhlig

M-Lab: user initiated internet data for the research community

Phillipa Gill, Christophe Diot, Lai Yi Ohlsen, Matt Mathis, Stephen Soltesz

Abstract

Measurement Lab (M-Lab) is an open, distributed server platform on which researchers have deployed measurement tools. Its mission is to measure the Internet, save the data and make it universally accessible and useful. This paper serves as an update on the MLab platform 10+ years after its initial introduction to the research community [5]. Here, we detail the current state of the M-Lab distributed platform, highlight existing measurements/data available on the platform, and describe opportunities for further engagement between the networking research community and the platform.

Download from ACM

Roadmap for edge AI: a Dagstuhl perspective

Aaron Yi Ding, Ella Peltonen, Tobias Meuser, Atakan Aral, Christian Becker, Schahram Dustdar, Thomas Hiessl, Dieter Kranzlmüller, Madhusanka Liyanage, Setareh Maghsudi, Nitinder Mohan, Jörg Ott, Jan S. Rellermeyer, Stefan Schulte, Henning Schulzrinne, Gürkan Solmaz, Sasu Tarkoma, Blesson Varghese, Lars Wolf

Abstract

Based on the collective input of Dagstuhl Seminar (21342), this paper presents a comprehensive discussion on AI methods and capabilities in the context of edge computing, referred as Edge AI. In a nutshell, we envision Edge AI to provide adaptation for data-driven applications, enhance network and radio access, and allow the creation, optimisation, and deployment of distributed AI/ML pipelines with given quality of experience, trust, security and privacy targets. The Edge AI community investigates novel ML methods for the edge computing environment, spanning multiple sub-fields of computer science, engineering and ICT. The goal is to share an envisioned roadmap that can bring together key actors and enablers to further advance the domain of Edge AI.

Download from ACM

Towards client-side active measurements without application control

Palak Goenka, Kyriakos Zarifis, Arpit Gupta, Matt Calder

Abstract

Monitoring performance and availability are critical to operating successful content distribution networks. Internet measurements provide the data needed for traffic engineering, alerting, and network diagnostics. While there are significant benefits to performing end-user active measurements, these capabilities are limited to a small number of content providers with application control. In this work, we present a solution to the long-standing problem of issuing active measurements from clients without requiring application control, e.g., injecting JavaScript to the content served. Our approach uses server-side programmable features of the Network Error Logging specification that allow a CDN to induce a browser connection to an HTTPS server of the CDN’s choosing without application control.

Download from ACM

Towards retina-quality VR video streaming: 15ms could save you 80% of your bandwidth

Luke Hsiao, Brooke Krajancich, Philip Levis, Gordon Wetzstein, Keith Winstein

Abstract

Virtual reality systems today cannot yet stream immersive, retina-quality virtual reality video over a network. One of the greatest challenges to this goal is the sheer data rates required to transmit retina-quality video frames at high resolutions and frame rates. Recent work has leveraged the decay of visual acuity in human perception in novel gaze-contingent video compression techniques. In this paper, we show that reducing the motion-to-photon latency of a system itself is a key method for improving the compression ratio of gaze-contingent compression. Our key finding is that a client and streaming server system with sub-15ms latency can achieve 5x better compression than traditional techniques while also using simpler software algorithms than previous work.

Download from ACM

The January 2022 issue

This January 2022 issue contains three technical papers and four editorial notes.

The first technical paper, Zeph & Iris Map the Internet – A resilient reinforcement learning approach to distributed IP route tracing, by Matthieu Gouel and colleagues, proposes to improve topology discovery by optimizing the use of existing probing resources. This can be done by intelligently allocating probing directives to vantage points. The system is based on the inter-working of two components: Iris, which takes care of the route tracing, and Zeph, which coordinates Iris’s measurements. The results in the paper show that Zeph, in combination with Iris, are able to facilitate fast topology measurements from geographically distributed vantage points.

The second technical paper, Towards Retina-Quality VR Video Streaming: 15ms Could Save You 80\% of Your Bandwidth, by Luke Hsiao and colleagues, investigates how to provide retina-quality video streaming in virtual reality (VR). The paper studies the impact of the motion-to-photon latency — the time between a change in the viewer’s gaze and the resulting change in the display’s pixels — on a VR system. This metric is paramount for VR systems since it impacts video compression. The paper shows, experimentally, that a client and streaming server system with sub-15 ms end-to-end motion-to-photon latency benefit from 5x better video compression than in presence of larger latencies. The paper also shows how to build such a low latency system both hardware and software-wise.

The third technical paper, Towards client-side active measurements without application control, by Palak Goenka and colleagues, proposes to harness Network Error Logging (NEL) to enable active client-side measurements (RTT and connection availability) by dynamically modifying the HTTPS endpoint where NEL reports should be uploaded. Network Error Logging (NEL) is a W3C standard which defines how web servers can receive from a browser reports about performance and failures of web requests. The techniques used in the paper enable active client-side measurements in the browser without requiring Javascript code injection, which is the current and more invasive state of the art solution.

Finally, we have four editorial notes. Roadmap for Edge AI: A Dagstuhl Perspective, by Aaron Yi Ding and his colleagues, based on the collective input of Dagstuhl Seminar (21342), presents a comprehensive discussion on AI methods and capabilities in the context of edge computing, referred as Edge AI. Then, M-Lab: User initiated Internet data for the research community, by Phillipa Gill and her colleagues, presents Measurement Lab (M-Lab), an open, distributed server platform on which researchers have deployed measurement tools. Important Concepts in Data Communications, by Craig Partridge, presents one perspective about which concepts or ideas in data communications have proven to be enduring in the evolution of data communications. Finally, Answering Three Questions About Networking Research, by Jennifer Rexford and Scott Shenker, presents the first of a series of answers to three questions that were asked to panelists during HotNets’21, about how they pick their own research topics, what areas they would like to see more research on, and how they evaluate conference papers.

I hope that you will enjoy reading this new issue and welcome comments and suggestions on CCR Online (https://ccronline.sigcomm.org) or by email at ccr-editor at sigcomm.org.

Zeph & Iris map the internet: A resilient reinforcement learning approach to distributed IP route tracing

Matthieu Gouel, Kevin Vermeulen, Maxime Mouchet, Justin P. Rohrer, Olivier Fourmaux, Timur Friedman

Abstract

We describe a new system for distributed tracing at the IP level of the routes that packets take through the IPv4 internet. Our Zeph algorithm coordinates route tracing efforts across agents at multiple vantage points, assigning to each agent a number of /24 destination prefixes in proportion to its probing budget and chosen according to a reinforcement learning heuristic that aims to maximize the number of multipath links discovered. Zeph runs on top of Iris, our fault-tolerant system for orchestrating internet measurements across distributed agents of heterogeneous probing capacities. Iris is built around third party free open source software and modern containerization technology, thereby presenting a new model for assembling a resilient and maintainable internet measurement architecture. We show that carefully choosing the destinations to probe from which vantage point matters to optimize topology discovery and that a system can learn which assignment will maximize the overall discovery based on previous measurements. After 10 cycles of probing, Zeph is capable of discovering 2.4M nodes and 10M links in a cycle of 6 hours, when deployed on 5 Iris agents. This is at least 2 times more nodes and 5 times more links than other production systems for the same number of prefixes probed.

Download from ACM

Data-driven networking research: models for academic collaboration with industry (a Google point of view)

Jeffrey C. Mogul, Priya Mahadevan, Christophe Diot, John Wilkes, Phillipa Gill, Amin Vahdat

Abstract

We in Google’s various networking teams would like to increase our collaborations with academic researchers related to data-driven networking research. There are some significant constraints on our ability to directly share data, which are not always widely-understood in the academic community; this document provides a brief summary. We describe some models which can work – primarily, interns and visiting scientists working temporarily as employees, which simplifies the handling of some confidentiality and privacy issues. We describe some specific areas where we would welcome proposals to work within those models.

Download from ACM

An educational toolkit for teaching cloud computing

Cosimo Anglano, Massimo Canonico, Marco Guazzone

Abstract

In an educational context, experimenting with a real cloud computing platform is very important to let students understand the core concepts, methodologies and technologies of cloud computing. However, API heterogeneity of cloud providers complicates the experimentation by forcing students to focus on the use of different APIs, and by hindering the jointly use of different platforms. In this paper, we present EasyCloud, a toolkit enabling the easy and effective use of different cloud platforms. In particular, we describe its features, architecture, scalability, and use in our cloud computing courses, as well as the pedagogical insights we learnt over the years.

Download from ACM

Machine learning-based analysis of COVID-19 pandemic impact on US research networks

Mariam Kiran, Scott Campbell, Fatema Bannat Wala, Nick Buraglio, Inder Monga

Abstract

This study explores how fallout from the changing public health policy around COVID-19 has changed how researchers access and process their science experiments. Using a combination of techniques from statistical analysis and machine learning, we conduct a retrospective analysis of historical network data for a period around the stay-at-home orders that took place in March 2020. Our analysis takes data from the entire ESnet infrastructure to explore DOE high-performance computing (HPC) resources at OLCF, ALCF, and NERSC, as well as User sites such as PNNL and JLAB. We look at detecting and quantifying changes in site activity using a combination of t-Distributed Stochastic Neighbor Embedding (t-SNE) and decision tree analysis. Our findings bring insights into the working patterns and impact on data volume movements, particularly during late-night hours and weekends.

Download from ACM

REDACT: refraction networking from the data center

Arjun Devraj, Liang Wang, Jennifer Rexford

Abstract

Refraction networking is a promising censorship circumvention technique in which a participating router along the path to an innocuous destination deflects traffic to a covert site that is otherwise blocked by the censor. However, refraction networking faces major practical challenges due to performance issues and various attacks (e.g., routing-around-the-decoy and fingerprinting). Given that many sites are now hosted in the cloud, data centers offer an advantageous setting to implement refraction networking due to the physical proximity and similarity of hosted sites. We propose REDACT, a novel class of refraction networking solutions where the decoy router is a border router of a multi-tenant data center and the decoy and covert sites are tenants within the same data center. We highlight one specific example REDACT protocol, which leverages TLS session resumption to address the performance and implementation challenges in prior refraction networking protocols. REDACT also offers scope for other designs with different realistic use cases and assumptions.

Download from ACM