Category Archives: CCR April 2024

Towards Re-Architecting Today’s Internet for Survivability: NSF Workshop Report

Fabián E. Bustamante, John Doyle, Walter Willinger, Marwan Fayed, David L. Alderson, Steven Low, Stefan Savage, Henning Schulzrinne

Abstract

On November 28–29, 2023, Northwestern University hosted a workshop titled “Towards Re-architecting Today’s Internet for Survivability” in Evanston, Illinois, US. The goal of the workshop was to bring together a group of national and international experts to sketch and start implementing a transformative research agenda for solving one of our community’s most challenging yet important tasks: the re-architecting of tomorrow’s Internet for “survivability”, ensuring that the network is able to fulfill its mission even in the presence of large-scale catastrophic events. This report provides a necessarily brief overview of two full days of active discussions.

Download from ACM

On Sample Selection for Continual Learning: A Video Streaming Case Study

Alexander Dietmüller, Romain Jacob, Laurent Vanbever

Abstract

Machine learning (ML) is a powerful tool to model the complexity of communication networks. As networks evolve, we cannot only train once and deploy. Retraining models, known as continual learning, is necessary. Yet, to date, there is no established methodology to answer the key questions: With which samples to retrain? When should we retrain?
We address these questions with the sample selection system Memento, which maintains a training set with the “most useful” samples to maximize sample space coverage. Memento particularly benefits rare patterns—the notoriously long “tail” in networking—and allows assessing rationally when retraining may help, i.e., when the coverage changes.
We deployed Memento on Puffer, the live-TV streaming project, and achieved a 14 % reduction of stall time, 3.5× the improvement of random sample selection. Memento is model-agnostic and can be applied beyond video streaming.

Download from ACM

The April 2024 issue

This April 2024 issue contains two technical papers and one editorial note.

The first technical paper, This Is a Local Domain: On Amassing Country-Code Top-Level Domains from Public Data, by Raffaele Sommese and colleagues, presents a measurement study that investigates ccTLD coverage using public data sources. Domain lists such as Alexa and Tranco are crucial tools for performing representative Web censuses. However, these lists often overlook domains under country-code top-level domains (ccTLDs), resulting in biased census outcomes. The authors demonstrate that data from Certificate Transparency (CT) logs and Common Crawl provide robust ccTLD coverage and can serve as reliable proxies for Web censuses. The authors also plan to release ccTLD domain names to the community as part of an expansion of the daily OpenINTEL measurement.

The second technical paper, On Sample Selection for Continual Learning: a Video Streaming Case Study, by Alexander Dietmüller and colleagues, proposes Memento, a prototype implementation aiming at improving sample selection, especially in the tail of the traffic distribution. An already sizable and increasing body of work focuses on using ML models to capture the inherent complexity of communication networks. And yet, as network traffic evolves, ML approaches need to tackle the issue of concept drift, which implies that the model will eventually underperform. This means that ML models need retraining. When to retrain and on what data remain difficult problems. This paper focuses precisely on those open issues.

Finally, the editorial note, Towards Re-architecting Today’s Internet for Survivability — NSF Workshop Report by Fabian Bustamante and colleagues, reports on a workshop that took place on November 28-29, 2023, at Northwestern University. The goal of the workshop was to bring together a group of national and international experts to sketch and start implementing a transformative research agenda for solving one of our community’s most challenging yet important tasks: the re-architecting of tomorrow’s Internet for “survivability”, ensuring that the network is able to fulfill its mission even in the presence of large-scale catastrophic events.

I hope that you will enjoy reading this new issue and welcome comments and suggestions on CCR Online (https://ccronline.sigcomm.org) or by email at ccr-editor at sigcomm.org.

This Is a Local Domain: On Amassing Country-Code Top-Level Domains from Public Data

Raffaele Sommese, Roland van Rijswijk-Deij, Mattijs Jonker

Abstract

Domain lists are a key ingredient for representative censuses of the Web. Unfortunately, such censuses typically lack a view on domains under country-code top-level domains (ccTLDs). This introduces unwanted bias: many countries have a rich local Web that remains hidden if their ccTLDs are not considered. The reason ccTLDs are rarely considered is that gaining access – if possible at all – is often laborious. To tackle this, we ask: what can we learn about ccTLDs from public sources? We extract domain names under ccTLDs from 6 years of public data from Certificate Transparency logs and Common Crawl. We compare this against ground truth for 19 ccTLDs for which we have the full DNS zone. We find that public data covers 43%-80% of these ccTLDs, and that coverage grows over time. By also comparing port scan data we then show that these public sources reveal a significant part of the Web presence under a ccTLD. We conclude that in the absence of full access to ccTLDs, domain names learned from public sources can be a good proxy when performing Web censuses.

Download from ACM