Framework, Models and Controlled Experiments of NetworkTroubleshooting

Abstract :

Growing network complexity mandates automated tools and methodologies for troubleshooting. In this paper, we follow a crowd-sourcing trend and argue for the need to deploy measurement probes at the edge of the network, which can be either under the control of the users (e.g., end-user devices) or the ISP (e.g., home gateways), and that raises an interesting tradeoff.

Our first contribution consists in the definition of a framework for network troubleshooting, and its implementation as open source software named NetProbes. In data mining terms, depending on the amount of information available to the probes (e.g., ISP topology), we formalize the network troubleshooting task as either a clustering or a classification problem. In networking terms, these algorithms allow respectively end-users to assess the severity of the network performance degradation, and ISPs to precisely identify the faulty link. We solve both problems with an algorithm that achieves perfect classification under the assumption of a strategic selection of probes (e.g., assisted by an ISP), and assess its performance degradation under a naive random selection. Our algorithm is generic, as it is agnostic to the network performance metrics; scalable, as it requires firing only few measurement events and simple processing; flexible, as clustering and classification stages are pipelined, so that the execution naturally adapts to the information available at the vantage point where the probe is deployed; and reliable, as it produces results that match the expectations of simple analytical models.

Our second contribution consists in a careful evaluation of the framework. Previous work on network troubleshooting has so far tackled the problem with either more theoretical or more practical approaches: inherently, evaluation methodologies lack either realism or control. In this paper, we counter this problem by conducting controlled experiments with a rigorous and reproducible methodology that contrasts expectations yielded by analytical models to the experimental results gathered running our NetProbes software in the Mininet emulator. As integral part of our methodology, we perform a thorough calibration of the measurement tools employed by NetProbes to measure two example metrics of interest, namely delay and bandwidth: we show this step to be crucial, as otherwise significant biases in the measurements techniques could lead to wrong assessment of algorithmic performance. Albeit our NetProbes software is far from being a carrier-grade solution for network troubleshooting (since it does not consider neither multiple contemporary measurements, nor multiple failures, and given that we experiment with a limited number of metrics), our controlled study allows making several interesting observation that help designing such an automated troubleshooting system.

Type de document :
Article dans une revue
Elsevier Computer Networks, 2016, 107, pp.36-54
Liste complète des métadonnées
Contributeur : Admin Télécom Paristech <>
Soumis le : mardi 18 octobre 2016 - 12:48:05
Dernière modification le : mercredi 21 mars 2018 - 18:57:46


  • HAL Id : hal-01383255, version 1


François Espinet, Diana Joumblatt, D. Rossi. Framework, Models and Controlled Experiments of NetworkTroubleshooting. Elsevier Computer Networks, 2016, 107, pp.36-54. 〈hal-01383255〉



Consultations de la notice