Toward online testing of federated and heterogeneous distributed systems

Marco Canini, Vojin Jovanovic, Daniele Venzano, Boris Spasojevic, Olivier Crameri, Dejan Kostic

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Scopus citations

Abstract

Making distributed systems reliable is notoriously difficult. It is even more difficult to achieve high reliability for federated and heterogeneous systems, i.e., those that are operated by multiple administrative entities and have numerous inter-operable implementations. A prime example of such a system is the Internet's inter-domain routing, today based on BGP. We argue that system reliability should be improved by proactively identifying potential faults using an online testing functionality. We propose DiCE, an approach that continuously and automatically explores the system behavior, to check whether the system deviates from its desired behavior. DiCE orchestrates the exploration of relevant system behaviors by subjecting system nodes to many possible inputs that exercise node actions. DiCE starts exploring from current, live system state, and operates in isolation from the deployed system. We describe our experience in integrating DiCE with an open-source BGP router. We evaluate the prototype's ability to quickly detect origin misconfiguration, a recurring operator mistake that causes Internet-wide outages. We also quantify DiCE's overhead and find it to have marginal impact on system performance.

Original languageEnglish (US)
Title of host publicationProceedings of the 2011 USENIX Annual Technical Conference, USENIX ATC 2011
PublisherUSENIX Association
Pages241-246
Number of pages6
ISBN (Electronic)9781931971850
StatePublished - 2019
Event2011 USENIX Annual Technical Conference, USENIX ATC 2011 - Portland, United States
Duration: Jun 15 2011Jun 17 2011

Publication series

NameProceedings of the 2011 USENIX Annual Technical Conference, USENIX ATC 2011

Conference

Conference2011 USENIX Annual Technical Conference, USENIX ATC 2011
Country/TerritoryUnited States
CityPortland
Period06/15/1106/17/11

ASJC Scopus subject areas

  • General Computer Science

Fingerprint

Dive into the research topics of 'Toward online testing of federated and heterogeneous distributed systems'. Together they form a unique fingerprint.

Cite this