L
Título: New Systems and Algorithms for Scalable Fault Tolerance
Autores: Sen, Siddhartha
Fecha: 2013-05-21
2013-05-21
2013
Publicador: Princeton, NJ : Princeton University
Fuente: Ver documento
Tipo: Academic dissertations (Ph.D.)
Tema: balanced trees
Byzantine fault tolerance
database access methods
expander graphs
join-leave attacks
partial broadcast
Computer science
Applied mathematics
Descripción: Users of Internet services are increasingly intolerant of delays and outages, while demanding a consistent online experience. A website that is down or misbehaving is reported within seconds, often with an embarrassing screenshot that spreads through the news like wildfire. Among these failures, the most notorious are the ones that manifest arbitrary behavior, such as returning the wrong content to users or accidentally deleting their data. Unfortunately, protecting against such failures---whether due to misconfigurations, bugs, or even malice---is prohibitively expensive, because most existing solutions do not scale beyond a single server's performance. As a result, these solutions are not used for customer-facing services, where scalability is required to cope with large user populations. This thesis describes new systems and algorithms for tolerating arbitrary failures in Internet services, inspired by real-world debacles. Unlike prior work, our solutions are highly scalable. Our approach integrates theoretical innovations into the later stages of system design, giving robust guarantees that are also practical. We begin with a real failure that occurred in the indexing technique used by a certain database provider, and explain theoretically why the technique failed. We remedy the technique by introducing a new class of tree data structures, called relaxed trees, with provably good properties. Our analysis of relaxed trees makes use of exponential potential functions. Then, we describe a general system for tolerating arbitrary failures, called Prophecy, that delivers scalable performance on read-mostly workloads. With a modest trust assumption, Prophecy is practical for modern Internet services, as our evaluation confirms. Finally, we devise two techniques to scale this fault tolerance to very large-scale systems and general workloads. The first is an algorithm for securely composing many small replica groups, subject to an adversary that can coordinate faulty nodes across the groups dynamically. The second is a technique for improving the fault tolerance within each replica group, by adding small, trusted broadcast channels that mitigate the impact of faulty nodes.
Idioma: Inglés
Artículos similares:
Engineering solutions for a carbon-constrained world por Celia, M. A.,Nordbotten, J. M.
Impact of capillary forces on large-scale migration of CO2 por Nordbotten, Jan M.,Dahle, Helge K.
Impact of geological heterogeneity on early-stage CO2 plume migration por Ashraf, Meisam,Lie, Knut-Andreas,Nilsen, Halvor M.,Nordbotten, Jan M.,Skorstad, Arne
A model-oriented benchmark problem for CO2 storage por Dahle, Helge K.,Eigestad, Geir T.,Nordbotten, Jan M.,Pruess, K.
CO2 trapping in sloping aqiufers: High resolution numerical simulations por Elenius, Maria,Tchelepi, Hamdi,Johannsen, Klaus
Report from CO2 storage workshop por Dahle, Helge K.,Lien, Martha,Nordbotten, Jan M.,Lie, Knut-Andreas,Braathen, Alvar,Helmig, Rainer,Class, Holger,Celia, Michael A.
Summary of Princeton Workshop on Geological Storage of CO2 por Celia, Michael A.,Nordbotten, Jan M.,Bachu, Stefan,Kavetski, Dmitri,Gasda, Sarah
10