Quality of Availability for Widely Distributed and Replicated Content Stores
Key: On04-1
Author: Giwon On
Date: June 2004
Kind: @phdthesis
Abstract: </SPAN> <BR> </P> <P>As the Internet continues to experience a tremendous expansion in number of users and services today, the importance of satisfying service availability becomes recently one of the most critical factors for the success of "Internet services" such as the World Wide Web, peer-to-peer file sharing and digital audio/video streaming. In this thesis, we take an availability-centric view on end-to-end service Quality of Service (QoS) and investigate service availability for large-scale content distribution and replication systems in wide-area internetworks such as the Internet. The main focus is on issues of providing availability guarantees, i.e., delivering target level of service availability with guarantees for all individual users of services running on top of the above systems. Specifically, the thesis aims at the development of concept and framework for technically realizing the availability-centred QoS approach, on the one hand, and of decision strategies for content replication in order to provide availability guarantee for the large-scale Internet-based services, on the other hand. In particular, we tackle the replica selection and placement problem and study the effectiveness of realistic replication strategies on the achieved service availability in the Internet. There are several contributions that this thesis makes. The first contribution is the concept of Quality of Availability (QoA) which enables one to treat availability as a controllable and observable QoS parameter. On the basis of this concept, three refinements of the existing availability definition have been investigated (decoupled, differentiated and fine-grained) in order to enable a more quantitative specification and evaluation of the service availability. For a technical realization of the QoA concept, we have developed a QoA framework in which users can specify their service requirements in terms of QoA and the requirements are mapped into the low-level replication specification. Additionally, the QoA framework offers QoA metrics and mechanisms to control the QoA guarantee. The second contribution is an evaluation of the replication techniques used to resolve the replica placement problem for different content distribution and replication system models. We have tackled the replica placement problem and investigated specifically the effects of the degree of replication, the granularity and location of replicas on overall service availability achieved by the selected placements. We divided the placement problem into two sub-problems, (1) the static full server replica placement in content delivery network (CDN)-like replicated systems and (2) the dynamic partial data replica placement in peer-to-peer (P2P) systems. For solving the static full server replica placement problem, we first modelled the content distribution and replication system as a stochastic graph and then developed several ranking-based heuristics and an exact state enumeration algorithm for improving and guaranteeing service availability of the system. By simulating the behaviour of the proposed algorithms, we found that the location of replicas is a relevant factor for the service availability. Even though the QoA improvement could be achieved by increasing replica numbers, replicas&#039; placement and their dependability affected the QoA more significantly. In the case of the dynamic replica placement problem in P2P-based content distribution and replication systems, we took the intermittent connectivity of peers of the system explicitly into account. We modelled the system as a dynamic stochastic graph where each node (peer) goes up and down according to its assigned up-time probability. We considered different time scales for replica creation so that replicas could be created either proactively at the service initialization phase, or on-the-fly during the service running phase. We re-used the placement heuristics used for solving the static replica placement problem and modified them slightly so that they are fully decentralized, assume no global information about the system condition or network topology, and work in a cooperative way to replicate content on-the-fly. To quantitatively study the effectiveness of the used heuristics, we developed an event-driven simulation framework which captures the data access model as well as peers&#039; dynamic behaviour. The simulation results indicated that the cooperative placement heuristics offer, in general, better QoA than non-cooperative local placement approach, while all of them induce approximately the same amount of replacement cost in terms of the number of replacements. The third contribution is a new approach for determining replica placements in large-scale content distribution and replication systems, which always provides an availability guarantee for all service requesting entities in a content distribution and replication system. For the first time, we suggested an admission control-based replica placement scheme that solves the static replica placement problem in a polynomial time, with respect to the size of the input system. Through a simulative study, we have experimentally evaluated the admission control-based algorithm with random and power-law graphs of different sizes. Our results show that the placement achieved by the admission control-based algorithm always provides a QoA guarantee, and that the upper bound of the replication degree (i.e., replica numbers) required for delivering QoA guarantees is about 27% and 37% of the total nodes in the random and power-law graph, respectively. As far as we know this is the first time that the admission control concept has been integrated with a replica placement scheme that solves the problem in a polynomial time while offering service availability guarantees.

The documents distributed by this server have been provided by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, not withstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.