FREE ELECTRONIC LIBRARY - Abstracts, online materials

Pages:     | 1 |   ...   | 7 | 8 || 10 | 11 |   ...   | 18 |

«Efficient, scalable consistency for highly fault-tolerant storage GARTH GOODSON August 2004 CMU-CS-04-111 Dept. of Electrical and Computer ...»

-- [ Page 9 ] --

The use of the collision-resistant hash cryptographic primitive protects against Byzantine entities. Byzantine storage-nodes cannot undetectably corrupt values written to them, because the hash of the object’s value is in the timestamp. Byzantine clients cannot perform poisonous writes [Martin et al. 2002]. In a poisonous write, a Byzantine client writes different values to different storage-nodes with the same timestamp. Storage-nodes validate the value sent in a CW request with the value verifier. Since in the R/CW protocol client’s transmit full replicas, as opposed to erasure coded fragments in the R/W protocol, storage-node validation prevents poisonous writes. As well, validation ensures that a Byzantine client cannot make a correct storage-node appear Byzantine (i.e., only Byzan· 64 Efficient, scalable consistency for highly fault-tolerant storage tine storage-nodes return values that cannot be validated by the client).

4.2.2 Authenticators To ensure R/CW semantics, the conditioned-on relationship between the value in the CW request and the candidate in the object history set must be maintained. This conditionedon relationship is validated by a storage-node before it executes a CW request. For a storage-node to validate the conditioned-on relationship, it must “know” that the replica histories in the object history set are indeed those returned by other storage-nodes.

Generally, digital signatures are used for such purposes. However, digital signatures are computationally expensive to compute and verify. Keyed cryptographic hash functions can be evaluated approximately three orders of magnitude faster than digital signatures.

To make Byzantine fault-tolerant agreement efficient, Castro and Liskov used authenticators in lieu of digital signatures in BFT [Castro and Liskov 1998b]. Authenticators are vectors of keyed hashes: the ith element in the vector is used to prove the authenticity of the message to entity i. To enable authenticators, all pairs of entities that need to prove message authenticity to one another must share distinct secret keys. Authenticators are not as strong a primitive as signatures: any entity can verify an entire signature, whereas only entities in the vector of keyed hashes can validate the authenticator (and, at that, only its own entry in the authenticator).

Authenticators are used by storage-nodes in the R/CW protocol to “sign” replica histories returned in read responses. If authenticator validation fails, the storage-node cannot tell if a Byzantine client corrupted a valid replica history, or if a Byzantine storage-node constructed an invalid replica history, since the object history set is constructed by the client. Invalid authenticators are discussed in Section 4.6.2.

During repair, authenticators allow clients with read-only access to an object to perform repair on the object. In many systems, there is an asymmetry between write privileges and read privileges. Authenticators provide sufficient proof to storage-nodes that the object history set is valid and thus that some candidate is repairable. Consequently, storage-nodes can execute “repair” CW requests from read-only clients.


4.3 Protocol 65

–  –  –

This section pseudo-code for classifying candidates, performing conditional write operations, and validating CW requests. As well, read operations and storage-node actions are discussed in detail.

4.3.1 Read operation Figure 4.3 shows the pseudo-code for the read operation. The read operation begins by issuing read requests to the set of N storage-nodes. Given the asynchronous nature of the protocol, and the crash-recovery failure model for storage-nodes, no more than N −t read responses are collected.

In response to a read request, the storage-node returns its replica history. A client can explicitly request a specific version of the object (based on candidate classification), and

the storage-node returns its value. This functionality is used to implement read witnesses:

only one storage-node need return the value of its object replica, the replica histories of other storage-nodes act as witnesses [Pˆ ris 1986] that validate the correctness of the value a (through the object’s value hash).

The client uses the returned replica histories to construct the ObjectHistorySet (cf.

line 100). Recall, each entry in a replica history is a logical timestamp, conditioned-on logical timestamp, value tuple. For simplicity of presentation, data corresponding to each entry in the replica history is returned. However, in practice, only the data value associated with the latest logical timestamp is returned. Optimistically, the latest timestamp is usually classified as complete and this data is sufficient, otherwise an extra read round-trip is required to fetch the appropriate data version (as in the R/W protocol).

A storage-node also attaches an authenticator to the replica history it returns. When the client constructs the object history set, ObjectHistorySet, each replica history in the set has an authenticator. The client can cache the object history set and authenticators and use them in a future CW request.

The pseudo-code for the CW DO READ operation is very similar to the code shown · 66 Efficient, scalable consistency for highly fault-tolerant storage

READ() :

100: ObjectHistorySet := DO READ() 101: Candidate, Status := CLASSIFY(ObjectHistorySet) 102: if (Status = CLASSIFIED COMPLETE) then 103: return (SUCCESS, Candidate.LT, Candidate.Data ) 104: else 105: /∗ Status = CLASSIFIED REPAIRABLE ∗/ 106: return (CONDITIONAL WRITE(Candidate, Candidate.LT conditioned, Candidate.Data, ObjectHistorySet)) 107: end if


/ 200: ResponseSet := 0 201: repeat 202: for all S ∈ {S1,..., SN } \ ResponseSet.S do 203: SEND(S, READ REQUEST) 204: end for 205: if (POLL FOR RESPONSE() = TRUE) then 206: S, ReplicaHistory := RECEIVE READ RESPONSE() 207: if ((S ∈ ResponseSet.S) AND VALIDATE(ReplicaHistory) = SUCCESS)) then / 208: ObjectHistorySet[S] := ReplicaHistory 209: ResponseSet := ResponseSet ∪ S 210: end if 211: end if 212: until (|ResponseSet| = N − t) 213: return (ObjectHistorySet)

CLASSIFY(ObjectHistorySet) :

300: Candidate, Count := SELECT CS(ObjectHistorySet, ⊥) 301: loop 302: if (Count ≥ COMPLETE) then 303: return ( Candidate, CLASSIFIED COMPLETE ) 304: else if (Count ≥ INCOMPLETE) then 305: return ( Candidate, CLASSIFIED REPAIRABLE ) 306: end if 307: /∗ Incomplete candidate: find new candidate and loop again. ∗/ 308: Candidate, Count := SELECT CS(ObjectHistorySet, Candidate.LT) 309: end loop

Figure 4.3: Client read and classification pseudo-code.

in the R/W protocol chapter (Figure 3.6). First, the DO READ function discards any responses that cannot be validated (cf. line 207). Since data is assumed to be replicated, the value verifier (stored in the timestamp) is simply the hash over the data. Second, the ObjectHistorySet is constructed from the set of storage-node responses (cf. line 208).

The client then performs classification on the object history set (cf. line 101). The pseudo-code for classification is also shown in Figure 4.3. The function iteratively identifies and classifies potential candidates until a valid complete or repairable candidate is found. The function SELECT CS, on line 300, identifies the potential candidate with the highest timestamp, and sets Candidate accordingly. The value Count returned from ·

4.3 Protocol 67 SELECT CS is the count of the number of storage-nodes in the ObjectHistorySet that host the potential candidate. Note, only candidates that can be validated with the value verifier in the timestamp are chosen; as well, barrier-writes are ignored.

Once a potential candidate has been chosen, it is classified as either complete, repairable, or incomplete. To classify a potential candidate, Count is compared with the constants COMPLETE and INCOMPLETE. The derivation of these constants are described in Section 4.4.

If the potential candidate is classified as complete, the read operation returns the value associated with the candidate. If the potential candidate is classified as incomplete, SELECT CS is called again, but with the potential candidate’s timestamp (cf. line 308).

A new potential candidate, with a lower timestamp, is identified. Candidate classification begins again.

If the potential candidate is classified as repairable, the client performs repair. It does so by issuing a CW operation (shown in Figure 4.5) with the value of the repairable candidate, see line 106 in Figure 4.3. If the repairable candidate contains the latest timestamp observed by the client (i.e., its LT = MAX[ObjectHistorySet]), then the client can attempt repair by completing the operation at repairable’s logical time. Otherwise, a barrier is needed to block any competing incomplete operations from completing. When a barrier is needed, a new (higher) logical timestamp is generated. The condition on timestamp (LT conditioned ) is set to the timestamp of the latest complete write. The latest complete write is the CW conditioned on by the repairable candidate. If the repair operation fails, then the read operation aborts and must be retried. An example of a repair that requires a barrier-write is shown in Figure 4.4.

4.3.2 CW operation

The pseudo-code for the CW operation is shown in Figure 4.5. First, the logical timestamp for the CW operation, LT, is constructed. The major part of LT is determined by incrementing the major part of the largest timestamp in the ObjectHistorySet. The veriers are determined by taking the hash of the object history set and the object value for · 68 Efficient, scalable consistency for highly fault-tolerant storage

–  –  –

Figure 4.4: Example of repair requiring a barrier-write.

For this setup: N = 4, COMPLETE = 3, INCOMPLETE = 2. The object history is shown at 4 storage-nodes during the progression of a read operation. The subscript to each version’s timestamp corresponds to the timestamp on which the version is conditioned. First, the client classifies LT 2 as repairable (although it is actually complete). However, an incomplete operation exists at LT 3. To block this operation from completing, a barrier is written at LT 4. Finally, repair is attempted at LT = 5, LT conditioned = 1.

Note, that the condition on time is the same as the repairable candidate’s.

the CW operation. It is important to note that every logical timestamp generated is at least 1 larger than the latest complete write that currently exists in the system.

Before the CW operation is issued with the value, the object history set is checked to see if a barrier is needed (cf. line 407). Barriers ensure that multiple CW operations conditioned-on the same candidate do not complete. A barrier is needed if any of the replica histories in the object history set contain timestamps that are greater than that of the candidate’s timestamp (and are not themselves barriers). If such a timestamp exists, then there may be another CW operation concurrent to this CW operation. If the barrier completes, then the concurrent CW operation cannot complete.

To create a barrier, the client performs a DO WRITE with the ⊥ value, a ⊥ verifier, and LT.Barrier set to TRUE (called a barrier-write). This allows the CW operation to be executed at storage-nodes that host its barrier. The DO WRITE function issues CW requests to all storage-nodes. The object history set is updated with each response DO WRITE receives, line 508. As in DO READ, the replica history is first validated. However, one should note that even if the operation failed, the replica history is returned. This allows the client ·

4.3 Protocol 69

CONDITIONAL WRITE(Candidate, LT conditioned, Data, ObjectHistorySet) :

400: /∗ Construct the logical timestamp for the CW operation. ∗/ 401: LT latest := MAX TIMESTAMP(ObjectHistorySet) 402: LT.Time := LT latest.Time + 1 403: LT.Verifier OHS := HASH(ObjectHistorySet) 404: LT.Verifier Data := HASH(Data) 405: LT.Barrier := FALSE 406: /∗ If necessary, write a barrier. ∗/ 407: if (LT latest Candidate.LT) then 408: LT barrier := LT 409: LT barrier.Verifier Data := ⊥ 410: LT barrier.Barrier := TRUE 411: Count, ObjectHistorySet := DO WRITE(LT barrier, LT conditioned, ObjectHistorySet, ⊥) 412: if (Count COMPLETE) then 413: return (ABORT, ⊥, 0 ) 414: else 415: /∗ Re-classify based on returned object history set∗/ 416: Candidate new, Status := CLASSIFY(ObjectHistorySet) 417: if (Candidate new = Candidate) then 418: /∗ Abort if classification yields different result ∗/ 419: return (ABORT, ⊥, 0 ) 420: end if 421: end if 422: end if 423: /∗ Perform the CW operation. ∗/ 424: Count, ObjectHistorySet := DO WRITE(LT, LT conditioned, ObjectHistorySet, Data) 425: if (Count COMPLETE) then 426: return (ABORT, ⊥, 0 ) 427: end if 428: return (SUCCESS, Data, LT )

–  –  –

to retry a write without performing a read to obtain the latest object history. Once N − t responses have been received (line 515), DO WRITE returns.

Once the barrier-write has completed successfully it is necessary to perform classi· 70 Efficient, scalable consistency for highly fault-tolerant storage

–  –  –

Figure 4.6: Example of a barrier-write.

For this setup: N = 4, COMPLETE = 3, INCOMPLETE = 2.

The object history is shown at 4 storage-nodes during the progression of a CW operation. First, the client classifies LT 2 as incomplete. Next, it attempts to write a barrier at LT = 3, however the barrier-write is concurrent with the completion of 2. Once the barrier-write completes, the client re-classifies LT 2 as complete. The CW operation is aborted and retried at LT = 4, LT conditioned = 2.

Pages:     | 1 |   ...   | 7 | 8 || 10 | 11 |   ...   | 18 |

Similar works:

«The Permissibility of Prerogative Grounded Incentives in Liberal Egalitarianism Dr Alan Thomas _ By/Par Department of Philosophy University of Kent, UK a.p.Thomas@kent.ac.uk This paper evaluates a criticism of the claim that within a fully just liberal egalitarian society it would be permissible to pay special incentives to its more productive members. The envisaged basis for these payments is that the incentives are grounded on reasonable, agentcentred, ethical prerogatives on the part of the...»

«Know Thyself and Your Followers Leadership Advance Online– Issue XVII, Summer 2009 by J. Alan Marshall and Corné J. Bekker Whether credited to the Oracle of Delphi or to Greek philosophers such as Plato et al., the importance of knowing oneself has been emphasized over the ages (Classics, 2008). Mirvis (1982) applied this axiom to research endeavors by proposing that in order to have any hope of meaningful and ethical research, researchers must adhere to this admonition with the variation,...»

«FREEDOM OF CHOICE: A PRAGMATIC ARGUMENT FOR THE PERMISSIBILITY OF ASSISTED SUICIDE BY Michael Joseph Accavitti Jr. Thesis Submitted to the Faculty of the Graduate School of Vanderbilt University in partial fulfillment of the requirements for the degree of MASTER OF ARTS in Philosophy August, 2011 Nashville, Tennessee Approved: Professor John Lachs Professor Michael Hodges Professor Mark Bliton ii To my Father and Mother ii The focus of this thesis will be an arguement for the permissibility of...»

«UNDERSTANDING AND TREATING EYE DISEASES: MECHANICAL CHARACTERIZATION AND PHOTOCHEMICAL MODIFICATION OF THE CORNEA AND SCLERA Thesis by Matthew Sanford Mattson In Partial Fulfillment of the Requirements for the degree of Doctor of Philosophy CALIFORNIA INSTITUTE OF TECHNOLOGY Pasadena, California (Defended May 7, 2008) ii © 2008 Matthew Sanford Mattson All Rights Reserved iii ACKNOWLEDGEMENTS I would like to thank my mom and dad for all their support. Really without them, I never would have...»

«Journal of Philosophy of Life Vol.1, No.2 (July 2011):59-73 [Discussion Paper] How Should We Think About “Enhancement”? Beyond the Proponents vs. Opponents Scheme Masahiro Inaga Abstract This paper outlines a new scheme for bioethical arguments of “enhancement” by introducing the concept of “form of life.” Many authors have argued about the morality of enhancement technologies, but their conflict seems unsolvable because there is no common ground to discuss the morality of...»

«In Vivo Measurement and Visualization of Pelvic Position and Orientation and Changes in Soft Tissue Shape and Thickness with Respect to Changes in Seating Surface Shape Thomas G. Ault CMU-RI-TR-05-56 November 9th, 2005 Carnegie Mellon University Pittsburgh, Pennsylvania 15213 Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in the field of Robotics Thesis Committee: Mel Siegel, Robotics Institute (Chair) David Brienza, University of Pittsburgh Stephen...»

«A Mathematical Model of Divine Infinity Prof. Eric Steinhart, Department of Philosophy, William Paterson University, Wayne NJ 07470. Email: esteinhart1@nyc.rr.com, steinharte@wpunj.edu. Published in 2009 in Theology and Science 7 (3), 261 – 274. ABSTRACT: Mathematics is obviously important in the sciences. And so it is likely to be equally important in any effort that aims to understand God in a scientifically significant way or that aims to clarify the relations between science and theology....»

«Ca r d if f U N I VI R S I T Y I *R I I Y S Ci ( ) 1 Ca' r o w BINDING SERVICES Tel +44 (0)29 2087 4949 Fax.+44 (0)29 2037 1921 E-Mail Bindery@Cardiff.ac.uk SUBJECTIVE ETHNOLINGUISTIC VITALITY OF WELSH IN THE CHUBUT PROVINCE. ARGENTINA IAN JOHNSON Centre for Language and Communication Research English, Communication and Philosophy University o f Wales, Cardiff January 2007 Ca r d if f ESRC UNIVERSITY ECONOMIC & SOC IA L PRIFYSGOL RESEARCH CAERDYg CO UN Cl L UMI Number: U584924 All rights...»

«PARALLEL VOLUME RENDERING OF IRREGULAR GRIDS A Dissertation Presented by Jos Cludio Teixeira e Silva Junior ea to the graduate school in partial fulfillment of the requirements for the Degree of Doctor of Philosophy in Computer Science State University of New York at Stony Brook December 1996 Copyright 1996 c by Jos Cludio Teixeira e Silva Junior ea State University of New York at Stony Brook The Graduate School Jos Cludio Teixeira e Silva Junior ea We, the dissertation committee for the...»

«„The Continent of Murder: Disability and the Nazi „Euthanasia‟ Programme in the Euthanasia Debates of Britain and the United States, 1945-Present‟. Emmeline Burdett UCL Submitted for the Degree of Doctor of Philosophy. I, Emmeline Burdett, confirm that the work presented in this thesis is my own. Where information has been derived from other sources, I confirm that this has been indicated in the thesis. Abstract This thesis considers the impact that ideas about disability and disabled...»

«Man and World 24:461-470, 1991. 9 1991 Kluwer Academic Publishers. Printed in the Netherlands. The existential meaning of the art of theatre in Kierkegaard's philosophy AVI SAGI Department of Philosophy, Bar-llan University, Ramat Gan, Israel In this paper, I will consider Kierkegaard's approach to the art of theatre: Does he view the art of theatre as existentially relevant to concrete existence or does he see the art per se, as well as the attendance at theatrical performances, as reflecting...»

«This is an accepted manuscript of an article published in Philosophical Explorations on 30 April, 2014. Available online: http://www.tandfonline.com/doi/full/10.1080/13869795.2014.910309 This manuscript contains typographical errors corrected in the published version. Con-Reasons and the Causal Theory of Action Jonathan D. Payton Department of Philosophy, University of Toronto jonathan.payton@mail.utoronto.ca A con-reason is a reason which plays a role in motivating and explaining an agent’s...»

<<  HOME   |    CONTACTS
2017 www.abstract.dislib.info - Abstracts, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.