FREE ELECTRONIC LIBRARY - Abstracts, online materials

Pages:     | 1 |   ...   | 3 | 4 || 6 | 7 |   ...   | 18 |

«Efficient, scalable consistency for highly fault-tolerant storage GARTH GOODSON August 2004 CMU-CS-04-111 Dept. of Electrical and Computer ...»

-- [ Page 5 ] --

3.2.2 Data-fragment integrity Byzantine storage-nodes can corrupt their data-fragments. As such, it must be possible to detect and mask up to b storage-node integrity faults. Cross checksums [Gong 1989] enable read operations to detect corrupt data-fragments. A cryptographic hash of each data-fragment is computed, and the set of N hashes are concatenated to form the cross · 28 Efficient, scalable consistency for highly fault-tolerant storage Figure 3.2: Example of a “poisonous write” by a Byzantine client. In this example datafragment 5 has been corrupted by a Byzantine client. Decoding different sets of fragments (i.e., 1 and 3 vs. 3 and 5) lead to data-item values that are not equivalent. Therefore, it is necessary to protect good clients from observing differing data values written to the same timestamp.

checksum of the data-item. The cross checksum is stored with each data-fragment (i.e., it is replicated N times), enabling corrupted data-fragments to be detected at read time.

Note, the N 2 space overhead is small relative to the data size, given reasonable data sizes (e.g., there is an 8.3% overhead for N = 17, m = 5, and a 16 KB block). An example of generating a cross checksum is shown in Figure 3.1.

3.2.3 Write operation integrity

–  –  –

Validating timestamps To ensure that only a single set of data-fragments can be written at any logical time, the hash of the cross checksum is placed in the low order bits of the logical timestamp. Note, the hash is used for space-efficiency; instead, the entire cross checksum could be placed in the low bits of the timestamp.

On a write, each storage-node verifies its data-fragment against the corresponding hash in the cross checksum. The storage-node also verifies that the cross checksum matches the low-order bits of the validating timestamp. A correct storage-node only executes write requests for which both checks pass. Thus, a Byzantine client cannot make a correct storage-node appear Byzantine—only Byzantine storage-nodes can return unverifiable responses.

Validated cross checksums

Combining storage-node verification with validating timestamps ensures that the datafragments being considered by any read operation were not fabricated by Byzantine storage-nodes. To ensure that the client that performed the write operation acted correctly, the cross checksum must be validated by the reader. For the reader to validate the cross checksum, all N data-fragments are required. Given any m data-fragments, the reader can generate the full set of N data-fragments a correct client should have written. The reader can then compute the “correct” cross checksum from the generated data-fragments. If the generated cross checksum does not match the validated cross checksum, then a Byzantine client performed the write operation. The example in Figure 3.3 shows the steps necessary to perform the validation described above.

Only a single-valued write operation can generate a cross checksum that can be validated. Instead of using validated cross checksums, our protocol could use Verifiable Secret Sharing [Chor et al. 1985; Feldman 1987]. Verifiable Secret Sharing enables storagenodes to validate that the client acted correctly on each write request (instead of validating the data-item on each read operation).

· 30 Efficient, scalable consistency for highly fault-tolerant storage Figure 3.3: Verification of a validating cross checksum at read time. This example shows the necessary steps to validate a set of 3 fragments at read time using a validated cross checksum.

First, the full set of N data-fragments must be regenerated. Second, the cross checksum is computed and validated against the cross checksum that were read. Third, the hash of the cross checksum is taken and validated against the hash in the timestamp.

3.2.4 Authentication

Clients and storage-nodes must be able to validate the authenticity of messages. RPC requests and responses are authenticated using HMACs (i.e., clients and storage-nodes have pair-wise shared secrets). Thus, the channels between clients and storage-nodes are authenticated. We assume some infrastructure is in place to distribute shared secrets—our implementation supports an existing Kerberos [Steiner et al. 1988] infrastructure.

3.3 Protocol This section describes our Byzantine fault-tolerant consistency protocol that efficiently supports erasure-coded data-items by taking advantage of versioning storage-nodes. It describes, in detail, the protocol in pseudo-code form.

3.3.1 Overview

–  –  –

the greatest timestamp they host, and then incrementing the greatest response. In order to verify the integrity of the data, a hash performed over the data-fragment is appended to the logical timestamp. Two clients may arrive at the same logical timestamp only if they are both writing the same data concurrently to each other. In this case the write requests generated look identical to each other.

To perform a read operation, clients issue read requests to a subset of storage-nodes.

Once at least a read quorum of storage-nodes reply, the client identifies the candidate— the response with the greatest logical timestamp. The set of read responses that share the timestamp of the candidate comprise the candidate set. The read operation classies the candidate as complete, repairable, or incomplete. If the candidate is classified as complete, the data-fragments, timestamp, and return value are validated. If validation is successful, the value of the candidate is returned and the read operation is complete;

otherwise, the candidate is reclassified as incomplete. If the candidate is classified as repairable, it is repaired by writing data-fragments back to the original set of storage-nodes (note, in [Malkhi and Reiter 1998b], repair, for replicas, is referred to as “write-back”).

Prior to performing repair, data-fragments are validated in the same manner as for a complete candidate. If the candidate is classified as incomplete, the candidate is discarded, previous data-fragment versions are requested, and classification begins anew. Classification is performed according to a set of constraints dependent upon the failure model (see Section 3.4). All candidates fall into one of the three classifications, even those corresponding to concurrent or failed write operations.

3.3.2 Pseudo-code

The pseudo-code for the protocol is shown in Figures 3.5 and 3.6. The symbol LT denotes logical time and LTc denotes the logical time of the candidate. The set {D1,..., DN } denotes the N data-fragments; likewise, {S1,..., SN } denotes the set of N storage-nodes.

In the pseudo-code, the binary operator ‘|’ denotes string concatenation. Simplicity and clarity in the presentation of the pseudo-code was chosen over obvious optimizations that are in the actual implementation.

· 32 Efficient, scalable consistency for highly fault-tolerant storage

–  –  –

Storage-node interface Storage-nodes offer interfaces to write a data-fragment at a specific logical time; to query the greatest logical time of a hosted data-fragment; to read the hosted data-fragment with the greatest logical time; and to read the hosted data-fragment with the greatest logical time at or before some logical time. Each write request a storage-node executes creates a new version of the data-fragment (indexed by its logical timestamp) at the storage-node (i.e., the storage-node performs comprehensive versioning).

All stored data is initialized to ⊥ at time 0, and has a cross checksum of ⊥. The zero time, 0, and the null value, ⊥, are well known values which the clients understand. The storage-node pseudo-code is shown in Figure 3.4. The History which contains the version history for the data-item is kept in stable storage such that it persists during a crash and subsequent recovery. Storage-nodes validate write requests before executing them (to protect against Byzantine clients). This is performed by the function VALIDATE WRITE called by RECEIVE WRITE REQUEST. The value returned by RECEIVE READ LATEST and RECEIVE READ PREVIOUS, Latest, is guaranteed to be unique, since timestamps are unique (i.e., two distinct write operations cannot have the same timestamp).


3.3 Protocol 33

–  –  –

Write operation The WRITE operation, shown in Figure 3.5 consists of determining the greatest logical timestamp, constructing write requests, and issuing the requests to the storage-nodes.

First, a timestamp greater than, or equal to, that of the latest complete write must be determined. Collecting N −t responses, on line 211 of READ TIMESTAMP, ensures that the response set intersects a complete write at a correct storage-node. Since the environment is asynchronous, a client can wait for no more than N − t responses. Fewer than N − t responses are actually required to observe the timestamp of the latest complete write, since a single correct response is sufficient; in fact, this bound is t + b + 1.

Next, the ENCODE function, on line 101 of WRITE, encodes the data-item into N datafragments. The data-fragments are used to construct a cross checksum from the concatenation of the hash of each data-fragment (line 102). The function MAKE TIMESTAMP, called on line 104, generates a logical timestamp to be used for the current write oper· 34 Efficient, scalable consistency for highly fault-tolerant storage ation. This is done by incrementing the high order bits of the greatest observed logical timestamp from the ResponseSet (i.e., LT.TIME) and appending the Verifier. The Verifier is just the hash of the cross checksum.

Finally, write requests are issued to all storage-nodes. Each storage-node is sent a specific data-fragment, the logical timestamp, and the cross checksum. A storage-node validates the cross checksum with the verifier and validates the data-fragment with the cross checksum before executing a write request (i.e., storage-nodes call VALIDATE WRITE listed in their pseudo-code). The write operation returns to the issuing client once N − t WRITE RESPONSE messages are received (line 511 of DO WRITE).

Read operation

The read operation, shown in Figure 3.6, iteratively identifies and classifies candidates, until a repairable or complete candidate is found. Once a repairable or complete candidate is found, the read operation validates its correctness and returns the data. Note that the read operation returns a timestamp, value pair; in practice, a client only makes use of the value returned.

The read operation begins by issuing READ LATEST commands to all storage-nodes (via the DO READ function). Each storage-node responds with the data-fragment, logical timestamp, and cross checksum corresponding to the greatest timestamp it has executed.

The integrity of each response is individually validated through the VALIDATE function, called on line 207 of DO READ. This function checks the cross checksum against the Verifier found in the logical timestamp and the data-fragment against the appropriate hash in the cross checksum.

A second type of validation is performed on read responses (also on line 207). For responses to READ PREV commands, the logical timestamp is checked to ensure it is strictly less than the timestamp specified in the command. This check ensures that improper responses from Byzantine storage-nodes are not included in the response set.

Since, in an asynchronous system, slow storage-nodes cannot be differentiated from crashed storage-nodes, only N − t read responses can be collected (line 211 of DO READ).


3.3 Protocol 35

–  –  –

Since correct storage-nodes perform the same validation before executing write requests, the only responses that can fail the client’s validation are those from Byzantine storagenodes. For every discarded Byzantine storage-node response, an additional response can be awaited.

After sufficient responses have been received, a candidate for classification is chosen.

The function SELECT CS, called on line 102 of READ, determines the candidate timestamp, denoted LTc, which is the greatest timestamp found in the response set. All data-fragments that share LTc are identified and returned as the candidate set. At this point, the candidate set contains a set of validated data-fragments that share a common cross checksum and logical timestamp.

Once a candidate has been chosen, it is classified as either complete, repairable, or incomplete based on the size of the CandidateSet. The rules for classifying a candidate as INCOMPLETE or COMPLETE are given in the following subsection. If the candidate is classi· 36 Efficient, scalable consistency for highly fault-tolerant storage fied as incomplete, a READ PREV message is sent to each storage-node with its timestamp.

Candidate classification begins again with the new response set.

If the candidate is classified as either complete or repairable, the candidate set contains sufficient data-fragments written by the client to decode the original data-item. To validate the observed write’s integrity, the candidate set is used to generate a new set of datafragments (line 105 of READ). A validated cross checksum, CCvalid, is computed from the newly generated data-fragments. The validated cross checksum is compared to the cross checksum of the candidate set (line 107 of READ). If the check fails, the candidate was written by a Byzantine client; the candidate is reclassified as incomplete and the read operation continues. If the check succeeds, the candidate was written by a correct client and the read enters its final phase. Note that this check either succeeds or fails for all correct clients regardless of which storage-nodes are represented within the candidate set.

If necessary, repair is performed: write requests are issued with the generated datafragments, the validated cross checksum, and the logical timestamp (line 109 of READ).

Storage-nodes not currently hosting the write execute the write at the given logical time;

those already hosting the write are safe to ignore it. Finally, the function DECODE, on line 113 of READ, decodes m data-fragments, returning the data-item.

Pages:     | 1 |   ...   | 3 | 4 || 6 | 7 |   ...   | 18 |

Similar works:

«SPRING 2011 UNDERGRADUATE COURSE DESCRIPTIONS GSC 27999-01 (CRN 23724) Gender Studies Gateway Course For all Majors & Minors No Hours/No Credits Co-Requisite Course for Pre-approval Registration All Gender Studies Majors and Minors are pre-approved for this Gateway Course once they have finalized meeting procedures with the Gender Studies Academic Advisor. Every Gender Studies Major and Minor MUST REGISTER FOR THIS COURSE ONCE A SEMESTER in order to obtain pre-approved permission to register...»

«Реpublic of Serbiа OFFICE OF THE WAR CRIMES PROSECUTOR Belgrade, 9 November 2007 DISTRICT COURT IN BELGRADE War Crimes Chamber Belgrade Pursuant to my authority under articles 46 §2 (3) and 266 of the Criminal Procedure Act re articles 3 and 4 §2 of the Law on Organisation and Competence of State Authorities in War Crimes Procedure, I hereby bring the INDICTMENT Against Ilija JURIŠIĆ, from Tuzla, married and father of two majors; trained schoolteacher and philosophy graduate, now retired...»

«Copyrighted Material CHAPTER I Progress and the Enlightenment’s Two Conflicting Ways of Improving the World T hat notions concerning “progress,” “improvement of society,” and what one now-forgotten radicalminded novelist of the 1790s termed the “amelioration of the state of mankind” were central to the Enlightenment is scarcely surprising.1 Four out of six of the Enlighten­ ment’s philosophical founding figures—Descartes, Hobbes, Spinoza, and Bayle—held that most...»

«Rowlands De filosoof en de wolf 05-02-2009 09:08 Pagina 1 de filosoof en de wolf Rowlands De filosoof en de wolf 05-02-2009 09:08 Pagina 2 Rowlands De filosoof en de wolf 05-02-2009 09:08 Pagina 3 De filosoof en de wolf Lessen over liefde, dood en geluk mark rowlands Vertaling Pon Ruiter de bezige bij amsterdam Rowlands De filosoof en de wolf 05-02-2009 09:08 Pagina 4 Copyright © 2008 Mark Rowlands Copyright Nederlandse vertaling © 2009 Pon Ruiter Oorspronkelijke titel The Philosopher and the...»

«Investigating the role of Brettanomyces and Dekkera during winemaking by Adriaan Oelofse Dissertation presented for the the degree of Doctor of Philosophy (Science) at Stellenbosch University Institute for Wine Biotechnology, Faculty of AgriSciences Promoter: Prof. M. du Toit Co-promoters: Prof. I.S. Pretorius Prof. A. Lonvaud-Funel December 2008 Investigating the role of Brettanomyces and Dekkera during winemaking by Adriaan Oelofse Dissertation presented for the the degree of Doctor of...»

«Dominican School of Philosophy and Theology Theology of Sacraments: ST-3067, Fall 2015 DSPT Room 1, Fridays 12:40 – 3:30 p.m. Fr. Robert Christian, OP (DSPT), available by appointment: 510-596-1807; rfchristianop@gmail.com Course Description: This course will introduce students to systematic theological reflection on the sacraments in general and on each of the seven sacraments. The focus of the course will be on the Catholic tradition, but the contributions of other traditions will also be...»

«International Strategic Partnerships page 1 INTERNATIONAL STRATEGIC PARTNERSHIPS by Tony Baker, Managing Partner, Market Access Worldwide Trade is not a zero sum game. Trade breeds more trade. If you win, then your winning helps me win too. This is true for individuals, for companies, and for countries. Partnering with other companies, in other countries, fits well with this philosophy. Especially for small and medium-sized companies, it is very desirable to develop overseas by using...»

«KidZone Day Care Centre PARENT INFORMATION PACKAGE January 2013 DIRECTOR: Deborah Lukings 1175 Hyde Park Road London, Ontario N6H 5K6 4736676 E-mail: kidzonedaycareinc@gmail.com TABLE OF CONTENTS PAGE MESSAGE FROM THE DIRECTOR.. 1 MISSION CORPORATE STRUCTURE PROGRAM PHILOSOPHY PROGRAM DESCRIPTION Ages of Children Spaces Available Teacher/Child Ratios Staffing Requirements Volunteer/Student Policy  7 Days of Operation Professional Development Days. 7...»

«Journal of Reformed Theology 7 (2013) 181-203 brill.com/jrt Simply Impossible: A Case against Divine Simplicity R. T. Mullins University of Notre Dame, South Bend, IN, USA rtmullins@gmail.com Abstract Within contemporary philosophical theology the doctrine of divine simplicity has regained attention.1 The pertinent literature has increased by several new defenses of the doctrine.2 One of the more surprising, and troubling, aspects of the contemporary defenses amongst Christian philosophers and...»

«HIPPOCAMPAL OVERSHADOWING: EXPLORING THE UNDERLYING MECHANISMS TINE LANDEHAGEN GULBRANDSEN Bachelor of Science Kinesiology, University of Lethbridge, 2009 A Thesis Submitted to the School of Graduate Studies Of the University of Lethbridge in Partial Fulfilment of the Requirements of the Degree DOCTOR OF PHILOSOPHY Department of Neuroscience University of Lethbridge LETHBRIDGE, ALBERTA, CANADA © Tine Landehagen Gulbrandsen, 2015 HIPPOCAMPAL OVERSHADOWING: EXPLORING THE UNDERLYING MECHANISMS...»

«1 Caspar Hare June 2009 Forthcoming in Philosophical Perspectives Perfectly Balanced Interests1 One major challenge in moral theory has been to account for some intuitively striking moral differences between decision problems that involve conflicts of interest, and decision problems that do not. These differences come out clearly in rescue cases. Consider: One Person, One Island While idly steaming through an almost-deserted South Sea archipelago, I receive a distress call. Agatha has recently...»

«Bruna Grassi de Miranda DILMA ROUSSEFF da Ditadura Militar ao Facebook A Construção da Imagem do Indivíduo na Política Dissertação de Mestrado em Comunicação e Jornalismo, orientada pela Doutora Isabel Maria Ribeiro Ferin Cunha, apresentada ao Departamento de Filosofia, Comunicação e Informação da Faculdade de Letras da Universidade de Coimbra Faculdade de Letras DILMA ROUSSEFF da Ditadura Militar ao Facebook A Construção da Imagem do Indivíduo na Política Ficha Técnica: Tipo...»

<<  HOME   |    CONTACTS
2017 www.abstract.dislib.info - Abstracts, online materials

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.