FreshPatents.com Logo FreshPatents.com icons
Monitor Keywords Patent Organizer File a Provisional Patent Browse Inventors Browse Industry Browse Agents

3

views for this patent on FreshPatents.com
updated 05/17/13


Inventor Store

    Free Services  

  • MONITOR KEYWORDS
  • Enter keywords & we'll notify you when a new patent matches your request (weekly update).

  • ORGANIZER
  • Save & organize patents so you can view them later.

  • RSS rss
  • Create custom RSS feeds. Track keywords without receiving email.

  • ARCHIVE
  • View the last few months of your Keyword emails.

  • COMPANY PATENTS
  • Patents sorted by company.

Enhanced reliability using deterministic multiprocessing-based synchronized replication   

pdficondownload pdfimage preview


Abstract: A hardware and/or software facility for executing a multithreaded program is described. The facility causes each of a plurality of machines to execute the multithreaded program deterministically, such that the deterministic execution of the multithreaded program is replaced across the plurality of machines. The facility detects a problem in the execution of the multithreaded Program by one of the plurality of machines. In response, the facility adjusts the execution of the multithreaded program by at least one of the machines of the plurality. ...


Inventors: Luis Ceze, Peter J. Godman, Mark H. Oskin
USPTO Applicaton #: #20110283262 - Class: 717128 (USPTO) - 11/17/11 - Class 717 
Related Terms: Hardware   Program   Software   
view organizer monitor keywords


The Patent Description & Claims data below is from USPTO Patent Application 20110283262, Enhanced reliability using deterministic multiprocessing-based synchronized replication.

pdficondownload pdf

BACKGROUND

Multiprocessing is a mode of operation in which two or more processing units each carry out one or more processes (programs or sets of instructions) in tandem. The objective of a multiprocessing system is to increase processing speed. Typically, this is accomplished by each processing unit operating on a different set of instructions or on different threads of the same process. A process may execute one or more threads. Each thread has it own processor context, including its own program context. Traditionally, for an application to take advantage of the benefits of multiprocessing, a software developer must write the application to be multithreaded. As used herein, a multithreaded application refers to a program capable of running two or more threads simultaneously.

On a multiprocessor or multi-core system (collectively referred to herein as a “multiprocessing system”), two or more of the threads of a multithreaded application may be able to execute at the same time, with each processor or core running a particular thread. It is common for threads of a multithreaded application to share resources during concurrent execution, such as, for example, memory. As used herein, concurrent execution refers to the simultaneous execution of two or more threads of a multithreaded application. A consequence of concurrent execution is that two or more threads of a multithreaded application may read and/or update the same shared resource. For example, one thread may modify a value of a shared memory location while another thread executes a sequence of operations that depend on the value stored in the shared memory location.

Under the traditional software development model, software developers spend a substantial amount of time identifying and attempting to correctly synchronize parallel threads within their multithreaded applications. For example, a developer may explicitly use locks, semaphores, barriers, or other synchronization mechanisms to control access to a shared resource. When a thread accesses the shared resource, the synchronization mechanism prevents other threads from accessing the resource by suspending those threads until the resource becomes available. Software developers who explicitly implement synchronization mechanisms also typically spend a substantial amount of time debugging their synchronization code. However, software defects (referred to as “bugs”) resulting from synchronization errors typically manifest themselves transiently (i.e., a bug may appear only on a particular sequence or sequences of interleaved thread operations). As a result, defective software might execute correctly hundreds of times before a subtle synchronization bug appears.

It is difficult to develop software for multiprocessing systems because of the nondeterministic behavior created by the various interleaving of threads on such systems. An interleaving refers to an order of thread operations that may include interaction between threads. The number of possible interleavings between threads significantly increases as the number of threads increase. Consequently, multithreaded applications present additional challenges in terms of error detection and modeling program behavior. For example, given the same input to a multithreaded application, a multiprocessing system will interleave thread operations nondeterministically, thereby producing different output each time the multithreaded application is executed. FIG. 1 is a high-level diagram showing an example of two possible thread interleavings in a multithreaded application executed on a multiprocessing system. As illustrated, the application includes at least two threads: thread 1 and thread 2. When the application is invoked, at some point in time, thread 1 executes an operation settings the value of variable A to one (A=1) followed by an operation settings the value of variable B to the value of variable A (B=A), and thread 2 executes an operation settings the value of variable B to zero (B=0) followed by an operation settings the value of variable A to the value of variable B (A=B). As illustrated, the operations of thread 1 and thread 2 are interleaved nondeterministically, thereby producing different output each time the application is invoked. That is, during the first illustrated invocation, the interleaving of operations resulted in variables A and B each being set to zero, while during the second illustrated invocation, the interleaving of operations resulted in variables A and B each being set to one.

Non-determinism in multithreaded execution may arise from small changes in the execution environment, such as, for example, other processes executing simultaneously, differences in the operating system resource allocation, the state of caches, translation lookaside buffers (“TLBs”), buses, interrupts, and other microarchitectural structures. As a result, developing a multithreaded application is significantly more difficult than developing a single-threaded application.

Conventionally, efforts in addressing this problem have focused on deterministically replaying multithreaded execution based on a previously generated log file. However, deterministic replay systems suffer substantial performance degradation as a result of the overhead associated with maintaining the replay log file. Moreover, with deterministic replay, a software developer does not have control over how the interleaving of threads is performed. As a result, synchronization bugs resulting from particular interleavings of operations may not be identified (and, more importantly, corrected) before the software is deployed to a customer. Non-determinism further complicates the software development process in that non-determinism makes it hard to assess test coverage. Good coverage requires both a wide range of program inputs and a wide range of possible thread interleavings.

BRIEF DESCRIPTION OF THE DRAWINGS

One or more embodiments of the facility are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements and in which:

FIG. 1 is a high-level diagram showing an example of two possible thread interleavings in a multithreaded program.

FIG. 2 is a flow diagram of a deterministic serialization process performed by the facility in one or more embodiments.

FIG. 3 is a flow diagram of a deterministic selective serialization process performed by the facility in one or more embodiments.

FIG. 4 is a high-level block diagram showing an example architecture of a computing system on which the facility executes in one or more embodiments.

FIG. 5 is a high-level block diagram showing various functional elements of a deterministic multiprocessing layer in one or more embodiments.

FIG. 6 is a high-level block diagram showing a data structure used by the facility to make multiprocessor code deterministic in one or more embodiments.

FIG. 7 is a high-level diagram showing an example of creating and deterministically executing threads in one or more embodiments.

FIG. 8 is a high-level diagram showing an example of utilizing a transactional memory system to make multiprocessor code deterministic in one or more embodiments.

FIG. 9 is a flow diagram showing a process performed by the facility to augment an application in one or more embodiments.

FIG. 10 is a flow diagram showing a process performed by the facility to parse a block in one or more embodiments.

FIG. 11 is an example of a control flow graph of an augmented function of a multithread application in one or more embodiments.

FIG. 12 is a flow diagram showing a deterministic multiprocessing initialization function in one or more embodiments.

FIG. 13 is a flow diagram showing a deterministic multiprocessing commit function in one or more embodiments.

FIG. 14 is a network diagram showing a typical network environment in which the facility performs deterministic multiprocessing-based synchronized replication.

FIG. 15 is a flow diagram showing steps typically performed by the facility as part of deterministic multiprocessing-based synchronized replication.

DETAILED DESCRIPTION

Conventional systems, such as deterministic replay systems, do not adequately resolve the problems associated with the nondeterministic behavior in the development of multithreaded applications. Additionally, no existing systems reduce or attempt to resolve the problems associated with nondeterministic behavior in the deployment of multithreaded applications. Accordingly, a hardware and/or software facility for deterministic multiprocessing of multithreaded applications (“the facility”) has been developed. As used herein, the term deterministic multiprocessing refers to a technique by which given the same input to a multithreaded application, the same output is produced by the multithreaded application. The facility simplifies the process of developing multithreaded applications, for example, by freeing developers from the burden of synchronizing thread accesses to shared resources. Additionally, the facility improves the reliability of such multithreaded applications when they are deployed, for example, by enabling developers to reproduce bugs and rigorously test various thread interleavings.

In some embodiments, the facility divides execution of a multithreaded application into sets of a finite, deterministic number of operations (each set is referred to herein as a “quantum”). When identifying quanta, the facility may distinguish between operations that can be performed concurrently, such as communication-free thread operations, and operations that are to be performed in a deterministic order, such as inter-thread communications, system calls, and so on. Each quantum identified by the facility is then performed in a deterministic order. By controlling the order in which quanta are executed by threads of a multithreaded application, the facility enables the multithreaded application to behave deterministically. That is, given the same input, threads of the multithreaded application interleave their operations deterministically, thereby providing the same output.

In some embodiments, the facility serializes execution of a multithreaded application. That is, the facility may control the global interleaving of all thread operations. For example, this may be accomplished by establishing a memory access token that is passed in a deterministic order between threads. A thread may be referred to as “holding” the token when the value of the token matches the identifier of that thread. When the value of the token does not match the identifier of a thread, its execution is suspended until the value of the token matches the identifier of the thread. When the value of the token matches the identifier of a thread, the thread performs a finite, deterministic number of operations (i.e., a quantum) before the token is passed to the next thread. The token may be passed to the next thread, for example, by advancing the value of the token to correspond to the identifier of the next thread in the deterministic order.

In some embodiments, the facility uses deterministic multiprocessing to synchronize the execution of a multi-threaded program on two or more separate peer machines. As is discussed in detail below, doing so provides an extra measure of reliability and/or performance in many cases relative to executing the same multi-threaded program on a single system.

FIG. 2 is a flow diagram of a deterministic serialization process 200 performed by the facility in one or more embodiments. For example, the deterministic serialization process 200 may be performed while a multithreaded application executes on a multiprocessing system. While the multithreaded application executes, the facility loops through steps 205-215 for each thread. In step 205, if the facility determines that the value of the token matches the identifier of a thread, then the facility continues to step 210, else the facility loops back to step 205. That is, the facility suspends execution of the thread until the value of the token matches the identifier of that thread. In step 210, the facility allows the thread whose identifier matches the token to execute a finite, deterministic number of operations (i.e., a quantum), then the facility continues to step 215. In step 215, the facility sets the value of the token to equal the identifier of the next thread in the deterministic order, then the facility continues to step 205. It is noted that the facility may continue looping through the serialization process 200 until the application exits.

Those skilled in the art will appreciate that the steps shown in FIG. 2 and in each of the following flow diagrams may be altered in a variety of ways. For example, the order of certain steps may be rearranged; certain sub-steps may be performed in parallel; certain shown steps may be omitted; or other steps may be included; etc.

In some embodiments, the facility selectively serializes execution of a multithreaded application. That is, the facility may control the interleaving of certain thread operations (referred to herein as “controlled operations”), while other thread operations are performed concurrently. For example, the facility may control the interleaving of operations that involve communication between two or more threads. Inter-thread communication occurs when a thread reads data that is privately held by another thread, or when a thread writes to shared data, thereby privatizing it. In some embodiments, when a thread attempts to read data that is regarded as privately held by another thread, the thread suspends its execution until the value of the token matches its identifier. Similarly, in some embodiments, when a thread attempts to write to data that is shared or regarded as privately held by another thread, it suspends its execution until the value of the token matches its identifier and all other threads reach a deterministic point in their execution (e.g., complete execution of a quantum). As a result, the facility ensures that all threads observe the change in state of the data (from shared to privately held by the thread) at a deterministic point in their execution.

In some embodiments, to detect inter-thread communication, the facility maintains a shared-memory data structure that includes sharing information for each memory location in the address space of the multithreaded application. For example, such information may indicate that a memory location is shared, private, etc. It is noted that sharing may occur at different levels, such as the operation-level, instruction-level, page-level, and so on. In some embodiments, a thread may access its own privately held data or read shared data without holding the token. However, to write to shared data or read data that is held as private by another thread, the thread waits until it holds the token and all other threads are blocked (i.e., are also waiting for the token). When a thread reads a memory location that is regarded as private, the shared-memory data structure is updated to indicate that the read memory location is to be regarded as shared. When a thread writes to a memory location, the shared-memory data structure is updated to indicate that the memory location is to be regarded as privately held by that thread. Similarly, when a thread reads a memory location that has not been previously accessed by another thread, the shared-memory data structure is updated to indicate that the memory location is to be regarded as privately held by that thread.

FIG. 3 is a flow diagram of a deterministic selective serialization process 300 performed by the facility in one or more embodiments. For example, the selective serialization process 300 may be performed when a thread or processor attempts to perform a controlled operation, such as memory operations, system calls, etc. In step 305, if the facility determines that the operation is a system call (e.g., an I/O operation, etc.), then facility continues to step 325, else the facility continues to step 310. In step 310, if the facility determines that the operation accesses memory that is not privately held by the thread, then the facility continues to step 315, else the facility continues to step 355. In step 315, if the facility determines that the operation accesses shared memory, then the facility continues to step 320, else the facility continues to step 325. In step 320, if the facility determines that the operation is a store operation, then the facility continues to step 325, else the facility continues to step 355. In step 325, if the facility determines that the value of the token matches the identifier of the thread, then the facility continues to step 330, else the facility loops back to step 325. That is, the facility suspends execution of the selected thread until the value of the token matches the identifier of the selected thread. In step 330, if the facility determines that all threads of the multithreaded application are suspended (or blocked), then the facility continues to step 335, else the facility loops back to step 330. By waiting for all threads to be suspended before the thread holding the token may execute, the facility ensures that, at a deterministic point in their execution, all threads observe any state change that results from execution of the operation. In step 335, if the facility determines that the operation is a system call, then the facility continues to step 355, else the facility continues to step 340. In step 340, if the facility determines that the operation is a store operation, then the facility continues to step 345, else the facility continues to step 350. In step 345, the facility updates the shared memory data structure to indicate that the memory location affected by the operation is to be regarded as privately held by the thread, then the facility continues to step 355. In step 350, the facility the updates the shared memory data structure to indicate that the memory location accessed by the operation is to be regarded as shared, then the facility continues to step 355. In step 355, the facility allows the thread to proceed with the operation, then the facility returns.

In some embodiments, the facility operates together with a transactional memory system to serialize or selectively serialize execution of a multithreaded application. For example, the facility may use the transactional memory system to detect inter-thread communication that would violate the deterministic ordering of memory operations. That is, the transactional memory system may be used instead of, or in addition to, the shared-memory data structure. It is noted that the transactional memory system may be a hardware transactional memory (HTM) system, a software transactional memory (STM) system, or a hybrid hardware-software transactional memory system (HS-TM). When operating together with a transactional memory system, the facility encapsulates each quantum executed by a thread within a transaction. By encapsulating each quantum within a transaction, the threads appear to execute atomically and in isolation. As a result, transactions may be executed concurrently, and then committed according to a deterministic order. A transaction is typically not committed if the transaction includes an inter-thread communication that would violate the deterministic ordering (referred to herein as a “conflict”). When a conflict exists, the transaction is aborted and restarted.

In some embodiments, the facility includes a quantum builder component and a deterministic multiprocessing (“DMP”) component. The quantum builder component is used to divide execution of a multithreaded application into quanta (i.e., sets of a finite, deterministic number of operations). In some embodiments, the quantum builder component distinguishes between operations that may be performed concurrently, such as communication-free thread operations, and operations that are to be performed in a deterministic order (e.g., controlled operations), such as inter-thread communications, system calls, and so on. The DMP component ensures that each quantum is performed according to a deterministic order. In some embodiments, when the token is advanced to a thread that is blocked (e.g. waiting for a lock held by another thread), the facility passes the token to the next thread, thereby avoiding livelock resulting from blocking synchronization primitives that a developer included within the multithreaded code. For example, if thread 1 holds a lock that thread 2 requires to proceed at the time that the token is passed to thread 2, then the token is passed to the next thread (e.g., thread 3), and so on. Because the token is passed in a deterministic order, and because each thread executes a quantum (or passes the token), the quanta are interleaved deterministically, thereby producing the same output each time the code is executed with the same input and preventing livelock.

The quantum builder component and DMP component may be implemented in hardware, software, or a combination of hardware and software. For example, the quantum builder component may be implemented by counting instructions as they retire and placing a quantum boundary when the predetermined quantum size is reached. To serialize execution, the DMP component may be implemented as a token that is passed between processors at a quantum boundary in a deterministic order. As another example, to selectively serialize execution, the quantum builder component may monitor memory accesses to determine whether an access involves inter-thread communication (e.g., access to shared data, etc.). For example, in one embodiment, the quantum builder uses a cache line state maintained by a MESI (“Modify, Exclusive Share, Invalidate”) cache coherence protocol to implement a sharing table. A cache line in an exclusive or modified state is regarded as privately held by a processor, and can be freely read or written by its owner thread without holding the token. Similarly, a cache line in a shared state may be freely read by its owner thread without holding the token. The processor may write to a cache line in a shared state when all threads are at a deterministic point in their execution (e.g., when all processors are blocked) and when the processor acquires the deterministic token. In such embodiments, each processor broadcasts when it is blocked and/or unblocked. It is noted that the state of entries in the sharing table corresponding to lines that are not cached by any processor may be kept in memory and managed by a memory controller, and that the state of such entries may be transferred when cache misses are serviced. In some embodiments, the quantum builder and DMP components operate together with a transactional memory (TM) system, such as a hardware transactional memory (HTM) system, to specify a specific transaction commit order—the deterministic commit order of quanta encapsulated inside transactions. In such embodiments, the TM system commits a transaction when the processor holds the token and, after the transaction is committed, the token is passed to the next processor in the deterministic order. It in noted that, in some embodiments, the hardware may support multiple tokens, thereby allowing multiple deterministic processes to execute at the same time, each process specifying a token that is passed between processors.

In some embodiments, the facility may be implemented using a compiler or a binary rewriting infrastructure. For example, the quantum builder component may use a compiler to build quanta by inserting synchronization code within multithreaded application code to track operations in the control-flow-graph (“CFG”) generated by the complier. It is noted that quanta need not be of uniform size as long as the size is deterministic. Such synchronization code may be inserted, for example, at the beginning and end of function calls, and at the tail end of CFG back edges. The inserted code tracks quantum size and when the target size has been reached, it calls back to the DMP component. For example, to serialize execution such embodiments, the DMP component may implement the token as a queuing lock that is passed between threads in a deterministic order. As another example, to selectively serialize execution, the quantum builder component may use the compiler to insert code such that load and store operations result in a callback to the DMP component. In some embodiments, the DMP component operates together with a transactional memory system, such as software transactional memory (STM) system, and/or implements a sharing table.

In some embodiments, to control the interleaving of operations performed by threads, the facility may augment source code, an intermediate representation of source code, or an executable. For example, the facility may augment multithreaded application code by inserting one or more deterministic multiprocessing (“DMP”) functions or data structures into the application code. As another example, the inserted DMP functions may call back to a runtime system, such as that provided by the DMP component, which maintains one or more data structures (e.g., a shared memory data structure). When the augmented code is executed by a multiprocessing system, the inserted DMP functions and data structures are then used to control the order in which operations are performed, such as memory and I/O operations, system calls, and so on. By controlling the order in which threads perform such operations, the facility enables the multithreaded application to behave deterministically (referred to herein as an “augmented application”). That is, given the same input, threads of an augmented application may interleave some or all of their operations deterministically, thereby providing the same output. Those skilled in the art will appreciate that the facility may be extended to control other thread operations.

In some embodiments, the facility is implemented as a compiler module that augments multithreaded application code by inserting functions provided by a DMP library, which enforce deterministic execution of quanta performed by threads of the augmented application. In some embodiments, after the code is augmented, a compiler re-optimizes the code, such as, for example, inlining all calls to the DMP library. Those skilled in the art will appreciate that the compiler may perform other optimizations to the augmented code not specifically described herein.

In some embodiments, the facility includes a DMP data structure, referred to herein as a “thread data structure,” the details of which are discussed in greater detail below in connection with FIG. 6. However, it is noted that any number of DMP data structures may be included. It is further noted that the thread data structure may represent multiple DMP data structures. In some embodiments, the thread data structure stores a thread identifier (“ID”) corresponding to each thread that is created by the augmented application during execution. For example, the thread data structure may include an array, linked list, a queue or other data structure of thread IDs (referred to herein as a “thread container”).

In some embodiments, the thread data structure includes a token that may be used to control the order of quantum execution. For example, in some embodiments, prior to executing a quantum, a thread determines whether the current value of the token matches the ID of the thread. When the ID of a thread matches current value of the token, the thread may execute the quantum. Otherwise, the thread waits to execute the quantum until the current value of the token matches its identifier.

In some embodiments, the order in which threads are created corresponds to the order in which the threads are deterministically executed. For example, as each thread is created, the thread\'s corresponding thread ID may be sequentially stored in the thread container (e.g., a thread ID of 1 for the first-created thread; a thread ID of 2 for the second-created thread; etc.). As operations are executed, the threads may invoke certain DMP functions that operate to advance the value of the token by sequentially looping through the thread IDs stored in the thread container based on the sequence in which the thread IDs were stored (beginning with the first thread ID). It is noted that, when a thread exits, the thread\'s corresponding ID is typically removed from the thread container.

In some embodiments, the thread data structure stores a value corresponding to a finite, deterministic number (i.e., quantum) of controlled operations or blocks that may be executed by a thread whose thread ID matches the current value of the token before the token is advanced. This number of controlled operations or blocks is referred to herein as the “commit block size.” The commit block size may range from one to N controlled operations or blocks. Those skilled in the art will appreciate that there are performance tradeoffs associated both large and small commit block sizes. For example, when the commit block size is too small, the performance of the augmented application will suffer as a result of the overhead associated with context switches between threads. As another example, when the commit block size is too large, the performance of the augmented application will suffer because many or all threads may be forced to wait for the thread whose thread ID matches the token (and every thread whose thread ID precedes its thread ID) to exit or actually execute the number of controlled operations specified by commit block size. In at least one embodiment, the commit block size is equal to one thousand (10,000).

In some embodiment, the commit block size is configurable. For example, the commit block size may be configured by a software developer to programmatically manipulate and test the various thread interleavings of an augmented application. As another example, the commit block size may be automatically configured based on the maximum number of threads that may be created by the augmented application and/or the number of processor or cores of the multiprocessing system on which the augmented application executes. Those skilled in the art will appreciate that a variety of techniques may be used to count the number of controlled operations performed by a thread. For example, in some embodiments, the thread data structure includes a value corresponding to the number of controlled operations that have been performed by a thread whose thread ID matches the current token ID. Each time the thread performs a controlled operation, the number of controlled operations in incremented, and the compared to the commit block size. If the number of controlled operation equals the commit block size, then the token is advanced to the next thread ID, and the number of controlled operations is reset to zero.

By augmenting a multithreaded application to control the ordering of certain thread operations (such as, e.g., controlled thread operations), the development process is substantially simplified. For example, the facility can be used by a software developer to directly manipulate thread interleavings of a multithreaded application, thereby allowing for substantially better test coverage of the multithreaded application. A developer may manipulate the interleavings of controlled thread operations, for example, by modifying the commit block size. As another example, a developer may manipulate the interleavings of controlled thread operations by modifying the ordering of thread IDs stored in the thread container. In some embodiments, the facility enables a software developer to mark code as being inserted for augmentation purposes, such that the inserted code will not affect quantum building.

In some embodiments, a multithreaded application is deployed in its augmented form. By deploying a multithreaded application in its augmented form, the reliability of the application is substantially increased because, for example, the execution of the augmented application “in the field” (i.e., by a customer) will more closely resemble in-house testing of the application. Additionally, if the augmented application were to crash or experience a synchronization bug, a software developer may quickly resolve the defect by collecting meaningful crash information from the customer. That is, when deployed in its augmented form, the actions performed by the customer that preceded the crash are meaningful because they allow the software developer to easily reproduce the crash. As a result, the software developer can resolve the defect substantially faster than if the crash or synchronization bug were associated with an unknown interleaving of threads. Accordingly, the facility improves both the development and deployment of multithreaded applications.

In some embodiments, the computing system on which a multithreaded application is developed, and/or on which the multithreaded application is deployed, includes a transactional memory (“TM”) system for controlling access to shared memory. The transactional memory system may be a hardware transactional memory (“HTM”), a software transactional memory (“STM”) system, or a hybrid hardware-software (HS-TM) system. Both TM systems are known in the art. A STM system provides a programming abstraction through which a thread atomically performs a sequence of operations, some of which may involve one or more shared resources (e.g., memory), without locking or waiting for a shared resource to be freed.

Conventional TM systems are “optimistic” in the sense that a thread completes modifications to shared memory without regard for what other threads might be doing. This is accomplished, for example, by maintaining a log for each thread of a multithreaded application and, for each transaction, each thread sequentially record its operations in its corresponding log. For example, a log may include a number of memory locations and values that a thread reads and/or writes during a transaction. At the end of the transaction, if no other thread has concurrently accessed the same shared memory locations, the thread actually performs the sequence of operations (this is commonly referred to as a “commit”). However, if another thread has concurrently accessed one or more of the same memory locations, then the transaction is aborted and restarted. That is, in conventional TM systems, transactions execute concurrently so long as a shared resource is not accessed by more than one thread during the same transaction.

There are a number of disadvantages associated with conventional TM systems. For example, although conventional TM systems somewhat simplify development by allowing developers to declare certain operations or certain sequences of operations as atomic, conventional TM systems do not provide deterministic multiprocessing of multithreaded applications. Additionally, conventional TM systems do not allow software developers to specify or manipulate the interleavings of threads in a multithreaded application. As a result, conventional TM systems also suffer from latent synchronization bugs. Also, compared with HTM systems, STM systems suffer a performance hit as a result of the overhead associated with maintaining a log and the time spent committing transactions.

In some embodiments, the facility controls the order of execution of certain thread operations of a multithreaded application that uses a transactional memory system to control access to shared resources, such as a HTM, STM, or HS-TM system. That is, the facility may control the order in which threads begin and/or commit transactions in a transactional memory system. In some embodiments, the facility augments an application programming interface (“API”) provided by a STM system. As one example, the facility may augment the functions of the STM API provided in Table 1 below. It will be appreciated by those skilled in the art that, although some embodiments of the facility are described with reference to the STM API provided in Table 1, the facility may operate on various transactional memory systems.

TABLE 1 void begins a new transaction performed by a STMBeginTransaction( ): thread value STMRead(*addr): records information in a log about the operation type, address, and/or current value of the shared memory location void records information in a log about the STMWrite(*addr, value): operation type, address, and/or current value of the shared memory location as a result of the operation bool determines, based on a thread\'s log, whether STMValidTransaction ( ): another thread has concurrently accessed one or more of the same shared resources void aborts a transaction performed by a thread STMAbortTransaction( ): bool commits a transaction performed by a thread STMCommitTransaction( ):

In some embodiments, a software developer manually specifies atomic blocks within a multithreaded application. For example, a software developer may include the following atomic block:

  atomic {  a = b + c; }

Following compilation, the above example atomic block would be replaced by the following pseudo code:

  STM_Begin_Transaction( ); try {   var_1 = STMRead(*b);   var_2 = STMRead(*c);   STMWrite(*a, var_1 + var_2);   bool transaction_valid = STMValidTransaction( );   if (!STMValidTransaction( )) {    STMAbortTransaction( );   }   else if (STMValidTransaction( )) {    bool transaction_commited = STMCommitTransaction( );    if (!transaction_commited) {     throw transaction_failed_to_commit;    }   }

Download full PDF for full patent description/claims.




You can also Monitor Keywords and Search for tracking patents relating to this Enhanced reliability using deterministic multiprocessing-based synchronized replication patent application.
###
monitor keywords

Other recent patent applications listed under the agent :



Keyword Monitor How KEYWORD MONITOR works... a FREE service from FreshPatents
1. Sign up (takes 30 seconds). 2. Fill in the keywords to be monitored.
3. Each week you receive an email with patent applications related to your keywords.  
Start now! - Receive info on patent apps like Enhanced reliability using deterministic multiprocessing-based synchronized replication or other areas of interest.
###


Previous Patent Application:
Quality assurance tools for use with source code and a semantic model
Next Patent Application:
Conditional dynamic instrumentation of software in a specified transaction context
Industry Class:
Data processing: software development, installation, and management

###

FreshPatents.com Support - Terms & Conditions
Thank you for viewing the Enhanced reliability using deterministic multiprocessing-based synchronized replication patent info.
- - - AAPL - Apple, BA - Boeing, GOOG - Google, IBM, JBL - Jabil, KO - Coca Cola, MOT - Motorla

Results in 1.10766 seconds


Other interesting Freshpatents.com categories:
Novartis , Pfizer , Philips , Procter & Gamble , g2