Sunday, July 18, 2010

Why worry about thread safety?

Given the structure of the JVM, local variables, method parameters, and return values are inherently "thread-safe." But instance variables and class variables will only be thread-safe if you design your class appropriately

-- Bill Venners

Preventing Memery Locks on the Multi CPU machine

Problem
   Multithreaded apps create new objects at the same time
   New objects are always created in the EDEN space
   During object creation, memory is locked
   On a multi CPU machine (threads run concurrently) there can be contention

Solution
    Allow each thread to have a private piece of the EDEN
    space
Thread Local Allocation Buffer
    -XX:+UseTLAB
    -XX:TLABSize=
    -XX:+ResizeTLAB
    (On by default on multi CPU machines and newer JDK)
Analyse TLAB usage
    -XX:+PrintTLAB
JDK 1.5 and higher (GC ergonomics)
    Dynamic sizing algorithm, tuned to each thread


INSIDE THE JAVA VIRTUAL MACHINE / Filip Hanik

Sun recommended JVM Settings

GC Settings
   -XX:+UseConcMarkSweepGC
   -XX:+CMSIncrementalMode
   -XX:+CMSIncrementalPacing
   -XX:CMSIncrementalDutyCycleMin=0
   -XX:+CMSIncrementalDutyCycle=10
   -XX:+UseParNewGC
   -XX:+CMSPermGenSweepingEnabled
To analyze what is going on
   -XX:+PrintGCDetails
   -XX:+PrintGCTimeStamps
   -XX:-TraceClassUnloading
                                   
Minor Notes
-XX:+UseParallelGC <> -XX:+UseParNewGC
-XX:ParallelGCThreads=
    Use with ParallelGC setting
If you have 4 cpus and 1 JVM
    Set value to 4
If you have 4 cpus and 2 JVM
    Set value to 2
If you have 4 cpus and 6 JVM
    Set value to 2


INSIDE THE JAVA VIRTUAL MACHINE / Filip Hanik

Thread Safety Rules

Safety Rules

Sometimes, a set of data is interdependent. For example, we might have two fields corresponding to a street address and a zip code. Changing an address might involve changing both of these fields. If the zip code is changed without a corresponding change to the street address, the data may be inconsistent or incoherent. Such a set of operations, which must be done as a unit -- i.e., either all of the operations are executed or none are -- in order to ensure consistency of the data, is called a transaction. The property of "doing all or none" is called atomicity. A system in which all transactions are atomic is transaction-safe.
The following rule suffices to ensure that your system is transaction-safe:
    All (potentially changeable) shared data is accessed only through the synchronized methods of a single object; no interdependent piece can be accessed independently.
Note that this means that shared data cannot be returned by these methods for access by other methods. If shared data is to be returned, a (non-shared) copy must be made. Further, if interdependent values are to be returned (i.e., a portion of the shared data is to be used by other methods), all of the relevant values must be returned in a single transaction.
For example, the address and zip code of the previous example should not be returned by two separate method calls if they are to be assumed consistent.
public class AddressData {
private String streetAddress;
private String zipCode;
public AddressData( String streetAddress, String zipCode) {
    this.setAddress( streetAddress, zipCode );
    ....
}
public synchronized void setAddress( String streetAddress,
                                     String zipCode) {
    // validity checks
    ....
    // set fields
    ....
}
public synchronized String getStreetAddress() { // problematic!
    return this.streetAddress;
}
public synchronized String getZipCode() { // problematic!
    return this.zipCode;
}
}
If this class definition were used, e.g. for
printMailingLabel( address.getStreetAddress(),
                   address.getZipCode() );
it would in principle be possible to get an inconsistent address. For example, between the calls to address.getStreetAddress() and address.getZipCode(), it is possible that a call to address.setAddress could occur. In this case, getStreetAddress would return the old street address, while getZipCode() would return the new zip code.
Instead, getStreetAddress() and getZipCode() should be replaced by a single synchronized method which returns a copy of the fields of the AddressData object:
public synchronized SimpleAddressData getAddress() {
    return new SimpleAddressData( this.streetAddress,
                                  this.zipCode );
}
The SimpleAddressData class can contain just public streetAddress and zipCode fields, without accessors. It is being used solely to return the two objects at the same time.

Introduction to Interactive Programming by Lynn Andrea Stein

Reentrant Synchronization

Recall that a thread cannot acquire a lock owned by another thread. But a thread can acquire a lock that it already owns. Allowing a thread to acquire the same lock more than once enables reentrant synchronization. This describes a situation where synchronized code, directly or indirectly, invokes a method that also contains synchronized code, and both sets of code use the same lock. Without reentrant synchronization, synchronized code would have to take many additional precautions to avoid having a thread cause itself to block.

Intrinsic Locks and Synchronization