Changes between Version 7 and Version 8 of AmbraCache

Show
Ignore:
Author:
pradeep (IP: 24.5.254.109)
Timestamp:
04/13/09 14:51:44 (1 year ago)
Comment:

Update for 0.9.3

Legend:

Unmodified
Added
Removed
Modified
  • AmbraCache

    v7 v8  
    11== Ambra Cache Concepts == 
    22 
    3  * populate() bypasses txn local {key, value} maps and writes directly 
    4  * read() does not populate the txn local {key, value} map (READ_COMMITTED txn isolation) 
    5  * put(), remove(), removeAll() updates the txn local {key, value} map only 
    6  * read() looks in the txn local {key, value} map first. remove() and removeAll() in local txn, results in a 'null' 
    7  * commit() reflects the txn local {key, value} onto the underlying cache (Ehcache) 
    8  * beforeCommit() acquires a lock and afterCommit() updates cache and releases lock. 
    9     * Ensures 'serializable' txn isolation for cache updates.  
    10     * committed cached data is immediately visible to all txns locally (READ_COMMITTED txn isolation) 
    11     * ehcache-peers will see stale data till peer synchs are complete 
     3See [http://www.topazproject.org/trac/wiki/AmbraCache?version=7] for 0.9.2 and prior versions 
    124 
    13 === What is the difference between put() and populate (the get() with lookup() callback) === 
     5See r7599 for the changes from 0.9.2 
    146 
    15 populate is essentially reflecting a value in the database, where as 
    16 put() is reflecting an update to the database. It is important to 
    17 follow this distinction to ensure entries are not stale in the cache. 
    18 put() operations are accumulated and applied at commit() time. Whereas 
    19 populate is applied immedietly. Therefore it is '''important''' to ensure 
    20 that populate is not used for data that is the result of updates  
    21 that are likely to be rolled-back. 
     7The goal of Ambra cache management abstraction is to ensure that updates to the 
     8cache from within write txns are only committed to the backing cache (eg. EhCache) 
     9only at transaction commit. Till then these changes are held in a transaction local 
     10{key, value} map.  
    2211 
    23 === Why READ_COMMITTED isolation level? === 
    24  
    25 get() always loads from the underlying ehcache. It does not keep a  
    26 copy in the local txn scoped cache. This essentially means the get()  
    27 operation is at a READ_COMMITTED txn isolation level and repeated 
    28 get() calls may return different values within a single txn.  
    29  
    30 This is sort of a primitive. For application logic that requires 
    31 repeated read guarantees, it can maintain a local cache of its own. 
    32  
    33 For example, the second-level cache for OTM operations is only 
    34 looked up if the first-level cache (ie. the Session) itself does 
    35 not contain the object or a refresh() is requested. So the primitive 
    36 READ_COMMITTED get() is required. 
    37  
    38 === Why the lock for beforeCompletion() in !CacheManager? === 
    39  
    40 The txn scoped put() and remove() are all accumulated in  
    41 a local map and the changes are propagated to the cache 
    42 at transaction commit. We cannot commit the cache changes 
    43 at beforeCompletion() since a prepare()/commit() operation 
    44 could fail. This leaves all changes to be at afterCompletion(). 
    45  
    46 The problem at afterCompletion() is that the txn in mulgara 
    47 is already committed and what we update could very well be 
    48 stale data by the time we come around to writing it to the cache. 
    49  
    50 So the strategy chosen is to acquire a lock at beforeCompletion() 
    51 and hold it till the cache is updated in afterCompletion(). This is 
    52 effectively serializing the cache updates. Certainly it serializes 
    53 updates locally. Peer updates are serialized as long as the  
    54 updates only cause cache invalidation (for ehcache, 
    55 replicateUpdatesViaCopy = false, replicatePuts = false).  
    56  
    57 '''Important:''' Currently ambra caches are all set with  
    58 replicateUpdatesViaCopy = true, replicatePuts = true. This can  
    59 cause an older version of the data in the peer to overwrite the  
    60 latest change. Ambra caches are tuned more for performance 
    61 than coherency is essentially what that means. In such a set up the 
    62 update lock acquisition may not be that important either. So an 
    63 admin may decide to reduce the lock wait time to allow more 
    64 concurrent transactions during lengthy commit() operations 
    65 such as an article ingest to a blob-store like Fedora. 
     12Note that this goal is slightly different from the 0.9 ambra - there was an additional  
     13goal there to reduce queries to mulgara at whatever cost and therefore there was the  
     14popuate() concept that was different from a put(). Changes in 0.9.3 makes use of ambra's  
     15extensive use of read-only transactions and relaxes the need for a cache populate(). So  
     16it is possible when Mulgara allows multiple concurrent write transactions, multiple threads  
     17all doing the same mulgara query to populate the cahce. However that is not a concern for  
     18now and therefore eliminating populate() vastly simplifies the cache-management and avoids 
     19us having to serialize transaction commits. 
    6620 
    6721== Ambra Cache UML Diagrams ==