| 3 | | * populate() bypasses txn local {key, value} maps and writes directly |
|---|
| 4 | | * read() does not populate the txn local {key, value} map (READ_COMMITTED txn isolation) |
|---|
| 5 | | * put(), remove(), removeAll() updates the txn local {key, value} map only |
|---|
| 6 | | * read() looks in the txn local {key, value} map first. remove() and removeAll() in local txn, results in a 'null' |
|---|
| 7 | | * commit() reflects the txn local {key, value} onto the underlying cache (Ehcache) |
|---|
| 8 | | * beforeCommit() acquires a lock and afterCommit() updates cache and releases lock. |
|---|
| 9 | | * Ensures 'serializable' txn isolation for cache updates. |
|---|
| 10 | | * committed cached data is immediately visible to all txns locally (READ_COMMITTED txn isolation) |
|---|
| 11 | | * ehcache-peers will see stale data till peer synchs are complete |
|---|
| | 3 | See [http://www.topazproject.org/trac/wiki/AmbraCache?version=7] for 0.9.2 and prior versions |
|---|
| 15 | | populate is essentially reflecting a value in the database, where as |
|---|
| 16 | | put() is reflecting an update to the database. It is important to |
|---|
| 17 | | follow this distinction to ensure entries are not stale in the cache. |
|---|
| 18 | | put() operations are accumulated and applied at commit() time. Whereas |
|---|
| 19 | | populate is applied immedietly. Therefore it is '''important''' to ensure |
|---|
| 20 | | that populate is not used for data that is the result of updates |
|---|
| 21 | | that are likely to be rolled-back. |
|---|
| | 7 | The goal of Ambra cache management abstraction is to ensure that updates to the |
|---|
| | 8 | cache from within write txns are only committed to the backing cache (eg. EhCache) |
|---|
| | 9 | only at transaction commit. Till then these changes are held in a transaction local |
|---|
| | 10 | {key, value} map. |
|---|
| 23 | | === Why READ_COMMITTED isolation level? === |
|---|
| 24 | | |
|---|
| 25 | | get() always loads from the underlying ehcache. It does not keep a |
|---|
| 26 | | copy in the local txn scoped cache. This essentially means the get() |
|---|
| 27 | | operation is at a READ_COMMITTED txn isolation level and repeated |
|---|
| 28 | | get() calls may return different values within a single txn. |
|---|
| 29 | | |
|---|
| 30 | | This is sort of a primitive. For application logic that requires |
|---|
| 31 | | repeated read guarantees, it can maintain a local cache of its own. |
|---|
| 32 | | |
|---|
| 33 | | For example, the second-level cache for OTM operations is only |
|---|
| 34 | | looked up if the first-level cache (ie. the Session) itself does |
|---|
| 35 | | not contain the object or a refresh() is requested. So the primitive |
|---|
| 36 | | READ_COMMITTED get() is required. |
|---|
| 37 | | |
|---|
| 38 | | === Why the lock for beforeCompletion() in !CacheManager? === |
|---|
| 39 | | |
|---|
| 40 | | The txn scoped put() and remove() are all accumulated in |
|---|
| 41 | | a local map and the changes are propagated to the cache |
|---|
| 42 | | at transaction commit. We cannot commit the cache changes |
|---|
| 43 | | at beforeCompletion() since a prepare()/commit() operation |
|---|
| 44 | | could fail. This leaves all changes to be at afterCompletion(). |
|---|
| 45 | | |
|---|
| 46 | | The problem at afterCompletion() is that the txn in mulgara |
|---|
| 47 | | is already committed and what we update could very well be |
|---|
| 48 | | stale data by the time we come around to writing it to the cache. |
|---|
| 49 | | |
|---|
| 50 | | So the strategy chosen is to acquire a lock at beforeCompletion() |
|---|
| 51 | | and hold it till the cache is updated in afterCompletion(). This is |
|---|
| 52 | | effectively serializing the cache updates. Certainly it serializes |
|---|
| 53 | | updates locally. Peer updates are serialized as long as the |
|---|
| 54 | | updates only cause cache invalidation (for ehcache, |
|---|
| 55 | | replicateUpdatesViaCopy = false, replicatePuts = false). |
|---|
| 56 | | |
|---|
| 57 | | '''Important:''' Currently ambra caches are all set with |
|---|
| 58 | | replicateUpdatesViaCopy = true, replicatePuts = true. This can |
|---|
| 59 | | cause an older version of the data in the peer to overwrite the |
|---|
| 60 | | latest change. Ambra caches are tuned more for performance |
|---|
| 61 | | than coherency is essentially what that means. In such a set up the |
|---|
| 62 | | update lock acquisition may not be that important either. So an |
|---|
| 63 | | admin may decide to reduce the lock wait time to allow more |
|---|
| 64 | | concurrent transactions during lengthy commit() operations |
|---|
| 65 | | such as an article ingest to a blob-store like Fedora. |
|---|
| | 12 | Note that this goal is slightly different from the 0.9 ambra - there was an additional |
|---|
| | 13 | goal there to reduce queries to mulgara at whatever cost and therefore there was the |
|---|
| | 14 | popuate() concept that was different from a put(). Changes in 0.9.3 makes use of ambra's |
|---|
| | 15 | extensive use of read-only transactions and relaxes the need for a cache populate(). So |
|---|
| | 16 | it is possible when Mulgara allows multiple concurrent write transactions, multiple threads |
|---|
| | 17 | all doing the same mulgara query to populate the cahce. However that is not a concern for |
|---|
| | 18 | now and therefore eliminating populate() vastly simplifies the cache-management and avoids |
|---|
| | 19 | us having to serialize transaction commits. |
|---|