4.1.4 In Memory Format 内存存储模型
IMap 拥有可配置的内存存储格式.缺省的Hazelcast存储数据时,会将二进制序列化后的结果放入内存中存储起来.但有时,它也会将他们对象的键值以对象的形式进行有效率的存储,尤其是在本地数据处理比如说在查询或者键值对处理的时候.设置map在内存中的配置您可以决定数据具体以什么样的方式储存在内存当中,下面是可供选择的配置:
(default): This is the default option. The data will be stored in serialized binary format. You can use this option if you mostly perform regular map operations like put and get. - 二进制:缺省的配置,数据将存储为序列化后的二进制格式,如果你经常使用map的常规方法比如说get或者put方法的时候,选择这个比较好.
: The data will be stored in deserialized form. This configuration is good for maps where entry processing and queries form the majority of all operations and the objects are complex ones, so serialization cost is respectively high. By storing objects, entry processing will not contain the deserialization cost. - 对象:数据将会存储为反序列化后的格式.这个配置适用于对于一个负责的map键值对进行大量的数据处理或者查询的时候,推荐使用这个,因为序列化会占用大量的资源.使用这种方式存储对象,将不会花费多余的时间或资源在反序列化上面.
NOTE: If a value is stored in
format, a change on a returned value does not effect the stored instance. In this case, the returned instance is not the actual one but a clone. Therefore, changes made on an object after it is returned will not reflect on the actual stored data. Similarly, when a value is written to a map and the value is stored in OBJECT
format, it will be a copy of the put value. So changes made on the object after it is stored, will not reflect on the actual stored data.
4.1.5 Map Persistence 字典持久化
NOTE: Data store needs to be a centralized system that is accessible from all Hazelcast Nodes. Persisting to local file system is not supported.
public class PersonMapStore implements MapStore<Long, Person> { private final Connection con; public PersonMapStore() { try { con = DriverManager.getConnection("jdbc:hsqldb:mydatabase", "SA", ""); con.createStatement().executeUpdate( "create table if not exists person (id bigint, name varchar(45))"); } catch (SQLException e) { throw new RuntimeException(e); } } public synchronized void delete(Long key) { System.out.println("Delete:" + key); try { con.createStatement().executeUpdate( format("delete from person where id = %s", key)); } catch (SQLException e) { throw new RuntimeException(e); } } public synchronized void store(Long key, Person value) { try { con.createStatement().executeUpdate( format("insert into person values(%s,'%s')", key, value.name)); } catch (SQLException e) { throw new RuntimeException(e); } } public synchronized void storeAll(Map<Long, Person> map) { for (Map.Entry<Long, Person> entry : map.entrySet()) store(entry.getKey(), entry.getValue()); } public synchronized void deleteAll(Collection<Long> keys) { for (Long key : keys) delete(key); } public synchronized Person load(Long key) { try { ResultSet resultSet = con.createStatement().executeQuery( format("select name from person where id =%s", key)); try { if (!resultSet.next()) return null; String name = resultSet.getString(1); return new Person(name); } finally { resultSet.close(); } } catch (SQLException e) { throw new RuntimeException(e); } } public synchronized Map<Long, Person> loadAll(Collection<Long> keys) { Map<Long, Person> result = new HashMap<Long, Person>(); for (Long key : keys) result.put(key, load(key)); return result; } public Set<Long> loadAllKeys() { return null; } }
NOTE: Loading process is performed on a thread different than the partition threads using ExecutorService.
Hazelcast supports read-through, write-through and write-behind persistence modes which are explained in below subsections.
is successfully called so the entry is persisted. - MapStore.store(key,value)成功调用,数据已被持久化
In-Memory entry is updated
- 内存中的键值对已被更新
In-Memory backup copies are successfully created on other JVMs (if
is greater than 0) - 如果backup-count大于0那么另一台java虚拟机中的内存备份已经完成
NOTE: In write-behind mode, by default Hazelcast coalesces updates on a specific key, i.e. applies only the last update on it. But, you can set
and you can store all updates performed on a key to the data store.
NOTE: When you set
, after you reached per-node max write-behind-queue capacity, subsequent put operations will fail with ReachedMaxSizeException
. This exception will be thrown to prevent uncontrolled growing of write-behind queues. You can set per node max capacity with GroupProperty#MAP_WRITE_BEHIND_QUEUE_CAPACITY
在这个模式下 如果map.put(key,value)调用并且返回,那么可以确定如下几件事:
In-Memory entry is updated
In-Memory backup copies are successfully created on other JVMs (if
is greater than 0) - 如果backup-count大于0那么另一台java虚拟机中的内存备份已经完成
The entry is marked as dirty so that after
, it can be persisted withMapStore.store(key,value)
call. - 在write-delay-seconds设置的时间过后,该键值对将会被标记为脏数据.将会调用MapStore.store(key,value)进行持久化.
, and MapStore.deleteAll(collection)进行所有的写操作(在一次调用中完成).
NOTE: If a map entry is marked as dirty, i.e. it is waiting to be persisted to the
in a write-behind scenario, the eviction process forces the entry to be stored. By this way, you will have control on the number of entries waiting to be stored, so that a possible OutOfMemory exception can be prevented.
NOTE: MapStore or MapLoader implementations should not use Hazelcast Map/Queue/MultiMap/List/Set operations. Your implementation should only work with your data store. Otherwise, you may get into deadlock situations.
<hazelcast> ... <map name="default"> ... <map-store enabled="true"> <!-- Name of the class implementing MapLoader and/or MapStore. The class should implement at least of these interfaces and contain no-argument constructor. Note that the inner classes are not supported. --> <class-name>com.hazelcast.examples.DummyStore</class-name> <!-- Number of seconds to delay to call the MapStore.store(key, value). If the value is zero then it is write-through so MapStore.store(key, value) will be called as soon as the entry is updated. Otherwise it is write-behind so updates will be stored after write-delay-seconds value by calling Hazelcast.storeAll(map). Default value is 0. --> <write-delay-seconds>60</write-delay-seconds> <!-- Used to create batch chunks when writing map store. In default mode all entries will be tried to persist in one go. To create batch chunks, minimum meaningful value for write-batch-size is 2. For values smaller than 2, it works as in default mode. --> <write-batch-size>1000</write-batch-size> </map-store> </map> </hazelcast>
MapStoreFactory and MapLoaderLifecycleSupport Interfaces
众所周知的,可通过通配符将一个设置应用到多个map当中(Please see Using Wildcard),意味着配置会共享给多个map.但是MapStore并知道当一个配置在多个map中生效时键值对是怎样存储的.为了克服这个,Hazelcast提供了MapStoreFactory接口.
Config config = new Config(); MapConfig mapConfig = config.getMapConfig( "*" ); MapStoreConfig mapStoreConfig = mapConfig.getMapStoreConfig(); mapStoreConfig.setFactoryImplementation( new MapStoreFactory<Object, Object>() { @Override public MapLoader<Object, Object> newMapStore( String mapName, Properties properties ) { return null; } });
public interface MapLoaderLifecycleSupport { /** * Initializes this MapLoader implementation. Hazelcast will call * this method when the map is first used on the * HazelcastInstance. Implementation can * initialize required resources for the implementing * mapLoader such as reading a config file and/or creating * database connection. */ void init( HazelcastInstance hazelcastInstance, Properties properties, String mapName ); /** * Hazelcast will call this method before shutting down. * This method can be overridden to cleanup the resources * held by this map loader implementation, such as closing the * database connections etc. */ void destroy(); }
Initialization on startup
MapLoader.loadAllKeys API 当map第一次touched/used的时候通常预读取内存中的map.如果MapLoader.loadAllKeys会返回空,那么意味着无任何值被加载.MapLoader.loadAllKeys实现接口则会返回所有键,否则只返回一些keys.你可以选择对于一个实例,只返回Hot keys.你也可以使用最快捷的一种方式,预读取的map,Hazelcast将优先加载每一个节点自身的键值对.
- When
is first called from any node, initialization will start depending on the value of InitialLoadMode. If it is set as EAGER, initialization starts. If it is set as LAZY, initialization actually does not start but data is loaded at each time a partition loading is completed. - Hazelcast will call
to get all your keys on each node - Each node will figure out the list of keys it owns
- Each node will load all its owned keys by calling
- Each node puts its owned entries into the map by calling
NOTE: If the load mode is LAZY and when
method is called (which triggersMapStore.deleteAll()
), Hazelcast will remove ONLY the loaded entries from your map and datastore. Since the whole data is not loaded for this case (LAZY mode), please note that there may be still entries in your datastore.
Forcing All Keys To Be Loaded
public class LoadAll { public static void main(String[] args) { final int numberOfEntriesToAdd = 1000; final String mapName = LoadAll.class.getCanonicalName(); final Config config = createNewConfig(mapName); final HazelcastInstance node = Hazelcast.newHazelcastInstance(config); final IMap<Integer, Integer> map = node.getMap(mapName); populateMap(map, numberOfEntriesToAdd); System.out.printf("# Map store has %d elements\n", numberOfEntriesToAdd); map.evictAll(); System.out.printf("# After evictAll map size\t: %d\n", map.size()); map.loadAll(true); System.out.printf("# After loadAll map size\t: %d\n", map.size()); } }
Post Processing Map Store
class ProcessingStore extends MapStore<Integer, Employee> implements PostProcessingMapStore { @Override public void store( Integer key, Employee employee ) { EmployeeId id = saveEmployee(); employee.setId( id.getId() ); } }
4.1.6 Near Cache
在Hazelcast中map键值对将会分布在各个Cluster中.想想一下你需要读取一个键为k的值,会花费很多时间,因为如果k是其在cluster中另一个成员本身存储的key.那么对于每个map.get(k)将会进行一个远程操作,这意味着会花费大量网络消耗.如果你的map是只读的那么你可以考虑为这个map创建Near Cache的方式.通过这种方式会降低网络消耗并且提升客观的访问速度.当然好处不白来.当你应用near cache的时候,你必须考虑以下问题:
JVM will have to hold extra cached data so it will increase the memory consumption.
- java虚拟机将会hold住额外的缓存,以至于它会增加内存的消耗
If invalidation is turned on and entries are updated frequently, then invalidations will be costly.
- 如果校验开启并且键值对频繁更新,那么校验会十分消耗资源
Near cache breaks the strong consistency guarantees; you might be reading stale data.
- Near cache 将会打破强一致性原则;你没准读取到的数据不是最新数据
再次重申一下,如果对一个map进行大量读操作,那么Near Cache是一个不错的选择.下面是一个map应用Near Cache的配置例子:
<hazelcast> ... <map name="my-read-mostly-map"> ... <near-cache> <!-- Maximum size of the near cache. When max size is reached, cache is evicted based on the policy defined. Any integer between 0 and Integer.MAX_VALUE. 0 means Integer.MAX_VALUE. Default is 0. --> <max-size>5000</max-size> <!-- Maximum number of seconds for each entry to stay in the near cache. Entries that are older than <time-to-live-seconds> will get automatically evicted from the near cache. Any integer between 0 and Integer.MAX_VALUE. 0 means infinite. Default is 0. --> <time-to-live-seconds>0</time-to-live-seconds> <!-- Maximum number of seconds each entry can stay in the near cache as untouched (not-read). Entries that are not read (touched) more than <max-idle-seconds> value will get removed from the near cache. Any integer between 0 and Integer.MAX_VALUE. 0 means Integer.MAX_VALUE. Default is 0. --> <max-idle-seconds>60</max-idle-seconds> <!-- Valid values are: NONE (no extra eviction, <time-to-live-seconds> may still apply), LRU (Least Recently Used), LFU (Least Frequently Used). NONE is the default. Regardless of the eviction policy used, <time-to-live-seconds> will still apply. --> <eviction-policy>LRU</eviction-policy> <!-- Should the cached entries get evicted if the entries are changed (updated or removed). true of false. Default is true. --> <invalidate-on-change>true</invalidate-on-change> <!-- You may want also local entries to be cached. This is useful when in memory format for near cache is different than the map's one. By default it is disabled. --> <cache-local-entries>false</cache-local-entries> </near-cache> </map> </hazelcast>
NOTE: Programmatically, near cache configuration is done by using the class NearCacheConfig. And this class is used both in nodes and clients. To create a near cache in a client (native Java client), use the method
in the class ClientConfig
(please see Java Client section). Please note that near cache configuration is specific to the node or client itself, a map in a node may not have near cache configured while the same map in a client may have.