-
概述
Proto DataStore 将数据作为自定义数据类型的实例进行存储。此实现要求您使用协议缓冲区来定义架构,但可以确保类型安全。
和Preferences DataStore不同的是,使用Proto DataStore会比较繁琐,需要使用proto语法预定义数据,优点是可以确保类型安全。
这里说的确保类型安全其实就是会按照.proto文件中定义的数据类型来读取写入对象,因为是事先定义,并且整个读取写入操作都被封装好,少了人为的读取写入,就达到了所谓的确保类型安全的目的。其实Preferences DataStore在使用过程中也是“确保”了类型安全的,不过这里的类型安全确保值得是人为的确保,通常对于一个属性或对象我们会定义一组特定类型的读取写入方法,使用的时候调取相关方法就好,很少情况下会出现类型不匹配错误。
-
配置
在使用Proto DataStore时,依赖于proto buffer的插件来根据proto文件自动生成相关实体类,只不过这些类只存在于编译运行期,并不属于用户代码,自动生成也是避免人为出错。
注意:下面的配置版本是适配的,使用的是Studio4.2.1,使用更高或更低版本需要选择适配版本,我这里根据google开发者中国文档和网上的博文一起整理的,没找到官方的完整配置。
-
在项目build.gradle中
buildscript { ext.kotlin_version = "1.4.21" //这里的repositories是配置用于当前buildscript中的dependencies中依赖项的查找仓库 repositories { jcenter() google() mavenCentral() } dependencies { classpath "com.android.tools.build:gradle:4.0.2" classpath "org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version" classpath 'com.google.protobuf:protobuf-gradle-plugin:0.8.12' // NOTE: Do not place your application dependencies here; they belong // in the individual module build.gradle files } }
-
Module的build.gradle中
apply plugin: 'com.google.protobuf' protobuf { protoc { artifact = "com.google.protobuf:protoc:3.10.0" } // Generates the java Protobuf-lite code for the Protobufs in this project. See // https://github.com/google/protobuf-gradle-plugin#customizing-protobuf-compilation // for more information. generateProtoTasks { all().each { task -> task.builtins { java { option 'lite' } } } } } dependencies{ implementation "androidx.datastore:datastore:1.0.0" implementation "com.google.protobuf:protobuf-javalite:3.10.0" }
-
sync下载依赖项和插件,然后定义.proto文件
syntax = "proto3"; option java_package = "com.mph.review.bean.plain"; option java_multiple_files = true; message Demo2 { int32 aa = 1; string bb = 2; }
注意proto文件要放在src/main/proto文件夹下。
-
最后ReBuild项目,会在build>generated>source>proto>debug或者release>“java_package指定的包路径”中找到自动生成的类,比如上面的proto文件会生成:
-
-
代码使用
-
定义自己的androidx.datastore.core.Serializer
class MyDemo2Serializer(override val defaultValue : Demo2) : Serializer
{ override suspend fun readFrom(input : InputStream) : Demo2 { try { return Demo2.parseFrom(input) } catch(e : InvalidProtocolBufferException) { throw e } } override suspend fun writeTo(t : Demo2, output : OutputStream) { t.writeTo(output) } } 这里其实就是调用生成的Demo2类的相关方法进行实例化读取和写入。
-
创建proto的操作对象
val Context.demo2Proto by dataStore("demo2.proto",MyDemo2Serializer(Demo2.getDefaultInstance()))
我们使用kotlin的委托方法dataStore来生成,需要传入proto文件名,还有上面定义的Serializer。
-
读取写入
suspend fun incrementCounterByProto() { demo2Proto.updateData { currentSettings -> currentSettings.toBuilder() .setAa(currentSettings.aa + 1) .setBb(currentSettings.bb + "New") .build() } } private fun testProtoDataStore() { val aaFlow : Flow
= demo2Proto.data.map { settings -> // The exampleCounter property is generated from the proto schema. settings.aa } }
-
-
源码分析
首先看一下构造:
public fun
dataStore( fileName: String, serializer: Serializer , corruptionHandler: ReplaceFileCorruptionHandler ? = null, produceMigrations: (Context) -> List > = { listOf() }, scope: CoroutineScope = CoroutineScope(Dispatchers.IO + SupervisorJob()) ): ReadOnlyProperty > { return DataStoreSingletonDelegate( fileName, serializer, corruptionHandler, produceMigrations, scope ) } 返回的是DataStoreSingletonDelegate,它继承自ReadOnlyProperty:
public fun interface ReadOnlyProperty
{ /** * Returns the value of the property for the given object. * @param thisRef the object for which the value is requested. * @param property the metadata for the property. * @return the property value. */ public operator fun getValue(thisRef: T, property: KProperty<*>): V } 可以看到,这里需要operator重写getValue方法,即:
@GuardedBy("lock") @Volatile private var INSTANCE: DataStore
? = null /** * Gets the instance of the DataStore. * * @param thisRef must be an instance of [Context] * @param property not used */ override fun getValue(thisRef: Context, property: KProperty<*>): DataStore { return INSTANCE ?: synchronized(lock) { if (INSTANCE == null) { val applicationContext = thisRef.applicationContext INSTANCE = DataStoreFactory.create( serializer = serializer, produceFile = { applicationContext.dataStoreFile(fileName) }, corruptionHandler = corruptionHandler, migrations = produceMigrations(applicationContext), scope = scope ) } INSTANCE!! } } 当使用DataStoreSingletonDelegate的时候其实就是使用INSTANCE,也就是DataStore,DataStore接口有一个属性和一个方法,分别用于读取和写入操作,具体的读取和写入逻辑交给不同的子类实现:
public interface DataStore
{ /** * Provides efficient, cached (when possible) access to the latest durably persisted state. * The flow will always either emit a value or throw an exception encountered when attempting * to read from disk. If an exception is encountered, collecting again will attempt to read the * data again. * * Do not layer a cache on top of this API: it will be be impossible to guarantee consistency. * Instead, use data.first() to access a single snapshot. * * @return a flow representing the current state of the data * @throws IOException when an exception is encountered when reading data */ public val data: Flow /** * Updates the data transactionally in an atomic read-modify-write operation. All operations * are serialized, and the transform itself is a coroutine so it can perform heavy work * such as RPCs. * * The coroutine completes when the data has been persisted durably to disk (after which * [data] will reflect the update). If the transform or write to disk fails, the * transaction is aborted and an exception is thrown. * * @return the snapshot returned by the transform * @throws IOException when an exception is encountered when writing data to disk * @throws Exception when thrown by the transform function */ public suspend fun updateData(transform: suspend (t: T) -> T): T } applicationContext.dataStoreFile(fileName)会创建一个File实例用于保存数据:
public fun Context.dataStoreFile(fileName: String): File = File(applicationContext.filesDir, "datastore/$fileName")
DataStoreFactory.create会返回一个SingleProcessDataStore实例:
public fun
create( serializer: Serializer , corruptionHandler: ReplaceFileCorruptionHandler ? = null, migrations: List > = listOf(), scope: CoroutineScope = CoroutineScope(Dispatchers.IO + SupervisorJob()), produceFile: () -> File ): DataStore = SingleProcessDataStore( produceFile = produceFile, serializer = serializer, corruptionHandler = corruptionHandler ?: NoOpCorruptionHandler(), initTasksList = listOf(DataMigrationInitializer.getInitializer(migrations)), scope = scope ) 接下来我们先看读取操作,上面的读取操作会返回一个Flow对象,看一下SingleProcessDataStore的data属性重写:
override val data: Flow
= flow { val currentDownStreamFlowState = downstreamFlow.value if (currentDownStreamFlowState !is Data) { // We need to send a read request because we don't have data yet. actor.offer(Message.Read(currentDownStreamFlowState)) } emitAll( downstreamFlow.dropWhile { if (currentDownStreamFlowState is Data || currentDownStreamFlowState is Final ) { // We don't need to drop any Data or Final values. false } else { // we need to drop the last seen state since it was either an exception or // wasn't yet initialized. Since we sent a message to actor, we *will* see a // new value. it === currentDownStreamFlowState } }.map { when (it) { is ReadException -> throw it.readException is Final -> throw it.finalException is Data -> it.value is UnInitialized -> error( "This is a bug in DataStore. Please file a bug at: " + "https://issuetracker.google.com/issues/new?" + "component=907884&template=1466542" ) } } ) } downstreamFlow.value是最新操作的数据,读取和写入都会重新更新downstreamFlow.value的值,这里判断如果不是Data(即可读取数据),则调用actor.offer方法来读取数据,看一下actor是什么:
private val actor = SimpleActor
>( scope = scope, onComplete = { it?.let { downstreamFlow.value = Final(it) } // We expect it to always be non-null but we will leave the alternative as a no-op // just in case. synchronized(activeFilesLock) { activeFiles.remove(file.absolutePath) } }, onUndeliveredElement = { msg, ex -> if (msg is Message.Update) { // TODO(rohitsat): should we instead use scope.ensureActive() to get the original // cancellation cause? Should we instead have something like // UndeliveredElementException? msg.ack.completeExceptionally( ex ?: CancellationException( "DataStore scope was cancelled before updateData could complete" ) ) } } ) { msg -> when (msg) { is Message.Read -> { handleRead(msg) } is Message.Update -> { handleUpdate(msg) } } } SimpleActor的构造方法如下:
internal class SimpleActor
( /** * The scope in which to consume messages. */ private val scope: CoroutineScope, /** * Function that will be called when scope is cancelled. Should *not* throw exceptions. */ onComplete: (Throwable?) -> Unit, /** * Function that will be called for each element when the scope is cancelled. Should *not* * throw exceptions. */ onUndeliveredElement: (T, Throwable?) -> Unit, /** * Function that will be called once for each message. * * Must *not* throw an exception (other than CancellationException if scope is cancelled). */ private val consumeMessage: suspend (T) -> Unit ) { ... ... 可见,这里最后的函数体(msg->...部分)赋值给了SimpleActor的consumeMessage属性。现在来看一下它的offer方法:
fun offer(msg: T) { // should never return false bc the channel capacity is unlimited check( messageQueue.trySend(msg) .onClosed { throw it ?: ClosedSendChannelException("Channel was closed normally") } .isSuccess ) // If the number of remaining messages was 0, there is no active consumer, since it quits // consuming once remaining messages hits 0. We must kick off a new consumer. if (remainingMessages.getAndIncrement() == 0) { scope.launch { // We shouldn't have started a new consumer unless there are remaining messages... check(remainingMessages.get() > 0) do { // We don't want to try to consume a new message unless we are still active. // If ensureActive throws, the scope is no longer active, so it doesn't // matter that we have remaining messages. scope.ensureActive() consumeMessage(messageQueue.receive()) } while (remainingMessages.decrementAndGet() != 0) } } }
我们看到,在这里调用了consumeMessage函数,传入的参数是从messageQueue中获取的,messageQueue在前面的check时通过trySend把actor.offer传入的Message.Read(currentDownStreamFlowState)添加的,还要注意的一点是这一切都是在scope.launch协程中发生的,scope是CoroutineScope(Dispatchers.IO + SupervisorJob()),这就保证了数据读取是一个不影响主线程的异步操作。
那么现在回到actor的构造处,因为它是Message.Read,就知道它会走handleRead方法:
private suspend fun handleRead(read: Message.Read
) { when (val currentState = downstreamFlow.value) { is Data -> { // We already have data so just return... } is ReadException -> { if (currentState === read.lastState) { readAndInitOrPropagateFailure() } // Someone else beat us but also failed. The collector has already // been signalled so we don't need to do anything. } UnInitialized -> { readAndInitOrPropagateFailure() } is Final -> error("Can't read in final state.") // won't happen } } 这里的if判断其实就是确保之前的一系列操作结束后并没有产生异常,因为这两个东西正常来说是同一个东西。那么只要不是Data或者不是Final(Final就是无数据),就会执行到readAndInitOrPropagateFailure方法:
private suspend fun readAndInitOrPropagateFailure() { try { readAndInit() } catch (throwable: Throwable) { downstreamFlow.value = ReadException(throwable) } }
这里又调用readAndInit方法:
private suspend fun readAndInit() { // This should only be called if we don't already have cached data. check(downstreamFlow.value == UnInitialized || downstreamFlow.value is ReadException) val updateLock = Mutex() var initData = readDataOrHandleCorruption() var initializationComplete: Boolean = false // TODO(b/151635324): Consider using Context Element to throw an error on re-entrance. val api = object : InitializerApi
{ override suspend fun updateData(transform: suspend (t: T) -> T): T { return updateLock.withLock() { if (initializationComplete) { throw IllegalStateException( "InitializerApi.updateData should not be " + "called after initialization is complete." ) } val newData = transform(initData) if (newData != initData) { writeData(newData) initData = newData } initData } } } initTasks?.forEach { it(api) } initTasks = null // Init tasks have run successfully, we don't need them anymore. updateLock.withLock { initializationComplete = true } downstreamFlow.value = Data(initData, initData.hashCode()) } readDataOrHandleCorruption用来读取文件中数据并转化成相关的实体类实例:
private suspend fun readDataOrHandleCorruption(): T { try { return readData() } catch (ex: CorruptionException) { val newData: T = corruptionHandler.handleCorruption(ex) try { writeData(newData) } catch (writeEx: IOException) { // If we fail to write the handled data, add the new exception as a suppressed // exception. ex.addSuppressed(writeEx) throw ex } // If we reach this point, we've successfully replaced the data on disk with newData. return newData } }
private suspend fun readData(): T { try { FileInputStream(file).use { stream -> return serializer.readFrom(stream) } } catch (ex: FileNotFoundException) { if (file.exists()) { throw ex } return serializer.defaultValue } }
可见,这里根据文件构造文件输入流,再把它传给我们之前传入的Serializer,也就是MyDemo2Serializer,调用它的readFrom方法,还记得我们在那里调用了自动生成的Demo2的parseFrom方法:
public static com.mph.review.bean.plain.Demo2 parseFrom(java.io.InputStream input) throws java.io.IOException { return com.google.protobuf.GeneratedMessageLite.parseFrom( DEFAULT_INSTANCE, input); }
DEFAULT_INSTANCE就是Demo2对象,是在静态块中实例化的:
private static final com.mph.review.bean.plain.Demo2 DEFAULT_INSTANCE; static { Demo2 defaultInstance = new Demo2(); // New instances are implicitly immutable so no need to make // immutable. DEFAULT_INSTANCE = defaultInstance; com.google.protobuf.GeneratedMessageLite.registerDefaultInstance( Demo2.class, defaultInstance); }
来看一下GeneratedMessageLite的parseFrom方法:
protected static
> T parseFrom( T defaultInstance, InputStream input) throws InvalidProtocolBufferException { return checkMessageInitialized( parsePartialFrom( defaultInstance, CodedInputStream.newInstance(input), ExtensionRegistryLite.getEmptyRegistry())); } parsePartialFrom方法如下:
static
> T parsePartialFrom( T instance, CodedInputStream input, ExtensionRegistryLite extensionRegistry) throws InvalidProtocolBufferException { @SuppressWarnings("unchecked") // Guaranteed by protoc T result = (T) instance.dynamicMethod(MethodToInvoke.NEW_MUTABLE_INSTANCE); try { // TODO(yilunchong): Try to make input with type CodedInpuStream.ArrayDecoder use // fast path. Schema schema = Protobuf.getInstance().schemaFor(result); schema.mergeFrom(result, CodedInputStreamReader.forCodedInput(input), extensionRegistry); schema.makeImmutable(result); } catch (IOException e) { if (e.getCause() instanceof InvalidProtocolBufferException) { throw (InvalidProtocolBufferException) e.getCause(); } throw new InvalidProtocolBufferException(e.getMessage()).setUnfinishedMessage(result); } catch (RuntimeException e) { if (e.getCause() instanceof InvalidProtocolBufferException) { throw (InvalidProtocolBufferException) e.getCause(); } throw e; } return result; } 因为instance是Demo2,所以看一下Demo2的dynamicMethod方法:
protected final java.lang.Object dynamicMethod( com.google.protobuf.GeneratedMessageLite.MethodToInvoke method, java.lang.Object arg0, java.lang.Object arg1) { switch (method) { case NEW_MUTABLE_INSTANCE: { return new com.mph.review.bean.plain.Demo2(); } case NEW_BUILDER: { return new Builder(); } ... ... } }
这里还是会返回一个新的Demo2对象,DEFAULT_INSTANCE已经是对象了,这里看起来好像有些多余,其实DEFAULT_INSTANCE是静态的、共用的,有可能其他地方也在用,我们不能直接修改这个实例避免影响其他地方。拿到result之后,parsePartialFrom的后续操作就是从文件输入流中读取保存值然后对result进行赋值。
到这里,readDataOrHandleCorruption的流程就走完了,我们读取到了保存的数据,回到readAndInit方法,接下来就是一些非必要的处理操作,我们来看一下。
首先会构造一个InitializerApi对象叫api,initTasks中的每个函数都要处理一次它,注意,这里的initTasks只会使用一次,只用于第一次读取的时候,这是为什么呢?因为这些操作是为了兼容旧数据,相当于对旧数据提供一个可以更新的入口,为什么这么说,我们往下看。
initTasks就是initTasksList,initTasksList来自于DataStoreFactory构造SingleProcessDataStore时传入的:
initTasksList = listOf(DataMigrationInitializer.getInitializer(migrations)),
migrations是DataStoreFactory的create方法传入的,这可以在我们调用dataStore时通过produceMigrations参数传入,是一个集合,里面存的是DataMigration,是一个接口:
public interface DataMigration
{ public suspend fun shouldMigrate(currentData: T): Boolean public suspend fun migrate(currentData: T): T public suspend fun cleanUp() } 现在看一下DataMigrationInitializer.getInitializer方法返回的:
fun
getInitializer(migrations: List >): suspend (api: InitializerApi ) -> Unit = { api -> runMigrations(migrations, api) } 是一个suspend函数,所以前面的initTasks的forEach中的it就是这个函数体,可见这里调用了runMigrations方法:
private suspend fun
runMigrations( migrations: List >, api: InitializerApi ) { val cleanUps = mutableListOf Unit>() api.updateData { startingData -> migrations.fold(startingData) { data, migration -> if (migration.shouldMigrate(data)) { cleanUps.add { migration.cleanUp() } migration.migrate(data) } else { data } } } var cleanUpFailure: Throwable? = null cleanUps.forEach { cleanUp -> try { cleanUp() } catch (exception: Throwable) { if (cleanUpFailure == null) { cleanUpFailure = exception } else { cleanUpFailure!!.addSuppressed(exception) } } } // If we encountered a failure on cleanup, throw it. cleanUpFailure?.let { throw it } } 这里的api就是前面构造的InitializerApi,调用它的updateData方法,参数就是这里的startingData部分,赋值给transform,然后传入前面读取的initData并调用它,那现在我们再来读上面的代码就好理解了,startingData就是initData,也就是读取到的Demo2,migrations.fold方法会循环migrations中的所有的DataMigration,依次执行后面的代码块,每一个DataMigration都会判断shouldMigrate,为true则进行migrate迁移操作并返回处理后的数据,为false则返回原数据,fold操作使得后面的DataMigration会在前者处理后的基础上继续操作。迁移之后统一调用所有DataMigration的cleanUp方法,如果有需要可以重写这个方法执行一些逻辑。
回到readAndInit方法,我们已经完成旧数据的按需迁移工作,最后给downstreamFlow.value赋值Data,里面含有我们读取的Demo2和它的hashCode。
此时我们应该回到SingleProcessDataStore的data赋值的地方,也就是flow代码块处,继续往下走该执行emitAll方法了,这部分操作主要是去除重复数据和排错。现在我们只是拿到了一个包含数据的Flow对象,关于取数据我们后面再说。
下面来看一下写入。
写入是调用DataStore的updateData方法,看一下SingleProcessDataStore的实现:
override suspend fun updateData(transform: suspend (t: T) -> T): T { /** * The states here are the same as the states for reads. Additionally we send an ack that * the actor *must* respond to (even if it is cancelled). */ val ack = CompletableDeferred
() val currentDownStreamFlowState = downstreamFlow.value val updateMsg = Message.Update(transform, ack, currentDownStreamFlowState, coroutineContext) actor.offer(updateMsg) return ack.await() } 可以看到也是调用了actor的offer方法,只不过参数换成了Message.Update(transform, ack, currentDownStreamFlowState, coroutineContext),transform是前面设置数据的用户代码:
{ currentSettings -> currentSettings.toBuilder() .setAa(currentSettings.aa + 1) .setBb(currentSettings.bb + "New") .build() }
根据前面的经验,这次会走handleUpdate()->transformAndWrite()->writeData():
internal suspend fun writeData(newData: T) { file.createParentDirectories() val scratchFile = File(file.absolutePath + SCRATCH_SUFFIX) try { FileOutputStream(scratchFile).use { stream -> serializer.writeTo(newData, UncloseableOutputStream(stream)) stream.fd.sync() // TODO(b/151635324): fsync the directory, otherwise a badly timed crash could // result in reverting to a previous state. } if (!scratchFile.renameTo(file)) { throw IOException( "Unable to rename $scratchFile." + "This likely means that there are multiple instances of DataStore " + "for this file. Ensure that you are only creating a single instance of " + "datastore for this file." ) } } catch (ex: IOException) { if (scratchFile.exists()) { scratchFile.delete() // Swallow failure to delete } throw ex } }
和读取同理,也会调用serializer的writeTo方法:
override suspend fun writeTo(t : Demo2, output : OutputStream) { t.writeTo(output) }