本文描述了Android Content Provider实现时,sqlite数据库出现lock问题的大致原因和解决思路;
虽然最终可以归结为线程同步和资源竞争问题, 但是对于trouble shooting却并非如此简单.
[Symptom]
From the logcat:
08-22 22:32:30.499 475 1692 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 25. Continuing to wait in thread: 30
08-22 22:32:32.850 475 1836 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 25. Continuing to wait in thread: 31
08-22 22:32:44.000 475 1059 E SqliteCursor.cpp: Bailing on database busy retry
08-22 22:32:44.000 475 1059 E SQLiteQuery: exception: database is locked: retrycount exceeded; query: SELECT storage_id FROM files WHERE (_data=?)
08-22 22:32:44.000 475 1059 E DatabaseUtils: Writing exception to parcel
08-22 22:32:44.000 475 1059 E DatabaseUtils: android.database.sqlite.SQLiteDatabaseLockedException: database is locked: retrycount exceeded
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.nativeFillWindow(Native Method)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:86)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:164)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:156)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.database.CursorToBulkCursorAdaptor.count(CursorToBulkCursorAdaptor.java:184)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.content.ContentProviderNative.onTransact(ContentProviderNative.java:117)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at android.os.Binder.execTransact(Binder.java:338)
08-22 22:32:44.000 475 1059 E DatabaseUtils: at dalvik.system.NativeStart.run(Native Method)
..........................................
08-22 22:33:36.420 475 1836 E SqliteCursor.cpp: Bailing on database busy retry
08-22 22:33:36.420 475 1836 E SQLiteQuery: exception: database is locked: retrycount exceeded; query: SELECT _id FROM files WHERE (_data=?)
08-22 22:33:36.430 475 1836 E DatabaseUtils: Writing exception to parcel
08-22 22:33:36.430 475 1836 E DatabaseUtils: android.database.sqlite.SQLiteDatabaseLockedException: database is locked: retrycount exceeded
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.nativeFillWindow(Native Method)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:86)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:164)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:156)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.database.CursorToBulkCursorAdaptor.count(CursorToBulkCursorAdaptor.java:184)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.content.ContentProviderNative.onTransact(ContentProviderNative.java:117)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at android.os.Binder.execTransact(Binder.java:338)
08-22 22:33:36.430 475 1836 E DatabaseUtils: at dalvik.system.NativeStart.run(Native Method)
..........................................
08-22 22:34:06.639 475 1059 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 30. Continuing to wait in thread: 25
08-22 22:34:28.630 475 1692 E SqliteCursor.cpp: Bailing on database busy retry
08-22 22:34:28.630 475 1692 E SQLiteQuery: exception: database is locked: retrycount exceeded; query: SELECT _id FROM images
08-22 22:34:28.640 475 1692 E MediaScanner: SQLException in MediaScanner.scan()
08-22 22:34:28.640 475 1692 E MediaScanner: android.database.sqlite.SQLiteDatabaseLockedException: database is locked: retrycount exceeded
08-22 22:34:28.640 475 1692 E MediaScanner: at android.database.sqlite.SQLiteQuery.nativeFillWindow(Native Method)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:86)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:164)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:156)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.media.MediaScanner.prescan(MediaScanner.java:1132)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.media.MediaScanner.scanDirectories(MediaScanner.java:1315)
08-22 22:34:28.640 475 1692 E MediaScanner: at com.android.providers.media.MediaScannerService.scan(MediaScannerService.java:111)
08-22 22:34:28.640 475 1692 E MediaScanner: at com.android.providers.media.MediaScannerService.access$300(MediaScannerService.java:49)
08-22 22:34:28.640 475 1692 E MediaScanner: at com.android.providers.media.MediaScannerService$ServiceHandler.handleMessage(MediaScannerService.java:271)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.os.Handler.dispatchMessage(Handler.java:99)
08-22 22:34:28.640 475 1692 E MediaScanner: at android.os.Looper.loop(Looper.java:137)
08-22 22:34:28.640 475 1692 E MediaScanner: at com.android.providers.media.MediaScannerService.run(MediaScannerService.java:193)
08-22 22:34:28.640 475 1692 E MediaScanner: at java.lang.Thread.run(Thread.java:856)
08-22 22:34:28.640 475 1692 D MediaProvider: [MediaScannerService] broadcast ACTION_MEDIA_SCANNER_FINISHED with file:///mnt/sdcard
..........................................
08-22 22:34:58.719 475 1692 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 25. Continuing to wait in thread: 30
08-22 22:34:58.969 475 1836 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 25. Continuing to wait in thread: 31
..................
08-22 22:35:22.369 475 1059 E SqliteCursor.cpp: Bailing on database busy retry
08-22 22:35:22.369 475 1059 E SQLiteQuery: exception: database is locked: retrycount exceeded; query: SELECT _id FROM files WHERE (_data=?)
08-22 22:35:22.369 475 1059 E DatabaseUtils: Writing exception to parcel
08-22 22:35:22.369 475 1059 E DatabaseUtils: android.database.sqlite.SQLiteDatabaseLockedException: database is locked: retrycount exceeded
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.nativeFillWindow(Native Method)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:86)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:164)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:156)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.database.CursorToBulkCursorAdaptor.count(CursorToBulkCursorAdaptor.java:184)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.content.ContentProviderNative.onTransact(ContentProviderNative.java:117)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at android.os.Binder.execTransact(Binder.java:338)
08-22 22:35:22.369 475 1059 E DatabaseUtils: at dalvik.system.NativeStart.run(Native Method)
..........................................
08-22 22:35:52.380 475 1692 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 31. Continuing to wait in thread: 30
08-22 22:35:52.949 475 486 W SQLiteDatabase: database lock has not been available for 30 sec. Current Owner of the lock is 31. Continuing to wait in thread: 19
..........................................
08-22 22:36:15.209 475 1836 E SqliteCursor.cpp: Bailing on database busy retry
08-22 22:36:15.209 475 1836 E SQLiteQuery: exception: database is locked: retrycount exceeded; query: SELECT _id, date_modified, _data, artist_id FROM audio ORDER BY _id
08-22 22:36:15.209 475 1836 E DatabaseUtils: Writing exception to parcel
08-22 22:36:15.209 475 1836 E DatabaseUtils: android.database.sqlite.SQLiteDatabaseLockedException: database is locked: retrycount exceeded
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.nativeFillWindow(Native Method)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:86)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:164)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:156)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.database.CursorToBulkCursorAdaptor.count(CursorToBulkCursorAdaptor.java:184)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.content.ContentProviderNative.onTransact(ContentProviderNative.java:117)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at android.os.Binder.execTransact(Binder.java:338)
08-22 22:36:15.209 475 1836 E DatabaseUtils: at dalvik.system.NativeStart.run(Native Method)
..........................................
08-22 22:36:15.229 2036 2049 W dalvikvm: threadid=11: thread exiting with uncaught exception (group=0x40adb9f0)
08-22 22:36:15.229 2036 2049 E test : Exception
08-22 22:36:15.229 2036 2049 E AndroidRuntime: FATAL EXCEPTION: pool-1-thread-1
08-22 22:36:15.229 2036 2049 E AndroidRuntime: android.database.sqlite.SQLiteException: database is locked: retrycount exceeded
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at android.database.DatabaseUtils.readExceptionFromParcel(DatabaseUtils.java:180)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at android.database.DatabaseUtils.readExceptionFromParcel(DatabaseUtils.java:136)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at android.content.ContentProviderProxy.query(ContentProviderNative.java:358)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at android.content.ContentResolver.query(ContentResolver.java:311)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at com.xxxx.mediaextra.provider.MediaExtraUpdater.getAudioInfoListFromMediaDb(MediaExtraUpdater.java:629)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at com.xxxx.mediaextra.provider.MediaExtraUpdater.access$500(MediaExtraUpdater.java:61)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at com.xxxx.mediaextra.provider.MediaExtraUpdater$2.run(MediaExtraUpdater.java:200)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1076)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:569)
08-22 22:36:15.229 2036 2049 E AndroidRuntime: at java.lang.Thread.run(Thread.java:856)
08-22 22:36:15.249 258 593 W ActivityManager: sendCrashtoMTBF!!
[Reason]
1. The Provider-use side
The App with pid 2049 uses ContentResolver proxy to query Audio Info and gets an exception string in the reply parcel from the Provider-Impl side.
The following line output by pid 2049
08-22 22:36:15.229 2036 2049 E AndroidRuntime: android.database.sqlite.SQLiteException: database is locked: retrycount exceeded, is the EXCEPTION interpretion string.
The call realtion throw the exception.
query(...) => readExceptionFromParcel(String reply) => readExceptionFromParcel(Parcel reply, String msg, int code)
326 public Cursor ContentResolver::query(Uri url, String[] projection, String selection,
327 String[] selectionArgs, String sortOrder) throws RemoteException {
328 BulkCursorToCursorAdaptor adaptor = new BulkCursorToCursorAdaptor();
..............
353 data.writeString(sortOrder);
354 data.writeStrongBinder(adaptor.getObserver().asBinder());
355
356 mRemote.transact(IContentProvider.QUERY_TRANSACTION, data, reply, 0);
357
358 DatabaseUtils.readExceptionFromParcel(reply);
359
..............
}
163 private static final void DatabaseUtils::readExceptionFromParcel(Parcel reply, String msg, int code) {
164 switch (code) {
165 case 2:
166 throw new IllegalArgumentException(msg);
167 case 3:
168 throw new UnsupportedOperationException(msg);
169 case 4:
170 throw new SQLiteAbortException(msg);
171 case 5:
172 throw new SQLiteConstraintException(msg);
173 case 6:
174 throw new SQLiteDatabaseCorruptException(msg);
175 case 7:
176 throw new SQLiteFullException(msg);
177 case 8:
178 throw new SQLiteDiskIOException(msg);
179 case 9:
180 throw new SQLiteException(msg);
181 default:
182 reply.readException(code, msg);
183 }
184 }
2. The Provider-Implemention side.
The Binder Thread #x series are 1692, 1836, 1059 and 486. Worker thread with a small pid value 486 may be one main worker thread.
When one thread wants to modify the database, it acquires and holds a lock on the SQLiteDatabase.
The first two-line like log is printed in SQLiteDatabase::[email protected], when the binder worker threads do while-try to get lock on the SQLiteDatabase::mLock.
407 private static final long LOCK_WAIT_PERIOD = 30L;
408 private void SQLiteDatabase::lock(String sql, boolean forced) {
409 // make sure this method is NOT being called from a 'synchronized' method
410 if (Thread.holdsLock(this)) {
411 Log.w(TAG, "don't lock() while in a synchronized method");
412 }
413 verifyDbIsOpen();
414 if (!forced && !mLockingEnabled) return;
415 boolean done = false;
416 long timeStart = SystemClock.uptimeMillis();
417 while (!done) {
418 try {
419 // try to acquire the lock within 30sec
420 done = mLock.tryLock(LOCK_WAIT_PERIOD, TimeUnit.SECONDS);
421 if (!done) {
422 // lock not acquired in NSec. print a message and stacktrace saying the lock
423 // has not been available for 30sec.
424 Log.w(TAG, "database lock has not been available for " + LOCK_WAIT_PERIOD +
425 " sec. Current Owner of the lock is " + mLock.getOwnerDescription() +
426 ". Continuing to wait in thread: " + Thread.currentThread().getId());
427 }
428 } catch (InterruptedException e) {
429 // ignore the interruption
430 }
431 }
432 if (SQLiteDebug.DEBUG_LOCK_TIME_TRACKING) {
433 if (mLock.getHoldCount() == 1) {
434 // Use elapsed real-time since the CPU may sleep when waiting for IO
435 mLockAcquiredWallTime = SystemClock.elapsedRealtime();
436 mLockAcquiredThreadTime = Debug.threadCpuTimeNanos();
437 }
438 }
439 if (sql != null) {
440 if (ENABLE_DB_SAMPLE) {
441 logTimeStat(sql, timeStart, GET_LOCK_LOG_PREFIX);
442 }
443 }
444 }
The claim of Bailing on database busy is output in nativeFillWindow@android_database_SQLiteQuery.cpp
If one process tries to access a sqlite file that is locked by another process, the default behavior is to return SQLITE_BUSY.
And if the table in the database is locked by another process, SQLITE_LOCKED is returned.
22 public static final int SQLITE_BUSY = 5; /* The database file is locked */
23 public static final int SQLITE_LOCKED = 6; /* A table in the database is locked */
The code of nativeFillWindow@android_database_SQLiteQuery.cpp list as follows.
38 static jint nativeFillWindow(JNIEnv* env, jclass clazz, jint databasePtr,
39 jint statementPtr, jint windowPtr, jint startPos, jint offsetParam) {
40 sqlite3* database = reinterpret_cast<sqlite3*>(databasePtr);
41 sqlite3_stmt* statement = reinterpret_cast<sqlite3_stmt*>(statementPtr);
42 CursorWindow* window = reinterpret_cast<CursorWindow*>(windowPtr);
59 // We assume numRows is initially 0.
60 LOG_WINDOW("Window: numRows = %d, size = %d, freeSpace = %d",
61 window->getNumRows(), window->size(), window->freeSpace());
62
63 int numColumns = sqlite3_column_count(statement);
64 status_t status = window->setNumColumns(numColumns);
65 if (status) {
66 LOGE("Failed to change column count from %d to %d", window->getNumColumns(), numColumns);
67 jniThrowException(env, "java/lang/IllegalStateException", "numColumns mismatch");
68 return 0;
69 }
70
71 int retryCount = 0;
72 int totalRows = 0;
73 int addedRows = 0;
74 bool windowFull = false;
75 bool gotException = false;
76 const bool countAllRows = (startPos == 0); // when startPos is 0, we count all rows
77 while (!gotException && (!windowFull || countAllRows)) {
78 int err = sqlite3_step(statement);
79 if (err == SQLITE_ROW) {
80 LOG_WINDOW("Stepped statement %p to row %d", statement, totalRows);
} else if (err == SQLITE_DONE) {
182 // All rows processed, bail
183 LOG_WINDOW("Processed all rows");
184 break;
185 } else if (err == SQLITE_LOCKED || err == SQLITE_BUSY) {
186 // The table is locked, retry
187 LOG_WINDOW("Database locked, retrying");
188 if (retryCount > 50) {
189 LOGE("Bailing on database busy retry");
190 throw_sqlite3_exception(env, database, "retrycount exceeded");
191 gotException = true;
192 } else {
193 // Sleep to give the thread holding the lock a chance to finish
194 usleep(1000);
195 retryCount++;
196 }
197 } else {
198 throw_sqlite3_exception(env, database);
199 gotException = true;
200 }
201 }
202
203 LOG_WINDOW("Resetting statement %p after fetching %d rows and adding %d rows"
204 "to the window in %d bytes",
205 statement, totalRows, addedRows, window->size() - window->freeSpace());
206 sqlite3_reset(statement);
207
208 // Report the total number of rows on request.
209 if (startPos > totalRows) {
210 LOGE("startPos %d > actual rows %d", startPos, totalRows);
211 }
212 return countAllRows ? totalRows : 0;
213}
When the binder worker thread succeed in acquiring the SQLiteDatabase::mLock in while-try loop, it uses jni nativeFillWindow() via sqlite api to query sqlite and fill the CursorWindow;
If the table or the file is locked by another process, it makes a 50-times try. If retrycount exceeded, the thread Bails on database busy and exception string is passed to the client.
[Conclusion]
So, the reason is that the SQLiteDatabase high-level lock doesn't well protect the database access, and threads get into sqlite-level lock problem.
It is a sqlite access competition issue.
Why the thread has acquired SQLiteDatabase::mLock BUT encounters to sqlite-level's lcok problem.
There may be three reasons,
a. The threads use different SQLiteDatabase instance(should be singleton), so not the sole SQLiteDatabase::mLock.
b. Some thread forgets closing the sqlite and releasing sqlite lock when it leaves.
c. Some thread is in a long time wasted sqlite batch transaction.
For one table or file-db, it is recommended to use one SQLiteDatabase instance which uses its lock to protect the db.
And pay attention to the comment "// make sure this method is NOT being called from a 'synchronized' method" in SQLiteDatabase::lock().
Pls check com.xxxx.ppppp.provider.MediaExtraUpdater's code;
For further investigating, com.xxxx.metadatacleanup source code is needed.