Skip to content

Migrate to multiplatform SQLite library#6202

Open
FloEdelmann wants to merge 16 commits intomasterfrom
sqlite-multiplatform
Open

Migrate to multiplatform SQLite library#6202
FloEdelmann wants to merge 16 commits intomasterfrom
sqlite-multiplatform

Conversation

@FloEdelmann
Copy link
Copy Markdown
Member

Attempt to fix #5417.

As mentioned in #6119, the multiplatform Jetpack SQLite library is quite bare-bones, so a lot of stuff that was previously handled by the Android SQLite library now has to be implemented on our side.

@FloEdelmann FloEdelmann requested a review from westnordost April 8, 2025 20:28
@FloEdelmann FloEdelmann moved this to In Progress in iOS Port Apr 8, 2025
@FloEdelmann FloEdelmann self-assigned this Apr 8, 2025
@westnordost
Copy link
Copy Markdown
Member

westnordost commented Apr 9, 2025 via email

@westnordost
Copy link
Copy Markdown
Member

westnordost commented Apr 10, 2025 via email

Comment on lines +92 to +105
val statement = databaseConnection.prepareInsert(table, columnNames.toList(), conflictAlgorithm)
val result = ArrayList<Long>()
transaction {
for (values in valuesList) {
require(values.size == columnNames.size)
statement.bindAll(values)
val rowId = statement.executeInsert()
result.add(rowId)
statement.clearBindings()
statement.reset()
}
statement.close()
}
return result
Copy link
Copy Markdown
Member

@westnordost westnordost Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the way, this can potentially be optimized (quite a bit?).

SQLite allows a syntax like this:

INSERT INTO my_table (column_a, column_b, column_c)
VALUES (1,2,"a"), (2,4,"b"), (3,1,"c"), (4,5,"d")

to insert 4 rows. I.e. no need for a transaction, no need for repeating the first line x times.

To be honest, I thought I've already long done this, but apparently not.

https://sqlite.org/lang_insert.html

Copy link
Copy Markdown
Member

@westnordost westnordost Apr 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(But before blindly doing a premature optimization, it might be worth it to confirm that this would actually be faster in some unit test or something. Maybe binding, like, up to tens of thousands of parameters to a prepared statement with tens of thousands placeholders comes with implications or limitations in itself. Also, it needn't be part of this PR.)

@westnordost westnordost added the iOS necessary for iOS port label Apr 21, 2025
@westnordost westnordost mentioned this pull request May 4, 2025
Helium314 added a commit to Helium314/SCEE that referenced this pull request May 30, 2025
to make sure it doesn't conflict with streetcomplete#6202
performance boost mostly relevant for old phones that can't run maplibre 11
@westnordost
Copy link
Copy Markdown
Member

Any news here? Or rather, do you intend to follow this through?

I imagine some transactions are not really necessary, we only need transactions strictly for data integrity. Otherwise, when piecing together data, a consistent result could maybe be guaranteed on the *Controller layer.
Generally, though, I believe the integrity of tables that use a foreign key to link to a key in another table should be guaranteed on the database layer because I am not sure if queries that involve foreign keys to non-existing rows in another table (e.g. UNION on something or whatever) would not already go haywire (i.e. cause an exception).

In case the above is not viable, any stuff that needs transactions to guarantee database integrity should go into one DAO, so for strongly interconnected tables, we have one DAO that actually manages multiple tables. Plus, of course, any queries involving transactions, i.e. multiple queries, the queries need to be rewritten probably with Database.rawQuery (or Da.exec).

@westnordost westnordost moved this from In Progress to Todo in iOS Port Mar 3, 2026
@FloEdelmann FloEdelmann force-pushed the sqlite-multiplatform branch from f73aab6 to 2444d09 Compare March 16, 2026 22:56
@FloEdelmann
Copy link
Copy Markdown
Member Author

FloEdelmann commented Mar 16, 2026

I finally got around to looking into this PR, and I think I figured it out 🙂

Executing BEGIN TRANSACTION; on its own is fine and works as expected (a transaction is started), but multiple concurrent database accesses can bring the transactions out of sync. The old Android database layer had concurrent access linearization, I now rebuilt something similar with ReentrantLock (to allow nested transactions).

I rebased and force-pushed to make reviewing easier; the first commit is basically the one you already reviewed, just based upon the current master branch (i.e. DB instantiation in AndroidModule instead of DbModule). The other commits are individual fixes.

@FloEdelmann FloEdelmann marked this pull request as ready for review March 16, 2026 23:09
@FloEdelmann FloEdelmann requested a review from westnordost March 16, 2026 23:09
Copy link
Copy Markdown
Member

@westnordost westnordost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh?

Hm. My mind was already set on that it would be necessary to either check if we need transactions spanning multiple queries at all, or to put any transactions we might need into one sql statement.

Let me try to wrap my head around this.

So if BEGIN TRANSACTION; executes just fine and opens a transaction and then, in code, submit more queries to sqlite, like that,

db.startTransaction()
db.query(…)
db.query(…)
db.endTransaction()

some other part of the program code might also call db.query() from a different thread and thus cause that query to be lumped in together with the transaction that was started by a completely unrelated DAO (as the database runs concurrent to the program code, of course). Worse yet, it is completely possible that another DAO explicitly starts another transaction, and latest then this should result in an error. (This error you already encountered, IIRC.) That resulting behavior is what you describe as "transactions out of sync", right?

The obvious solution for me would be to do everything for one transaction in one SQL statement

db.query("""
BEGIN TRANSACTION;
…
…
COMMIT;
""")

And, I would have expected, that the Android SQLite database layer would have done that under the hood. But, you checked this, and actually it does something else? Does it truly linearize all access to the database? That sounds... uh, that is has horrible consequences for performance: Any write operation will block any read operation? (Or already did in Android's SQLite database layer?)

Does the preparing of the statements also have to be in the synchronized section? (Or, for that matter, everything except statement.execute*)


Notes/thoughts to myself (can refactor later):

  • rename Database interface to DatabaseConnection

  • rename StreetCompleteDatabase to SQLiteDatabaseConnection (or similar) and upgrade + create logic should move out of this class (a SQLiteDatabaseConnection should be the result of a database (connection) initialization operation, e.g. DatabaseInitializer.initialize(sqliteConnection) : DatabaseConnection

private var transactionDepth = 0

init {
databaseConnection.execSQL("PRAGMA journal_mode=WAL")
Copy link
Copy Markdown
Member

@westnordost westnordost Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A comment as to why this is necessary is necessary. Related page from the documentation:

https://www.sqlite.org/isolation.html

I read a bit about this.

Apparently, all intermediate changes made within one database transaction are actually visible for all readers on the same database connection. We use just one database connection. So, basically the very thing that (we hoped) a transaction should do, guarantee data consistency at all times (e.g. there are no elements without geometry) is actually not guaranteed currently, unless this is prevented by Android database driver in one way or another.

A claim made in some blog article that intermediate changes made within one transaction would not be visible for all readers on the same database connection by turning WAL mode on isn't supported by the documentation. (To the contrary, it even says that when an UPDATE runs in parallel to a SELECT, that the data returned by the SELECT may contain old data, new data or both, i.e. it is undefined.)

Do you have more info on this matter?

(This is kind of important. Because if transactions already don't do what we hoped they do, and it did not lead to obvious problems, maybe we don't need them at all, then. Or, at least, in places where we need to make guarantees about data consistency, e.g. every element has a geometry, guarantee it programmatically after the query is done)

Copy link
Copy Markdown
Member

@mnalis mnalis Mar 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use just one database connection

Note that while I do not have Android experience specifically, generally with SQL databases if you use different (i.e. parallel) contexts / threads / etc. you should use different (multiple) database handles for each of them if you intend for their transactions to be separate.

ref: https://sqlite.org/isolation.html

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, for proper encapsulation, we'd need to use one SQLite connection for every operation. E.g. getting OSM data by bounding box must be done with one SQLite connection, writing all those OSM elements and their geometries just downloaded to the database must be done with another SQLite connection. Otherwise, the getting of the bbox data may include partly updated and partly not-updated data.

However, there are a few question marks here:

  • The SQLite connections would need to be managed by the *Controller classes, though, because one operation might include updating several tables at once, see e.g. MapDataController and these should be in the same transaction / same connection

  • What would be the performance overhead for creating and closing a new sqlite connection for each operation? Is it greater than serializing all access to the database?

  • Using the standard Android SQLite API, which is of course also used under the hood at least when using androidx.sqlite:sqlite-bundled, is it actually possible to use more than one SQLite connection? The documentation sounds like SQLiteOpenHelper#getWritableDatabase() will always return the same connection

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this is why the next step is to find out what is actually the status quo. I.e. what exactly does the current Android SQLite API do (because it works and doesn't create obvious problems). We then have the option to mirror that behavior, or do something else, in the hope of improving performance through parallelization.


override fun insert(
table: String,
values: Collection<Pair<String, Any?>>,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If we have to pick apart keys and values anyway, might just as well make the interface consistent to insertMany and have columns and values as separate parameters. Same for update)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

iOS necessary for iOS port

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

Implement Database interface on iOS

3 participants