Skip to content

Multitab tests#9212

Draft
jcoglan wants to merge 5 commits intoapache:masterfrom
neighbourhoodie:multitab-tests
Draft

Multitab tests#9212
jcoglan wants to merge 5 commits intoapache:masterfrom
neighbourhoodie:multitab-tests

Conversation

@jcoglan
Copy link
Copy Markdown
Contributor

@jcoglan jcoglan commented Mar 27, 2026

Overview

This PR adds a testing setup for multi-tab situations, and includes a test that
reproduces a reported bug (see linked issues and PRs) when the indexeddb
adapater operates in multi-tab settings.

I have a theory of why multi-tab interactions are broken on indexeddb which I
include a couple of possible fixes for. The theory goes like this:

  • When you open a PouchDB instance on this adapter in tab A, it calls
    indexedDB.open(dbname, v1) where v1 is the database version. In PouchDB
    the version is basically the current unix time.

  • When another tab B opens the same database it calls indexedDB.open(dbname, v2) where v2 > v1. This triggers onversionchange in tab A since another
    client has opened the database at a higher version.

  • PouchDB handles onversionchange by doing indexedDB.open(dbname) with no
    version, i.e. it opens the database at whatever version is already open. This
    is to prevent multiple tabs interacting with the database from thrashing
    open() calls as they keep invalidating each other's connections with higher
    version numbers. (A comment in the code refers to this as an "infinite loop"
    which I think is inaccurate; calling open() only happens if you interact
    with the database, so doesn't passively cause an infinite loop, but it does
    cause thrashing if multiple tabs interact with the database.)

  • Aside: within a single-tab context, IndexedDB connections are cached by name
    so only one connection per database is needed, so this problem does not happen
    in single-tab contexts even if multiple PouchDB instances are active.

  • The other ingredient of this problem is that the adapter caches a "metadata"
    object containing doc_count and seq in memory when the database is first
    opened. This data is stored in the meta object store and only updated by the
    bulkDocs() function. All other operations read this value from the in-memory
    cache. The seq value must be up to date as each doc update in the database
    must have a unique seq value, and this is enforced by a unique index in
    IndexedDB. The error that occurs in the multi-tab case is IndexedDB reporting
    a violation of this constraint.

Therefore, the following set of events are possible:

  1. Tab A calls indexedDB.open(name, v1)
  2. Tab B calls indexedDB.open(name, v2) where v2 > v1
  3. This causes tab A to call indexedDB.open(name)

The last open() does not trigger onversionchange in tab B, and so there are
now two active connections to the same IndexedDB database. And both of these
PouchDB instances have an in-memory cache of the { doc_count, seq } metadata.
When one of these clients updates a doc, the cached metadata in the other tab
becomes stale, and this causes a seq conflict when it next tries to update a
doc. It also causes the result of info() and other methods to be incorrect.

Why does PouchDB use versions in this way?

  • IndexedDB puts an integer version on all databases.
  • You're only allowed to change the definitions of object stores and indexes
    when the database is first opened, and only when it is opened with a higher
    version than it was last opened at.
  • PouchDB uses native IndexedDB indexes to represent Mango indexes so it needs
    to make sure these are up to date when the database is opened.
  • Since you can't read anything from the database before opening it, PouchDB
    uses the current time as the version to give itself the opportunity to define
    any necessary indexes.
  • When you call createIndex(), this also bumps the database's version,
    triggering onversionchange in other tabs and forcing them to re-open the
    database with the right indexes defined.

Possible fixes I am considering:

  1. Always give the version when re-opening the database. i.e. remove the
    mechanism that means open() is called without a version number when it's
    triggered via onversionchange and always pass in the current time. This
    changes nothing for single-tab contexts, they can continue to cache metadata.
    However, it means any interaction with the database will cause
    onversionchange in other tabs and force them to call open() and reload
    the metadata again. So in multi-tab contexts, this causes open() thrashing.

  2. Stop caching metadata. If multiple connections can coexist without
    triggering onversionchange on each other, then caching metadata in memory
    is simply wrong as there's no way to detect that it's stale. (You could
    handle conflicts on seq when updating docs but this still allows info()
    to return stale data.) This means metadata would always be re-read from the
    meta store for all interactions, even for single tabs, which could have
    performance implications.

I have not been able to see any measurable difference in the performance tests
as a result of removing metadata caching. To me, option 1 seems strictly worse
than 2 in the multi-tab case, since re-opening the connection also entails
re-reading the metadata, so in effect it removes metadata caching when open()
thrashing occurs. Option 2's possible performance downside in single-tab
situations does not seem measurable so I'm inclined to prefer this solution.

Implementations of both solutions are available in the last 2 commits, so
whichever one we decide to go with, the other can be removed from the history.

I have learned one thing from implementing solution 2: bulkDocs() needs the
metadata in memory, because it uses idb_attachment_format to decide things
inside preProcessAttachments(), which is async and so cannot be done inside an
IndexedDB transaction. We pre-process attachments and then start a transaction
to write the docs and updated metadata.

I have solved this by keeping the metadata returned by setup() in a var called
cachedMeta and passing this in to bulkDocs(), while everything else looks
the metadata up dynamically. bulkDocs() also loads it dynamically inside the
transaction to get up to date doc_count and seq. This makes the distinction
between metadata and cachedMeta a bit confusing and not very revealing of
why any of this is being done.

To fix this, we could consider a migration to separate "static" metadata like
the UUID and idb_attachment_format from "dynamic" things like doc_count and
seq which change on doc writes. The former could be cached in memory on
open() without causing race conditions, while the latter needs to be read from
IDB all the time to make sure it's not stale.

Testing recommendations

Related Issues or Pull Requests

Checklist

  • I am not a bot
  • This is my own work, I did not use AI, LLM's or similar technology for code or docs generation
  • Code is written and works correctly
  • Changes are covered by tests
  • Documentation changes were made in the docs folder

@jcoglan jcoglan force-pushed the multitab-tests branch 5 times, most recently from 6c9a6e6 to b85a439 Compare March 27, 2026 16:45
@jcoglan jcoglan force-pushed the multitab-tests branch 4 times, most recently from e6d05b1 to 50a59f6 Compare March 27, 2026 17:17
Copy link
Copy Markdown
Contributor

@alxndrsn alxndrsn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! 👍

Not sure that the changes to README.md are relevant.

"emit": "readonly",
"PouchDB": "readonly"
"PouchDB": "readonly",
"__pouch__": "readonly"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could window.__pouch__ could be used instead of adding an extra constant here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Omitting window. from the test code was a readability win to me but I don't have a strong preference either way. Happy to change this if it's a big issue.

TESTING.md Outdated
> **Always set the `COUCH_HOST` env var to a proper CouchDB for the integration tests!**
> [!WARNING]
> VERY IMPORTANT **Always set the `COUCH_HOST` env var to a proper CouchDB for
> the integration tests!**
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these changes related to the new tests?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

81b0f1d is a separate whitespace commit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This commit only changes whitespace in TESTING.md; my editor hard-wraps text/markdown files so they're readable in terminal/editors, and these formatting changes are in their own commit, separate from any content changes.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to clarify, I am generally fine with cleanup commits that make things easier as long as they are separate.

@alxndrsn do you prefer these to be separate PRs?

return page.evaluate(fn);
}

_getPage(id) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"page" means "tab"? Maybe helpful to use consistent language?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

page is the playwright API language and it is consistent with that. Of course we can change it, but not sure if worth it?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was unsure about this. Everyone discussing these issues uses "tab" but Playwright uses "page". Strictly speaking, a "tab" is a particular UI for operating multiple pages, but it's not the only one. Colloquially I think "multi-tab" communicates "several pages operating concurrently" better than "multi-page" which to me sounds like a multi-step sequential workflow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants