N+1 query problem on api/sync #938

Closed
opened 2026-02-04 23:23:10 +03:00 by OVERLORD · 19 comments
Owner

Originally created by @raphaelcoutu on GitHub (Feb 18, 2021).

Subject of the issue

Syncing/Loading ciphers with a remote PostgreSQL can be really slow (~ 50-60 seconds) due to a very large amount of SQL queries. The problem doesn't seem to affect users who are using SQLite or docker/local SQL databases due to no/minimal latency roundtrips. Browser extension could simply not sync.

Your environment

  • Install method: docker image
  • Clients used: Web or Browser extension
  • Bitwarden_rs version: v1.19.0
  • Web-vault version: v2.18.1
  • Running within Docker: true
  • Internet access: true
  • Uses a proxy: false
  • DNS Check: true
  • Time Check: true
  • Domain Configuration Check: true
  • HTTPS Check: false
  • Database type: PostgreSQL
  • Version of mysql/postgresql: 12.6
    I have about ~300 ciphers in my Bitwarden wallet. My remote PostgreSQL server is about 30-50 ms from my local docker Bitwarden_rs setup (for testing purposes).

Steps to reproduce

  1. Creating a docker container with bitwarden_rs
  2. Importing Bitwarden wallet (official self hosted) into bitwarden_rs
  3. Loading vault home page

I used different setups to identity the problem. I have no problem with local SQLite or docker PostgreSQL.

I finally found the problem using a PostgreSQL log configuration in postgresql.conf. There were a lot of requests written for a single sync (2400 lines so I estimate about 1000 requests):

log_statement = 'all
log_directory = 'log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
logging_collector = on

Docker-compose used:

version: '3'
services:
  # postgres:
  #   image: postgres:12
  #   restart: always
  #   volumes:
  #     - pgdata:/var/lib/postgresql/data
  #   environment:
  #     - POSTGRES_PASSWORD=admin
  #     - TZ=America/Montreal
  bitwarden:
    image: bitwardenrs/server
    restart: always
    volumes:
      - ./bw-data:/data
    ports:
      - 8080:80
      #- 3012:3012
    environment:
      - DOMAIN=http://localhost:8080
      - DATABASE_URL=postgresql://bitwardenrs:<<password>>@<<host>>:5432/bitwardenrs
      # (testing local) - DATABASE_URL=postgresql://postgres:admin@postgres
      - ENABLE_DB_WAL=false
      - SIGNUPS_ALLOWED=true
      #- WEBSOCKET_ENABLED=true
      - ADMIN_TOKEN=<<token>>
      - LOG_FILE=/data/bitwarden.log
      - LOG_LEVEL=warn
      - DATABASE_MAX_CONNS=25
volumes:
  pgdata:

Expected behaviour

Should load faster (~up to 5-10 seconds could be acceptable)

Actual behaviour

Really slow. (~1 min)

Relevant code blocks

(I don't usally code in Rust but I found the place where we might start looking) Is there a way to use query joins instead of doing many requests?

d69be7d03a/src/api/core/ciphers.rs (L86-L106)

  • On api/core/ciphers.rs L95, each collection would generate a query for details.

d69be7d03a/src/db/models/collection.rs (L60-L67)

d69be7d03a/src/db/models/collection.rs (L227-L247)

  • On api/core/ciphers.rs L104, each cipher would generate a query for attachments.
    etc.

d69be7d03a/src/db/models/cipher.rs (L82-L85)

Originally created by @raphaelcoutu on GitHub (Feb 18, 2021). <!-- # ### NOTE: Please update to the latest version of bitwarden_rs before reporting an issue! This saves you and us a lot of time and troubleshooting. See: https://github.com/dani-garcia/bitwarden_rs/issues/1180 # ### --> <!-- Please fill out the following template to make solving your problem easier and faster for us. This is only a guideline. If you think that parts are unnecessary for your issue, feel free to remove them. Remember to hide/obfuscate personal and confidential information, such as names, global IP/DNS addresses and especially passwords, if necessary. --> ### Subject of the issue <!-- Describe your issue here.--> Syncing/Loading ciphers with a remote PostgreSQL can be really slow (~ 50-60 seconds) due to a very large amount of SQL queries. The problem doesn't seem to affect users who are using SQLite or docker/local SQL databases due to no/minimal latency roundtrips. Browser extension could simply not sync. ### Your environment <!-- The version number, obtained from the logs or the admin diagnostics page --> <!-- Remember to check your issue on the latest version first! --> <!-- How the server was installed: Docker image / package / built from source --> * Install method: docker image * Clients used: Web or Browser extension<!-- if applicable --> * Bitwarden_rs version: v1.19.0 * Web-vault version: v2.18.1 * Running within Docker: true * Internet access: true * Uses a proxy: false * DNS Check: true * Time Check: true * Domain Configuration Check: true * HTTPS Check: false * Database type: PostgreSQL * Version of mysql/postgresql: 12.6 I have about ~300 ciphers in my Bitwarden wallet. My remote PostgreSQL server is about 30-50 ms from my local docker Bitwarden_rs setup (for testing purposes). ### Steps to reproduce <!-- Tell us how to reproduce this issue. What parameters did you set (differently from the defaults) and how did you start bitwarden_rs? --> 1. Creating a docker container with bitwarden_rs 2. Importing Bitwarden wallet (official self hosted) into bitwarden_rs 3. Loading vault home page I used different setups to identity the problem. I have no problem with local SQLite or docker PostgreSQL. I finally found the problem using a PostgreSQL log configuration in postgresql.conf. There were a lot of requests written for a single sync (2400 lines so I estimate about 1000 requests): ``` log_statement = 'all log_directory = 'log' log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' logging_collector = on ``` Docker-compose used: ```yaml version: '3' services: # postgres: # image: postgres:12 # restart: always # volumes: # - pgdata:/var/lib/postgresql/data # environment: # - POSTGRES_PASSWORD=admin # - TZ=America/Montreal bitwarden: image: bitwardenrs/server restart: always volumes: - ./bw-data:/data ports: - 8080:80 #- 3012:3012 environment: - DOMAIN=http://localhost:8080 - DATABASE_URL=postgresql://bitwardenrs:<<password>>@<<host>>:5432/bitwardenrs # (testing local) - DATABASE_URL=postgresql://postgres:admin@postgres - ENABLE_DB_WAL=false - SIGNUPS_ALLOWED=true #- WEBSOCKET_ENABLED=true - ADMIN_TOKEN=<<token>> - LOG_FILE=/data/bitwarden.log - LOG_LEVEL=warn - DATABASE_MAX_CONNS=25 volumes: pgdata: ``` ### Expected behaviour <!-- Tell us what should happen --> Should load faster (~up to 5-10 seconds could be acceptable) ### Actual behaviour Really slow. (~1 min) ### Relevant code blocks (I don't usally code in Rust but I found the place where we might start looking) Is there a way to use query joins instead of doing many requests? https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/api/core/ciphers.rs#L86-L106 * On api/core/ciphers.rs L95, each collection would generate a query for details. https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L60-L67 https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/collection.rs#L227-L247 * On api/core/ciphers.rs L104, each cipher would generate a query for attachments. etc. https://github.com/dani-garcia/bitwarden_rs/blob/d69be7d03a0369faf1f6be6ed2cb908ec6b7a253/src/db/models/cipher.rs#L82-L85
OVERLORD added the enhancementlow priority labels 2026-02-04 23:23:10 +03:00
Author
Owner

@jjlin commented on GitHub (Feb 19, 2021):

You're welcome to investigate if there's a clean way to reduce the number of queries, but otherwise it's a relatively niche use case that I don't think the main contributors are willing to spend much time on.

@jjlin commented on GitHub (Feb 19, 2021): You're welcome to investigate if there's a clean way to reduce the number of queries, but otherwise it's a relatively niche use case that I don't think the main contributors are willing to spend much time on.
Author
Owner

@raphaelcoutu commented on GitHub (Feb 19, 2021):

Yes, sure, I'll try to investigate this by my own! I just wanted to flag this problem so people who are used to code in rust can help me find a cleaner way of doing those requests!

@raphaelcoutu commented on GitHub (Feb 19, 2021): Yes, sure, I'll try to investigate this by my own! I just wanted to flag this problem so people who are used to code in rust can help me find a cleaner way of doing those requests!
Author
Owner

@dislazy commented on GitHub (May 18, 2021):

I also encountered the same problem, can I consider introducing a caching mechanism?

@dislazy commented on GitHub (May 18, 2021): I also encountered the same problem, can I consider introducing a caching mechanism?
Author
Owner

@jjlin commented on GitHub (May 18, 2021):

Feel free to propose something, but I suspect that there won't be a clean solution to this that isn't effectively a rewrite of the project.

@jjlin commented on GitHub (May 18, 2021): Feel free to propose something, but I suspect that there won't be a clean solution to this that isn't effectively a rewrite of the project.
Author
Owner

@BlackDex commented on GitHub (May 18, 2021):

Also, keep in mind that some people are running multiple vaultwarden instances with one database, which makes caching done internally a bit of a pain. You then would need to go to tools like redis or memcached or something, which i doubt would be beneficial to the project.

@BlackDex commented on GitHub (May 18, 2021): Also, keep in mind that some people are running multiple vaultwarden instances with one database, which makes caching done internally a bit of a pain. You then would need to go to tools like redis or memcached or something, which i doubt would be beneficial to the project.
Author
Owner

@attie commented on GitHub (Nov 3, 2021):

I'm not sure if I'm seeing this effect too, or if it's unrelated... I've just migrated from the official Bitwarden server to Vaultwarden (docker, v1.23.0, SQLite database).

I have around 400 items in my vault, and the syncs are noticeably slower than before...

I will try to dig further when I get the chance.

@attie commented on GitHub (Nov 3, 2021): I'm not sure if I'm seeing this effect too, or if it's unrelated... I've just migrated from the official Bitwarden server to Vaultwarden (docker, v1.23.0, SQLite database). I have around 400 items in my vault, and the syncs are noticeably slower than before... I will try to dig further when I get the chance.
Author
Owner

@BlackDex commented on GitHub (Nov 3, 2021):

Keep in mind that it also matters on what kind of hardware it is running, the network connection etc.. etc...
The Bitwarden Cloud server is probably running multiple servers, and if not, it is probably large enough to handle a lot of load.
Also it's network connection is probably big enough.

But, any help on improving speed in any way would be great.
Best thing to check is the response time from the server via the Developer Console for example.
And to be fair, make sure it is always done in a clean incognito/private browser so that no previous cache is used which makes it all look different.

@BlackDex commented on GitHub (Nov 3, 2021): Keep in mind that it also matters on what kind of hardware it is running, the network connection etc.. etc... The Bitwarden Cloud server is probably running multiple servers, and if not, it is probably large enough to handle a lot of load. Also it's network connection is probably big enough. But, any help on improving speed in any way would be great. Best thing to check is the response time from the server via the Developer Console for example. And to be fair, make sure it is always done in a clean incognito/private browser so that no previous cache is used which makes it all look different.
Author
Owner

@attie commented on GitHub (Nov 3, 2021):

Ah, apologies, these are both self hosted - Bitwarden in docker in a VM, Vaultwarden on native docker (no VM), both on the same physical hardware.

@attie commented on GitHub (Nov 3, 2021): Ah, apologies, these are both self hosted - Bitwarden in docker in a VM, Vaultwarden on native docker (no VM), both on the same physical hardware.
Author
Owner

@attie commented on GitHub (Nov 10, 2021):

Apologies for my noise here, it seems that the performance issue I was seeing is fully addressed by moving away from SQLite to MariaDB (not totally unexpected).

More info

Test setup:

  • NUC (i7 8559U), 32GB RAM, NVMe local storage, NFS remote storage (Gigabit network, ZFS)
  • ~400 items in my vault / 207KiB compact JSON
  • Querying /api/ciphers, measuring TTFB (~1.15 MiB payload)

Tests:

  • Vaultwarden, Docker, SQLite, remote storage, ~3.65s
  • Vaultwarden, Docker, local MariaDB service, local storage, ~200ms
  • Vaultwarden, Docker, local MariaDB service, remote storage, ~225ms
@attie commented on GitHub (Nov 10, 2021): Apologies for my noise here, it seems that the performance issue I was seeing is fully addressed by moving away from SQLite to MariaDB (not totally unexpected). <details> <summary>More info</summary> Test setup: - NUC (i7 8559U), 32GB RAM, NVMe local storage, NFS remote storage (Gigabit network, ZFS) - ~400 items in my vault / 207KiB compact JSON - Querying `/api/ciphers`, measuring TTFB (~1.15 MiB payload) Tests: - Vaultwarden, Docker, SQLite, remote storage, ~3.65s - Vaultwarden, Docker, local MariaDB service, local storage, ~200ms - Vaultwarden, Docker, local MariaDB service, remote storage, ~225ms </details>
Author
Owner

@bendem commented on GitHub (Dec 2, 2021):

I've investigated this a bit with some test data (200 collections, 600 ciphers), here is a rough estimate:

  • collection query: 1
    • to_json_details
      • is_writable_by_user: 200
        • find_by_user_and_org: 0-200
      • hide_passwords_for_user: 200
        • find_by_user_and_org: 0-200
  • cipher query: 1
    • to_json:
      • Attachment::find_by_cipher: 600
      • get_access_restrictions: 0-600
        • is_in_full_access_org: 0-600
      • get_collections: 600

That's between 1062 and 2122 sql queries to prepare the full json blob. And that's not counting the queries for Folders, Policies and Sends. The sync query takes about 6s to load.

@bendem commented on GitHub (Dec 2, 2021): I've investigated this a bit with some test data (200 collections, 600 ciphers), here is a rough estimate: * collection query: 1 * `to_json_details` * `is_writable_by_user`: 200 * `find_by_user_and_org`: 0-200 * `hide_passwords_for_user`: 200 * `find_by_user_and_org`: 0-200 * cipher query: 1 * `to_json`: * `Attachment::find_by_cipher`: 600 * `get_access_restrictions`: 0-600 * `is_in_full_access_org`: 0-600 * `get_collections`: 600 That's between 1062 and 2122 sql queries to prepare the full json blob. And that's not counting the queries for Folders, Policies and Sends. The sync query takes about 6s to load.
Author
Owner

@lifeofguenter commented on GitHub (Jan 7, 2022):

@raphaelcoutu, @BlackDex: 30-50ms latency for a database connection is not a good thing in any case. Usually you would want to keep it below 10ms, however, as @bendem discovered, doing 1-2k SQL queries for a single API call might not be ideal either and will definitely cause issues no matter the hardware / connectivity on larger collections. Definitely not a niche issue, unless the majority of users have less than 50 passwords.

This can be avoided by making use of JOINs or SELECT .. IN (1, 2, 3, ...) or even just a SELECT by some parent-id.

@lifeofguenter commented on GitHub (Jan 7, 2022): @raphaelcoutu, @BlackDex: 30-50ms latency for a database connection is not a good thing in any case. Usually you would want to keep it below 10ms, however, as @bendem discovered, doing 1-2k SQL queries for a single API call might not be ideal either and will definitely cause issues no matter the hardware / connectivity on larger collections. Definitely not a niche issue, unless the majority of users have less than 50 passwords. This can be avoided by making use of `JOIN`s or `SELECT .. IN (1, 2, 3, ...)` or even just a `SELECT` by some parent-id.
Author
Owner

@BlackDex commented on GitHub (Jan 7, 2022):

i can take a better look into the queries. But JOIN's arn't that easy with all the different tables and rust structs etc...

@BlackDex commented on GitHub (Jan 7, 2022): i can take a better look into the queries. But JOIN's arn't that easy with all the different tables and rust structs etc...
Author
Owner

@bendem commented on GitHub (Feb 1, 2022):

We indeed need an optimised code path for that route. The danger lies in duplicating all the (already quite complex) code that ensures users only see what they are authorised to see. That duplication would mean there is a chance that part of the code gets updated and the optimised query does not (or the other way around).

@bendem commented on GitHub (Feb 1, 2022): We indeed need an optimised code path for that route. The danger lies in duplicating all the (already quite complex) code that ensures users only see what they are authorised to see. That duplication would mean there is a chance that part of the code gets updated and the optimised query does not (or the other way around).
Author
Owner

@bendem commented on GitHub (Feb 1, 2022):

I just tried looking into this but I can't for the life of me figure out how to add diesel_logger. Where does DbConn get transformed into an actual connection ? It's just macros over macros and I'm not a rust dev.

@bendem commented on GitHub (Feb 1, 2022): I just tried looking into this but I can't for the life of me figure out how to add [`diesel_logger`](https://docs.rs/diesel_logger/latest/diesel_logger/). Where does `DbConn` get transformed into an actual connection ? It's just macros over macros and I'm not a rust dev.
Author
Owner

@BlackDex commented on GitHub (Feb 1, 2022):

It's relatively simple actually:
I didn't have time my self to check this out on how to optimize this unfortunately.

Also, do not forget the to set the log_level to DEBUG, else you will not see the output.

diff --git a/Cargo.toml b/Cargo.toml
index 46a7ca0..0432bf5 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -65,6 +65,7 @@ fern = { version = "0.6.0", features = ["syslog-4"] }
 # A safe, extensible ORM and Query builder
 diesel = { version = "1.4.8", features = [ "chrono", "r2d2"] }
 diesel_migrations = "1.4.0"
+diesel_logger = "0.1.1"
 
 # Bundled SQLite
 libsqlite3-sys = { version = "0.22.2", features = ["bundled"], optional = true }
diff --git a/src/db/mod.rs b/src/db/mod.rs
index bcbb7ce..586f4f5 100644
--- a/src/db/mod.rs
+++ b/src/db/mod.rs
@@ -72,9 +72,9 @@ macro_rules! generate_connections {
 }
 
 generate_connections! {
-    sqlite: diesel::sqlite::SqliteConnection,
-    mysql: diesel::mysql::MysqlConnection,
-    postgresql: diesel::pg::PgConnection
+    sqlite: diesel_logger::LoggingConnection<diesel::sqlite::SqliteConnection>,
+    mysql: diesel_logger::LoggingConnection<diesel::mysql::MysqlConnection>,
+    postgresql: diesel_logger::LoggingConnection<diesel::pg::PgConnection>
 }
 
 impl DbConnType {
@BlackDex commented on GitHub (Feb 1, 2022): It's relatively simple actually: I didn't have time my self to check this out on how to optimize this unfortunately. Also, do not forget the to set the log_level to DEBUG, else you will not see the output. ```diff diff --git a/Cargo.toml b/Cargo.toml index 46a7ca0..0432bf5 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -65,6 +65,7 @@ fern = { version = "0.6.0", features = ["syslog-4"] } # A safe, extensible ORM and Query builder diesel = { version = "1.4.8", features = [ "chrono", "r2d2"] } diesel_migrations = "1.4.0" +diesel_logger = "0.1.1" # Bundled SQLite libsqlite3-sys = { version = "0.22.2", features = ["bundled"], optional = true } diff --git a/src/db/mod.rs b/src/db/mod.rs index bcbb7ce..586f4f5 100644 --- a/src/db/mod.rs +++ b/src/db/mod.rs @@ -72,9 +72,9 @@ macro_rules! generate_connections { } generate_connections! { - sqlite: diesel::sqlite::SqliteConnection, - mysql: diesel::mysql::MysqlConnection, - postgresql: diesel::pg::PgConnection + sqlite: diesel_logger::LoggingConnection<diesel::sqlite::SqliteConnection>, + mysql: diesel_logger::LoggingConnection<diesel::mysql::MysqlConnection>, + postgresql: diesel_logger::LoggingConnection<diesel::pg::PgConnection> } impl DbConnType { ```
Author
Owner

@bendem commented on GitHub (Feb 8, 2022):

Thanks, I'm sure it's easy if you know how macros work, that's why I asked :)

@bendem commented on GitHub (Feb 8, 2022): Thanks, I'm sure it's easy if you know how macros work, that's why I asked :)
Author
Owner

@BlackDex commented on GitHub (May 11, 2022):

Solved via #2429

@BlackDex commented on GitHub (May 11, 2022): Solved via #2429
Author
Owner

@jtsang4 commented on GitHub (May 12, 2022):

@BlackDex Amazing work! The user experience is improved so much!

@jtsang4 commented on GitHub (May 12, 2022): @BlackDex Amazing work! The user experience is improved so much!
Author
Owner

@bendem commented on GitHub (May 12, 2022):

Very nice, can't wait for the next release!

@bendem commented on GitHub (May 12, 2022): Very nice, can't wait for the next release!
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/vaultwarden#938