Malformed UTF-8 characters after upgrade #4344

Closed
opened 2026-02-05 08:36:57 +03:00 by OVERLORD · 1 comment
Owner

Originally created by @fxthomas on GitHub (Dec 1, 2023).

Attempted Debugging

  • I have read the debugging page

Searched GitHub Issues

  • I have searched GitHub for the issue.

Describe the Scenario

Hello,

I realize this might be difficult to reproduce, but I'm hoping for some pointers to where things might go wrong so I can investigate.

I'm having issues after a recent system/Bookstack update in the last month — the problem is that I didn't notice it right away and I now have no idea of what could have changed. I had been running Bookstack just fine since last year with zero issues, and I haven't touched its configuration since the installation.

Basically, this is a new page with some nice UTF-8 characters:

image

This is what happens when saving:

image

This is what happens when clicking "Edit" again:

image

The body text is apparently saved incorrectly in the database (with different values for the HTML and plain text versions!), but the title itself is OK:

MariaDB [bookstack]> select name, html, text from pages order by pages.created_at desc limit 1;
+-----------+-----------------------------------------------------------------------------+--------------------------+
| name      | html                                                                        | text                     |
+-----------+-----------------------------------------------------------------------------+--------------------------+
| Æ Œ ★     | <p id="bkmrk-test-hello-%C3%86">Æ Œ â
</p>
<p id="bkmrk-%C2%A0"></p>      | à ŠâÂ

           |
+-----------+-----------------------------------------------------------------------------+--------------------------+
1 row in set (0.001 sec)

Previous pages that were not modified have correct UTF-8 characters, both in the db and when viewed in the web interface.

Any idea of what's happening and where I could have messed up?

Exact BookStack Version

v23.10.4

Log Content

No response

Hosting Environment

PHP 8.2 on Archlinux, up-to-date at the time of writing
MariaDB 11.2.2-MariaDB, utf8mb4_unicode_ci collation / character set

Originally created by @fxthomas on GitHub (Dec 1, 2023). ### Attempted Debugging - [X] I have read the debugging page ### Searched GitHub Issues - [X] I have searched GitHub for the issue. ### Describe the Scenario Hello, I realize this might be difficult to reproduce, but I'm hoping for some pointers to where things might go wrong so I can investigate. I'm having issues after a recent system/Bookstack update in the last month — the problem is that I didn't notice it right away and I now have no idea of what could have changed. I had been running Bookstack just fine since last year with zero issues, and I haven't touched its configuration since the installation. Basically, this is a new page with some nice UTF-8 characters: ![image](https://github.com/BookStackApp/BookStack/assets/613594/ebef6204-5e34-4eca-8b2b-d2c76c3da5b4) This is what happens when saving: ![image](https://github.com/BookStackApp/BookStack/assets/613594/506831f7-25b3-444f-8412-5abdabe62d0d) This is what happens when clicking "Edit" again: ![image](https://github.com/BookStackApp/BookStack/assets/613594/712807c3-1a5b-42c1-a15c-5ab5035985fd) The body text is apparently saved incorrectly in the database (with different values for the HTML and plain text versions!), but the title itself is OK: ``` MariaDB [bookstack]> select name, html, text from pages order by pages.created_at desc limit 1; +-----------+-----------------------------------------------------------------------------+--------------------------+ | name | html | text | +-----------+-----------------------------------------------------------------------------+--------------------------+ | Æ Œ ★ | <p id="bkmrk-test-hello-%C3%86">Æ Œ â </p> <p id="bkmrk-%C2%A0"></p> | à Šâ | +-----------+-----------------------------------------------------------------------------+--------------------------+ 1 row in set (0.001 sec) ``` Previous pages that were not modified have correct UTF-8 characters, both in the db and when viewed in the web interface. Any idea of what's happening and where I could have messed up? ### Exact BookStack Version v23.10.4 ### Log Content _No response_ ### Hosting Environment PHP 8.2 on Archlinux, up-to-date at the time of writing MariaDB 11.2.2-MariaDB, utf8mb4_unicode_ci collation / character set
OVERLORD added the 🐕 Support label 2026-02-05 08:36:57 +03:00
Author
Owner

@ssddanbrown commented on GitHub (Dec 1, 2023):

Hi @fxthomas,
Please see my comment here: https://github.com/BookStackApp/BookStack/issues/4701#issuecomment-1835906884

I'm going to close this as a duplicate of #4701, Feel free to comment on that issue if needed.

@ssddanbrown commented on GitHub (Dec 1, 2023): Hi @fxthomas, Please see my comment here: https://github.com/BookStackApp/BookStack/issues/4701#issuecomment-1835906884 I'm going to close this as a duplicate of #4701, Feel free to comment on that issue if needed.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/BookStack#4344