Cyrillic is displayed as garbage on old and new pages #4342

Closed
opened 2026-02-05 08:35:51 +03:00 by OVERLORD · 11 comments
Owner

Originally created by @gentoo-root on GitHub (Nov 29, 2023).

Describe the Bug

Some while ago, everything worked properly. Now all pages with Cyrillic text display garbage, as if a wrong encoding is used:

image

Only the page titles are OK. The previews in the page list are OK for old pages, but not OK for the new ones.

I checked that the database (still) uses utf8mb4, old pages are readable by SELECTing them with SQL (but not from Bookstack), while the new pages are stored into the database like this.

Arch Linux (both server and client)
MariaDB 11.1.3
Bookstack 23.10.4
Firefox 119.0

Steps to Reproduce

  1. Open any old page with Cyrillic text.

OR:

  1. Create a new page and put Cyrillic text there.

Expected Behaviour

The text should be readable.

Screenshots or Additional Context

No response

Browser Details

Firefox 119.0 (64-bit, Arch Linux)

Exact BookStack Version

v23.10.4

Originally created by @gentoo-root on GitHub (Nov 29, 2023). ### Describe the Bug Some while ago, everything worked properly. Now all pages with Cyrillic text display garbage, as if a wrong encoding is used: ![image](https://github.com/BookStackApp/BookStack/assets/862550/0bae21a8-72af-4384-b941-827a114da485) Only the page titles are OK. The previews in the page list are OK for old pages, but not OK for the new ones. I checked that the database (still) uses utf8mb4, old pages are readable by SELECTing them with SQL (but not from Bookstack), while the new pages are stored into the database like this. Arch Linux (both server and client) MariaDB 11.1.3 Bookstack 23.10.4 Firefox 119.0 ### Steps to Reproduce 1. Open any old page with Cyrillic text. OR: 1. Create a new page and put Cyrillic text there. ### Expected Behaviour The text should be readable. ### Screenshots or Additional Context _No response_ ### Browser Details Firefox 119.0 (64-bit, Arch Linux) ### Exact BookStack Version v23.10.4
OVERLORD added the 🐛 Bug label 2026-02-05 08:35:51 +03:00
Author
Owner

@Preycon commented on GitHub (Dec 1, 2023):

Hello, developer helped me on Discord, the culprit is libxml, on arch the error is fixed by downgrading to a previous version, in my case it was:

sudo pacman -U libxml2-2.11.5-1-x86_64.pkg.tar.zst

With this all is working again, in case you still see some random gargabe glyphs, clear cache and edit/save the affected previews.

@Preycon commented on GitHub (Dec 1, 2023): Hello, developer helped me on Discord, the culprit is libxml, on arch the error is fixed by downgrading to a previous version, in my case it was: sudo pacman -U libxml2-2.11.5-1-x86_64.pkg.tar.zst With this all is working again, in case you still see some random gargabe glyphs, clear cache and edit/save the affected previews.
Author
Owner

@ssddanbrown commented on GitHub (Dec 1, 2023):

Copying my message(s) from Discord:

Okay, I've spent a good few hours playing in an Arch environment, and testing things and attempting to validate my thoughts by building PHP and libxml from source.
I've been unsuccessful in validating things though, due to my inexperience with arch, and building these libs, and time consumption due to the speed of my VM.

That said, I'm pretty sure it's due to an upstream change in libxml, related to this thread:
https://gitlab.gnome.org/GNOME/libxml2/-/issues/637
Was a change made in 2.12, which entered arch a couple of weeks back.

It's now reverted as per that thread but might be a while more before that gets made into a release, for it to trickle down for arch and your PHP.

Okay, I built an older libxml package and installed it via pacman, that seemed to fix the issue on my test system.
To save some work, if using an x86_64 system, I've uploaded the built package here:
https://user.fm/files/v2-ac3a5382574b64895fa53c171a7440bf/libxml2-2.11.5-1-x86_64.pkg.tar.zst
Download that then pacman -U libxml2-2.11.5-1-x86_64.pkg.tar.zst to install and downgrade libxml.

DO AT YOUR OWN RISK though. I'm not being malicious but I have very little experience & knowledge with arch, plus it's always dangerous to trust internet strangers.

If you prefer to go to the arch sources, there's a PKGBUILD for the repo and tag linked here:
https://gitlab.archlinux.org/archlinux/packaging/packages/libxml2/-/tree/2.11.5-1?ref_type=tags

@ssddanbrown commented on GitHub (Dec 1, 2023): Copying my message(s) from Discord: Okay, I've spent a good few hours playing in an Arch environment, and testing things and attempting to validate my thoughts by building PHP and libxml from source. I've been unsuccessful in validating things though, due to my inexperience with arch, and building these libs, and time consumption due to the speed of my VM. That said, I'm pretty sure it's due to an upstream change in libxml, related to this thread: https://gitlab.gnome.org/GNOME/libxml2/-/issues/637 Was a change made in 2.12, which entered arch a couple of weeks back. It's now reverted as per that thread but might be a while more before that gets made into a release, for it to trickle down for arch and your PHP. Okay, I built an older libxml package and installed it via pacman, that seemed to fix the issue on my test system. To save some work, if using an x86_64 system, I've uploaded the built package here: https://user.fm/files/v2-ac3a5382574b64895fa53c171a7440bf/libxml2-2.11.5-1-x86_64.pkg.tar.zst Download that then pacman -U libxml2-2.11.5-1-x86_64.pkg.tar.zst to install and downgrade libxml. DO AT YOUR OWN RISK though. I'm not being malicious but I have very little experience & knowledge with arch, plus it's always dangerous to trust internet strangers. If you prefer to go to the arch sources, there's a PKGBUILD for the repo and tag linked here: https://gitlab.archlinux.org/archlinux/packaging/packages/libxml2/-/tree/2.11.5-1?ref_type=tags
Author
Owner

@fxthomas commented on GitHub (Dec 1, 2023):

Thank you so much for the pointer and for your quick tests, I can confirm this fixes everything on my system!

Arch has an archive of previous package versions for libxml2-2.11.5-1-x86_64 (signature), that's what I used.

@fxthomas commented on GitHub (Dec 1, 2023): Thank you so much for the pointer and for your quick tests, I can confirm this fixes everything on my system! Arch has an archive of previous package versions for [libxml2-2.11.5-1-x86_64](https://archive.archlinux.org/repos/2023/11/15/core/os/x86_64/libxml2-2.11.5-1-x86_64.pkg.tar.zst) ([signature](https://archive.archlinux.org/repos/2023/11/15/core/os/x86_64/libxml2-2.11.5-1-x86_64.pkg.tar.zst.sig)), that's what I used.
Author
Owner

@gentoo-root commented on GitHub (Dec 2, 2023):

Thanks a lot for the workaround! I confirm that downgrading libxml to 2.11.5 worked for me too. I used my local package cache to get the old version.

What are the next steps? Is it a new bug in libxml that they need to fix, or is it some misuse of libxml APIs that needs to be fixed on BookStack side?

@gentoo-root commented on GitHub (Dec 2, 2023): Thanks a lot for the workaround! I confirm that downgrading libxml to 2.11.5 worked for me too. I used my local package cache to get the old version. What are the next steps? Is it a new bug in libxml that they need to fix, or is it some misuse of libxml APIs that needs to be fixed on BookStack side?
Author
Owner

@fxthomas commented on GitHub (Dec 2, 2023):

What are the next steps? Is it a new bug in libxml that they need to fix, or is it some misuse of libxml APIs that needs to be fixed on BookStack side?

My understanding is that libxml are rolling back the changes in a next point release until they have a better implementation, there is nothing to fix on BookStack side.

@fxthomas commented on GitHub (Dec 2, 2023): > What are the next steps? Is it a new bug in libxml that they need to fix, or is it some misuse of libxml APIs that needs to be fixed on BookStack side? My understanding is that libxml are rolling back the changes in a next point release until they have a better implementation, there is nothing to fix on BookStack side.
Author
Owner

@ssddanbrown commented on GitHub (Dec 2, 2023):

@gentoo-root this is something that has already been reverted in libxml, as per this thread. The use that BookStack's using could potentially be considered hacky, but it's pretty common as a method in PHP to force utf8 parsing, and I'm not sure of a better option (recently reviewed this and doubled-down on the approach in use here, due to other PHP deprecations).

@ssddanbrown commented on GitHub (Dec 2, 2023): @gentoo-root this is something that has already been reverted in libxml, [as per this thread](https://gitlab.gnome.org/GNOME/libxml2/-/issues/637). The use that BookStack's using could potentially be considered hacky, but it's pretty common as a method in PHP to force utf8 parsing, and I'm not sure of a better option (recently reviewed this and doubled-down on the approach in use here, due to other PHP deprecations).
Author
Owner

@gentoo-root commented on GitHub (Dec 2, 2023):

Understood; waiting for the next update of libxml then. Thank you for stepping in and debugging it!

@gentoo-root commented on GitHub (Dec 2, 2023): Understood; waiting for the next update of libxml then. Thank you for stepping in and debugging it!
Author
Owner

@ssddanbrown commented on GitHub (Dec 2, 2023):

Happy I could help and find the cause.
I'll therefore close this off but if anything changes and potential action is needed on the BookStack side of things feel free to comment still.

@ssddanbrown commented on GitHub (Dec 2, 2023): Happy I could help and find the cause. I'll therefore close this off but if anything changes and potential action is needed on the BookStack side of things feel free to comment still.
Author
Owner

@C0rn3j commented on GitHub (Dec 25, 2023):

So supposedly 2.12.2 should have brought the old behavior back.

I am on 2.12.3 on Arch Linux, but still experiencing this or something akin to it.

image

This is not cyrillic but Czech alphabet, supposed to say Bábovka.
https://rys.pw/books/recipes/page/recipes#bkmrk-b%C3%83%C2%A1bovka

I've rebuilt BookStack with the new libraries (which also clears cache) but this did not help the situation.

Did the issue with cyrillic for people in this thread go away with 2.12.2+ and I am experiencing something else?

@C0rn3j commented on GitHub (Dec 25, 2023): So supposedly 2.12.2 should have brought the old behavior back. I am on 2.12.3 on Arch Linux, but still experiencing this or something akin to it. ![image](https://github.com/BookStackApp/BookStack/assets/1641362/07b7a49b-baab-4b3f-a150-68130ca4c01a) This is not cyrillic but Czech alphabet, supposed to say Bábovka. https://rys.pw/books/recipes/page/recipes#bkmrk-b%C3%83%C2%A1bovka I've rebuilt BookStack with the new libraries (which also clears cache) but this did not help the situation. Did the issue with cyrillic for people in this thread go away with 2.12.2+ and I am experiencing something else?
Author
Owner

@ssddanbrown commented on GitHub (Dec 25, 2023):

@C0rn3j Seems to be working for me, but having trouble getting into a non-working state to re-test the change from non-working to working.

My output, and running of BookStack for testing:

[root@bsarchtest bookstack]# php -i | grep libxml
libxml Version => 2.12.3
libxml
libxml2 Version => 2.12.3
[root@bsarchtest bookstack]# php artisan serve --host=0.0.0.0 --port=80
@ssddanbrown commented on GitHub (Dec 25, 2023): @C0rn3j Seems to be working for me, but having trouble getting into a non-working state to re-test the change from non-working to working. My output, and running of BookStack for testing: ``` [root@bsarchtest bookstack]# php -i | grep libxml libxml Version => 2.12.3 libxml libxml2 Version => 2.12.3 [root@bsarchtest bookstack]# php artisan serve --host=0.0.0.0 --port=80 ```
Author
Owner

@C0rn3j commented on GitHub (Dec 25, 2023):

Seems to be working for me, but having trouble getting into a non-working state to re-test the change from non-working to working.

If getting the env to run was a problem (and not actually triggering the bug with the env) - you should be able to downgrade the entire system to let's say 2023-11-14 (since Nov 16 is when 2.12 shipped) using the Arch archive.

It would be then best to install bookstack from AUR/bookstack (that I maintain) as that's probably what majority Arch users are using BookStack through.

I will make a new issue then as it seems I shouldn't be suffering from this one.

@C0rn3j commented on GitHub (Dec 25, 2023): > Seems to be working for me, but having trouble getting into a non-working state to re-test the change from non-working to working. If getting the env to run was a problem (and not actually triggering the bug with the env) - you should be able to downgrade the entire system to let's say 2023-11-14 (since Nov 16 is when 2.12 shipped) using the [Arch archive](https://wiki.archlinux.org/title/Arch_Linux_Archive#How_to_restore_all_packages_to_a_specific_date). It would be then best to [install](https://wiki.archlinux.org/title/Arch_User_Repository#Installing_and_upgrading_packages) bookstack from [AUR/bookstack](https://aur.archlinux.org/packages/bookstack) (that I maintain) as that's probably what majority Arch users are using BookStack through. I will make a new issue then as it seems I shouldn't be suffering from this one.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/BookStack#4342