mirror of
https://github.com/BookStackApp/BookStack.git
synced 2026-02-05 00:29:48 +03:00
Search Results are completely Irrelevant #2318
Closed
opened 2026-02-05 03:38:37 +03:00 by OVERLORD
·
8 comments
No Branch/Tag Specified
development
l10n_development
further_theme_development
release
llm_only
vectors
v25-11
docker_env
drawio_rendering
user_permissions
ldap_host_failover
svg_image
prosemirror
captcha_example
fix/video-export
v25.12.3
v25.12.2
v25.12.1
v25.12
v25.11.6
v25.11.5
v25.11.4
v24.11.4
v25.11.3
v25.11.2
v25.11.1
v25.11
v25.07.3
v25.07.2
v25.07.1
v25.07
v25.05.2
v25.05.1
v25.05
v25.02.5
v25.02.4
v25.02.3
v25.02.2
v25.02.1
v25.02
v24.12.1
v24.12
v24.10.3
v24.10.2
v24.10.1
v24.10
v24.05.4
v24.05.3
v24.05.2
v24.05.1
v24.05
v24.02.3
v24.02.2
v24.02.1
v24.02
v23.12.3
v23.12.2
v23.12.1
v23.12
v23.10.4
v23.10.3
v23.10.2
v23.10.1
v23.10
v23.08.3
v23.08.2
v23.08.1
v23.08
v23.06.2
v23.06.1
v23.06
v23.05.2
v23.05.1
v23.05
v23.02.3
v23.02.2
v23.02.1
v23.02
v23.01.1
v23.01
v22.11.1
v22.11
v22.10.2
v22.10.1
v22.10
v22.09.1
v22.09
v22.07.3
v22.07.2
v22.07.1
v22.07
v22.06.2
v22.06.1
v22.06
v22.04.2
v22.04.1
v22.04
v22.03.1
v22.03
v22.02.3
v22.02.2
v22.02.1
v22.02
v21.12.5
v21.12.4
v21.12.3
v21.12.2
v21.12.1
v21.12
v21.11.3
v21.11.2
v21.11.1
v21.11
v21.10.3
v21.10.2
v21.10.1
v21.10
v21.08.6
v21.08.5
v21.08.4
v21.08.3
v21.08.2
v21.08.1
v21.08
v21.05.4
v21.05.3
v21.05.2
v21.05.1
v21.05
v21.04.6
v21.04.5
v21.04.4
v21.04.3
v21.04.2
v21.04.1
v21.04
v0.31.8
v0.31.7
v0.31.6
v0.31.5
v0.31.4
v0.31.3
v0.31.2
v0.31.1
v0.31.0
v0.30.7
v0.30.6
v0.30.5
v0.30.4
v0.30.3
v0.30.2
v0.30.1
v0.30.0
v0.29.3
v0.29.2
v0.29.1
v0.29.0
v0.28.3
v0.28.2
v0.28.1
v0.28.0
v0.27.5
v0.27.4
v0.27.3
v0.27.2
v0.27.1
v0.27
v0.26.4
v0.26.3
v0.26.2
v0.26.1
v0.26.0
v0.25.5
v0.25.4
v0.25.3
v0.25.2
v0.25.1
v0.25.0
v0.24.3
v0.24.2
v0.24.1
v0.24.0
v0.23.2
v0.23.1
v0.23.0
v0.22.0
v0.21.0
v0.20.3
v0.20.2
v0.20.1
v0.20.0
v0.19.0
v0.18.5
v0.18.4
v0.18.3
v0.18.2
v0.18.1
v0.18.0
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.0
v0.16.3
v0.16.2
v0.16.1
v0.16.0
v0.15.3
v0.15.2
v0.15.1
v0.15.0
v0.14.3
v0.14.2
v0.14.1
v0.14.0
v0.13.1
v0.13.0
v0.12.2
v0.12.1
v0.12.0
v0.11.2
v0.11.1
v0.11.0
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.2
v0.8.1
v0.8.0
v0.7.6
v0.7.5
v0.7.4
v0.7.3
0.7.2
v.0.7.1
v0.7.0
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
Labels
Clear labels
🎨 Design
📖 Docs Update
🐛 Bug
🐛 Bug
:cat2:🐈 Possible duplicate
💿 Database
☕ Open to discussion
💻 Front-End
🐕 Support
🚪 Authentication
🌍 Translations
🔌 API Task
🏭 Back-End
⛲ Upstream
🔨 Feature Request
🛠️ Enhancement
🛠️ Enhancement
🛠️ Enhancement
❤️ Happy feedback
🔒 Security
🔍 Pending Validation
💆 UX
📝 WYSIWYG Editor
🌔 Out of scope
🔩 API Request
:octocat: Admin/Meta
🖌️ View Customization
❓ Question
🚀 Priority
🛡️ Blocked
🚚 Export System
♿ A11y
🔧 Maintenance
> Markdown Editor
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/BookStack#2318
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @vampirismtrueblood on GitHub (Jul 8, 2021).
Describe the bug
When searching for anything even with the EXACT string, it still displays completely wrong results, I will post screenshots of both My mediawiki and Bookstack to help show the algorithm difference
Running version: BookStack v21.05.3 container (Yes, i added the fix term like % $searchterm % makes absolutely no difference with and without)
Both my bookstack and Mediawiki are in absolute sync, both have same exact articles of 1790+
I ran php artisan bookstack:regenerate-search BEFORE and AFTER % $searchterm %
Bookstack will only give correct results if:
Steps To Reproduce
Steps to reproduce the behavior:
Expected behavior
Get Pages with top matching score as first results
Screenshots

Test ONE
Bookstack (completely irrelevant results)
Mediawiki (Perfectly accurate)

Test TWO (Notice the sequence of keywords used vs actual page name

Bookstack (Completely irrelevant results)
Mediawiki (Perfectly accurate again)

Test THREE (Exact Title Match)

Bookstack (Completely irrelevant results)
Mediawiki (Perfect match)

Your Configuration (please complete the following information):
BookStack v21.05.3 container
Additional context
@vampirismtrueblood commented on GitHub (Jul 8, 2021):
If you would please take a peak at Mediawiki Search algos, I love bookstack, your visual Editor is state of the art with some css tweaks, and goes far and beyond when compared to Mediawiki, but the Search Functionality is equally as important. I'm happy to run tests anytime and feedback as quickly as possible.
@vampirismtrueblood commented on GitHub (Jul 8, 2021):
So I couldn't wait on someone to get back to me .. I did look into the code, and I realized it's using score weights based on pages title, description and type be it book, chapter, shelf .. etc .
I finally got it the search to behave accurately like Mediawiki and here's my solution:
On the DB
use bookstack
delete from search_terms;
edit the file app/Entities/Tools/SearchIndex.php
vim app/Entities/Tools/SearchIndex.php
change the value from 5 to 200 on Lines 34 and 52
so that it'll look like this under both "Public and Private" Functions":
$nameTerms = $this->generateTermArrayFromText($entity->name, 200 * $entity->searchFactor);
finally run this command to re-generate the index with more logical weights:
php artisan bookstack:regenerate-search
I'm sure there are better ways to do it, but that was the fastest and one that requires least changes. hopefully this helps someone too
@abulgatz commented on GitHub (Aug 6, 2021):
This greatly improves the search! @vampirismtrueblood can you explain what weights this changes and how you figured this out?
@bensulli commented on GitHub (Oct 12, 2021):
This has become an increasing issue as our wiki has grown with our company. We have a lot of pages now, but the search never seems to find or prioritize the results we expected. There are times where I'll search for exactly the title of a page and it won't show up in the results (or at least not in the first page).
Happy to provide any details to @ssddanbrown directly if it'll help, but can't post an example here due to confidentiality.
@ssddanbrown commented on GitHub (Oct 13, 2021):
Just to confirm my view on this, I'm well aware the search system needs some attention. Over the last few years I've spent some time attempting revamps of the system or exploring alternative options but failed each time. This was to address a wider set of issues (Such as translation handling). I've realised it would be more worthwhile to just improve upon what we have though so I do plan on soon spending some time during these next few months on improving a range of search elements.
@bensulli commented on GitHub (Oct 13, 2021):
Awesome to hear! Please don't hesitate to reach out if I can lend a hand with testing (not much of a coder unfortunately, but can deploy and test it against our real-world data). I've brought Bookstacks to three different organizations now because I sing its praises every new job :)
@vampirismtrueblood commented on GitHub (Nov 13, 2021):
I did look into the search class as well as the DB, found the link and went from there, it's what Im currently using, although it's still not quite there like media wiki but is very much usable with the workaround I mentioned above (it's quick and short)
@ssddanbrown commented on GitHub (Nov 13, 2021):
I've now made a range of changes to the search indexing & scoring system as part of PR #3043.
Part of this was adjusting up the title score although I have not upped this as drastically as above (From 5 to 40, instead of the 200 above). I'd want to tread a bit more carefully while being cautious of how the changes will interact with other changes.
Some other bits in the PR that specifically address result scoring/ranking:
Hopefully the combination of these changes will make a significant different. Will all be part of the next feature release so will therefore close this issue off.