mirror of
https://github.com/BookStackApp/BookStack.git
synced 2026-02-05 00:29:48 +03:00
Feature request: Pandoc integration #2043
Closed
opened 2026-02-05 02:42:02 +03:00 by OVERLORD
·
10 comments
No Branch/Tag Specified
development
l10n_development
further_theme_development
release
llm_only
vectors
v25-11
docker_env
drawio_rendering
user_permissions
ldap_host_failover
svg_image
prosemirror
captcha_example
fix/video-export
v25.12.3
v25.12.2
v25.12.1
v25.12
v25.11.6
v25.11.5
v25.11.4
v24.11.4
v25.11.3
v25.11.2
v25.11.1
v25.11
v25.07.3
v25.07.2
v25.07.1
v25.07
v25.05.2
v25.05.1
v25.05
v25.02.5
v25.02.4
v25.02.3
v25.02.2
v25.02.1
v25.02
v24.12.1
v24.12
v24.10.3
v24.10.2
v24.10.1
v24.10
v24.05.4
v24.05.3
v24.05.2
v24.05.1
v24.05
v24.02.3
v24.02.2
v24.02.1
v24.02
v23.12.3
v23.12.2
v23.12.1
v23.12
v23.10.4
v23.10.3
v23.10.2
v23.10.1
v23.10
v23.08.3
v23.08.2
v23.08.1
v23.08
v23.06.2
v23.06.1
v23.06
v23.05.2
v23.05.1
v23.05
v23.02.3
v23.02.2
v23.02.1
v23.02
v23.01.1
v23.01
v22.11.1
v22.11
v22.10.2
v22.10.1
v22.10
v22.09.1
v22.09
v22.07.3
v22.07.2
v22.07.1
v22.07
v22.06.2
v22.06.1
v22.06
v22.04.2
v22.04.1
v22.04
v22.03.1
v22.03
v22.02.3
v22.02.2
v22.02.1
v22.02
v21.12.5
v21.12.4
v21.12.3
v21.12.2
v21.12.1
v21.12
v21.11.3
v21.11.2
v21.11.1
v21.11
v21.10.3
v21.10.2
v21.10.1
v21.10
v21.08.6
v21.08.5
v21.08.4
v21.08.3
v21.08.2
v21.08.1
v21.08
v21.05.4
v21.05.3
v21.05.2
v21.05.1
v21.05
v21.04.6
v21.04.5
v21.04.4
v21.04.3
v21.04.2
v21.04.1
v21.04
v0.31.8
v0.31.7
v0.31.6
v0.31.5
v0.31.4
v0.31.3
v0.31.2
v0.31.1
v0.31.0
v0.30.7
v0.30.6
v0.30.5
v0.30.4
v0.30.3
v0.30.2
v0.30.1
v0.30.0
v0.29.3
v0.29.2
v0.29.1
v0.29.0
v0.28.3
v0.28.2
v0.28.1
v0.28.0
v0.27.5
v0.27.4
v0.27.3
v0.27.2
v0.27.1
v0.27
v0.26.4
v0.26.3
v0.26.2
v0.26.1
v0.26.0
v0.25.5
v0.25.4
v0.25.3
v0.25.2
v0.25.1
v0.25.0
v0.24.3
v0.24.2
v0.24.1
v0.24.0
v0.23.2
v0.23.1
v0.23.0
v0.22.0
v0.21.0
v0.20.3
v0.20.2
v0.20.1
v0.20.0
v0.19.0
v0.18.5
v0.18.4
v0.18.3
v0.18.2
v0.18.1
v0.18.0
v0.17.4
v0.17.3
v0.17.2
v0.17.1
v0.17.0
v0.16.3
v0.16.2
v0.16.1
v0.16.0
v0.15.3
v0.15.2
v0.15.1
v0.15.0
v0.14.3
v0.14.2
v0.14.1
v0.14.0
v0.13.1
v0.13.0
v0.12.2
v0.12.1
v0.12.0
v0.11.2
v0.11.1
v0.11.0
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.2
v0.8.1
v0.8.0
v0.7.6
v0.7.5
v0.7.4
v0.7.3
0.7.2
v.0.7.1
v0.7.0
v0.6.3
v0.6.2
v0.6.1
v0.6.0
v0.5.0
Labels
Clear labels
🎨 Design
📖 Docs Update
🐛 Bug
🐛 Bug
:cat2:🐈 Possible duplicate
💿 Database
☕ Open to discussion
💻 Front-End
🐕 Support
🚪 Authentication
🌍 Translations
🔌 API Task
🏭 Back-End
⛲ Upstream
🔨 Feature Request
🛠️ Enhancement
🛠️ Enhancement
🛠️ Enhancement
❤️ Happy feedback
🔒 Security
🔍 Pending Validation
💆 UX
📝 WYSIWYG Editor
🌔 Out of scope
🔩 API Request
:octocat: Admin/Meta
🖌️ View Customization
❓ Question
🚀 Priority
🛡️ Blocked
🚚 Export System
♿ A11y
🔧 Maintenance
> Markdown Editor
No Label
🔨 Feature Request
Milestone
No items
No Milestone
Projects
Clear projects
No project
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: starred/BookStack#2043
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @maggie44 on GitHub (Jan 16, 2021).
Hi @ssddanbrown,
I was thinking Pandoc integration as an optional module. It would add some efficiencies to the various exports by keeping the assets seperate as discussed above (and potentially resolve some other outstanding issues), but also provide a bunch of additional options, such as EPUB (#1949), Word doc, video export support (#883; #2412) and a bunch more.
Here are a few shortcuts to try it out:
apt-get install pandocorbrew install pandocshould do the trick (if installing in a docker container, may need to install build-essential and/or curl).test.md
Execute the command:
pandoc test.md -o example2.html --extract-media ./assetsMore info relating to this originally discussed in: https://github.com/BookStackApp/BookStack/issues/2412
@maggie44 commented on GitHub (Jan 16, 2021):
@ssddanbrown in response to the last comment over in https://github.com/BookStackApp/BookStack/issues/2412, indeed, these ostensibly simple things often get more complex very quickly.
In terms of workflow, after giving it some thought perhaps a similar integration as WKHTMLTOPDF. The user installs Pandoc manually, using the Pandoc docs for their environment (apt-get Pandoc for example in Ubuntu). Then adds in a
PANDOC=Truevariable to the .env file so that BookStack doesn't have any responsibility for the Pandoc install.When PANDOC=True there could be some new fields in the export dropdown menu: EPUB; HTML Archive (or something more logically named instead of HTML Archive.
Hopefully then passing the same content being pulled for the current export features to Pandoc on the system locally, followed by a return of the output to download.
By using the same method as WKHTMLTOPDF, it doesn't make as mission critical to maintain and allows for some dev experimentation. Similarly, only using EPUB and HTML Archive rather than replacing the current PDF and html export processes, as certainly not confident enough in it to recommend that off the bat.
I realise a lot of this is preaching to the choir, but seems you have plenty of tickets and things on your plate, so figure the more thought/detail given to a feature request and the use case considered before making the request the better.
Big thanks for the work on this, it is going to become quite a central part of our EdTech COVID response work.
@maggie44 commented on GitHub (Jan 24, 2021):
After further thought, how about simplifying this down to allowing the original markdown that bookstack uses to be exported? When included in the api this would allow us to utilise third party processing of exported data (like pandoc) without the extra support burden.
@ssddanbrown commented on GitHub (Jan 24, 2021):
Hi @maggie0002 ,
If you're using the Markdown editor to edit pages, The pages API should already provide the stored markdown content (pages.show endpoint).
@maggie44 commented on GitHub (Jan 25, 2021):
Whoops, sorry, thought it defaulted to Markdown. I meant an API point to export the WYSIWYG content as is, rather than converting first to HTML or PDF. I don't see that in the API docs.
@ssddanbrown commented on GitHub (Jan 26, 2021):
That (pages => read) endpoint should give you the HTML that's used when viewing a page. This is pretty much the same as the HTML loaded in the WYSIWYG editor but with a pass to remove some potentially dangerous elements.
@maggie44 commented on GitHub (Jan 26, 2021):
Helpful, and interesting, thanks. My understanding then is the difference is just that the export -> html function takes that same html seen in the pages -> read endpoint, passes it to a processor that converts pictures etc into an embedded html file. But without headers, which presumably is what the html processor takes care of (among other things).
Will experiment with that endpoint and report back anything useful.
@maggie44 commented on GitHub (Jan 26, 2021):
Didn't get very far. Turns out the HTML the API pipes out is missing headings, css, all the formatting, would be a lot of work to go from there to something usable.
Is there a way to access the HTML used by the exporter but with the original HREF to the images and/or video rather than the embedded images? It would be a fairly simple (in theory) mirror of that page to then get it with exported content. Wget for example has a --mirror option I could experiment with as a light-weight solution.
@ssddanbrown commented on GitHub (Jan 27, 2021):
No way to get that directly, Although the main content HTML is what you'd get out of the API; The export just wraps it up in a template with some extra styles. The export uses this template, With these export styles.
@maggie44 commented on GitHub (May 27, 2021):
Having given it some more thought, how would you feel about PanDoc as an optional exporter similar to how wkhtmltopdf is currently integrated? This wrapper is proving useful: https://github.com/ueberdosis/pandoc
Would also help resolve some other issues that I don't think we will find a way around:
linuxserver/docker-bookstack#80
#2459
@ssddanbrown commented on GitHub (May 31, 2021):
Hi @maggie0002,
Sorry for my lack of response.
To be honest, I'd not be very keen. Supporting both of the existing PDF export options has already proved a lot more challenging than hoped and consumed a lot of my time in the various requests & issues that have generated from it. The range of conversion formats that pandoc would open up would worry me, and I think that it's optimistic that it'll solve more issues than it'll create as an alternative PDF generator, especially since I believe pandoc will use WKHTMLtoPDF by default anyway for HTML to PDF conversions.