Integrate external web content as „offline bookmark“ #1523

Closed
opened 2026-02-05 01:08:32 +03:00 by OVERLORD · 1 comment
Owner

Originally created by @koseduhemak on GitHub (Feb 8, 2020).

Describe the feature you'd like
It would be a nice feature, if you can just provide an URL to f.e. a blog post which then gets parsed and main text + images are extracted and embedded into/inside a page in Bookstack.
I think of something like selenium + clever algorithm to detect a blog posts content and images. As many (especially) tech related blogs / websites are SEO optimized, they often use semantic markup (f.e. <article/>) to structure their page content.
If that‘s to complicated, one could give the user the ability to interactively select the blog post / desired content he wants to import from a web page / blog post into the Bookstack page.

Describe the benefits this feature would bring to BookStack users
As I often came across interesting topics while surfing the WWW and want to „bookmark“ the content somehow. However, if you got multiple bookmarks for the same topic, you can‘t really relate the content itself / paragraphs of a blog post to other blog posts / contents you found on other pages. There is also no unified user experience when browsing the bookmarked content.
I think this could greatly improve the way how users can leverage public information and integrate it into their own personal Wiki, enriching it with additional information and integrate it into the other structured pieces of knowledge in their Bookstack shelves.

To clarify what kind fo functionality I would appreciate:
https://embed.ly/extract

This service lets you extract the main content of a page automatically.

Originally created by @koseduhemak on GitHub (Feb 8, 2020). **Describe the feature you'd like** It would be a nice feature, if you can just provide an URL to f.e. a blog post which then gets parsed and main text + images are extracted and embedded into/inside a page in Bookstack. I think of something like selenium + clever algorithm to detect a blog posts content and images. As many (especially) tech related blogs / websites are SEO optimized, they often use semantic markup (f.e. `<article/>`) to structure their page content. If that‘s to complicated, one could give the user the ability to interactively select the blog post / desired content he wants to import from a web page / blog post into the Bookstack page. **Describe the benefits this feature would bring to BookStack users** As I often came across interesting topics while surfing the WWW and want to „bookmark“ the content somehow. However, if you got multiple bookmarks for the same topic, you can‘t really relate the content itself / paragraphs of a blog post to other blog posts / contents you found on other pages. There is also no unified user experience when browsing the bookmarked content. I think this could greatly improve the way how users can leverage public information and integrate it into their own personal Wiki, enriching it with additional information and integrate it into the other structured pieces of knowledge in their Bookstack shelves. To clarify what kind fo functionality I would appreciate: https://embed.ly/extract This service lets you extract the main content of a page automatically.
Author
Owner

@ssddanbrown commented on GitHub (Mar 3, 2020):

Thanks for offering this idea @koseduhemak.

I think it will be tricky to parse the content out of a page, and I'd prefer to not rely on external services where possible, and it would be a feature that would likely need ongoing support and new edge-cases arise. Therefore I don't think this is something we'd look to integrate into the core platform anytime soon therefore I'm going to close this off as out-of-scope.

Maybe someone could create a browser extension that would take content from a page and insert it into an action BookStack editor page in another tab?

@ssddanbrown commented on GitHub (Mar 3, 2020): Thanks for offering this idea @koseduhemak. I think it will be tricky to parse the content out of a page, and I'd prefer to not rely on external services where possible, and it would be a feature that would likely need ongoing support and new edge-cases arise. Therefore I don't think this is something we'd look to integrate into the core platform anytime soon therefore I'm going to close this off as out-of-scope. Maybe someone could create a browser extension that would take content from a page and insert it into an action BookStack editor page in another tab?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/BookStack#1523