Pages not rendering base64 blob images #2152

Closed
opened 2026-02-05 03:07:31 +03:00 by OVERLORD · 12 comments
Owner

Originally created by @awarre on GitHub (Mar 11, 2021).

Describe the bug
Pages not rendering images after pasting images into WYSIWYG editor when original source is a base64 image blob.

Steps To Reproduce
Steps to reproduce the behavior:

  1. Create a New Page
  2. Copy data from a web site including an image that is a base64 data blob. (I used the pixel graphic from http://www.techerator.com/2011/12/how-to-embed-images-directly-into-your-html/ for this test)
  3. Paste the data into the WYSIWYG editor
  4. The text and image will appear on the Editing page.
  5. Save the page
  6. The image will not appear on the page. The image tag is completely stripped.
  7. Edit the page again.
  8. The image again renders on the Editing page, and was preserved during the save.

Expected Behavior
The image should render both on the editor, and on the actual page.

More info
This is very close to working. The image does get uploaded, and can be accessed directly and viewed through the editor. The path generated is in the format of: blob:https://example.com/aabb1234-1234-aabb-bbcc-ffffeeeeffff. As stated above, this is the expected uploaded image and can be accessed directly.

It appears the page does not generate the element at all. If it did, everything should work as expected.

Your Configuration (please complete the following information):

  • Exact BookStack Version (Found in settings): 0.31.7
  • PHP Version: 7.3.19
  • Hosting Method (Nginx/Apache/Docker): IIS
Originally created by @awarre on GitHub (Mar 11, 2021). **Describe the bug** Pages not rendering images after pasting images into WYSIWYG editor when original source is a base64 image blob. **Steps To Reproduce** Steps to reproduce the behavior: 1. Create a New Page 2. Copy data from a web site including an image that is a base64 data blob. (I used the pixel graphic from http://www.techerator.com/2011/12/how-to-embed-images-directly-into-your-html/ for this test) 3. Paste the data into the WYSIWYG editor 4. The text and image will appear on the Editing page. 5. Save the page 6. The image will not appear on the page. The image tag is completely stripped. 7. Edit the page again. 8. The image again renders on the Editing page, and was preserved during the save. **Expected Behavior** The image should render both on the editor, and on the actual page. **More info** This is very close to working. The image does get uploaded, and can be accessed directly and viewed through the editor. The path generated is in the format of: `blob:https://example.com/aabb1234-1234-aabb-bbcc-ffffeeeeffff`. As stated above, this is the expected uploaded image and can be accessed directly. It appears the page does not generate the <img> element at all. If it did, everything should work as expected. **Your Configuration (please complete the following information):** - Exact BookStack Version (Found in settings): 0.31.7 - PHP Version: 7.3.19 - Hosting Method (Nginx/Apache/Docker): IIS
OVERLORD added the 🔍 Pending Validation label 2026-02-05 03:07:31 +03:00
Author
Owner

@ssddanbrown commented on GitHub (Mar 12, 2021):

Thanks for reporting @awarre with the provided detail.

Just tried this in the WYSIWYG and markdown editors on my local instance using the same image and in both instances the image data uploaded into BookStack as expected. On Firefox/Fedora.

Could you confirm the browser in use when this occurs?

@ssddanbrown commented on GitHub (Mar 12, 2021): Thanks for reporting @awarre with the provided detail. Just tried this in the WYSIWYG and markdown editors on my local instance using the same image and in both instances the image data uploaded into BookStack as expected. On Firefox/Fedora. Could you confirm the browser in use when this occurs?
Author
Owner

@awarre commented on GitHub (Mar 12, 2021):

Using the latest build of Chrome on Windows 10. I just tested with Edge and see the same (though that's not much of a difference).

I can replicate this issue on the BookStack Demo site as well. It doesn't look like a browser rendering issue, because the <img> element is entirely missing from the page.

<p id="bkmrk-cool%2C-that%E2%80%99s-eight-b">cool, that’s eight-bit me! And no linked image required!</p>
<p id="bkmrk-"></p>
<p id="bkmrk-if-you%E2%80%99re-interested">If you’re interested in trying this on your own without a pre-made base64 conversion tool, pay special attention to</p>

Where the empty <p> tags are, there should be the image.

@awarre commented on GitHub (Mar 12, 2021): Using the latest build of Chrome on Windows 10. I just tested with Edge and see the same (though that's not much of a difference). I can replicate this issue on the BookStack Demo site as well. It doesn't *look* like a browser rendering issue, because the `<img>` element is entirely missing from the page. ``` <p id="bkmrk-cool%2C-that%E2%80%99s-eight-b">cool, that’s eight-bit me! And no linked image required!</p> <p id="bkmrk-"></p> <p id="bkmrk-if-you%E2%80%99re-interested">If you’re interested in trying this on your own without a pre-made base64 conversion tool, pay special attention to</p> ``` Where the empty `<p>` tags are, there should be the image.
Author
Owner

@awarre commented on GitHub (Mar 12, 2021):

The Editor page shows the code below, and the image is correctly rendered.


<textarea id="html-editor"  name="html" rows="5"
--
  | >&lt;p id=&quot;bkmrk-cool%2C-that%E2%80%99s-eight-b&quot;&gt;cool, that’s eight-bit me! And no linked image required!&lt;/p&gt;
  | &lt;p id=&quot;bkmrk-&quot;&gt;&lt;img class=&quot;aligncenter&quot; src=&quot;&quot; alt=&quot;beastie.png&quot;&gt;&lt;/p&gt;
  | &lt;p id=&quot;bkmrk-if-you%E2%80%99re-interested&quot;&gt;If you’re interested in trying this on your own without a pre-made base64 conversion tool, pay special attention to&lt;/p&gt;</textarea>


@awarre commented on GitHub (Mar 12, 2021): The Editor page shows the code below, and the image is correctly rendered. ``` <textarea id="html-editor" name="html" rows="5" --   | >&lt;p id=&quot;bkmrk-cool%2C-that%E2%80%99s-eight-b&quot;&gt;cool, that’s eight-bit me! And no linked image required!&lt;/p&gt;   | &lt;p id=&quot;bkmrk-&quot;&gt;&lt;img class=&quot;aligncenter&quot; src=&quot;&quot; alt=&quot;beastie.png&quot;&gt;&lt;/p&gt;   | &lt;p id=&quot;bkmrk-if-you%E2%80%99re-interested&quot;&gt;If you’re interested in trying this on your own without a pre-made base64 conversion tool, pay special attention to&lt;/p&gt;</textarea> ```
Author
Owner

@awarre commented on GitHub (Mar 12, 2021):

Verified the same results with latest release of Firefox on Windows as well, using the BookStack Demo site, with the image copy and pasted from the website listed above.

image
image

@awarre commented on GitHub (Mar 12, 2021): Verified the same results with latest release of Firefox on Windows as well, using the BookStack Demo site, with the image copy and pasted from the website listed above. ![image](https://user-images.githubusercontent.com/30052685/110950046-5cd3d000-8311-11eb-91e3-4748a90f3c5b.png) ![image](https://user-images.githubusercontent.com/30052685/110950062-61988400-8311-11eb-913a-a2a2c3371076.png)
Author
Owner

@awarre commented on GitHub (Mar 12, 2021):

Okay, sorry for so many updates, just trying to get a minimal process to reproduce this. You do not need to copy and paste the image itself to reproduce the problem.

  1. Create a new page.
  2. Source code button
  3. Paste
    <img src=" NCAMAAAAsYgRbAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5c cllPAAAABJQTFRF3NSmzMewPxIG//ncJEJsldTou1jHgAAAARBJREFUeNrs2EEK gCAQBVDLuv+V20dENbMY831wKz4Y/VHb/5RGQ0NDQ0NDQ0NDQ0NDQ0NDQ 0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0PzMWtyaGhoaGhoaGhoaGhoaGhoxtb0QGho aGhoaGhoaGhoaGhoaMbRLEvv50VTQ9OTQ5OpyZ01GpM2g0bfmDQaL7S+ofFC6x v3ZpxJiywakzbvd9r3RWPS9I2+MWk0+kbf0Hih9Y17U0nTHibrDDQ0NDQ0NDQ0 NDQ0NDQ0NTXbRSL/AK72o6GhoaGhoRlL8951vwsNDQ0NDQ1NDc0WyHtDTEhD Q0NDQ0NTS5MdGhoaGhoaGhoaGhoaGhoaGhoaGhoaGposzSHAAErMwwQ2HwRQ AAAAAElFTkSuQmCC" alt="beastie.png">
  4. OK

Result

  1. Image displays on Editor
  2. Image does not display on page
@awarre commented on GitHub (Mar 12, 2021): Okay, sorry for so many updates, just trying to get a minimal process to reproduce this. You do not need to copy and paste the image itself to reproduce the problem. 1. Create a new page. 2. Source code button 3. Paste `<img src=" NCAMAAAAsYgRbAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5c cllPAAAABJQTFRF3NSmzMewPxIG//ncJEJsldTou1jHgAAAARBJREFUeNrs2EEK gCAQBVDLuv+V20dENbMY831wKz4Y/VHb/5RGQ0NDQ0NDQ0NDQ0NDQ0NDQ 0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0PzMWtyaGhoaGhoaGhoaGhoaGhoxtb0QGho aGhoaGhoaGhoaGhoaMbRLEvv50VTQ9OTQ5OpyZ01GpM2g0bfmDQaL7S+ofFC6x v3ZpxJiywakzbvd9r3RWPS9I2+MWk0+kbf0Hih9Y17U0nTHibrDDQ0NDQ0NDQ0 NDQ0NDQ0NTXbRSL/AK72o6GhoaGhoRlL8951vwsNDQ0NDQ1NDc0WyHtDTEhD Q0NDQ0NTS5MdGhoaGhoaGhoaGhoaGhoaGhoaGhoaGposzSHAAErMwwQ2HwRQ AAAAAElFTkSuQmCC" alt="beastie.png">` 4. OK **Result** 1. Image displays on Editor 2. Image does not display on page
Author
Owner

@awarre commented on GitHub (Mar 19, 2021):

@ssddanbrown

Source of the problem is PageContent.php, line 348.

        // Remove data or JavaScript iFrames
        $badIframes = $xPath->query('//*[contains(@src, \'data:\')] | //*[contains(@src, \'javascript:\')] | //*[@srcdoc]');
        foreach ($badIframes as $badIframe) {
            $badIframe->parentNode->removeChild($badIframe);
        }

It looks like the intent of the code was to remove iframes, but the query matches <img> elements with data: as well.

I've verified the content is correctly saved to the database, but the rendering process is sanitizing it and preventing the images from showing.

@awarre commented on GitHub (Mar 19, 2021): @ssddanbrown Source of the problem is [PageContent.php, line 348](https://github.com/BookStackApp/BookStack/blob/1420f239fceac1e59182cd50a5dc9d39b614ffa4/app/Entities/Tools/PageContent.php#L348). ``` // Remove data or JavaScript iFrames $badIframes = $xPath->query('//*[contains(@src, \'data:\')] | //*[contains(@src, \'javascript:\')] | //*[@srcdoc]'); foreach ($badIframes as $badIframe) { $badIframe->parentNode->removeChild($badIframe); } ``` It looks like the intent of the code was to remove iframes, but the query matches `<img>` elements with `data:` as well. I've verified the content is correctly saved to the database, but the rendering process is sanitizing it and preventing the images from showing.
Author
Owner

@awarre commented on GitHub (Mar 20, 2021):

I am not familiar with PHP xpath syntax, but maybe this line would meet the security concerns as well as allowing blob images?

$badIframes = $xPath->query('//*[contains(@src, \'data:\') and not(contains(@src, \'data:image\'))] | //*[contains(@src, \'javascript:\')] | //*[@srcdoc]');
@awarre commented on GitHub (Mar 20, 2021): I am not familiar with PHP xpath syntax, but maybe this line would meet the security concerns as well as allowing blob images? ``` $badIframes = $xPath->query('//*[contains(@src, \'data:\') and not(contains(@src, \'data:image\'))] | //*[contains(@src, \'javascript:\')] | //*[@srcdoc]'); ```
Author
Owner

@ssddanbrown commented on GitHub (Mar 20, 2021):

Thanks for digging into this and offering a pull request @awarre.

To be honest though, I don't really want to be handling base64 image data or blob types within the content, these should be stored as images. Even if we merged your changes I don't think it will solve your original case since they enter the editor as a blob, not base64.

Ideally we'd need to properly handle pasting with mixed content that includes images but it might touch upon the difficulties of #449 unless things have improved since I last checked. Content pasting formats seem to be a constant moving target.

@ssddanbrown commented on GitHub (Mar 20, 2021): Thanks for digging into this and offering a pull request @awarre. To be honest though, I don't really want to be handling base64 image data or blob types within the content, these should be stored as images. Even if we merged your changes I don't think it will solve your original case since they enter the editor as a blob, not base64. Ideally we'd need to properly handle pasting with mixed content that includes images but it might touch upon the difficulties of #449 unless things have improved since I last checked. Content pasting formats seem to be a constant moving target.
Author
Owner

@awarre commented on GitHub (Mar 20, 2021):

I've confirmed in my test environment it did resolve. The database is storing this to the html field the following data:

<p id="bkmrk-"><img src=" NCAMAAAAsYgRbAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5c cllPAAAABJQTFRF3NSmzMewPxIG//ncJEJsldTou1jHgAAAARBJREFUeNrs2EEK gCAQBVDLuv+V20dENbMY831wKz4Y/VHb/5RGQ0NDQ0NDQ0NDQ0NDQ0NDQ 0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0PzMWtyaGhoaGhoaGhoaGhoaGhoxtb0QGho aGhoaGhoaGhoaGhoaMbRLEvv50VTQ9OTQ5OpyZ01GpM2g0bfmDQaL7S+ofFC6x v3ZpxJiywakzbvd9r3RWPS9I2+MWk0+kbf0Hih9Y17U0nTHibrDDQ0NDQ0NDQ0 NDQ0NDQ0NTXbRSL/AK72o6GhoaGhoRlL8951vwsNDQ0NDQ1NDc0WyHtDTEhD Q0NDQ0NTS5MdGhoaGhoaGhoaGhoaGhoaGhoaGhoaGposzSHAAErMwwQ2HwRQ AAAAAElFTkSuQmCC" alt="beastie.png"></p>

Obviously your call if you don't like this solution. I was just trying to find the cause and a solution that works to resolve my problem with rendering the data as it was being stored.

@awarre commented on GitHub (Mar 20, 2021): I've confirmed in my test environment it did resolve. The database is storing this to the `html` field the following data: ``` <p id="bkmrk-"><img src=" NCAMAAAAsYgRbAAAAGXRFWHRTb2Z0d2FyZQBBZG9iZSBJbWFnZVJlYWR5c cllPAAAABJQTFRF3NSmzMewPxIG//ncJEJsldTou1jHgAAAARBJREFUeNrs2EEK gCAQBVDLuv+V20dENbMY831wKz4Y/VHb/5RGQ0NDQ0NDQ0NDQ0NDQ0NDQ 0NDQ0NDQ0NDQ0NDQ0NDQ0NDQ0PzMWtyaGhoaGhoaGhoaGhoaGhoxtb0QGho aGhoaGhoaGhoaGhoaMbRLEvv50VTQ9OTQ5OpyZ01GpM2g0bfmDQaL7S+ofFC6x v3ZpxJiywakzbvd9r3RWPS9I2+MWk0+kbf0Hih9Y17U0nTHibrDDQ0NDQ0NDQ0 NDQ0NDQ0NTXbRSL/AK72o6GhoaGhoRlL8951vwsNDQ0NDQ1NDc0WyHtDTEhD Q0NDQ0NTS5MdGhoaGhoaGhoaGhoaGhoaGhoaGhoaGposzSHAAErMwwQ2HwRQ AAAAAElFTkSuQmCC" alt="beastie.png"></p> ``` Obviously your call if you don't like this solution. I was just trying to find the cause and a solution that works to resolve my problem with rendering the data as it was being stored.
Author
Owner

@ssddanbrown commented on GitHub (Mar 20, 2021):

@awarre Ah, I get you now, the browser converting the blobs on send.

I'm going to close the PR, as I don't want to get into storing base64 images within the database since it's had some impacts (mainly due to large content sizes) when it's leaked in before. But we'll leave this open with the intention if ideally reading out the blobs and storing into the system.

For web images like this, They should be handled properly if you right-click>"Copy Image" and then paste into BookStack. It's just when a large amount of content gets mixed in which causes complications.

@ssddanbrown commented on GitHub (Mar 20, 2021): @awarre Ah, I get you now, the browser converting the blobs on send. I'm going to close the PR, as I don't want to get into storing base64 images within the database since it's had some impacts (mainly due to large content sizes) when it's leaked in before. But we'll leave this open with the intention if ideally reading out the blobs and storing into the system. For web images like this, They should be handled properly if you right-click>"Copy Image" and then paste into BookStack. It's just when a large amount of content gets mixed in which causes complications.
Author
Owner

@awarre commented on GitHub (Mar 20, 2021):

Got it. This is a bit of an artificial scenario I created to narrow down the source of a problem I encountered in the real world.

My issue involved trying to import data from another knowledge management system into BookStack. Whether exported to CSV/JSON via their API, raw database data, or manually copying and pasting the entire document from the rendered web page. When another system does support base64 image storage, I see no clear method to get it into BookStack without a lot of manual effort.

Manually copy and pasting individual images from thousands of documents won't be very practical, so I guess I'll just have to find an alternative solution.

@awarre commented on GitHub (Mar 20, 2021): Got it. This is a bit of an artificial scenario I created to narrow down the source of a problem I encountered in the real world. My issue involved trying to import data from another knowledge management system into BookStack. Whether exported to CSV/JSON via their API, raw database data, or manually copying and pasting the entire document from the rendered web page. When another system does support base64 image storage, I see no clear method to get it into BookStack without a lot of manual effort. Manually copy and pasting individual images from thousands of documents won't be very practical, so I guess I'll just have to find an alternative solution.
Author
Owner

@ssddanbrown commented on GitHub (Jun 2, 2021):

Closing as per #2700, Will be part of the next patch release.

@ssddanbrown commented on GitHub (Jun 2, 2021): Closing as per #2700, Will be part of the next patch release.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/BookStack#2152