首页 > 解决方案 > 从富文本中提取base64字符串并将它们收集到一个数组中

问题描述

我在请求中有一个富文本,它可以包含多个图像作为 base64 字符串。我需要收集所有图像及其相应的文件名。到目前为止,我已经尝试了下面的代码,并且能够为单个图像解决这个问题。如何改进以下代码或有任何有效的方法可以做到这一点。我对正则表达式没有太多想法,所以没有尝试。任何帮助表示赞赏。

在下面的示例中,我认为富文本中仅存在一张图像。

public static void main(String[] args) {
        String richText = "<div class=\"se-component se-image-container __se__float-center\" contenteditable=\"false\"><figure style=\"margin: auto; width: 300px;\"><img src=\"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAABaAAAANNCAYAAABhqtmuAAAAAXNSR0IArs4c6QAAQABJREFUeAHs3QmcXtP5B/AniyAkEWuIkA0Va621BUG1CLV1sasqitr3/m21Vamllmq1VdqqorbaRYh9X0utScQWJbEFSUX+99yY6WRMZiYyZ7x53+/1ycz73vfe557zvWdmPn5z5rwdIuLy4p+NAAECBAgQIECAAAECBAgQIECAAAECBAi0qUDHNq2mGAECBAgQIECAAAECBAgQIECAAAECBAgQ+FxAAG0oECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIEngAAAAASUVORK5CYII=\" alt=\"\" data-rotate=\"0\" data-proportion=\"true\" data-align=\"center\" data-size=\"300px,300px\" data-index=\"0\" data-file-name=\"Upload Question.png\" data-file-size=\"38747\" data-origin=\"300px,300px\" style=\"width: 300px; height: 300px;\"></figure>";
        String[] texts = StringUtils.substringsBetween(richText, "<img", "</figure>");
        for (String td : texts) {
            String fileName = StringUtils.substringBetween(td, "data-file-name=\"", "\"");
            System.out.println("fileName:" + fileName); //prints fileName:Upload Question.png
            String base64 = StringUtils.substringBetween(td, ",", "\""); 
            System.out.println(base64);// prints //iVBORw0KGgoAAAANSUhEUgAABaAAAANNCAYAAABhqtmuAAAAAXNSR0IArs4c6QAAQABJREFUeAHs3QmcXtP5B/AniyAkEWuIkA0Va621BUG1CLV1sasqitr3/m21Vamllmq1VdqqorbaRYh9X0utScQWJbEFSUX+99yY6WRMZiYyZ7x53+/1ycz73vfe557zvWdmPn5z5rwdIuLy4p+NAAECBAgQIECAAAECBAgQIECAAAECBAi0qUDHNq2mGAECBAgQIECAAAECBAgQIECAAAECBAgQ+FxAAG0oECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIECBAgAABAgQIECBAgEAWAQF0FlZFCRAgQIAAAQIECBAgQIAAAQIECBAgQEAAbQwQIECAAAECBAgQIECAAAECBAgQIECAQBYBAXQWVkUJECBAgAABAgQIECBAgAABAgQIECBAQABtDBAgQIAAAQIECBAgQIAAAQIECBAgQIBAFgEBdBZWRQkQIECAAAECBAgQIECAAAECBAgQIEBAAG0MECBAgAABAgQIEngAAAAASUVORK5CYII=
        }
    }

标签: javaregeximagerichtext

解决方案


在Javascript/(?<=base64,).+?(?=\")/gi中可以解决您的问题


推荐阅读