首页 > 解决方案 > 我可以减少代码中的 pdfbox 内存使用还是应该扩展 Java 堆空间

问题描述

我的pdf不是很大,只有四页。我可以优化我的代码以使用更少的内存吗?每次循环后我可以做人清洁吗?

我的代码

import java.awt.image.BufferedImage;
import java.io.File;
import java.io.IOException;

import javax.imageio.IIOImage;
import javax.imageio.ImageIO;
import javax.imageio.ImageWriteParam;
import javax.imageio.ImageWriter;
import javax.imageio.plugins.jpeg.JPEGImageWriteParam;
import javax.imageio.stream.FileImageOutputStream;

import org.apache.pdfbox.io.MemoryUsageSetting;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
import org.apache.pdfbox.pdmodel.common.PDRectangle;
import org.apache.pdfbox.rendering.ImageType;
import org.apache.pdfbox.rendering.PDFRenderer;

import net.coobird.thumbnailator.Thumbnails;

public class PdfToImage {
       public static void main (String args[]) throws IOException {
           File file = new File("C:/myPath/printfile.pdf");
            PDDocument document = PDDocument.load(file, MemoryUsageSetting.setupTempFileOnly());
            PDFRenderer renderer = new PDFRenderer(document);
           
           try {
               int pageNumber = 0;
               for( PDPage page : document.getPages() )
                {
        
        float trimboxWidth = page.getTrimBox().getWidth();
        float trimboxHeight = page.getTrimBox().getHeight();
        
        float mediaboxWidth = page.getMediaBox().getWidth();
        float mediaboxHeight = page.getMediaBox().getHeight();
        
        float getY = (mediaboxHeight - trimboxHeight) / 2;
        float getX = (mediaboxWidth - trimboxWidth) / 2;
        
        PDRectangle rectangle = new PDRectangle(getX, getY, trimboxWidth, trimboxHeight);
        
        //Crop the page
        page.setCropBox(rectangle);
        
        //Rendering an image from the PDF document
        BufferedImage image = renderer.renderImageWithDPI(pageNumber, 96, ImageType.RGB);
                
        //Resize the image
        BufferedImage thumbnail = 
                Thumbnails.of(image)
                    .height(700)
                    .asBufferedImage();

        //Writing the image to a file
        JPEGImageWriteParam jpegParams = new JPEGImageWriteParam(null);
        jpegParams.setCompressionMode(ImageWriteParam.MODE_EXPLICIT);
            jpegParams.setCompressionQuality(1f);
        
        final ImageWriter writer = ImageIO.getImageWritersByFormatName("jpg").next();
        // specifies where the jpg image has to be written
        writer.setOutput(new FileImageOutputStream(
          new File("C:/myPath/myimage" + pageNumber + ".jpg")));
        
        writer.write(null, new IIOImage(thumbnail, null, null), jpegParams);

            System.out.println("Used Memory: " +  (Runtime.getRuntime().totalMemory() - Runtime.getRuntime().freeMemory()));
       pageNumber++; 
            
                }
           }
           finally
            {
                if( document != null )
                {
                    document.close();
                }

            } 
       }
}

错误信息第三个循环

线程“主”java.lang.OutOfMemoryError 中的异常:java.util.HashMap.treeifyBin(未知源)处 java.util.HashMap.replacementTreeNode(未知源)处的 Java 堆空间 java.util.HashMap.putVal(未知源) ) 在 java.util.HashMap.put(Unknown Source) at org.apache.pdfbox.pdmodel.graphics.shading.TriangleBasedShadingContext.calcPixelTable(TriangleBasedShadingContext.java:121) at org.apache.pdfbox.pdmodel.graphics.shading.PatchMeshesShadingContext .calcPixelTable(PatchMeshesShadingContext.java:280) 在 org.apache.pdfbox.pdmodel.graphics.shading.TriangleBasedShadingContext.createPixelTable(TriangleBasedShadingContext.java:80) 在 org.apache.pdfbox.pdmodel.graphics.shading.PatchMeshesShadingContext.(PatchMeshesShadingContext. java:71) 在 org.apache.pdfbox.pdmodel.graphics.shading.Type7ShadingContext.(Type7ShadingContext.java:46)在 org.apache.pdfbox.pdmodel.graphics.shading.Type7ShadingPaint.createContext(Type7ShadingPaint.java:63)在 sun.java2d.pipe.AlphaPaintPipe.startSequence(未知来源)在 sun.java2d.pipe.AAShapePipe.renderTiles( Sun.java2d.pipe.AAShapePipe.renderPath(Unknown Source) 在 sun.java2d.pipe.AAShapePipe.fill(Unknown Source) 在 sun.java2d.pipe.PixelToParallelogramConverter.fill(Unknown Source) 在 sun.java2d。 pipe.ValidatePipe.fill(Unknown Source) at sun.java2d.SunGraphics2D.fill(Unknown Source) at org.apache.pdfbox.rendering.PageDrawer.shadingFill(PageDrawer.java:1388) at org.apache.pdfbox.contentstream.operator .graphics.ShadingFill.process(ShadingFill.java:42) 在 org.apache.pdfbox.contentstream.PDFStreamEngine.processOperator(PDFStreamEngine.java:932) 在 org.apache.pdfbox.contentstream.PDFStreamEngine。processStreamOperators(PDFStreamEngine.java:510) at org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:484) at org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:156) at org.apache .pdfbox.rendering.PageDrawer.drawPage(PageDrawer.java:271) 在 org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:321) 在 org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java :243) 在 org.apache.pdfbox.rendering.PDFRenderer.renderImageWithDPI(PDFRenderer.java:229) 在 PdfToImage.main(PdfToImage.java:47)java:271) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:321) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) at org.apache.pdfbox.rendering .PDFRenderer.renderImageWithDPI(PDFRenderer.java:229) 在 PdfToImage.main(PdfToImage.java:47)java:271) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:321) at org.apache.pdfbox.rendering.PDFRenderer.renderImage(PDFRenderer.java:243) at org.apache.pdfbox.rendering .PDFRenderer.renderImageWithDPI(PDFRenderer.java:229) 在 PdfToImage.main(PdfToImage.java:47)

编辑 我发现我改变了

//Rendering an image from the PDF document
BufferedImage image = renderer.renderImageWithDPI(pageNumber, 96, ImageType.RGB);

//Rendering an image from the PDF document
BufferedImage image = renderer.renderImageWithDPI(pageNumber, 50, ImageType.RGB);

它工作正常,没有崩溃。但这不是解决方案,也不是降低分辨率的糟糕解决方案。

标签: javapdfboxheap-memory

解决方案


推荐阅读