首页 > 解决方案 > 读取 csv 文件时 Java 堆错误内存超出范围

问题描述

我有 700 个 csv 文件,每个文件大小约为 5-7 MB。我正在使用春季启动。我需要做的就是阅读这 700 个 csv 文件。因此,每当将新文件添加到目录时,都会调用fileUpdatedOrAdded()来自的方法。FileWatcherJob.java它做了一些检查,只是试图从本质上读取文件。

FileWatcherJob.java

public class FileWatcherJob implements DirectoryScanListener {

    private final static Logger logger = LoggerFactory.getLogger(FileWatcherJob.class);

    public static final String LISTENER_NAME = "DirScanListenerName";

    private static boolean fileFound = true;

    private ReadFile readFile = new ReadFile();

    public void filesUpdatedOrAdded(File[] files) {
        if (fileFound) {

            System.out.println("------------- I am doing it again-------------");
            for (File file : files) {
                logger.info("File Found : {}", file.getName());
            }
            logger.info("ALL THE FILES ARE AVAILABLE NOW");

            if (!readFile.getFileAStored()) {
                readFile.readAllFiles("D:\\FileToRead\\fileA.csv");
            }

            if (!readFile.getFileBStored()) {
                readFile.readAllFiles("D:\\FileToRead\\fileB.csv");
            }

            //Read Miscallenous Files including File A and File B
            if (readFile.getFileAStored() && readFile.getFileBStored()) {
                readFile.readAllFiles("D:\\FileToRead\\");
            }
            fileFound = false;
            logger.info("-------------- I am done -----------------");
        }
    }
}

ReadFile.java

public class ReadFile {

    private static final Logger LOGGER = LoggerFactory.getLogger(ReadFile.class);


    private Map<Path, List<String>> fileA = new HashMap<>();
    private Map<Path, List<String>> fileB = new HashMap<>();
    private Boolean fileAStored = false;
    private Boolean fileBStored = false;

    private Map<Path, List<String>> miscallenousFiles = new HashMap<>();

    public Boolean getFileAStored() {
        return fileAStored;
    }

    public Boolean getFileBStored() {
        return fileBStored;
    }

    public void readAllFiles(String path) {

        try (Stream<Path> paths = Files.walk(Paths.get(path)).collect(toList()).parallelStream()
        ){
            paths.forEach(filePath -> {
                //LOGGER.info("CHECK IF FILE IS REGULAR");
                if (filePath.toFile().exists()) {

                    String fileName = filePath.getFileName().toString();
                    try {
                            LOGGER.info("START LOADING THE CONTENT OF FILE " + fileName);
                            List<String> loadedFile = readContent(filePath);
                            storeAandBFiles(fileName, filePath, loadedFile);
                    } catch (Exception e) {
                        LOGGER.info("ERROR WHILE READING THE CONTENT OF FILE");
                        LOGGER.error(e.getMessage());
                    }
                }
            });
        } catch (IOException e) {
            LOGGER.info("ERROR WHILE READING THE FILES IN PARALLEL");
            LOGGER.error(e.getMessage());
        }
    }


    private List<String> readContent(Path filePath) throws IOException {
        //LOGGER.info("START READING THE FILE, LINE BY LINE");
        return Files.readAllLines(filePath, StandardCharsets.ISO_8859_1);
    }

    private void storeAandBFiles(String fileName, Path filePath, List<String> loadedFile) {
        //LOGGER.info("START STORING THE FILE");
        if (fileName.contains("fileA") && !fileAStored) {
            fileA.put(filePath.getFileName(), loadedFile);
            fileAStored = true;
        }

        if (fileName.contains("fileB") && !fileBStored) {
            fileB.put(filePath.getFileName(), loadedFile);
            fileBStored = true;
        }

    }
}

但是,我不断收到以下错误:

Job group1.FileScanJobName 抛出未处理的异常:

java.lang.OutOfMemoryError:Java 堆空间

我不明白问题是什么。有人可以帮忙吗?一件奇怪的事情,我对问题原因的怀疑是,即使没有新文件被添加到目录中,观察者仍然以某种方式说在目录中找到了新文件!

标签: javaspring-bootheap-memory

解决方案


推荐阅读