首页 > 解决方案 > 将数据保存到 ArrayList 的高效代码

问题描述

我的控制台从这样的 XML 文件中输出年份标签

2020
2019
1997
2017
2019
2017 (...)

从该数据中,我想将每个不同的年份保存在 ArrayList 中,例如:

Years found on file: 2020 , 2019 , 1997 , 2017

我已经尝试了很多东西,但它们似乎都没有奏效。我正在尝试使用以下代码找出解决方案:

public class Publications {
    public static void main(String[] args) throws IOException {
        File file = new File("dblp-2020-04-01.xml");
        FileInputStream fileStream = new FileInputStream(file);
        InputStreamReader input = new InputStreamReader(fileStream);
        BufferedReader reader = new BufferedReader(input);
        String line;
        ArrayList<String> publicationsList = new ArrayList<String>();
        int i = 0;
        while ((line = reader.readLine()) != null) {
            Publications publ = new Publications();
            Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
            Matcher matcher = pattern.matcher(line);
            if (matcher.find()) {
                String year = matcher.group(1);
                if (publicationsList.size() == 0) {
                    publicationsList.add(year);
                }else{
                    for(String publications1 : publicationsList){
                        if(!(publications1.contains(year))){
                            publicationsList.add(year);
                        }
                    }
                }
            }
        }
        //READING TEST
        for (String publications1 : publicationsList){
            System.out.println(publications1);
        }
    }
}

错误:

Exception in thread "main" java.util.ConcurrentModificationException
    at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042)
    at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996)
    at Publications.main(Publications.java:26)

标签: javaxmlstringarraylist

解决方案


替换ArrayListLinkedHashSet和重复将自动被忽略,而插入值的顺序仍然保留。

此外,这是 2020 年,因此您应该使用 NIO.2 API 和 try-with-resources 语句,它们都是在 2011 年在 Java 7 中添加的。这将有助于解决您不关闭文件流的问题。

这就是你的代码应该是这样的:

Set<String> publicationYears = new LinkedHashSet<>();
try (BufferedReader reader = Files.newBufferedReader(Paths.get("dblp-2020-04-01.xml"))) {
    Pattern pattern = Pattern.compile("<year>(.+?)</year>", Pattern.DOTALL);
    for (String line; (line = reader.readLine()) != null; ) {
        Matcher matcher = pattern.matcher(line);
        if (matcher.find()) {
            String year = matcher.group(1);
            publicationYears.add(year);
        }
    }
}
//READING TEST
for (String year : publicationYears){
    System.out.println(year);
}

当然,由于您正在读取 XML 文件,因此使用 XML 解析器会更好,例如 StAX:

Set<String> publicationYears = new LinkedHashSet<>();
try (InputStream in = Files.newInputStream(Paths.get("dblp-2020-04-01.xml"))) {
    XMLStreamReader xml = XMLInputFactory.newFactory().createXMLStreamReader(in);
    while (xml.hasNext()) {
        xml.next();
        if (xml.getEventType() == XMLStreamConstants.START_ELEMENT) {
            if (xml.getLocalName().equals("year")) {
                String year = xml.getElementText();
                publicationYears.add(year);
            }
        }
    }
}

推荐阅读