首页 > 解决方案 > 如何找到文本中出现频率最高的单词?

问题描述

我有一个问题。如果我有这样的输入:“谢谢谢谢谢谢汽车”输出将是“谢谢”。如果我的单词以大写字母开头,它将以小写字母打印该单词。我可以在我的解决方案中添加什么来解决这个问题?

 public class Main {
 public static void main(String[] args) throws IOException {
     String line;
     String[] words = new String[100];
     Map < String, Integer > frequency = new HashMap < > ();
     BufferedReader reader = new BufferedReader(new InputStreamReader(System.in));
     while ((line = reader.readLine()) != null) {
         line = line.trim();
         if (!line.isEmpty()) {
             words = line.split("\\W+");
             for (String word: words) {
                 String processed = word.toLowerCase();
                 processed = processed.replace(",", "");

                 if (frequency.containsKey(processed)) {
                     frequency.put(processed,
                         frequency.get(processed) + 1);
                 } else {
                     frequency.put(processed, 1);
                 }
             }
         }
     }
     int mostFrequentlyUsed = 0;
     String theWord = null;

     for (String word: frequency.keySet()) {
         Integer theVal = frequency.get(word);
         if (theVal > mostFrequentlyUsed) {
             mostFrequentlyUsed = theVal;
             theWord = word;
         } else if (theVal == mostFrequentlyUsed && word.length() <
             theWord.length()) {
             theWord = word;
             mostFrequentlyUsed = theVal;
         }

     }
     System.out.printf(theWord);
 }

标签: javaarraysstringoopbufferedreader

解决方案


要让代码以输入的格式而不是小写形式打印最常用的单词,您可以更改下面的代码行。

String processed = word.toLowerCase();

将其更改为:

String processed = word;

但请注意,containsKey()方法区分大小写,不会将“谢谢”和“谢谢”视为同一个词。


推荐阅读