首页 > 解决方案 > 从文本文件创建索引

问题描述

我有一个文本文件,需要从中建立索引。我相信我需要一种方法来更新我的 WordCount 类中的行号和字数,但是我在如何做到这一点上遇到了麻烦。我知道该方法应该是 void 类型,因为它只是更新,并且不返回任何值。但我被困在写什么上。我已经在测试器类中评论了我认为这种方法应该去哪里。下面提供的是我的测试器、循环列表和 wordcount 类。感谢您对此的任何帮助,谢谢。

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class Tester
{
    public static final int WordsPerLine = 10;

    public static void main() throws FileNotFoundException
    {
        //build then output hash table
        HashTable ht = new HashTable();
        System.out.println(ht.toString());

        String word; //read from input file
        WordCount wordToFind;  //search for this in the bst
        WordCount wordInTree;  //found in the bst

        //create generic BST, of WordCount here
        BSTree<WordCount> t = new BSTree<WordCount>();

        //want to read word at a time from input file
        Scanner wordsIn = new Scanner(new File("Hamlet.txt"));
        wordsIn.useDelimiter("[^A-Za-z']+");

        int wordCount = 0;
        int lineNum = 1;
        System.out.printf("%3d:  ", lineNum);
        while (wordsIn.hasNext()) {
            word = wordsIn.next();
            ++wordCount;
            System.out.print(word + " ");
            word = word.toLowerCase();
            
            if(t.find(new WordCount(word)) != null){
                wordToFind=  new WordCount(word);
                wordInTree= t.find(wordToFind); 
                //I need to have a method here that update word count and line number

            
            }



            
            
            if (wordCount % WordsPerLine == 0) {
                ++lineNum;
                System.out.printf("\n%3d:  ", lineNum);
            }
        }
        //EOF
        System.out.println();

        //print bst in alpha order
        System.out.println(t.toString());
    }
}

public class WordCount implements Comparable<WordCount>
{
    protected String word;
    protected int count;
    protected CircularList lineNums;

    //required for class to compile
    public int compareTo(WordCount other)
    {
        return word.compareTo(other.word);
    }

    {
        word = "";
        count = 0;
        lineNums= new CircularList();
    }

    public WordCount(String w)
    {
        word = w;
        count = 0;
        lineNums= new CircularList();
    }
    

  

    public String toString()
    {
        return String.format("%-12s %3d %3d", word, count, lineNums);
    }
}

public class CircularList
{
    private Item list;

    public CircularList()
    {
        list = null;
    }

    public Boolean isEmpty()
    {
        return list == null;
    }

    public void append(int x)
    {
        Item r = new Item(x);
        if (isEmpty()) {
            r.next = r;
        }
        else {
            r.next = list.next;
            list.next = r;
        }
        list = r;
    }

    public int nextLine(int x)
    {
        Item r= new Item(x);
        if (!isEmpty()) {
            r = list.next;
            while (r != list) {
                r = r.next;
            }
            //append last item
        }
        return r.info;
    }
    

    public String toString()
    {
        StringBuilder s = new StringBuilder("");

        if (!isEmpty()) {
            Item r = list.next;
            while (r != list) {
                s.append(r.info + ", ");
                r = r.next;
            }
            //append last item
            s.append(r.info);
        }
        return s.toString();
    }
}

标签: javadata-structures

解决方案


相反,我建议您现在专注于将 WordCount 对象放入 bst。比你有一些会打印出来的东西。因此,要将 WordCount 对象放入 bst,在伪代码中,我建议这样做:

创建一个新的 WordCount 对象来查找。将此设置为单词,计数为 1,为 lineNums 创建一个新的循环列表对象并使用 CL::append() 将 lineNum 添加到它 如果在树中找到单词,则尝试在树中找到它 //我需要在这里有一个方法来更新字数和行号将新 WordCount 对象添加到树中

一旦将一些数据放入树中,它将在 wordsIn.hasNext() while 循环之后的 t.toString() 处打印出来。


推荐阅读