java 使用 HashSet 存储文本文件并从中读取
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/29178258/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using HashSet to store a text file and read from it
提问by guitar138
I've seen a lot of great resources regarding HasSets, but nothing that helps me with this particular problem. I'm taking an algorithms class on generics and this assignment requires a txt file to be read into the system using Scanner (which is done) and using a hashSet, load the txt file so that I can read it with user input and find the number of occurrences of the word. i have the method for returning words and I have most of the hashSet and file reader code done. But I'm completely stuck on how to store the whole txt file as one hashSet. i couldn't get it to work by doing crime.add and i tried several other things. Am I missing an easier way to implement this method? Thanks
我已经看到了很多关于 HasSets 的很好的资源,但没有任何东西可以帮助我解决这个特定的问题。我正在学习一个关于泛型的算法课程,这个作业需要使用扫描仪(已完成)和使用 hashSet 将 txt 文件读入系统,加载 txt 文件,以便我可以使用用户输入读取它并找到单词出现的次数。我有返回单词的方法,我已经完成了大部分的 hashSet 和文件阅读器代码。但我完全坚持如何将整个 txt 文件存储为一个 hashSet。我无法通过执行crime.add 使其工作,我尝试了其他几件事。我是否缺少一种更简单的方法来实现这种方法?谢谢
Edit: assignment instructions - Program 1 (70 points) Load a java.util.HashSet with the the words from the novel “Crime and Punishment”, by Theodore Dostoevsky (text file available on Blackboard with this assignment). Prompt the user to enter a word and report whether or not that word appears in the novel.
编辑:作业说明 - 程序 1(70 分)加载一个 java.util.HashSet,其中包含 Theodore Dostoevsky 所著小说“罪与罚”中的单词(此作业可在 Blackboard 上找到文本文件)。提示用户输入一个词并报告该词是否出现在小说中。
Edit: Ok, I have all of this written and it runs but it is not finding words that are definitely in the txt file, so somewhere I went wrong adding the file into the hashSet. Any ideas? I've tried with array list, different String implementations and I just don't know where to turn. Thanks for any helpful info.
编辑:好的,我已经编写了所有这些并且它运行了但是它没有找到肯定在 txt 文件中的单词,所以我在某个地方将文件添加到 hashSet 中出错了。有任何想法吗?我尝试过数组列表、不同的 String 实现,但我不知道该从哪里转向。感谢您提供任何有用的信息。
import java.awt.List;
import java.io.File;
import java.io.FileNotFoundException;
import java.util.HashSet;
import java.util.Scanner;
import java.util.Set;
public class CandPHashSet {
public static void main(String[] args) throws FileNotFoundException{
Scanner file = new Scanner(new File("crime_and_punishment.txt")).useDelimiter("[?a-zA-Z]+");
Scanner input = new Scanner(System.in);
Set<String> crime = new HashSet<String>();
while(file.hasNext()){
String line = file.nextLine();
//String[] words = line.split("[?a-zA-Z]+");
for (String word : line.split("[?a-zA-Z]+")){
crime.add(line);
}
}
String search;
System.out.println("Enter a word to search for: ");
search = input.next();
if(crime.contains(input)){
System.out.println("Yes");
}else{
System.out.println("No");
}
}
}
回答by candied_orange
You are posting conflicting requirements.
您正在发布相互冲突的要求。
find the number of occurrences
找出出现次数
is not the same as
不一样
report whether or not that word appears in the novel.
报告该词是否出现在小说中。
HashSet works fine for this second one. Not for the first.
HashSet 对于这第二个工作正常。不是第一个。
Be very careful when reading requirements. 5 extra minutes reading them can save you 5 extra hours writing code.
阅读需求时要非常小心。多花 5 分钟阅读它们可以为您节省 5 小时的代码编写时间。
To follow the instructions what you need to do is add one word at a time to your hash set. Reading one word at a time already has an answer here
要按照说明进行操作,您需要一次向您的哈希集添加一个单词。一次读一个字在这里已经有了答案
Whenever I'm unsure what container to use I look at this:
每当我不确定要使用什么容器时,我都会看这个:
回答by Anderson Vieira
It looks like you don't need to count the word occurrences. You just need to splitthe input file string into individual words, and store them into a HashSet<String>
. Then you should use the method contains()
to check if a word given by the user is present in the set.
看起来您不需要计算单词出现次数。您只需要将输入文件字符串拆分为单个单词,并将它们存储到HashSet<String>
. 然后你应该使用该方法contains()
来检查用户给出的单词是否存在于集合中。
There are a couple of problems in your code that you should check:
您的代码中有几个问题需要检查:
The way you use
useDelimiter()
in theScanner
is not correct. You probably don't want to specify a delimiter so that whitespace, the default, will be used.If you are using whitespaceas the scanner delimiter it will already split your input as words. So we don't need to read the file line by line.
You use
crime.contains(input)
to look for the user provided word. Butinput
is aScanner
, not aString
. You want to usecrime.contains(search)
.
您
useDelimiter()
在 中的使用方式Scanner
不正确。您可能不想指定分隔符以便使用默认值whitespace。如果您使用空格作为扫描仪分隔符,它已经将您的输入拆分为单词。所以我们不需要逐行读取文件。
您
crime.contains(input)
用来查找用户提供的单词。但input
是Scanner
,不是String
。您想使用crime.contains(search)
.
The revised code would look somewhat like this:
修改后的代码看起来有点像这样:
// Read the file using whitespace as a delimiter (default)
// so that the input will be split into words
Scanner file = new Scanner(new File("crime_and_punishment.txt"));
Set<String> crime = new HashSet<>();
// For each word in the input
while (file.hasNext()) {
// Convert the word to lower case, trim it and insert into the set
// In this step, you will probably want to remove punctuation marks
crime.add(file.next().trim().toLowerCase());
}
System.out.println("Enter a word to search for: ");
Scanner input = new Scanner(System.in);
// Also convert the input to lowercase
String search = input.next().toLowerCase();
// Check if the set contains the search string
if (crime.contains(search)) {
System.out.println("Yes");
} else {
System.out.println("No");
}
回答by user207421
You can't do that with a HashSet.
You will just lose the duplicates. You can count t the duplicates as you add them, but then you need somewhere to put the counts.
你不能这样做HashSet.
你只会丢失重复项。您可以在添加重复项时对其进行计数,但随后您需要在某个地方放置计数。
A Map<String, Integer>
is required.
AMap<String, Integer>
是必需的。