从 .txt 文件 JAVA 中读取特定数据

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32553371/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-11-02 20:23:10  来源:igfitidea点击:

Read specific data from a .txt file JAVA

javafile

提问by Andrei Olar

I have a problem. I'm trying to read a large .txt file, but I don't need every piece of data that's inside.

我有个问题。我正在尝试读取一个大的 .txt 文件,但我不需要里面的每一条数据。

My .txt file looks something like this:

我的 .txt 文件看起来像这样:

8000000 abcdefg hijklmn word word letter

8000000 abcdefg hijklmn word word 字母

I only need, let's say, the number and the first two text positions: "abcdefg" and "hijklmn" and write it to another file after that. I don't know how to read and write just the data that I need.

我只需要数字和前两个文本位置:“abcdefg”和“hijklmn”,然后将其写入另一个文件。我不知道如何读写我需要的数据。

Here is my code so far:

到目前为止,这是我的代码:

    BufferedReader br = new BufferedReader(new FileReader("position2.txt"));
    BufferedWriter bw = new BufferedWriter(new FileWriter("position.txt"));
    String line;

    while ((line = br.readLine())!= null){
        if(line.isEmpty() || line.trim().equals("") || line.trim().equals("\n")){
            continue;
        }else{
            //bw.write(line + "\n");
            String[] data = line.split(" ");
            bw.write(data[0] + " " + data[1] + " " + data[2] + "\n");
        }

    }

    br.close();
    bw.close();

}

Can you give me some sugestions ? Thanks in advance

你能给我一些建议吗?提前致谢

UPDATE: My .txt files are a bit weird. Using the code above works great when there is only one single " " between them. My files can have a \t or more spaces, or a \t and some spaces between the words. Ho can I proceed now ?

更新:我的 .txt 文件有点奇怪。当它们之间只有一个“”时,使用上面的代码效果很好。我的文件可以有一个 \t 或多个空格,或者一个 \t 和单词之间的一些空格。我现在可以继续吗?

采纳答案by Andreas

Depending on the complexity of you data, you have a few options.

根据数据的复杂性,您有几种选择。

If the lines are simple space-separated values like shown, the simplest is to split the text, and write the values you want to keep to the new file:

如果这些行是简单的空格分隔值,如所示,最简单的方法是拆分文本,然后将要保留的值写入新文件:

try (BufferedReader br = new BufferedReader(new FileReader("text.txt"));
     BufferedWriter bw = new BufferedWriter(new FileWriter("data.txt"))) {
    String line;
    while ((line = br.readLine()) != null) {
        String[] values = line.split(" ");
        if (values.length >= 3)
            bw.write(values[0] + ' ' + values[1] + ' ' + values[2] + '\n');
    }
}

If the values might be more complex, you could use a regular expression:

如果值可能更复杂,您可以使用正则表达式:

Pattern p = Pattern.compile("^(\d+ \w+ \w+)");
try (BufferedReader br = new BufferedReader(new FileReader("text.txt"));
     BufferedWriter bw = new BufferedWriter(new FileWriter("data.txt"))) {
    String line;
    while ((line = br.readLine()) != null) {
        Matcher m = p.matcher(line);
        if (m.find())
            bw.write(m.group(1) + '\n');
    }
}

This ensures that first value is digits only, and second and third values are word-characters only (a-z A-Z _ 0-9).

这确保第一个值仅为数字,第二个和第三个值仅为单词字符 ( a-z A-Z _ 0-9)。

回答by Jorge Z

Assuming all lines of your text file follow the structure you described then you could do this: Replace FILE_PATH with your actual file path.

假设您的文本文件的所有行都遵循您描述的结构,那么您可以这样做:将 FILE_PATH 替换为您的实际文件路径。

public static void main(String[] args) {
    try {
        Scanner reader = new Scanner(new File("FILE_PATH/myfile.txt"));
        PrintWriter writer = new PrintWriter(new File("FILE_PATH/myfile2.txt"));
        while (reader.hasNextLine()) {
            String line = reader.nextLine();
            String[] tokens = line.split(" ");

            writer.println(tokens[0] + ", " + tokens[1] + ", " + tokens[2]);
        }
        writer.close();
        reader.close();
    } catch (FileNotFoundException ex) {
        System.out.println("Error: " + ex.getMessage());
    }
}

You'll get something like: word0, word1, word2

你会得到类似的东西:word0, word1, word2

回答by smttsp

If your files are really huge (above 50-100 MB maybe GBs) and you are sure that the first word is a number and you need two words after that I would suggest you to read one line and iterate through that string. Stop when you find 3rd space.

如果您的文件非常大(可能超过 50-100 MB,可能是 GB),并且您确定第一个单词是一个数字,然后您需要两个单词,我建议您阅读一行并遍历该字符串。当你找到第三个空间时停止。

String str = readLine();
int num_spaces = 0, cnt = 0;
String arr[] = new String[3];
while(num_spaces < 3){
    if(str.charAt(cnt) == ' '){
        num_space++;
    }
    else{
        arr[num_space] += str.charAt(cnt);
    }
}

If your data is couple of MB only or have a lot of numbers inside, no need to worry about iterating char by char. Just read line by line and split lines then check the wordsas it is mentioned

如果您的数据只有几 MB 或里面有很多数字,则无需担心逐字符迭代。正如read line by line and split lines then check the words它所提到的

回答by QuakeCore

else {
     String[] res = line.split(" ");
     bw.write(res[0] + " " + res[1] + " " + res[2] + "\n"); // the first three words...
}