Java 计算文件中的字符、单词和行数

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/18274391/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-11 23:56:41  来源:igfitidea点击:

count characters, words and lines in file

javacountoutput

提问by nazar_art

This should count number of lines, words and characters into file.

这应该计算文件中的行数、单词数和字符数。

But it doesn't work. From output it shows only 0.

但它不起作用。从输出它只显示0.

Code:

代码:

public static void main(String[] args) throws IOException {
    int ch;
    boolean prev = true;        
    //counters
    int charsCount = 0;
    int wordsCount = 0;
    int linesCount = 0;

    Scanner in = null;
    File selectedFile = null;
    JFileChooser chooser = new JFileChooser();
    // choose file 
    if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
        selectedFile = chooser.getSelectedFile();
        in = new Scanner(selectedFile);         
    }

    // count the characters of the file till the end
    while(in.hasNext()) {
        ch = in.next().charAt(0);
        if (ch != ' ') ++charsCount;
        if (!prev && ch == ' ') ++wordsCount;
        // don't count if previous char is space
        if (ch == ' ') 
            prev = true;
        else 
            prev = false;

        if (ch == '\n') ++linesCount;
    }

    //display the count of characters, words, and lines
    charsCount -= linesCount * 2;
    wordsCount += linesCount;
    System.out.println("# of chars: " + charsCount);
    System.out.println("# of words: " + wordsCount);
    System.out.println("# of lines: " + linesCount);

    in.close();
}

I can't understand what's going on. Any suggestions?

我不明白发生了什么。有什么建议?

采纳答案by boxed__l

Different approach. Using strings to find line,word and character counts:

不同的做法。使用字符串查找行数、字数和字符数:

public static void main(String[] args) throws IOException {
        //counters
        int charsCount = 0;
        int wordsCount = 0;
        int linesCount = 0;

        Scanner in = null;
        File selectedFile = null;
        JFileChooser chooser = new JFileChooser();
        // choose file 
        if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
            selectedFile = chooser.getSelectedFile();
            in = new Scanner(selectedFile);
        }

        while (in.hasNext()) {
            String tmpStr = in.nextLine();
            if (!tmpStr.equalsIgnoreCase("")) {
                String replaceAll = tmpStr.replaceAll("\s+", "");
                charsCount += replaceAll.length();
                wordsCount += tmpStr.split(" ").length;
            }
            ++linesCount;
        }

        //display the count of characters, words, and lines
        System.out.println("# of chars: " + charsCount);
        System.out.println("# of words: " + wordsCount);
        System.out.println("# of lines: " + linesCount);

        in.close();
    }



Note:注意:


对于其他编码样式使用new Scanner(new File(selectedFile), "###");new Scanner(new File(selectedFile), "###");到位new Scanner(selectedFile);new Scanner(selectedFile);

###is the Character set to needed. Refer thisand wiki

###是需要的字符集。参考这个维基

回答by hthserhs

Your code is looking at only the first characters of default tokens (words) in the file.

您的代码仅查看文件中默认标记(单词)的第一个字符。

When you do this ch = in.next().charAt(0), it gets you the first character of a token (word), and the scanner moves forward to the next token (skipping rest of that token).

当您这样做时ch = in.next().charAt(0),它会为您获取标记(单词)的第一个字符,并且扫描仪向前移动到下一个标记(跳过该标记的其余部分)。

回答by Josh M

You could store every line in a List<String>and then linesCount = list.size().

您可以将每一行存储在 a 中List<String>,然后linesCount = list.size().

Calculating charsCount:

计算charsCount

for(final String line : lines)
    charsCount += line.length();

Calculating wordsCount:

计算wordsCount

for(final String line : lines)
    wordsCount += line.split(" +").length;

It would probably be a wise idea to combine these calculations together as opposed to doing them seperately.

将这些计算组合在一起而不是单独进行计算可能是一个明智的想法。

回答by Jean Logeart

Use Scannermethods:

使用Scanner方法:

int lines = 0;
int words = 0;
int chars = 0;
while(in.hasNextLine()) {
    lines++;
    Scanner lineScanner = new Scanner(in.nextLine());
    lineScanner.useDelimiter(" ");
    while(lineScanner.hasNext()) {
        words++;
        chars += lineScanner.next().length();
    }
}

回答by JNL

Looks like everyone is suggesting you an alternative,

看来大家都在给你建议另一种选择,

The flaw with your logic is, you are not looping through the all the characters for the entire line. You are just looping through the first character of every line.

您的逻辑缺陷是,您没有遍历整行的所有字符。您只是遍历每一行的第一个字符。

 ch = in.next().charAt(0);

Also, what does 2 in charsCount -= linesCount * 2;represent?

另外,2 incharsCount -= linesCount * 2;代表什么?

You might also want to include a try-catch block, while accessing a file.

您可能还想在访问文件时包含一个 try-catch 块。

  try {
            in = new Scanner(selectedFile);
        } catch (FileNotFoundException e) {}

回答by Michael McGarrah

You have a couple of issues in here.

你在这里有几个问题。

First is the test for the end of line is going to cause problems since it usually isn't a single character denoting end of line. Read http://en.wikipedia.org/wiki/End-of-linefor more detail on this issue.

首先是对行尾的测试会导致问题,因为它通常不是表示行尾的单个字符。有关此问题的更多详细信息,请阅读http://en.wikipedia.org/wiki/End-of-line

The whitespace character between words can be more than just the ASCII 32 (space) value. Consider tabs as one case. You want to check for Character.isWhitespace() more than likely.

单词之间的空白字符可以不仅仅是 ASCII 32(空格)值。将选项卡视为一种情况。您很可能想检查 Character.isWhitespace()。

You could also solve the end of line issues with two scanners found in How to check the end of line using Scanner?

您还可以使用如何使用扫描仪检查行尾中的两个扫描仪来解决行尾问题

Here is a quick hack on the code you provided along with input and output.

这是对您提供的代码以及输入和输出的快速破解。

import java.io.*;
import java.util.Scanner;
import javax.swing.JFileChooser;

public final class TextApp {

public static void main(String[] args) throws IOException {
    //counters
    int charsCount = 0;
    int wordsCount = 0;
    int linesCount = 0;

    Scanner fileScanner = null;
    File selectedFile = null;
    JFileChooser chooser = new JFileChooser();
    // choose file 
    if (chooser.showOpenDialog(null) == JFileChooser.APPROVE_OPTION) {
        selectedFile = chooser.getSelectedFile();
        fileScanner = new Scanner(selectedFile);         
    }

    while (fileScanner.hasNextLine()) {
      linesCount++;
      String line = fileScanner.nextLine();
      Scanner lineScanner = new Scanner(line);
      // count the characters of the file till the end
      while(lineScanner.hasNext()) {
        wordsCount++;
        String word = lineScanner.next();
        charsCount += word.length();
      } 

    lineScanner.close();
  }

  //display the count of characters, words, and lines
  System.out.println("# of chars: " + charsCount);
  System.out.println("# of words: " + wordsCount);
  System.out.println("# of lines: " + linesCount);

  fileScanner.close();
 }
}

Here is the test file input:

这是测试文件输入:

$ cat ../test.txt 
test text goes here
and here

Here is the output:

这是输出:

$ javac TextApp.java
$ java TextApp 
# of chars: 23
# of words: 6
# of lines: 2
$ wc test.txt 
 2  6 29 test.txt

The difference between character count is due to not counting whitespace characters which appears to be what you were trying to do in the original code.

字符计数之间的差异是由于没有计算空白字符,这似乎是您在原始代码中尝试执行的操作。

I hope that helps out.

我希望这会有所帮助。

回答by Shell Scott

Maybe my code will help you...everything work correct

也许我的代码会帮助你...一切正常

import java.io.BufferedReader;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
import java.util.Scanner;
import java.util.StringTokenizer;

public class LineWordChar {
    public static void main(String[] args) throws IOException {
        // Convert our text file to string
    String text = new Scanner( new File("way to your file"), "UTF-8" ).useDelimiter("\A").next();
    BufferedReader bf=new BufferedReader(new FileReader("way to your file"));
    String lines="";
    int linesi=0;
    int words=0;
    int chars=0;
    String s="";
    // while next lines are present in file int linesi will add 1
        while ((lines=bf.readLine())!=null){
        linesi++;}
    // Tokenizer separate our big string "Text" to little string and count them
    StringTokenizer st=new StringTokenizer(text);
     while (st.hasMoreTokens()){
        `enter code here`  s = st.nextToken();
          words++;
    // We take every word during separation and count number of char in this words    
          for (int i = 0; i < s.length(); i++) {
              chars++;}
        }
     System.out.println("Number of lines: "+linesi);
     System.out.println("Number of words: "+words);
     System.out.print("Number of chars: "+chars);
 }
}

回答by Nandkishor Periwal

public class WordCount {

    /**
     * @return HashMap a map containing the Character count, Word count and
     *         Sentence count
     * @throws FileNotFoundException 
     *
     */
    public static void main() throws FileNotFoundException {
        lineNumber=2; // as u want
        File f = null;
        ArrayList<Integer> list=new ArrayList<Integer>();

        f = new File("file.txt");
        Scanner sc = new Scanner(f);
        int totalLines=0;
        int totalWords=0;
        int totalChars=0;
        int totalSentences=0;
        while(sc.hasNextLine())
        {
            totalLines++;
            if(totalLines==lineNumber){
                String line = sc.nextLine();
                totalChars += line.length();
                totalWords += new StringTokenizer(line, " ,").countTokens();  //line.split("\s").length;
                totalSentences += line.split("\.").length;
                break;
            }
            sc.nextLine();

        }

        list.add(totalChars);
        list.add(totalWords);
        list.add(totalSentences);
        System.out.println(lineNumber+";"+totalWords+";"+totalChars+";"+totalSentences);

    }
}