在 PHP 中遍历字符串中的每一行

Question

提问by Topher Fangio

I have a form that allows the user to either upload a text file or copy/paste the contents of the file into a textarea. I can easily differentiate between the two and put whichever one they entered into a string variable, but where do I go from there?

我有一个表单，允许用户上传文本文件或将文件内容复制/粘贴到文本区域。我可以轻松区分两者并将他们输入的任何一个放入字符串变量中，但是我该从哪里开始呢？

I need to iterate over each line of the string (preferably not worrying about newlines on different machines), make sure that it has exactly one token (no spaces, tabs, commas, etc.), sanitize the data, then generate an SQL query based off of all of the lines.

我需要遍历字符串的每一行（最好不要担心不同机器上的换行符），确保它只有一个标记（没有空格、制表符、逗号等），清理数据，然后生成 SQL 查询基于所有的行。

I'm a fairly good programmer, so I know the general idea about how to do it, but it's been so long since I worked with PHP that I feel I am searching for the wrong things and thus coming up with useless information. The key problem I'm having is that I want to read the contents of the string line-by-line. If it were a file, it would be easy.

我是一个相当不错的程序员，所以我知道如何做的一般想法，但是我使用 PHP 已经很长时间了，我觉得我在寻找错误的东西，从而得出无用的信息。我遇到的关键问题是我想逐行读取字符串的内容。如果它是一个文件，那就很容易了。

I'm mostly looking for useful PHP functions, not an algorithm for how to do it. Any suggestions?

我主要是在寻找有用的 PHP 函数，而不是如何去做的算法。有什么建议？

Answer 1

回答by Kyril

preg_splitthe variable containing the text, and iterate over the returned array:

preg_split包含文本的变量，并迭代返回的数组：

foreach(preg_split("/((\r?\n)|(\r\n?))/", $subject) as $line){
    // do stuff with $line
}

Answer 2

回答by Erwin Wessels

I would like to propose a significantlyfaster (and memory efficient) alternative: strtokrather than preg_split.

我想提出一个显着更快（和内存效率）的替代方案：strtok而不是preg_split.

$separator = "\r\n";
$line = strtok($subject, $separator);

while ($line !== false) {
    # do something with $line
    $line = strtok( $separator );
}

Testing the performance, I iterated 100 times over a test file with 17 thousand lines: preg_splittook 27.7 seconds, whereas strtoktook 1.4 seconds.

测试性能，我在一个preg_split17000行的测试文件上迭代了 100 次：用了 27.7 秒，而用strtok了 1.4 秒。

Note that though the $separatoris defined as "\r\n", strtokwill separate on either character - and as of PHP4.1.0, skip empty lines/tokens.

请注意，虽然$separator被定义为"\r\n",strtok将在任一字符上分开 - 从 PHP4.1.0 开始，跳过空行/标记。

See the strtok manual entry: http://php.net/strtok

参见strtok手册入口：http: //php.net/strtok

Answer 3

回答by FerCa

If you need to handle newlines in diferent systems you can simply use the PHP predefined constant PHP_EOL (http://php.net/manual/en/reserved.constants.php) and simply use explode to avoid the overhead of the regular expression engine.

如果您需要在不同的系统中处理换行符，您可以简单地使用 PHP 预定义常量 PHP_EOL (http://php.net/manual/en/reserved.constants.php) 并简单地使用 expand 来避免正则表达式引擎的开销.

$lines = explode(PHP_EOL, $subject);

Answer 4

回答by pguardiario

It's overly-complicated and ugly but in my opinion this is the way to go:

它过于复杂和丑陋，但在我看来这是要走的路：

$fp = fopen("php://memory", 'r+');
fputs($fp, $data);
rewind($fp);
while($line = fgets($fp)){
  // deal with $line
}
fclose($fp);

Answer 5

回答by CodeAngry

foreach(preg_split('~[\r\n]+~', $text) as $line){
    if(empty($line) or ctype_space($line)) continue; // skip only spaces
    // if(!strlen($line = trim($line))) continue; // or trim by force and skip empty
    // $line is trimmed and nice here so use it
}

^ this is how you break lines properly, cross-platform compatible with Regexp:)

^这就是你如何正确断行，跨平台兼容Regexp:)

Answer 6

回答by Absolute?ER?

Potential memory issues with `strtok`:

潜在的内存问题`strtok`：

Since one of the suggested solutions uses strtok, unfortunately it doesn't point out a potential memory issue (though it claims to be memory efficient). When using strtokaccording to the manual, the:

由于建议的解决方案之一使用strtok，不幸的是它没有指出潜在的内存问题（尽管它声称内存效率高）。当使用strtok根据本手册中，：

Note that only the first call to strtok uses the string argument. Every subsequent call to strtok only needs the token to use, as it keeps track of where it is in the current string.

请注意，只有第一次调用 strtok 时才使用字符串参数。每次对 strtok 的后续调用只需要使用令牌，因为它会跟踪它在当前字符串中的位置。

It does this by loading the file into memory.If you're using large files, you need to flush them if you're done looping through the file.

它通过将文件加载到内存中来做到这一点。如果您使用的是大文件，则在完成文件循环后需要刷新它们。

<?php
function process($str) {
    $line = strtok($str, PHP_EOL);

    /*do something with the first line here...*/

    while ($line !== FALSE) {
        // get the next line
        $line = strtok(PHP_EOL);

        /*do something with the rest of the lines here...*/

    }
    //the bit that frees up memory
    strtok('', '');
}

If you're only concerned with physical files (eg. datamining):

如果您只关心物理文件（例如数据挖掘）：

According to the manual, for the file upload part you can use the filecommand:

根据手册，对于文件上传部分，您可以使用以下file命令：

 //Create the array
 $lines = file( $some_file );

 foreach ( $lines as $line ) {
   //do something here.
 }

Answer 7

回答by Joe Kiley

Kyril's answer is best considering you need to be able to handle newlines on different machines.

考虑到您需要能够在不同的机器上处理换行符，Kyril 的回答是最好的。

"I'm mostly looking for useful PHP functions, not an algorithm for how to do it. Any suggestions?"

“我主要是在寻找有用的 PHP 函数，而不是如何去做的算法。有什么建议吗？”

I use these a lot:

我经常使用这些：

explode()can be used to split a string into an array, given a single delimiter.
implode() is explode's counterpart, to go from array back to string.

在给定单个分隔符的情况下，explode()可用于将字符串拆分为数组。
implode() 是爆炸的对应物，从数组回到字符串。

在 PHP 中遍历字符串中的每一行

提问by Topher Fangio

回答by Kyril

回答by Erwin Wessels

回答by FerCa

回答by pguardiario

回答by CodeAngry

回答by Absolute?ER?

Potential memory issues with `strtok`:

潜在的内存问题`strtok`：

If you're only concerned with physical files (eg. datamining):

如果您只关心物理文件（例如数据挖掘）：

回答by Joe Kiley

相关推荐

最近更新

标签

在 PHP 中遍历字符串中的每一行

提问by Topher Fangio

回答by Kyril

回答by Erwin Wessels

回答by FerCa

回答by pguardiario

回答by CodeAngry

回答by Absolute?ER?

Potential memory issues with strtok:

潜在的内存问题strtok：

If you're only concerned with physical files (eg. datamining):

如果您只关心物理文件（例如数据挖掘）：

回答by Joe Kiley

相关推荐

php 警告：打开目录：未实现

php 用php获取服务器ram

php 如何验证 $_GET 是否存在？

php 在php中查找多维数组中的所有二级键

相关推荐

最近更新

标签

Potential memory issues with `strtok`:

潜在的内存问题`strtok`：