php 如何将具有 200,00 行的巨大 CSV 文件导入 MySQL(异步且快速)?

声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow 原文地址: http://stackoverflow.com/questions/32504778/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me): StackOverFlow

提示:将鼠标放在中文语句上可以显示对应的英文。显示中英文
时间:2020-08-25 23:01:20  来源:igfitidea点击:

How to import huge CSV file with 200,00 rows to MySQL (asynchronous and fast)?

phpmysqlajaxcsvasynchronous

提问by Mead Umandal

I have to write a PHP script that will import data from a given CSV file into MySQL database. The given CSV file can contain up to 200,000 rows. I tried the following but problems arise :

我必须编写一个 PHP 脚本,将数据从给定的 CSV 文件导入 MySQL 数据库。给定的 CSV 文件最多可包含 200,000 行。我尝试了以下但出现问题:

  1. LOAD DATA LOCAL INFILE : I cannot use LOAD DATA LOCAL INFILE statement because I wanted to do some validations first BEFORE uploading the rows, also, our DB admin doesn't want me to use that statement and I don't know why.
  2. FOR LOOP : Inserting line by line inside FOR loop will take too much time resulting to Connection Timeout.
  1. LOAD DATA LOCAL INFILE :我不能使用 LOAD DATA LOCAL INFILE 语句,因为我想在上传行之前先进行一些验证,而且我们的数据库管理员不希望我使用该语句,我不知道为什么。
  2. FOR LOOP :在 FOR 循环内逐行插入将花费太多时间,导致连接超时。

Now, I am thinking of a solution by splitting the CSV file into smaller chunks, then inserting them asynchronously. I am already done with the splitting of CSV, but I currently have no idea how to asynchronously insert into my database for quick and safe way. But I heard that I will be using Ajax here.

现在,我正在考虑通过将 CSV 文件拆分为更小的块,然后异步插入它们来解决方案。我已经完成了 CSV 的拆分,但我目前不知道如何以快速和安全的方式异步插入到我的数据库中。但我听说我将在这里使用 Ajax。

Any solution you can recommend? Thanks a lot in advance!

您可以推荐任何解决方案吗?非常感谢!

回答by Mead Umandal

Thanks to everyone who gave answers to this question. I have discovered a solution! Just wanted to share it, in case someone needs to create a PHP script that will import a huge CSV file into MySQL database (asynchronously and fast!) I have tested my code with 400,000 rows and the importing is done in seconds. I believe it would work with larger files, you just have to modify maximum upload file size.

感谢所有回答这个问题的人。我发现了一个解决方案!只是想分享一下,以防有人需要创建一个 PHP 脚本来将一个巨大的 CSV 文件导入 MySQL 数据库(异步且快速!)我已经用 400,000 行测试了我的代码,导入在几秒钟内完成。我相信它可以处理更大的文件,您只需要修改最大上传文件大小。

In this example, I will be importing a CSV file that contains two columns (name, contact_number) into a MySQL DB that contains the same columns.

在此示例中,我将一个包含两列(名称、联系人号码)的 CSV 文件导入到包含相同列的 MySQL 数据库中。

Your CSV file should look like this :

您的 CSV 文件应如下所示:

Ana, 0906123489

安娜, 0906123489

John, 0908989199

约翰, 0908989199

Peter, 0908298392

彼得, 0908298392

...

...

...

...

So, here's the solution.

所以,这是解决方案。

First, create your table

首先,创建您的表

CREATE TABLE `testdb`.`table_test`
( `id` INT NOT NULL AUTO_INCREMENT ,
`name` VARCHAR(100) NOT NULL ,
`contact_number` VARCHAR(100) NOT NULL ,
PRIMARY KEY (`id`)) ENGINE = InnoDB;

Second, I have 4 PHP files. All you have to do is place this into a single folder. PHP files are as follows :

其次,我有 4 个 PHP 文件。您所要做的就是将其放入一个文件夹中。PHP文件如下:

index.php

索引.php

<form action="upload.php" method="post" enctype="multipart/form-data">
<input type="file" name="csv" value="" />
<input type="submit" name="submit" value="Save" /></form>

connect.php

连接.php

<?php
//modify your connections here
$servername = "localhost";
$username = "root";
$password = "";
$dbname = "testDB";
$conn = new mysqli($servername, $username, $password, $dbname);
if ($conn->connect_error) {
    die("Connection failed: " . $conn->connect_error);
} 
?>

senddata.php

发送数据文件

<?php
include('connect.php');
$data = $_POST['file'];
$handle = fopen($data, "r");
$test = file_get_contents($data);
if ($handle) {
    $counter = 0;
    //instead of executing query one by one,
    //let us prepare 1 SQL query that will insert all values from the batch
    $sql ="INSERT INTO table_test(name,contact_number) VALUES ";
    while (($line = fgets($handle)) !== false) {
      $sql .= "($line),";
      $counter++;
    }
    $sql = substr($sql, 0, strlen($sql) - 1);
     if ($conn->query($sql) === TRUE) {
    } else {
     }
    fclose($handle);
} else {  
} 
//unlink CSV file once already imported to DB to clear directory
unlink($data);
?>

upload.php

上传.php

<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/1.11.1/jquery.js"></script>
<script>
//Declaration of function that will insert data into database
 function senddata(filename){
        var file = filename;
        $.ajax({
            type: "POST",
            url: "senddata.php",
            data: {file},
            async: true,
            success: function(html){
                $("#result").html(html);
            }
        })
        }
 </script>
<?php
$csv = array();
$batchsize = 1000; //split huge CSV file by 1,000, you can modify this based on your needs
if($_FILES['csv']['error'] == 0){
    $name = $_FILES['csv']['name'];
    $ext = strtolower(end(explode('.', $_FILES['csv']['name'])));
    $tmpName = $_FILES['csv']['tmp_name'];
    if($ext === 'csv'){ //check if uploaded file is of CSV format
        if(($handle = fopen($tmpName, 'r')) !== FALSE) {
            set_time_limit(0);
            $row = 0;
            while(($data = fgetcsv($handle)) !== FALSE) {
                $col_count = count($data);
                //splitting of CSV file :
                if ($row % $batchsize == 0):
                    $file = fopen("minpoints$row.csv","w");
                endif;
                $csv[$row]['col1'] = $data[0];
                $csv[$row]['col2'] = $data[1];
                $min = $data[0];
                $points = $data[1];
                $json = "'$min', '$points'";
                fwrite($file,$json.PHP_EOL);
                //sending the splitted CSV files, batch by batch...
                if ($row % $batchsize == 0):
                    echo "<script> senddata('minpoints$row.csv'); </script>";
                endif;
                $row++; 
            }
            fclose($file);
            fclose($handle);
        }
    }
    else
    {
        echo "Only CSV files are allowed.";
    }
    //alert once done.
    echo "<script> alert('CSV imported!') </script>";
}
?>

That's it! You already have a pure PHP script that can import multiple number of rows in seconds! :) (Thanks to my partner who taught and gave me an idea on how to use ajax)

就是这样!您已经拥有一个可以在几秒钟内导入多行的纯 PHP 脚本!:)(感谢我的伙伴教我如何使用ajax并给了我一个想法)

回答by flaschenpost

The main slowness comes from sending every single line as it's own request. I would suggest to send the query with every 1000 or 500 rows in the same format used by mysqldump --opt, so build a long string in the way

主要的缓慢来自发送每一行作为它自己的请求。我建议每 1000 或 500 行以 使用的相同格式发送查询mysqldump --opt,因此以这种方式构建一个长字符串

 insert into datatable (name, prename, commen) 
   values ('wurst', 'hans', 'someone')
   , ('bush', 'george', 'otherone')
   , ...
   ;

You should check how long your lines are allowed to be or if the MySQL- Server is in your control you could extend the maximal query length.

您应该检查允许的行有多长,或者如果 MySQL 服务器在您的控制范围内,您可以扩展最大查询长度。

If this is still too long (I mean 200K is not much at all), then you could try to improve the csv-reading.

如果这仍然太长(我的意思是 200K 根本不多),那么您可以尝试改进 csv-reading。

It is a bit work splitting into those chunks, but you could write a small chunk-class for this, so adding the rows gets a bit easier.

拆分成这些块需要一些工作,但是您可以为此编写一个小的块类,因此添加行会更容易一些。

The usage of this class looked like

这个类的用法看起来像

$chunk->prepare("insert into datatable (name, prename, comment) values");
$chunk->setSize(1000);

foreach ($row...){
   if($query = $chunk->addRow(...)){
       callUpdate($query);
   }
}
if($query = $chunk->clear()){
  callUpdate($query);
}

回答by Julio Soares

I would still use LOAD DATA LOCAL INFILE into a temporary table and use MySQL to validate, filter, clean, etc with all data in a DB and then populate the destination table with the ready to go records.

我仍然会使用 LOAD DATA LOCAL INFILE 到一个临时表中,并使用 MySQL 来验证、过滤、清理数据库中的所有数据,然后用准备好的记录填充目标表。

回答by SatanicGeek

You can use fgetcsv() with PHP.

您可以在 PHP 中使用fgetcsv()。

Here is an example :

这是一个例子:

// Open the file with PHP
$oFile = fopen('PATH_TO_FILE', 'w');

// Get the csv content
$aCsvContent = fgetcsv($oFile);

// Browse your csv line per line
foreach($aCsvContent as $aRow){

    $sReqInsertData = ' INSERT
                        INTO
                            TABLENAME
                        SET
                            FIELD1 = "'.$aRow[0].'",
                            FIELD2 = "'.$aRow[1].'",
                            FIELD3 = "'.$aRow[2].'",
                            FIELD4 = "'.$aRow[3].'",
                            FIELD5 = "'.$aRow[4].'",
                            FIELD6 = "'.$aRow[5].'",
                            FIELD7 = "'.$aRow[6].'",
                            FIELD8 = "'.$aRow[7].'"';

    // Execute your sql with mysqli_query or something like this
    mysqli_query($sReqInsertData);
}

// Close you file
fclose($oFile);