Python 如何通过mysqldb将pandas数据框插入数据库？

Question

提问by Stefan

I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.

我可以从 python 连接到我的本地 mysql 数据库，我可以创建、选择和插入单个行。

My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?

我的问题是：我可以直接指示 mysqldb 获取整个数据帧并将其插入现有表中，还是需要遍历行？

In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?

在这两种情况下，对于带有 ID 和两个数据列以及匹配数据框的非常简单的表，python 脚本会是什么样子？

Answer 1

采纳答案by Andy Hayden

Update:

更新：

There is now a to_sqlmethod, which is the preferred way to do this, rather than write_frame:

现在有一种to_sql方法，这是执行此操作的首选方法，而不是write_frame：

df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')

Also note: the syntax may change in pandas 0.14...

另请注意：pandas 0.14 中的语法可能会发生变化...

You can set up the connection with MySQLdb:

您可以设置与MySQLdb的连接：

from pandas.io import sql
import MySQLdb

con = MySQLdb.connect()  # may need to add some other options to connect

Setting the flavorof write_frameto 'mysql'means you can write to mysql:

设定flavor的write_frame到'mysql'，你可以写mysql的手段：

sql.write_frame(df, con=con, name='table_name_for_df', 
                if_exists='replace', flavor='mysql')

The argument if_existstells pandas how to deal if the table already exists:

参数if_exists告诉 pandas 如果表已经存在如何处理：

if_exists: {'fail', 'replace', 'append'}, default 'fail'
     fail: If table exists, do nothing.
     replace: If table exists, drop it, recreate it, and insert data.
     append: If table exists, insert data. Create if does not exist.

if_exists: {'fail', 'replace', 'append'}, 默认'fail'
     fail: 如果表存在，则什么都不做。
     replace: 如果表存在，删除它，重新创建它，然后插入数据。
     append: 如果表存在，插入数据。如果不存在则创建。

Although the write_framedocscurrently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.

尽管write_frame文档目前表明它仅适用于 sqlite，但似乎支持 mysql，实际上代码库中有相当多的mysql 测试。

Answer 2

回答by waitingkuo

You might output your DataFrameas a csv file and then use mysqlimportto import your csv into your mysql.

您可以将您的输出DataFrame为 csv 文件，然后用于mysqlimport将您的 csv 导入到您的mysql.

EDIT

编辑

Seems pandas's build-in sql utilprovide a write_framefunction but only works in sqlite.

似乎熊猫的内置 sql util提供了一个write_frame功能，但仅适用于 sqlite。

I found something useful, you might try this

我发现了一些有用的东西，你可以试试这个

Answer 3

回答by Alex_L

The to_sql method works for me.

to_sql 方法对我有用。

However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:

但是，请记住，它看起来将被弃用以支持 SQLAlchemy：

FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)

Answer 4

回答by Martin Thoma

Python 2 + 3

蟒蛇 2 + 3

Prerequesites

先决条件

Pandas
MySQL server
sqlalchemy
pymysql: pure python mysql client

熊猫
MySQL服务器
sqlalchemy
pymysql: 纯 python mysql 客户端

Code

代码

from pandas.io import sql
from sqlalchemy import create_engine

engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
                       .format(user="root",
                               pw="your_password",
                               db="pandas"))
df.to_sql(con=engine, name='table_name', if_exists='replace')

Answer 5

回答by Rafael Valero

You can do it by using pymysql:

您可以使用 pymysql 来做到这一点：

For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.

例如，假设您有一个 MySQL 数据库，其中包含下一个用户、密码、主机和端口，并且您想写入数据库“data_2”（如果它已经存在或不存在）。

import pymysql
user = 'root'
passw = 'my-secret-pw-for-mysql-12ud'
host =  '172.17.0.2'
port = 3306
database = 'data_2'

If you already have the database created:

如果您已经创建了数据库：

conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

If you do NOT have the database created, also valid when the database is already there:

如果您没有创建数据库，当数据库已经存在时也有效：

conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)

conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
conn = pymysql.connect(host=host,
                       port=port,
                       user=user, 
                       passwd=passw,  
                       db=database,
                       charset='utf8')

data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')

Similar threads:

类似主题：

Answer 6

回答by Franck Dernoncourt

Andy Hayden mentioned the correct function (to_sql). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):

Andy Hayden 提到了正确的函数 ( to_sql)。在这个答案中，我将给出一个完整的示例，我使用 Python 3.5 对其进行了测试，但它也适用于 Python 2.7（和 Python 3.x）：

First, let's create the dataframe:

首先，让我们创建数据框：

# Create dataframe
import pandas as pd
import numpy as np

np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
    'feature1': np.random.random(number_of_samples),
    'feature2': np.random.random(number_of_samples),
    'class':    np.random.binomial(2, 0.1, size=number_of_samples),
    },columns=['feature1','feature2','class'])

print(frame)

Which gives:

这使：

   feature1  feature2  class
0  0.548814  0.791725      1
1  0.715189  0.528895      0
2  0.602763  0.568045      0
3  0.544883  0.925597      0
4  0.423655  0.071036      0
5  0.645894  0.087129      0
6  0.437587  0.020218      0
7  0.891773  0.832620      1
8  0.963663  0.778157      0
9  0.383442  0.870012      0

To import this dataframe into a MySQL table:

将此数据框导入 MySQL 表：

# Import dataframe into MySQL
import sqlalchemy
database_username = 'ENTER USERNAME'
database_password = 'ENTER USERNAME PASSWORD'
database_ip       = 'ENTER DATABASE IP'
database_name     = 'ENTER DATABASE NAME'
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
                                               format(database_username, database_password, 
                                                      database_ip, database_name))
frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')

One trick is that MySQLdbdoesn't work with Python 3.x. So instead we use mysqlconnector, which may be installedas follows:

一个技巧是MySQLdb不适用于 Python 3.x。因此，我们使用mysqlconnector，可以按如下方式安装：

pip install mysql-connector==2.1.4  # version avoids Protobuf error

Output:

输出：

Note that to_sqlcreates the table as well as the columns if they do not already exist in the database.

请注意，to_sql如果数据库中尚不存在表和列，则会创建表和列。

Answer 7

回答by s.katz

df.to_sql(name = "owner", con= db_connection, schema = 'aws', if_exists='replace', index = >True, index_label='id')

Python 如何通过mysqldb将pandas数据框插入数据库？

提问by Stefan

采纳答案by Andy Hayden

Update:

更新：

回答by waitingkuo

EDIT

编辑

回答by Alex_L

回答by Martin Thoma

Python 2 + 3

蟒蛇 2 + 3

Prerequesites

先决条件

Code

代码

回答by Rafael Valero

回答by Franck Dernoncourt

回答by s.katz

相关推荐

最近更新

标签

Python 如何通过mysqldb将pandas数据框插入数据库？

提问by Stefan

采纳答案by Andy Hayden

Update:

更新：

回答by waitingkuo

EDIT

编辑

回答by Alex_L

回答by Martin Thoma

Python 2 + 3

蟒蛇 2 + 3

Prerequesites

先决条件

Code

代码

回答by Rafael Valero

回答by Franck Dernoncourt

回答by s.katz

相关推荐

Python imshow() 函数不起作用

在 Python 2 中搜索 FileNotFoundError 的等效项

如何清除python中的多处理队列

如何使用python中的返回方法计算两点之间的距离？

相关推荐

最近更新

标签