Python 如何通过mysqldb将pandas数据框插入数据库?
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/16476413/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to insert pandas dataframe via mysqldb into database?
提问by Stefan
I can connect to my local mysql database from python, and I can create, select from, and insert individual rows.
我可以从 python 连接到我的本地 mysql 数据库,我可以创建、选择和插入单个行。
My question is: can I directly instruct mysqldb to take an entire dataframe and insert it into an existing table, or do I need to iterate over the rows?
我的问题是:我可以直接指示 mysqldb 获取整个数据帧并将其插入现有表中,还是需要遍历行?
In either case, what would the python script look like for a very simple table with ID and two data columns, and a matching dataframe?
在这两种情况下,对于带有 ID 和两个数据列以及匹配数据框的非常简单的表,python 脚本会是什么样子?
采纳答案by Andy Hayden
Update:
更新:
There is now a to_sqlmethod, which is the preferred way to do this, rather than write_frame:
现在有一种to_sql方法,这是执行此操作的首选方法,而不是write_frame:
df.to_sql(con=con, name='table_name_for_df', if_exists='replace', flavor='mysql')
Also note: the syntax may change in pandas 0.14...
另请注意:pandas 0.14 中的语法可能会发生变化...
You can set up the connection with MySQLdb:
您可以设置与MySQLdb的连接:
from pandas.io import sql
import MySQLdb
con = MySQLdb.connect() # may need to add some other options to connect
Setting the flavorof write_frameto 'mysql'means you can write to mysql:
设定flavor的write_frame到'mysql',你可以写mysql的手段:
sql.write_frame(df, con=con, name='table_name_for_df',
if_exists='replace', flavor='mysql')
The argument if_existstells pandas how to deal if the table already exists:
参数if_exists告诉 pandas 如果表已经存在如何处理:
if_exists: {'fail', 'replace', 'append'}, default'fail'
fail: If table exists, do nothing.
replace: If table exists, drop it, recreate it, and insert data.
append: If table exists, insert data. Create if does not exist.
if_exists: {'fail', 'replace', 'append'}, 默认'fail'
fail: 如果表存在,则什么都不做。
replace: 如果表存在,删除它,重新创建它,然后插入数据。
append: 如果表存在,插入数据。如果不存在则创建。
Although the write_framedocscurrently suggest it only works on sqlite, mysql appears to be supported and in fact there is quite a bit of mysql testing in the codebase.
尽管write_frame文档目前表明它仅适用于 sqlite,但似乎支持 mysql,实际上代码库中有相当多的mysql 测试。
回答by waitingkuo
You might output your DataFrameas a csv file and then use mysqlimportto import your csv into your mysql.
您可以将您的输出DataFrame为 csv 文件,然后用于mysqlimport将您的 csv 导入到您的mysql.
EDIT
编辑
Seems pandas's build-in sql utilprovide a write_framefunction but only works in sqlite.
似乎熊猫的内置 sql util提供了一个write_frame功能,但仅适用于 sqlite。
I found something useful, you might try this
我发现了一些有用的东西,你可以试试这个
回答by Alex_L
The to_sql method works for me.
to_sql 方法对我有用。
However, keep in mind that the it looks like it's going to be deprecated in favor of SQLAlchemy:
但是,请记住,它看起来将被弃用以支持 SQLAlchemy:
FutureWarning: The 'mysql' flavor with DBAPI connection is deprecated and will be removed in future versions. MySQL will be further supported with SQLAlchemy connectables. chunksize=chunksize, dtype=dtype)
回答by Martin Thoma
Python 2 + 3
蟒蛇 2 + 3
Prerequesites
先决条件
- Pandas
- MySQL server
- sqlalchemy
- pymysql: pure python mysql client
- 熊猫
- MySQL服务器
- sqlalchemy
- pymysql: 纯 python mysql 客户端
Code
代码
from pandas.io import sql
from sqlalchemy import create_engine
engine = create_engine("mysql+pymysql://{user}:{pw}@localhost/{db}"
.format(user="root",
pw="your_password",
db="pandas"))
df.to_sql(con=engine, name='table_name', if_exists='replace')
回答by Rafael Valero
You can do it by using pymysql:
您可以使用 pymysql 来做到这一点:
For example, let's suppose you have a MySQL database with the next user, password, host and port and you want to write in the database 'data_2', if it is already there or not.
例如,假设您有一个 MySQL 数据库,其中包含下一个用户、密码、主机和端口,并且您想写入数据库“data_2”(如果它已经存在或不存在)。
import pymysql
user = 'root'
passw = 'my-secret-pw-for-mysql-12ud'
host = '172.17.0.2'
port = 3306
database = 'data_2'
If you already have the database created:
如果您已经创建了数据库:
conn = pymysql.connect(host=host,
port=port,
user=user,
passwd=passw,
db=database,
charset='utf8')
data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
If you do NOT have the database created, also valid when the database is already there:
如果您没有创建数据库,当数据库已经存在时也有效:
conn = pymysql.connect(host=host, port=port, user=user, passwd=passw)
conn.cursor().execute("CREATE DATABASE IF NOT EXISTS {0} ".format(database))
conn = pymysql.connect(host=host,
port=port,
user=user,
passwd=passw,
db=database,
charset='utf8')
data.to_sql(name=database, con=conn, if_exists = 'replace', index=False, flavor = 'mysql')
Similar threads:
类似主题:
回答by Franck Dernoncourt
Andy Hayden mentioned the correct function (to_sql). In this answer, I'll give a complete example, which I tested with Python 3.5 but should also work for Python 2.7 (and Python 3.x):
Andy Hayden 提到了正确的函数 ( to_sql)。在这个答案中,我将给出一个完整的示例,我使用 Python 3.5 对其进行了测试,但它也适用于 Python 2.7(和 Python 3.x):
First, let's create the dataframe:
首先,让我们创建数据框:
# Create dataframe
import pandas as pd
import numpy as np
np.random.seed(0)
number_of_samples = 10
frame = pd.DataFrame({
'feature1': np.random.random(number_of_samples),
'feature2': np.random.random(number_of_samples),
'class': np.random.binomial(2, 0.1, size=number_of_samples),
},columns=['feature1','feature2','class'])
print(frame)
Which gives:
这使:
feature1 feature2 class
0 0.548814 0.791725 1
1 0.715189 0.528895 0
2 0.602763 0.568045 0
3 0.544883 0.925597 0
4 0.423655 0.071036 0
5 0.645894 0.087129 0
6 0.437587 0.020218 0
7 0.891773 0.832620 1
8 0.963663 0.778157 0
9 0.383442 0.870012 0
To import this dataframe into a MySQL table:
将此数据框导入 MySQL 表:
# Import dataframe into MySQL
import sqlalchemy
database_username = 'ENTER USERNAME'
database_password = 'ENTER USERNAME PASSWORD'
database_ip = 'ENTER DATABASE IP'
database_name = 'ENTER DATABASE NAME'
database_connection = sqlalchemy.create_engine('mysql+mysqlconnector://{0}:{1}@{2}/{3}'.
format(database_username, database_password,
database_ip, database_name))
frame.to_sql(con=database_connection, name='table_name_for_df', if_exists='replace')
One trick is that MySQLdbdoesn't work with Python 3.x. So instead we use mysqlconnector, which may be installedas follows:
一个技巧是MySQLdb不适用于 Python 3.x。因此,我们使用mysqlconnector,可以按如下方式安装:
pip install mysql-connector==2.1.4 # version avoids Protobuf error
Output:
输出:
Note that to_sqlcreates the table as well as the columns if they do not already exist in the database.
请注意,to_sql如果数据库中尚不存在表和列,则会创建表和列。
回答by s.katz
df.to_sql(name = "owner", con= db_connection, schema = 'aws', if_exists='replace', index = >True, index_label='id')
df.to_sql(name = "owner", con= db_connection, schema = 'aws', if_exists='replace', index = >True, index_label='id')


