pandas 在python中将dbf转换为csv的方法？

Question

提问by Stefano Potter

I have a folder with a bunch of dbf files I would like to convert to csv. I have tried using a code to just change the extension from .dbf to .csv, and these files open fine when I use Excel, but when I open them in pandas they look like this:

我有一个文件夹，里面有一堆我想转换为 csv 的 dbf 文件。我尝试使用代码将扩展名从 .dbf 更改为 .csv，当我使用 Excel 时，这些文件可以正常打开，但是当我在 Pandas 中打开它们时，它们看起来像这样：

                                                s\t?
0                                                NaN
1            1       176 1.58400000000e+005-3.385...

This is not what I want, and those characters don't appear in the real file.
How should I read in the dbf file correctly?

这不是我想要的，那些字符不会出现在真实文件中。
我应该如何正确读取 dbf 文件？

Answer 1

采纳答案by Andy Hayden

Looking online, there's a few options:

网上查了一下，有以下几种选择：

With simpledbf:

使用simpledbf：

dbf = Dbf5('fake_file_name.dbf')
df = dbf.to_dataframe()

Tweaked from the gist:

从要点调整：

import pysal as ps

def dbf2DF(dbfile, upper=True):
    "Read dbf file and return pandas DataFrame"
    with ps.open(dbfile) as db:  # I suspect just using open will work too
        df = pd.DataFrame({col: db.by_col(col) for col in db.header})
        if upper == True: 
           df.columns = map(str.upper, db.header) 
        return df

Answer 2

回答by Ethan Furman

Using my dbf libraryyou could do something like:

使用我的 dbf 库，您可以执行以下操作：

import sys
import dbf
for arg in sys.argv[1:]:
    dbf.export(arg)

which will create a .csvfile of the same name as each dbf file. If you put that code into a script named dbf2csv.pyyou could then call it as

这将创建一个.csv与每个 dbf 文件同名的文件。如果将该代码放入名为的脚本中dbf2csv.py，则可以将其称为

python dbf2csv.py dbfname dbf2name dbf3name ...

Answer 3

回答by Yang Qi

Here is my solution that I've been using for years. I have a solution for Python 2.7 and one for Python 3.5 (probably also 3.6).

这是我多年来一直使用的解决方案。我有一个适用于 Python 2.7 的解决方案和一个适用于 Python 3.5（可能也是 3.6）的解决方案。

Python 2.7:

蟒蛇 2.7：

import csv
from dbfpy import dbf

def dbf_to_csv(out_table):#Input a dbf, output a csv
    csv_fn = out_table[:-4]+ ".csv" #Set the table as .csv format
    with open(csv_fn,'wb') as csvfile: #Create a csv file and write contents from dbf
        in_db = dbf.Dbf(out_table)
        out_csv = csv.writer(csvfile)
        names = []
        for field in in_db.header.fields: #Write headers
            names.append(field.name)
        out_csv.writerow(names)
        for rec in in_db: #Write records
            out_csv.writerow(rec.fieldData)
        in_db.close()
    return csv_fn

Python 3.5:

蟒蛇 3.5：

import csv
from dbfread import DBF

def dbf_to_csv(dbf_table_pth):#Input a dbf, output a csv, same name, same path, except extension
    csv_fn = dbf_table_pth[:-4]+ ".csv" #Set the csv file name
    table = DBF(dbf_table_pth)# table variable is a DBF object
    with open(csv_fn, 'w', newline = '') as f:# create a csv file, fill it with dbf content
        writer = csv.writer(f)
        writer.writerow(table.field_names)# write the column name
        for record in table:# write the rows
            writer.writerow(list(record.values()))
    return csv_fn# return the csv name

You can get dbfpy and dbfread from pip install.

您可以从 pip install 获取 dbfpy 和 dbfread。

Answer 4

回答by Alessandro Trinca Tornidor

EDIT#2:

编辑#2：

It's possible to read a dbf file, line by line and without conversion into csv, with dbfread(simply install with pip install dbfread):

可以逐行读取 dbf 文件，无需转换为 csv，使用dbfread（只需安装pip install dbfread）：

>>> from dbfread import DBF
>>> for row in DBF('southamerica_adm0.dbf'):
...     print row
... 
OrderedDict([(u'COUNTRY', u'ARGENTINA')])
OrderedDict([(u'COUNTRY', u'BOLIVIA')])
OrderedDict([(u'COUNTRY', u'BRASIL')])
OrderedDict([(u'COUNTRY', u'CHILE')])
OrderedDict([(u'COUNTRY', u'COLOMBIA')])
OrderedDict([(u'COUNTRY', u'ECUADOR')])
OrderedDict([(u'COUNTRY', u'GUYANA')])
OrderedDict([(u'COUNTRY', u'GUYANE')])
OrderedDict([(u'COUNTRY', u'PARAGUAY')])
OrderedDict([(u'COUNTRY', u'PERU')])
OrderedDict([(u'COUNTRY', u'SURINAME')])
OrderedDict([(u'COUNTRY', u'U.K.')])
OrderedDict([(u'COUNTRY', u'URUGUAY')])
OrderedDict([(u'COUNTRY', u'VENEZUELA')])

My updated references:

我更新的参考资料：

official project site: http://pandas.pydata.org

官方项目站点：http: //pandas.pydata.org

official documentation: http://pandas-docs.github.io/pandas-docs-travis/

官方文档：http: //pandas-docs.github.io/pandas-docs-travis/

dbfread: https://pypi.python.org/pypi/dbfread/2.0.6

geopandas: http://geopandas.org/

geopandas：http: //geopandas.org/

shp and dbfwith geopandas: https://gis.stackexchange.com/questions/129414/only-read-specific-attribute-columns-of-a-shapefile-with-geopandas-fiona

shp 和 dbf与geopandas：https: //gis.stackexchange.com/questions/129414/only-read-specific-attribute-columns-of-a-shapefile-with-geopandas-fiona

pandas 在python中将dbf转换为csv的方法？

提问by Stefano Potter

采纳答案by Andy Hayden

回答by Ethan Furman

回答by Yang Qi

回答by Alessandro Trinca Tornidor

相关推荐

最近更新

标签

pandas 在python中将dbf转换为csv的方法？

提问by Stefano Potter

采纳答案by Andy Hayden

回答by Ethan Furman

回答by Yang Qi

回答by Alessandro Trinca Tornidor

相关推荐

Python Pandas 在循环中创建新列

保留列顺序 - Python Pandas 和 Column Concat

Python Pandas：如何将一行移动到数据框的第一行？

使用 `pandas.cut()`，如何获得整数 bin 并避免获得负的最低界限？

相关推荐

最近更新

标签