python 使用脚本语言动态数据库

Question

提问by tmoisan

I have a set of .csv files that I want to process. It would be far easier to process it with SQL queries. I wonder if there is some way to load a .csv file and use SQL language to look into it with a scripting language like python or ruby. Loading it with something similar to ActiveRecord would be awesome.

我有一组要处理的 .csv 文件。使用 SQL 查询处理它会容易得多。我想知道是否有某种方法可以加载 .csv 文件并使用 SQL 语言通过诸如 python 或 ruby 之类的脚本语言来查看它。用类似于 ActiveRecord 的东西加载它会很棒。

The problem is that I don't want to have to run a database somewhere prior to running my script. I souldn't have additionnal installations needed outside of the scripting language and some modules.

问题是我不想在运行我的脚本之前在某处运行数据库。除了脚本语言和一些模块之外，我不需要额外的安装。

My question is which language and what modules should I use for this task. I looked around and can't find anything that suits my need. Is it even possible?

我的问题是我应该使用哪种语言和哪些模块来完成这项任务。我环顾四周，找不到适合我需要的任何东西。甚至有可能吗？

Answer 1

回答by nosklo

There's sqlite3, included into python. With it you can create a database (on memory) and add rows to it, and perform SQL queries.

有sqlite3, 包含在 python 中。使用它，您可以创建一个数据库（在内存上）并向其中添加行，并执行 SQL 查询。

If you want neat ActiveRecord-like functionality you should add an external ORM, like sqlalchemy. That's a separate download though

如果你想要简洁的 ActiveRecord 类功能，你应该添加一个外部 ORM，比如sqlalchemy。虽然这是一个单独的下载

Quick example using sqlalchemy:

使用 sqlalchemy 的快速示例：

from sqlalchemy import create_engine, Column, String, Integer, MetaData, Table
from sqlalchemy.orm import mapper, create_session
import csv
CSV_FILE = 'foo.csv'
engine = create_engine('sqlite://') # memory-only database

table = None
metadata = MetaData(bind=engine)
with open(CSV_FILE) as f:
    # assume first line is header
    cf = csv.DictReader(f, delimiter=',')
    for row in cf:
        if table is None:
            # create the table
            table = Table('foo', metadata, 
                Column('id', Integer, primary_key=True),
                *(Column(rowname, String()) for rowname in row.keys()))
            table.create()
        # insert data into the table
        table.insert().values(**row).execute()

class CsvTable(object): pass
mapper(CsvTable, table)
session = create_session(bind=engine, autocommit=False, autoflush=True)

Now you can query the database, filtering by any field, etc.

现在您可以查询数据库、按任何字段过滤等。

Suppose you run the code above on this csv:

假设你在这个 csv 上运行上面的代码：

name,age,nickname
nosklo,32,nosklo
Afila Tun,32,afilatun
Foo Bar,33,baz

That will create and populate a table in memory with fields name, age, nickname. You can then query the table:

这将在内存中创建并填充一个包含字段name, age,的表nickname。然后，您可以查询该表：

for r in session.query(CsvTable).filter(CsvTable.age == '32'):
    print r.name, r.age, r.nickname

That will automatically create and run a SELECTquery and return the correct rows.

这将自动创建并运行SELECT查询并返回正确的行。

Another advantage of using sqlalchemy is that, if you decide to use another, more powerful database in the future, you can do so pratically without changing the code.

使用 sqlalchemy 的另一个优点是，如果您决定将来使用另一个更强大的数据库，您可以在不更改代码的情况下实际使用。

Answer 2

回答by lexu

Use a DB in a library like SQLite. There are Pythonand Rubyversions .

在像SQLite这样的库中使用数据库。有Python和Ruby版本。

Load your CSV into table, there might be modules/libraries to help you here too. Then SQL away.

将您的 CSV 加载到表格中，这里也可能有模块/库可以帮助您。然后SQL离开。

Answer 3

回答by dawg

Looked at Perl and and Text::CSV and DBI? There are many modules on CPAN to do exactly this. Here is an example (from HERE):

看过 Perl 和 Text::CSV 和 DBI？CPAN 上有很多模块可以做到这一点。这是一个示例（来自HERE）：

#!/usr/bin/perl
use strict;
use warnings;
use DBI;

# Connect to the database, (the directory containing our csv file(s))

my $dbh = DBI->connect("DBI:CSV:f_dir=.;csv_eol=\n;");

# Associate our csv file with the table name 'prospects'

$dbh->{'csv_tables'}->{'prospects'} = { 'file' => 'prospects.csv'};

# Output the name and contact field from each row

my $sth = $dbh->prepare("SELECT * FROM prospects WHERE name LIKE 'G%'");
$sth->execute();
while (my $row = $sth->fetchrow_hashref) {
     print("name = ", $row->{'Name'}, "  contact = ", $row->{'Contact'}. "\n");
}
$sth->finish();

name = Glenhuntly Pharmacy  contact = Paul
name = Gilmour's Shoes  contact = Ringo

Just type perldoc DBI and perldoc Text::CSV at the command prompt for more.

只需在命令提示符下键入 perldoc DBI 和 perldoc Text::CSV 即可获得更多信息。

Answer 4

回答by Justin Ethier

You could use either scripting language to parse the CSV file and store the data into SQLite, which just uses a single file for storage. From there you have it in a database and can run queries against it.

您可以使用任一脚本语言来解析 CSV 文件并将数据存储到SQLite 中，它只使用单个文件进行存储。从那里你将它保存在一个数据库中，并且可以对它运行查询。

Alternatively, on windows you can setup an ODBC data source as a CSV file. But it may be difficult to automate this.

或者，在 Windows 上，您可以将 ODBC 数据源设置为 CSV 文件。但是这可能很难自动化。

Answer 5

回答by Azeem.Butt

CSV files are not databases--they have no indices--and any SQL simulation you imposed upon them would amount to little more than searching through the entire thing over and over again.

CSV 文件不是数据库——它们没有索引——你强加给它们的任何 SQL 模拟只不过是一遍又一遍地搜索整个文件。

Answer 6

回答by trash80

I used nosklo's solution (thanks!) but I already had a primary key (passed in as pk_col) within the column line (first line of csv). So I thought I'd share my modification. I used a ternary.

我使用了 nosklo 的解决方案（谢谢！）但我已经在列行（csv 的第一行）中有一个主键（作为 pk_col 传入）。所以我想我会分享我的修改。我用了三元。

table = Table(tablename, metadata,
    *((Column(pk_col, Integer, primary_key=True)) if rowname == pk_col else (Column(rowname, String())) for rowname in row.keys()))
table.create()

Answer 7

回答by anijhaw

PHP FlatfileDB available hereis a very good option if you are building a web app

如果您正在构建 Web 应用程序，此处可用的 PHP FlatfileDB是一个非常好的选择

python 使用脚本语言动态数据库

提问by tmoisan

回答by nosklo

回答by lexu

回答by dawg

回答by Justin Ethier

回答by Azeem.Butt

回答by trash80

回答by anijhaw

相关推荐

最近更新

标签

python 使用脚本语言动态数据库

提问by tmoisan

回答by nosklo

回答by lexu

回答by dawg

回答by Justin Ethier

回答by Azeem.Butt

回答by trash80

回答by anijhaw

相关推荐

如何使用 Python 检查 GNU/Linux 操作系统中是否存在用户？

python 为 cookie 生成随机会话 ID 的最安全方法是什么？

PHP natsort 函数的 Python 模拟（使用“自然顺序”算法对列表进行排序）

python 如何将python dict与多处理同步

相关推荐

最近更新

标签