如何使用 Python 从 URL 读取 CSV 文件？

Question

提问by mongotop

when I do curl to a API call link http://example.com/passkey=wedsmdjsjmdd

当我 curl 到 API 调用链接http://example.com/passkey=wedsmdjsjmdd

curl 'http://example.com/passkey=wedsmdjsjmdd'

I get the employee output data on a csv file format, like:

我以 csv 文件格式获取员工输出数据，例如：

"Steve","421","0","421","2","","","","","","","","","421","0","421","2"

how can parse through this using python.

如何使用python解析这个。

I tried:

我试过：

import csv 
cr = csv.reader(open('http://example.com/passkey=wedsmdjsjmdd',"rb"))
for row in cr:
    print row

but it didn't work and I got an error

但它没有用，我得到了一个错误

http://example.com/passkey=wedsmdjsjmdd No such file or directory:

Thanks!

谢谢！

Answer 1

采纳答案by eandersson

You need to replace openwith urllib.urlopenor urllib2.urlopen.

您需要替换open为urllib.urlopen或urllib2.urlopen。

e.g.

例如

import csv
import urllib2

url = 'http://winterolympicsmedals.com/medals.csv'
response = urllib2.urlopen(url)
cr = csv.reader(response)

for row in cr:
    print row

This would output the following

这将输出以下内容

Year,City,Sport,Discipline,NOC,Event,Event gender,Medal
1924,Chamonix,Skating,Figure skating,AUT,individual,M,Silver
1924,Chamonix,Skating,Figure skating,AUT,individual,W,Gold
...

Answer 2

回答by Kathirmani Sukumar

Using pandas it is very simple to read a csv file directly from a url

使用 pandas 直接从 url 读取 csv 文件非常简单

import pandas as pd
data = pd.read_csv('https://example.com/passkey=wedsmdjsjmdd')

This will read your data in tabular format, which will be very easy to process

这将以表格格式读取您的数据，这将非常容易处理

Answer 3

回答by Rodo

You could do it with the requests module as well:

你也可以用 requests 模块做到这一点：

url = 'http://winterolympicsmedals.com/medals.csv'
r = requests.get(url)
text = r.iter_lines()
reader = csv.reader(text, delimiter=',')

Answer 4

回答by The Aelfinn

To increase performance when downloading a large file, the below may work a bit more efficiently:

为了在下载大文件时提高性能，以下可能会更有效地工作：

import requests
from contextlib import closing
import csv

url = "http://download-and-process-csv-efficiently/python.csv"

with closing(requests.get(url, stream=True)) as r:
    reader = csv.reader(r.iter_lines(), delimiter=',', quotechar='"')
    for row in reader:
        # Handle each row here...
        print row

By setting stream=Truein the GET request, when we pass r.iter_lines()to csv.reader(), we are passing a generatorto csv.reader(). By doing so, we enable csv.reader() to lazily iterate over each line in the response with for row in reader.

通过stream=True在 GET 请求中设置，当我们传递r.iter_lines()给 csv.reader() 时，我们将一个生成器传递给 csv.reader()。通过这样做，我们使 csv.reader() 能够懒惰地迭代响应中的每一行for row in reader。

This avoids loading the entire file into memory before we start processing it, drastically reducing memory overhead for large files.

这避免了在我们开始处理之前将整个文件加载到内存中，从而大大减少了大文件的内存开销。

Answer 5

回答by user2458922

import pandas as pd
url='https://raw.githubusercontent.com/juliencohensolal/BankMarketing/master/rawData/bank-additional-full.csv'
data = pd.read_csv(url,sep=";") # use sep="," for coma separation. 
data.describe()

Answer 6

回答by Adeshina Otayo

what you were trying to do with the curl command was to download the file to your local hard drive(HD). You however need to specify a path on HD

您尝试使用 curl 命令执行的操作是将文件下载到本地硬盘驱动器 (HD)。但是，您需要在 HD 上指定路径

curl http://example.com/passkey=wedsmdjsjmdd -o ./example.csv
cr = csv.reader(open('./example.csv',"r"))
for row in cr:
    print row

如何使用 Python 从 URL 读取 CSV 文件？

提问by mongotop

采纳答案by eandersson

回答by Kathirmani Sukumar

回答by Rodo

回答by The Aelfinn

回答by user2458922

回答by Adeshina Otayo

相关推荐

最近更新

标签

如何使用 Python 从 URL 读取 CSV 文件？

提问by mongotop

采纳答案by eandersson

回答by Kathirmani Sukumar

回答by Rodo

回答by The Aelfinn

回答by user2458922

回答by Adeshina Otayo

相关推荐

在 IPython 中释放巨大的 numpy 数组的内存

Python Flask 加载本地 json

如何在不连续检查标志的情况下终止 Python 线程

格式化/抑制 Python Pandas 聚合结果中的科学记数法

相关推荐

最近更新

标签