使用循环创建带有 Dataframe Pandas 的 Excel 表格
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/41215508/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
Using Loop to Create Excel Sheets with Dataframe Pandas
提问by mynameisgooch
I'm working on this function that scrapes a website for fantasy football information and writes it to an Excel file. Ultimately, I want to have information for each week on a separate sheet in the Excel workbook.
我正在研究此功能,该功能可抓取网站以获取梦幻足球信息并将其写入 Excel 文件。最终,我希望在 Excel 工作簿中的单独工作表上包含每周的信息。
The code as posted below works perfectly until I want to write it to the Excel workbook. The workbook ends up having just week 17 data. It seems that the pandas ExcelWriter overwrites the sheet every time when I really want it to add a sheet every time.
下面发布的代码运行良好,直到我想将其写入 Excel 工作簿。工作簿最终只有第 17 周的数据。似乎每次我真的希望Pandas ExcelWriter 每次都添加工作表时,它都会覆盖工作表。
I couldn't find anything online about creating sheets with a loop in the pandas ExcelWriter, so I'm not entirely sure if it can be done the way I want it.
我在网上找不到任何关于在 Pandas ExcelWriter 中创建带有循环的工作表的信息,所以我不完全确定它是否可以按照我想要的方式完成。
import bs4 as bs
import urllib.request
import pandas as pd
from pandas import ExcelWriter
for week in range(1,18):
#IGNORE MOST OF THIS STUFF BELOW BECAUSE IT WORKS AS IS
source = urllib.request.urlopen('http://fftoday.com/stats/playerstats.php?Season=2015&GameWeek='+str(week)+'&PosID=10&LeagueID=1').read()
soup = bs.BeautifulSoup(source, 'lxml')
table = soup.find('table', width='100%', border='0', cellpadding='2', cellspacing='1')
table_rows = table.find_all('tr')
player_data = {}
for tr in table_rows:
td = tr.find_all('td')
row = [i.text for i in td]
if row != ['\xa0 ', 'Passing', 'Rushing', 'Fantasy'] and row != ['Player\nSort First: \n\n\n\xa0\xa0\n\t\tLast: \n\n\n', 'Team\n\n\n\n', 'G\n\n\n\n', 'Comp\n\n', 'Att\n\n', 'Yard\n\n', 'TD\n\n', 'INT\n\n', 'Att\n\n', 'Yard\n\n', 'TD\n\n', 'FPts\n\n\n\n', 'FPts/G\n\n\n\n']:
names = str(row[0]).encode('utf-8')
names = str(names)[:-1].split()[1:]
names[0:] = [' '.join(names[0:])]
row[0] = names[0]
player_data[str(row[0])] = row[1:]
df_qb = pd.DataFrame.from_dict(player_data)
df_qb = df_qb.transpose()
df_qb.columns = ['Team', 'Games', 'Completions', 'Att', 'Yards',
'TD', 'INT', 'Rush Att', 'Rush Yards', 'Rush TD',
'Fantasy Points', 'Fantasy Points per Game']
#EVERY THING WORKS JUST FINE UNTIL RIGHT HERE
writer = ExcelWriter('Weekly Fantasy Football Data.xlsx')
df_qb.to_excel(writer, 'Week ' + str(week))
writer.save()
回答by Ajeet Ganga
All you need to do is move the writer decleration outside of the loop
您需要做的就是将编写器声明移到循环之外
e.g.
例如
a = numpy.arange(5)
w = ExcelWriter('e.xlsx')
for i in a:
df = pd.DataFrame({'a':np.random.randint(1,100,10)})
df.to_excel(w, sheet_name='sheet ' + str(i))
w.save()