pandas 如何从数据帧创建键:列名和值的字典:python 列中的唯一值
声明:本页面是StackOverFlow热门问题的中英对照翻译,遵循CC BY-SA 4.0协议,如果您需要使用它,必须同样遵循CC BY-SA许可,注明原文地址和作者信息,同时你必须将它归于原作者(不是我):StackOverFlow
原文地址: http://stackoverflow.com/questions/44105375/
Warning: these are provided under cc-by-sa 4.0 license. You are free to use/share it, But you must attribute it to the original authors (not me):
StackOverFlow
How to create a dictionary of key : column_name and value : unique values in column in python from a dataframe
提问by Shuvayan Das
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.Ultimately I want to be able to filter out the key_value pairs from the dict based on conditions. This is what I have been able to do so far:
我正在尝试创建一个 key:value 对字典,其中 key 是数据框的列名,value 将是一个包含该列中所有唯一值的列表。最终我希望能够从dict 基于条件。到目前为止,这是我能够做的:
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
product_list = ['shoe','footwear']
color_list = []
size_list = []
Here product,color,size are all column names and the dict keys should be named accordingly like color_list etc. Ultimately I will need to access each key:value_list in the dictionary. Expected output:
这里产品、颜色、大小都是列名,字典键应该相应地命名为 color_list 等。最终我需要访问字典中的每个键:value_list。预期输出:
KEY VALUE
color_list : ["red","blue","black"]
size_list: ["9","XL","32","10 inches"]
Can someone please help me regarding this?A snapshot of the data is attached.
采纳答案by Chiheb Nexus
With a DataFramelike this:
有了DataFrame这样的:
import pandas as pd
df = pd.DataFrame([["Women", "Slip on", 7, "Black", "Clarks"], ["Women", "Slip on", 8, "Brown", "Clarcks"], ["Women", "Slip on", 7, "Blue", "Clarks"]], columns= ["Category", "Sub Category", "Size", "Color", "Brand"])
print(df)
Output:
输出:
Category Sub Category Size Color Brand
0 Women Slip on 7 Black Clarks
1 Women Slip on 8 Brown Clarcks
2 Women Slip on 7 Blue Clarks
You can convert your DataFrame into dict and create your new dict when mapping the the columns of the DataFrame, like this example:
您可以将 DataFrame 转换为 dict 并在映射 DataFrame 的列时创建新的 dict,如下例所示:
new_dict = {"color_list": list(df["Color"]), "size_list": list(df["Size"])}
# OR:
#new_dict = {"color_list": [k for k in df["Color"]], "size_list": [k for k in df["Size"]]}
print(new_dict)
Output:
输出:
{'color_list': ['Black', 'Brown', 'Blue'], 'size_list': [7, 8, 7]}
In order to have a unique values, you can use setlike this example:
为了有一个唯一的值,你可以set像这个例子一样使用:
new_dict = {"color_list": list(set(df["Color"])), "size_list": list(set(df["Size"]))}
print(new_dict)
Output:
输出:
{'color_list': ['Brown', 'Blue', 'Black'], 'size_list': [8, 7]}
Or, like what @Ami Tavory said in his answer, in order to have the whole unique keys and values from your DataFrame, you can simply do this:
或者,就像@Ami Tavory 在他的回答中所说的那样,为了从您的 DataFrame 中获得整个唯一的键和值,您可以简单地执行以下操作:
new_dict = {k:list(df[k].unique()) for k in df.columns}
print(new_dict)
Output:
输出:
{'Brand': ['Clarks', 'Clarcks'],
'Category': ['Women'],
'Color': ['Black', 'Brown', 'Blue'],
'Size': [7, 8],
'Sub Category': ['Slip on']}
回答by Ami Tavory
I am trying to create a dictionary of key:value pairs where key is the column name of a dataframe and value will be a list containing all the unique values in that column.
我正在尝试创建一个 key:value 对的字典,其中 key 是数据框的列名,value 将是一个包含该列中所有唯一值的列表。
You could use a simple dictionary comprehensionfor that.
你可以使用一个简单的字典理解来做到这一点。
Say you start with
说你开始
import pandas as pd
df = pd.DataFrame({'a': [1, 2, 1], 'b': [1, 4, 5]})
Then the following comprehension solves it:
那么下面的理解就解决了:
>>> {c: list(df[c].unique()) for c in df.columns}
{'a': [1, 2], 'b': [1, 4, 5]}
回答by arnold
If I understand your question correctly, you may need setinstead of list. Probably at this piece of code, you might add setto get the unique values of the given list.
如果我正确理解您的问题,您可能需要set而不是列表。可能在这段代码中,您可能会添加set以获取给定列表的唯一值。
for col in col_list[1:]:
_list = []
_list.append(footwear_data[col].unique())
list_name = ''.join([str(col),'_list'])
list_name = set(list_name)
Sample of usage
使用示例
>>> a_list = [7, 8, 7, 9, 10, 9]
>>> set(a_list)
{8, 9, 10, 7}
回答by Waqar
Here how i did it let me know if it helps
在这里我是怎么做的让我知道它是否有帮助
import pandas as pd
df = pd.read_csv("/path/to/csv/file")
colList = list(df)
dic = {}
for x in colList:
_list = []
_list.append(list(set(list(df[x]))))
list_name = ''.join([str(x), '_list'])
dic[str(x)+"_list"] = _list
print dic
Output:
输出:
{'Color_list': [['Blue', 'Orange', 'Black', 'Red']], 'Size_list': [['9', '8', '10 inches', 'XL', '7']], 'Brand_list': [['Clarks']], 'Sub_list': [['SO', 'FOR']], 'Category_list': [['M', 'W']]}
MyCsv File
MyCsv 文件
Category,Sub,Size,Color,Brand
W,SO,7,Blue,Clarks
W,SO,7,Blue,Clarks
W,SO,7,Black,Clarks
W,SO,8,Orange,Clarks
W,FOR,8,Red,Clarks
M,FOR,9,Black,Clarks
M,FOR,10 inches,Blue,Clarks
M,FOR,XL,Blue,Clarks

