文章目录
- 一、pandas是什么?
- 二、json --> dataFrame
-
- 1. 直接使用pandas
-
-
- 1.1 orient='split' : columns,index,data
- 1.2 orient='index',按照index转化
- 1.3 orient='records'
- 1.4 orient='columns'
-
- 2. json_normalize
- 3. json --> dataFrame
-
-
- 1. 传输的文件为一个list列表
-
- 总结
一、pandas是什么?
示例:pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。dataFrame 为其主要的数据类型。
二、json --> dataFrame
1. 直接使用pandas
1.1 orient=‘split’ : columns,index,data
exp: {“columns”:[“name”,“values”,“describe”], “index”:[0,1,2],
“data”:[[“aa”,123,“sgsggfsgsdfh”],[“bb”,135,“shsdhdghdgh\u3002”],[“cc”,146,“sjglasgs”]]}
df–>json: data_json = data.to_json(orient=‘split’)
json–>df: df1 = pd.read_json(data_json,orient=‘split’)
1.2 orient=‘index’,按照index转化
exp:{“0”:{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
“1”:{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
“2”:{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}}
df–>json: data_json2 = data.to_json(orient=‘index’)
json–>df: df2 = pd.read_json(data_json2,orient=‘index’)
1.3 orient=‘records’
exp:[{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}]
df–>json:data_json3 = data.to_json(orient=‘records’)
json–>df:df3=pd.read_json(data_json3, orient=‘records’)
1.4 orient=‘columns’
exp:{“name”:{“0”:“aa”,“1”:“bb”,“2”:“cc”},
“values”:{“0”:123,“1”:135,“2”:146},
“describe”:{“0”:“sgsggfsgsdfh”,“1”:“shsdhdghdgh\u3002”,“2”:“sjglasgs”}}
df–>json: data_json4 = data.to_json(orient=‘columns’)
json–>df: df4 =d.read_json(orient=‘columns’)
备注:json文件都是双引号,如果得到是一个单引号的,可以通过replace转化 data_json5 =
data_json4.replace(’"’,’’’) # 双引号转化为单引号
2. json_normalize
代码如下:
from pandas.io.json import json_normalize
data = '{"a":"value1","b":"value1"}'
json.loads(data) # 读取json文件
json_normalize(json.loads(data)) # 将json文件转化为dataFrame
# 读取文件的形式:
with open("test.json", 'r') as f:
temp = json.loads(f.read())
temp_df = json_normalize(temp)
print(temp_df.T) # 数据需要转置
json/dict -->dataFrame 推荐: df = pd.DataFrame(dict0) # dict
需要固定格式:dict0 ={‘a’:[1,2,3,4],‘b’:[‘a’,‘b’,‘c’,‘d’]} df =
pd.DataFrame(json0) # 以上json的几种格式都可以使用
代码示例1:
#
# dict-->df
dict0 ={'a':[1,2,3,4],'b':['a','b','c','d']}
df = pd.DataFrame(dict0)
df
json0 = {"0":{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},
"1":{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},
"2":{"name":"cc","values":146,"describe":"sjglasgs"}}
df1 = pd.DataFrame(json0)
df1
0 1 2
name aa bb cc
values 123 135 146
describe sgsggfsgsdfh shsdhdghdgh。 sjglasgs
代码示例2.
import pandas as pd
import numpy as np
data = pd.DataFrame({'name':['aa','bb','cc'],'values':[123,135,146],'describe':['sgsggfsgsdfh','shsdhdghdgh。','sjglasgs']})
# 读取方法一:
data_json = data.to_json(orient='split')
print(data_json)
df = pd.read_json(data_json, orient='split')
# 结果:
{"columns":["name","values","describe"],"index":[0,1,2],"data":[["aa",123,"sgsggfsgsdfh"],["bb",135,"shsdhdghdgh\u3002"],["cc",146,"sjglasgs"]]}
name values describe
0 aa 123 sgsggfsgsdfh
1 bb 135 shsdhdghdgh。
2 cc 146 sjglasgs
代码示例3:
# 读取方法二:
data_json2 = data.to_json(orient='index')
print(data_json2)
df2 = pd.read_json(data_json2,orient='index')
df2
# 结果
{"0":{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},"1":{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},"2":{"name":"cc","values":146,"describe":"sjglasgs"}}
describe name values
0 sgsggfsgsdfh aa 123
1 shsdhdghdgh。 bb 135
2 sjglasgs cc 146
代码示例4:
data_json3 = data.to_json(orient='records')
print(data_json3)
df3=pd.read_json(data_json3, orient='records')
df3
# 结果
[{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},{"name":"cc","values":146,"describe":"sjglasgs"}]
name values describe
0 aa 123 sgsggfsgsdfh
1 bb 135 shsdhdghdgh。
2 cc 146 sjglasgs
代码示例5:
data_json4 = data.to_json(orient='columns')
print(data_json4)
data_json5 = data_json4.replace('\"','\'')
data_json5
# 结果
{"name":{"0":"aa","1":"bb","2":"cc"},"values":{"0":123,"1":135,"2":146},"describe":{"0":"sgsggfsgsdfh","1":"shsdhdghdgh\u3002","2":"sjglasgs"}}
"{'name':{'0':'aa','1':'bb','2':'cc'},'values':{'0':123,'1':135,'2':146},'describe':{'0':'sgsggfsgsdfh','1':'shsdhdghdgh\\u3002','2':'sjglasgs'}}"
3. json --> dataFrame
1. 传输的文件为一个list列表
代码示例1:
import pandas as pd
strtext='[{"ttery":"min","issue":"20130801-3391","code":"8,4,5,2,9","code1":"297734529","code2":null,"time":1013395466000},\
{"ttery":"min","issue":"20130801-3390","code":"7,8,2,1,2","code1":"298058212","code2":null,"time":1013395406000},\
{"ttery":"min","issue":"20130801-3389","code":"5,9,1,2,9","code1":"298329129","code2":null,"time":1013395346000},\
{"ttery":"min","issue":"20130801-3388","code":"3,8,7,3,3","code1":"298588733","code2":null,"time":1013395286000},\
{"ttery":"min","issue":"20130801-3387","code":"0,8,5,2,7","code1":"298818527","code2":null,"time":1013395226000}]'
df=pd.read_json(strtext,orient='records')
son_columns = '{"index":[1,2,3],"columns":["a","b","c"],"data":[[11,12,13],[13,14,15],[15,16,17]]}'
# json_columns = '{"columns":["name","values","describe"],"index":[0,1,2],"data":[["aa",123,"sgsggfsgsdfh"],["bb",135,"shsdhdghdgh\u3002"],["cc",146,"sjglasgs"]]}'
df = pd.read_json(json_columns,orient='split')
# 结果
ttery issue code code1 code2 time
0 min 20130801-3391 8,4,5,2,9 297734529 NaN 1013395466000
1 min 20130801-3390 7,8,2,1,2 298058212 NaN 1013395406000
2 min 20130801-3389 5,9,1,2,9 298329129 NaN 1013395346000
3 min 20130801-3388 3,8,7,3,3 298588733 NaN 1013395286000
4 min 20130801-3387 0,8,5,2,7 298818527 NaN 1013395226000
代码示例2:
import pandas as pd
import json
from pandas.io.json import json_normalize
data = '{"a":"value1","b":"value1"}'
json.loads(data)
json_normalize(json.loads(data))
总结
在数据的分析中,pandas 中dataFrame对象,能够更加简洁的满足数据分析的各种需求,但是实际的应用场景中数据的类型并非如此理想,其中主要的是json类型和dict,通过以上的方法可直接将数据转化为dataFrame类型进行数据处理,在数据传输过程中json文件的表现力更优。json文件和dataFrame数据之间的相互转化,更具实用性。
本文地址:https://blog.csdn.net/qq_33624802/article/details/110437953