pandas基础操作: json 和 dataFrame 相互转化

2022-07-26,,,,

文章目录

  • 一、pandas是什么?
  • 二、json --> dataFrame
    • 1. 直接使用pandas
        • 1.1 orient='split' : columns,index,data
        • 1.2 orient='index',按照index转化
        • 1.3 orient='records'
        • 1.4 orient='columns'
    • 2. json_normalize
    • 3. json --> dataFrame
        • 1. 传输的文件为一个list列表
  • 总结

一、pandas是什么?

示例:pandas 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。dataFrame 为其主要的数据类型。

二、json --> dataFrame

1. 直接使用pandas

1.1 orient=‘split’ : columns,index,data

exp: {“columns”:[“name”,“values”,“describe”], “index”:[0,1,2],
“data”:[[“aa”,123,“sgsggfsgsdfh”],[“bb”,135,“shsdhdghdgh\u3002”],[“cc”,146,“sjglasgs”]]}

df–>json: data_json = data.to_json(orient=‘split’)

json–>df: df1 = pd.read_json(data_json,orient=‘split’)

1.2 orient=‘index’,按照index转化

exp:{“0”:{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
“1”:{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
“2”:{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}}

df–>json: data_json2 = data.to_json(orient=‘index’)

json–>df: df2 = pd.read_json(data_json2,orient=‘index’)

1.3 orient=‘records’

exp:[{“name”:“aa”,“values”:123,“describe”:“sgsggfsgsdfh”},
{“name”:“bb”,“values”:135,“describe”:“shsdhdghdgh\u3002”},
{“name”:“cc”,“values”:146,“describe”:“sjglasgs”}]

df–>json:data_json3 = data.to_json(orient=‘records’)

json–>df:df3=pd.read_json(data_json3, orient=‘records’)

1.4 orient=‘columns’

exp:{“name”:{“0”:“aa”,“1”:“bb”,“2”:“cc”},
“values”:{“0”:123,“1”:135,“2”:146},
“describe”:{“0”:“sgsggfsgsdfh”,“1”:“shsdhdghdgh\u3002”,“2”:“sjglasgs”}}

df–>json: data_json4 = data.to_json(orient=‘columns’)
json–>df: df4 =d.read_json(orient=‘columns’)

备注:json文件都是双引号,如果得到是一个单引号的,可以通过replace转化 data_json5 =
data_json4.replace(’"’,’’’) # 双引号转化为单引号

2. json_normalize

代码如下:

from pandas.io.json import json_normalize
data = '{"a":"value1","b":"value1"}'
json.loads(data) # 读取json文件
json_normalize(json.loads(data)) # 将json文件转化为dataFrame


# 读取文件的形式:
with open("test.json", 'r') as f:
    temp = json.loads(f.read())
    temp_df = json_normalize(temp)
    print(temp_df.T) # 数据需要转置

json/dict -->dataFrame 推荐: df = pd.DataFrame(dict0) # dict
需要固定格式:dict0 ={‘a’:[1,2,3,4],‘b’:[‘a’,‘b’,‘c’,‘d’]} df =
pd.DataFrame(json0) # 以上json的几种格式都可以使用

代码示例1:

# 
# dict-->df
dict0 ={'a':[1,2,3,4],'b':['a','b','c','d']}
df = pd.DataFrame(dict0)
df
json0 = {"0":{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},
      "1":{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},
      "2":{"name":"cc","values":146,"describe":"sjglasgs"}}
df1 = pd.DataFrame(json0)
df1
0	1	2
name	aa	bb	cc
values	123	135	146
describe	sgsggfsgsdfh	shsdhdghdgh。	sjglasgs

代码示例2.

import pandas as pd
import numpy as np
data = pd.DataFrame({'name':['aa','bb','cc'],'values':[123,135,146],'describe':['sgsggfsgsdfh','shsdhdghdgh。','sjglasgs']})

# 读取方法一:
data_json = data.to_json(orient='split')
print(data_json)
df = pd.read_json(data_json, orient='split')
# 结果:
{"columns":["name","values","describe"],"index":[0,1,2],"data":[["aa",123,"sgsggfsgsdfh"],["bb",135,"shsdhdghdgh\u3002"],["cc",146,"sjglasgs"]]}
name	values	describe
0	aa	123	sgsggfsgsdfh
1	bb	135	shsdhdghdgh。
2	cc	146	sjglasgs

代码示例3:

# 读取方法二:
data_json2 = data.to_json(orient='index')
print(data_json2)
df2 = pd.read_json(data_json2,orient='index')
df2
# 结果
{"0":{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},"1":{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},"2":{"name":"cc","values":146,"describe":"sjglasgs"}}
describe	name	values
0	sgsggfsgsdfh	aa	123
1	shsdhdghdgh。	bb	135
2	sjglasgs	cc	146

代码示例4:

data_json3 = data.to_json(orient='records')
print(data_json3)
df3=pd.read_json(data_json3, orient='records')
df3
# 结果
[{"name":"aa","values":123,"describe":"sgsggfsgsdfh"},{"name":"bb","values":135,"describe":"shsdhdghdgh\u3002"},{"name":"cc","values":146,"describe":"sjglasgs"}]
name	values	describe
0	aa	123	sgsggfsgsdfh
1	bb	135	shsdhdghdgh。
2	cc	146	sjglasgs

代码示例5:

data_json4 = data.to_json(orient='columns')
print(data_json4)
data_json5 = data_json4.replace('\"','\'')
data_json5
# 结果
{"name":{"0":"aa","1":"bb","2":"cc"},"values":{"0":123,"1":135,"2":146},"describe":{"0":"sgsggfsgsdfh","1":"shsdhdghdgh\u3002","2":"sjglasgs"}}
"{'name':{'0':'aa','1':'bb','2':'cc'},'values':{'0':123,'1':135,'2':146},'describe':{'0':'sgsggfsgsdfh','1':'shsdhdghdgh\\u3002','2':'sjglasgs'}}"

3. json --> dataFrame

1. 传输的文件为一个list列表

代码示例1:

import pandas as pd
strtext='[{"ttery":"min","issue":"20130801-3391","code":"8,4,5,2,9","code1":"297734529","code2":null,"time":1013395466000},\
{"ttery":"min","issue":"20130801-3390","code":"7,8,2,1,2","code1":"298058212","code2":null,"time":1013395406000},\
{"ttery":"min","issue":"20130801-3389","code":"5,9,1,2,9","code1":"298329129","code2":null,"time":1013395346000},\
{"ttery":"min","issue":"20130801-3388","code":"3,8,7,3,3","code1":"298588733","code2":null,"time":1013395286000},\
{"ttery":"min","issue":"20130801-3387","code":"0,8,5,2,7","code1":"298818527","code2":null,"time":1013395226000}]'
 
df=pd.read_json(strtext,orient='records')
son_columns = '{"index":[1,2,3],"columns":["a","b","c"],"data":[[11,12,13],[13,14,15],[15,16,17]]}'
# json_columns = '{"columns":["name","values","describe"],"index":[0,1,2],"data":[["aa",123,"sgsggfsgsdfh"],["bb",135,"shsdhdghdgh\u3002"],["cc",146,"sjglasgs"]]}'
df = pd.read_json(json_columns,orient='split')

# 结果
	ttery	issue	code	code1	code2	time
0	min	20130801-3391	8,4,5,2,9	297734529	NaN	1013395466000
1	min	20130801-3390	7,8,2,1,2	298058212	NaN	1013395406000
2	min	20130801-3389	5,9,1,2,9	298329129	NaN	1013395346000
3	min	20130801-3388	3,8,7,3,3	298588733	NaN	1013395286000
4	min	20130801-3387	0,8,5,2,7	298818527	NaN	1013395226000

代码示例2:

import pandas as pd
import json
from pandas.io.json import json_normalize
data = '{"a":"value1","b":"value1"}'
json.loads(data)
json_normalize(json.loads(data))


总结

在数据的分析中,pandas 中dataFrame对象,能够更加简洁的满足数据分析的各种需求,但是实际的应用场景中数据的类型并非如此理想,其中主要的是json类型和dict,通过以上的方法可直接将数据转化为dataFrame类型进行数据处理,在数据传输过程中json文件的表现力更优。json文件和dataFrame数据之间的相互转化,更具实用性。

本文地址:https://blog.csdn.net/qq_33624802/article/details/110437953

《pandas基础操作: json 和 dataFrame 相互转化.doc》

下载本文的Word格式文档,以方便收藏与打印。