目录
xml例子
方法一:利用cElementTree
方法二:利用read_xml()
方法三:利用pd.json_normalize()
xml例子
xml = '''<?xml version='1.0' encoding='utf-8'?>
<data>
<row>
<shape>square</shape>
<degrees>360</degrees>
<sides>4.0</sides>
</row>
<row>
<shape>circle</shape>
<degrees>360</degrees>
</row>
<row>
<shape>triangle</shape>
<degrees>180</degrees>
<sides>3.0</sides>
</row>
</data>'''
方法一:利用cElementTree
from xml.etree import cElementTree as ET
import pandas as pd
# 读取xml字符串
root = ET.fromstring(text=xml)
# 读取xml文件
# tree = ET.ElementTree(file="text.xml")
# root = tree.getroot()
data = list()
for child in root:
data1 = list()
for son in child:
data1.append(son.text)
data.append(data1)
df = pd.DataFrame(data, columns=['shape', 'degrees', 'sides'])
print(df)
输出结果: shape degrees sides 0 square 360 4.0 1 circle 360 NaN 2 triangle 180 3.0
如果 shape 、degrees、sides 不是按照一定规律排列,这样取数据容易出错。
比如将最后一组 degrees、 shape 、sides ,
输出结果便会变成:
shape degrees sides 0 square 360 4.0 1 circle 360 None 2 180 triangle 3.0
方法二:利用read_xml()
import pandas as pd
df = pd.read_xml(xml)
print(df)
输出结果: shape degrees sides 0 square 360 4.0 1 circle 360 NaN 2 triangle 180 3.0
方法三:利用pd.json_normalize()
-
将xml转为类似json的格式文章来源:https://www.toymoban.com/news/detail-456752.html
-
利用pd.json_normalize() 读到dataframe文章来源地址https://www.toymoban.com/news/detail-456752.html
def fun1(root):
dic1 = dict()
for child in root:
if bool(child) is True: # 有下一层
print(child.tag)
dic2 = fun1(child) # 自己调用自己
value = dic1.get(child.tag) # 存在返回,不存在返回None
if value: # 存在
value.append(dic2)
dic1[child.tag] = value
else:
dic1[child.tag] = [dic2]
else:
dic1[child.tag] = child.text
return dic1
if __name__ == '__main__':
from xml.etree import cElementTree as ET
import pandas as pd
root = ET.fromstring(text=xml)
dic1 = fun1(root)
df = pd.json_normalize(dic1['row'])
print(df)
输出结果: shape degrees sides 0 square 360 4.0 1 circle 360 NaN 2 triangle 180 3.0
到了这里,关于利用python读取xml中的数据的文章就介绍完了。如果您还想了解更多内容,请在右上角搜索TOY模板网以前的文章或继续浏览下面的相关文章,希望大家以后多多支持TOY模板网!