Quantcast
Channel: How to extract nested values from XML with namespaces? - Stack Overflow
Viewing all articles
Browse latest Browse all 2

How to extract nested values from XML with namespaces?

$
0
0

I'm trying to extract some data from following XML file.

<?xml version="1.0" encoding="utf-8"?><go-home-1:GOHOMEV1 xmlns:go-home-1="https://sample.com/GO-HOME-V1"><HOMEV1FileHeader><FileCreationTimestamp>2020-02-15T08:29:22+01:00</FileCreationTimestamp><FileType>AB716</FileType><SGO>YIFG</SGO></HOMEV1FileHeader><OI><ON>YIFG4</ON><CI>HYU</CI><NL><NT><GOCode>HYU34</GOCode><NTName>HYUFFT - 11</NTName><NTData><RIS><RI><EDC>2020-01-18</EDC><E4NS><GNS><RD><NR><CC>9012</CC><NDC>411</NDC><SRng><SRngStart>000</SRngStart><SRngStop>999</SRngStop></SRng></NR></RD><RD><NR><CC>834</CC><NDC>101</NDC><SRng><SRngStart>150</SRngStart><SRngStop>295</SRngStop></SRng></NR></RD></GNS></E4NS><E2NS><MCC>111</MCC><MNC>222</MNC></E2NS><E2G><MGT_CC>9012</MGT_CC><MGT_NC>4113</MGT_NC></E2G></RI></RIS></NTData></NT></NL></OI></go-home-1:GOHOMEV1>

My expected output is like below, having SGO as first field.

enter image description here

My attempt is like below (taking ideas from here Getting all children of a node using xml.etree.ElementTree)but I'm getting some errors or empty lists (for sgo = root.find()... and A = root.findall()...) for which I'm stuck. Thanks for any help.

import xml.etree.ElementTree as ETimport glob, osfilename = "file.xml"namespaces = {"go-home-1": "https://sample.com/GO-HOME-V1"}root = ET.parse(filename).getroot()# For this sgo = root.find()... I get ERROR << AttributeError: 'NoneType' object has no attribute 'text'>>sgo = root.find("go-home-1:HOMEV1FileHeader/""go-home-1:SGO", namespaces).text  ### For below I'm getting empty list A = [] and I don't know why.A = root.findall("go-home-1:OI/go-home-1:NL/go-home-1:NT[1]/go-home-1:NTData/go-home-1:RIS/go-home-1:RI/go-home-1:E4NS/""go-home-1:GNS/""go-home-1:RD/""go-home-1:NR", namespaces)for item1 in A:    Result = [sgo]    cc = item1.find("go-home-1:CC", namespaces).text    ndc = item1.find("go-home-1:NDC", namespaces).text    Result.append(cc)    Result.append(ndc)    B = item1.findall("go-home-1:OI/go-home-1:NL/go-home-1:NT[1]/go-home-1:NTData/go-home-1:RIS/go-home-1:RI/go-home-1:E4NS/""go-home-1:GNS/""go-home-1:RD/""go-home-1:NR/""go-home-1:SRng", namespaces)    for item2 in B:    RngStart = item2.find("go-home-1:SRngStart", namespaces).text    RngStop = item2.find("go-home-1:SRngStop", namespaces).text    Result.append(RngStart)    Result.append(RngStop)    print(Result)

Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images