python - Organizing XML data into dictionaries -

i'm trying organize data dictionary format xml data. used run monte carlo simulations.

here example of couple of entries in xml like:

<retirement>     <item>         <low>-0.34</low>         <high>-0.32</high>         <freq>0.0294117647058824</freq>         <variable>stock</variable>         <type>historic</type>     </item>     <item>         <low>-0.32</low>         <high>-0.29</high>         <freq>0</freq>         <variable>stock</variable>         <type>historic</type>     </item> </retirement> 

my current data sets have 2 variables , type can 1 of 3 or possible 4 discrete types. hard coding 2 variables isn't problem, start working data has many more variables , automate process. goal automatically import xml data dictionary able further manipulate later without having hard code in array titles , variables.

here have:

# import xml parser import xml.etree.elementtree et  # parse xml directly file path tree = et.parse('xmlfile')  # create iterable item list items = tree.findall('item')  # create master dictionary masterdictionary = {}  # assign variables dictionary item in items:     thiskey = item.find('variable').text     if thiskey in masterdictionary == false:         masterdictionary[thiskey] = []     else:         pass  thislist = masterdictionary[thiskey] newdatapoint = datapoint(float(item.find('low').text), float(item.find('high').text), float(item.find('freq').text)) thissublist.append(newdatapoint) 

i'm getting keyerror @ thislist = masterdictionary[thiskey]

i trying create class deal of other elements of xml:

# define class each data point contains low, hi , freq attributes class datapoint:  def __init__(self, low, high, freq):   self.low = low   self.high = high   self.freq = freq 

would able check value like:

masterdictionary['stock'] [0].freq 

any , appreciated


thanks john. indentation issues sloppiness on part. it's first time posting on stack , didn't copy/paste right. part after else: in fact indented part of loop , class indented 4 spaces in code--just bad posting here. i'll keep capitalization convention in mind. suggestion indeed worked , commands:

print masterdictionary.keys() print masterdictionary['stock'][0].low 


['inflation', 'stock'] -0.34 

those indeed 2 variables , value syncs xml listed @ top.

update 2

well, thought had figured 1 out, careless again , turns out hadn't quite fixed issue. previous solution ended writing of data 2 dictionary keys have 2 equal lists of data assigned 2 different dictionary keys. idea have distinct sets of data assigned xml matching dictionary key. here current code:

# import xml parser import xml.etree.elementtree et  # parse xml directly file path tree = et.parse(xml file)  # create iterable item list items = tree.findall('item')  # create class historic variables class datapoint:     def __init__(self, low, high, freq):         self.low = low         self.high = high         self.freq = freq  # create master dictionary , variable list historic variables masterdictionary = {} thislist = []  # loop assign variables dictionary keys , associate values them item in items:     thiskey = item.find('variable').text      masterdictionary[thiskey] = thislist     if thiskey not in masterdictionary:         masterdictionary[thiskey] = []     newdatapoint = datapoint(float(item.find('low').text), float(item.find('high').text), float(item.find('freq').text))     thislist.append(newdatapoint) 

when input:

print masterdictionary['stock'][5].low print masterdictionary['inflation'][5].low print len(masterdictionary['stock']) print len(masterdictionary['inflation']) 

the results identical both keys ('stock' , 'inflation'):

-.22 -.22 56 56 

there 27 items stock tag in xml file , 29 tagged inflation. how can make each list assigned dictionary key pull particular data in loop?

update 3

it seems work 2 loops, have no idea how , why won't work in 1 single loop. managed accidentally:

# import xml parser import xml.etree.elementtree et  # parse xml directly file path tree = et.parse(xml file)  # create iterable item list items = tree.findall('item')  # create class historic variables class datapoint:     def __init__(self, low, high, freq):         self.low = low         self.high = high         self.freq = freq  # create master dictionary , variable list historic variables masterdictionary = {}  # loop assign variables dictionary keys , associate values them item in items:     thiskey = item.find('variable').text     thislist = []     masterdictionary[thiskey] = thislist  item in items:     thiskey = item.find('variable').text     newdatapoint = datapoint(float(item.find('low').text), float(item.find('high').text), float(item.find('freq').text))     masterdictionary[thiskey].append(newdatapoint) 

i have tried large number of permutations make happen in 1 single loop no luck. can of data listed both keys--identical arrays of data (not helpful), or data sorted correctly 2 distinct arrays both keys, last single data entry (the loop overwrites each time leaving 1 entry in array).

you have serious indentation problem after (unnecessary) else: pass. fix , try again. problem occur sample input data? other data? first time around loop? value of thiskey causing problem [hint: it's reported in keyerror error message]? contents of masterdictionary before error happens [hint: sprinkle few print statements around code]?

other remarks not relevant problem:

instead of if thiskey in masterdictionary == false: consider using if thiskey not in masterdictionary: ... comparisons against true or false redundant and/or bit of "code smell".

python convention reserve names initial capital letter (like item) classes.

using 1 space per indentation level makes code illegible , severely deprecated. use 4 (unless have reason -- i've never heard of one).

update wrong: thiskey in masterdictionary == false worse thought; because in relational operator, chained evaluation used (like a <= b < c) have (thiskey in masterdictionary) , (masterdictionary == false) evaluate false, , dictionary never updated. fix suggested: use if thiskey not in masterdictionary:

also looks thislist (initialised not used) should thissublist (used not initialised).


Popular posts from this blog

linux - Using a Cron Job to check if my mod_wsgi / apache server is running and restart -

actionscript 3 - TweenLite does not work with object -

jQuery Ajax Render Fragments OR Whole Page -