Hi I am a bit new to Python and am a bit confused how to proceed. I have a large dataset that contains both parent and child information. For example, if we have various items and their components, and their components also have other components or children, how do we create a type of tree structure? Here is an example of the data:
I was wondering how I can get it into a tree structure. So the output would be:
Tree structure for car
and it will also return the one for airplane, similar to the one for car.
I know that the common attribute for this would be based upon the parent number/child number. But, I am a bit confused on how to go about this in python.
Use a class to encode the structure:
class TreeNode:def __init__(self, number, name):self.number = numberself.name = nameself.children = []def addChild(self, child):self.children.append(child)
One example of how to use it:
car = TreeNode(1111, "car")
engine = TreeNode(3333, "engine")
car.addChild(engine)
Note: The number
attribute doesn't have to be an int (e.g. 1111
for car); it can just as well be a string of the integer (i.e. "1111"
).
To actually get something resembling your desired output, we'll need to serialize the root object into nested dictionaries:
class TreeNode:def __init__(self, number, name):self.number = numberself.name = nameself.children = []def addChild(self, child):self.children.append(child)def serialize(self):s = {}for child in self.children:s[child.name] = child.serialize()return s
Now, we can get something resembling your desired output by using json.dumps
:
dummy = TreeNode(None, None) # think of this as the root/tablecar = TreeNode(1111, "car")
dummy.addChild(car)engine = TreeNode(3333, "engine")
car.addChild(engine)fan = TreeNode(4312, "fan")
engine.addChild(fan)print(json.dumps(dummy.serialize(), indent=4))
prints:
{"car": {"engine": {"fan": {}}}
}