High-dimensional data structure in Python

2024/9/25 11:12:18

What is best way to store and analyze high-dimensional date in python? I like Pandas DataFrame and Panel where I can easily manipulate the axis. Now I have a hyper-cube (dim >=4) of data. I have been thinking of stuffs like dict of Panels, tuple as panel entries. I wonder if there is a high-dim panel thing in Python.

update 20/05/16: Thanks very much for all the answers. I have tried MultiIndex and xArray, however I am not able to comment on any of them. In my problem I will try to use ndarray instead as I found the label is not essential and I can save it separately.

update 16/09/16: I came up to use MultiIndex in the end. The ways to manipulate it are pretty tricky at first, but I kind of get used to it now.


MultiIndex is most useful for higher dimensional data as explained in the docs and this SO answer because it allows you to work with any number of dimension in a DataFrame environment.

In addition to the Panel, there is also Panel4D - currently in experimental stage. Given the advantages of MultiIndex I wouldn't recommend using either this or the three dimensional version. I don't think these data structures have gained much traction in comparison, and will indeed be phased out.


Related Q&A

How to access top five Google result links using Beautifulsoup

I want to access the top five(or any specified number) of links of results from Google. Through research, I found and modified the following code.import requests from bs4 import BeautifulSoup import re…

Logging in a Framework

Imagine there is a framework which provides a method called logutils.set_up() which sets up the logging according to some config.Setting up the logging should be done as early as possible since warning…

Working of the Earth Mover Loss method in Keras and input arguments data types

I have found a code for the Earth Mover Loss in Keras/Tensrflow. I want to compute the loss for the scores given to images but I can not do it until I get to know the working of the Earth Mover Loss gi…

Django Rest Framework writable nested serializer with multiple nested objects

Im trying to create a writable nested serializer. My parent model is Game and the nested models are Measurements. I am trying to post this data to my DRF application using AJAX. However, when try to po…

Django How to Serialize from ManyToManyField and List All

Im developing a mobile application backend with Django 1.9.1 I implemented the follower model and now I want to list all of the followers of a user but Im currently stuck to do that. I also use Django…

PyDrive and Google Drive - automate verification process?

Im trying to use PyDrive to upload files to Google Drive using a local Python script which I want to automate so it can run every day via a cron job. Ive stored the client OAuth ID and secret for the G…

Using rm * (wildcard) in envoy: No such file or directory

Im using Python and Envoy. I need to delete all files in a directory. Apart from some files, the directory is empty. In a terminal this would be:rm /tmp/my_silly_directory/*Common sense dictates that i…

cant import django model into celery task

i have the following task:from __future__ import absolute_importfrom myproject.celery import appfrom myapp.models import Entity@app.task def add(entity_id):entity = Entity.objects.get(pk=entity_id)retu…

Running unit tests with Nose inside a Python environment such as Autodesk Maya?

Id like to start creating unit tests for my Maya scripts. These scripts must be run inside the Maya environment and rely on the maya.cmds module namespace.How can I run Nose tests from inside a runnin…

Python Newline \n not working in jupyter notebooks

Im trying to display the tuples of a postgreSQL table neatly in my Jupyter Notebook, but the newline \n escape character doesnt seem to work here (it works for my python scripts w/ same code outside of…