Nested Json to pandas DataFrame with specific format

2024/11/19 21:43:50

I need to format the contents of a Json file in a certain format in a pandas DataFrame so that I can run pandassql to transform the data and run it through a scoring model.

file = C:\scoring_model\json.js (contents of 'file' are below)

{
"response":{"version":"1.1","token":"dsfgf","body":{"customer":{"customer_id":"1234567","verified":"true"},"contact":{"email":"[email protected]","mobile_number":"0123456789"},"personal":{"gender": "m","title":"Dr.","last_name":"Muster","first_name":"Max","family_status":"single","dob":"1985-12-23",}}}

I need the dataframe to look like this (obviously all values on same row, tried to format it best as possible for this question):

version | token | customer_id | verified | email      | mobile_number | gender |
1.1     | dsfgf | 1234567     | true     | [email protected] | 0123456789    | m      |title | last_name | first_name |family_status | dob
Dr.   | Muster    | Max        | single       | 23.12.1985

I have looked at all the other questions on this topic, have tried various ways to load Json file into pandas

with open(r'C:\scoring_model\json.js', 'r') as f:c = pd.read_json(f.read())with open(r'C:\scoring_model\json.js', 'r') as f:c = f.readlines()

tried pd.Panel() in this solution Python Pandas: How to split a sorted dictionary in a column of a dataframe with dataframe results from [yo = f.readlines()]. I thought about trying to split contents of each cell based on ("") and find a way to put the split contents into different columns but no luck so far.

Answer

If you load in the entire json as a dict (or list) e.g. using json.load, you can use json_normalize:

In [11]: d = {"response": {"body": {"contact": {"email": "[email protected]", "mobile_number": "0123456789"}, "personal": {"last_name": "Muster", "gender": "m", "first_name": "Max", "dob": "1985-12-23", "family_status": "single", "title": "Dr."}, "customer": {"verified": "true", "customer_id": "1234567"}}, "token": "dsfgf", "version": "1.1"}}In [12]: df = pd.json_normalize(d)In [13]: df.columns = df.columns.map(lambda x: x.split(".")[-1])In [14]: df
Out[14]:email mobile_number customer_id verified         dob family_status first_name gender last_name title  token version
0  [email protected]    0123456789     1234567     true  1985-12-23        single        Max      m    Muster   Dr.  dsfgf     1.1
https://en.xdnf.cn/q/26398.html

Related Q&A

Iterating over dictionary items(), values(), keys() in Python 3

If I understand correctly, in Python 2, iter(d.keys()) was the same as d.iterkeys(). But now, d.keys() is a view, which is in between the list and the iterator. Whats the difference between a view and …

Is there a method that tells my program to quit?

For the "q" (quit) option in my program menu, I have the following code:elif choice == "q":print()That worked all right until I put it in an infinite loop, which kept printing blank…

Hiding Axis Labels

Im trying to hide the axis labels on the first subplot at 211. Id like to label the figure, not just a subplot (reference: "Isub Event Characteristics"). How can I control font properties lik…

Why does Python preemptively hang when trying to calculate a very large number?

Ive asked this question before about killing a process that uses too much memory, and Ive got most of a solution worked out.However, there is one problem: calculating massive numbers seems to be untouc…

Django, name parameter in urlpatterns

Im following a tutorial where my urlpatterns are:urlpatterns = patterns(,url(r^passwords/$, PasswordListView.as_view(), name=passwords_api_root),url(r^passwords/(?P<id>[0-9]+)$, PasswordInstance…

The most Pythonic way of checking if a value in a dictionary is defined/has zero length

Say I have a dictionary, and I want to check if a key is mapped to a nonempty value. One way of doing this would be the len function:mydict = {"key" : "value", "emptykey"…

What does the --pre option in pip signify?

I saw on this page that pip install neo4j-doc-manager --pre was used. What does the --pre flag mean?

Tracing and Returning a Path in Depth First Search

So I have a problem that I want to use depth first search to solve, returning the first path that DFS finds. Here is my (incomplete) DFS function:start = problem.getStartState()stack = Stack()visited =…

Pandas OHLC aggregation on OHLC data

I understand that OHLC re-sampling of time series data in Pandas, using one column of data, will work perfectly, for example on the following dataframe:>>df ctime openbid 1443654000 1.1170…

Python plotting libraries [closed]

Closed. This question is seeking recommendations for books, tools, software libraries, and more. It does not meet Stack Overflow guidelines. It is not currently accepting answers.We don’t allow questi…