I'm trying to do some simple JSON parsing using Python 3's built in JSON module, and from reading a bunch of other questions on SO and googling, it seems this is supposed to be pretty straightforward. However, I think I'm getting a string returned instead of the expected dictionary.
Firstly, here is the JSON I am trying to get values from. It's just some output from Twitter's API
[{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
I assigned this String to a variable named json_string like so:
json_string = json.dumps(output)
jason = json.loads(json_string)
Then, when I try to get a specific key from the "jason" dictionary:
print(jason['hashtags'])
I'm getting an error:
TypeError: string indices must be integers
I want to be able to convert the json output to a dictionary, then use jason[key_name]
call to get values using specified keys. Is there something obvious that I'm missing here?
This is my fist time working with Python, after coming from Java. I absolutely love the language and think it's very powerful. So, any help on this would be greatly appreciated!
Ok first you should print your object so that you can read it:
>>> from pprint import pprint
>>> output = [{'in_reply_to_status_id_str': None, 'in_reply_to_screen_name': None, 'retweeted': False, 'in_reply_to_status_id': None, 'contributors': None, 'favorite_count': 0, 'in_reply_to_user_id': None, 'coordinates': None, 'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>', 'geo': None, 'retweet_count': 0, 'text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90', 'created_at': 'Mon Sep 01 19:36:25 +0000 2014', 'entities': {'symbols': [], 'user_mentions': [], 'urls': [{'expanded_url': 'http://www.isthereanappthat.com', 'display_url': 'isthereanappthat.com', 'url': 'http://t.co/QDVYv6bV90', 'indices': [16, 38]}], 'hashtags': []}, 'id_str': '506526005943865344', 'in_reply_to_user_id_str': None, 'truncated': False, 'favorited': False, 'lang': 'en', 'possibly_sensitive': False, 'id': 506526005943865344, 'user': {'profile_text_color': '333333', 'time_zone': None, 'entities': {'description': {'urls': []}}, 'url': None, 'profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png', 'profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png', 'protected': False, 'default_profile_image': True, 'utc_offset': None, 'default_profile': True, 'screen_name': 'KickzWatch', 'follow_request_sent': False, 'following': False, 'profile_background_color': 'C0DEED', 'notifications': False, 'description': '', 'profile_sidebar_border_color': 'C0DEED', 'geo_enabled': False, 'verified': False, 'friends_count': 40, 'created_at': 'Mon Sep 01 16:29:18 +0000 2014', 'is_translator': False, 'profile_sidebar_fill_color': 'DDEEF6', 'statuses_count': 4, 'location': '', 'id_str': '2784389341', 'followers_count': 4, 'favourites_count': 0, 'contributors_enabled': False, 'is_translation_enabled': False, 'lang': 'en', 'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png', 'id': 2784389341, 'profile_use_background_image': True, 'listed_count': 0, 'profile_background_tile': False, 'name': 'Maktub Destiny', 'profile_link_color': '0084B4'}, 'place': None}]
>>> pprint(output)
[{'contributors': None,'coordinates': None,'created_at': 'Mon Sep 01 19:36:25 +0000 2014','entities': {'hashtags': [],'symbols': [],'urls': [{'display_url': 'isthereanappthat.com','expanded_url': 'http://www.isthereanappthat.com','indices': [16, 38],'url': 'http://t.co/QDVYv6bV90'}],'user_mentions': []},'favorite_count': 0,'favorited': False,'geo': None,'id': 506526005943865344,'id_str': '506526005943865344','in_reply_to_screen_name': None,'in_reply_to_status_id': None,'in_reply_to_status_id_str': None,'in_reply_to_user_id': None,'in_reply_to_user_id_str': None,'lang': 'en','place': None,'possibly_sensitive': False,'retweet_count': 0,'retweeted': False,'source': '<a href="http://twitter.com" rel="nofollow">Twitter Web Client</a>','text': 'Tweeting a url \nhttp://t.co/QDVYv6bV90','truncated': False,'user': {'contributors_enabled': False,'created_at': 'Mon Sep 01 16:29:18 +0000 2014','default_profile': True,'default_profile_image': True,'description': '','entities': {'description': {'urls': []}},'favourites_count': 0,'follow_request_sent': False,'followers_count': 4,'following': False,'friends_count': 40,'geo_enabled': False,'id': 2784389341,'id_str': '2784389341','is_translation_enabled': False,'is_translator': False,'lang': 'en','listed_count': 0,'location': '','name': 'Maktub Destiny','notifications': False,'profile_background_color': 'C0DEED','profile_background_image_url': 'http://abs.twimg.com/images/themes/theme1/bg.png','profile_background_image_url_https': 'https://abs.twimg.com/images/themes/theme1/bg.png','profile_background_tile': False,'profile_image_url': 'http://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png','profile_image_url_https': 'https://abs.twimg.com/sticky/default_profile_images/default_profile_6_normal.png','profile_link_color': '0084B4','profile_sidebar_border_color': 'C0DEED','profile_sidebar_fill_color': 'DDEEF6','profile_text_color': '333333','profile_use_background_image': True,'protected': False,'screen_name': 'KickzWatch','statuses_count': 4,'time_zone': None,'url': None,'utc_offset': None,'verified': False}}]
From looking at this you can see that output is a list
which contains a single dict
. To access this you need:
>>> first_elem = output[0]
You will also see that the hashtags
key in the first_elem
is contained in a second level dict
under the key entities
:
>>> entities = first_elem['entities']
>>> pprint(entities)
{'hashtags': [],'symbols': [],'urls': [{'display_url': 'isthereanappthat.com','expanded_url': 'http://www.isthereanappthat.com','indices': [16, 38],'url': 'http://t.co/QDVYv6bV90'}],'user_mentions': []}
Now you are able to access hashtags
:
>>> entities['hashtags']
[]
Which just happens to be the empty list.
To convert to JSON, note the comment:
>>> import json
>>> # Make sure output is the list object not a string representing the object
>>> json_string = json.dumps(output)
>>> jason = json.loads(output)
>>> jason[0]['entities']['hashtags']
[]
I think your problem is that you made output a string before you json.dumps
it, meaning that json.loads
will return a string, not a json object.
And @Dan's answer is correct, this is not valid JSON. It is however a valid python dict, and I'm assuming that you got it from Twitter using python then printed it.