Question 1

I've to parse a text file that contains different kind of data. The most challenging is a line that contains three different JSON object (strings) and other data between them. I've to divide the Json data from the rest. The good thing is this: every Json object start with a name. The issue I'm having with regex is isolate the first Json string obj from the others, and parse it using json. Here my solution (it works) but i bet there is something better... I'm not good in regex yet.

#This parse a string for isolate the first JSON serialized Object.
def get_json_first_end(text):ind_ret = 0ind1 = 0for i,v in enumerate(text):if v == '{':ind1 = ind1 + 1if v == '}':ind1 = ind1 - 1if ind1 == 0:ind_ret = ibreakreturn ind_ret#This return a string that contain the JSON object
def get_json_str(line,json_name):js_str = ''if re.match('(.*)' + json_name + '(.*)',line):#Removing all spurious data before and after the Json objdata = re.sub('(.*)'+ json_name,'',line)ind1 = data.find('{')ind2 = data.rfind('}')ind3 = get_json_first_end(data[ind1:ind2+1])js_str = data[ind1:ind3+2]return js_str

If i don't call get_json_first_end the ind2 can be wrong if there are multiple json strings in the same line. The get_json_str return a string with the JS string obj I want and I can parse it with json without issues. My question is: there is a better way to do this? get_json_first_end seems quite ugly. Thanks

Update: here an example line:

ConfigJSON ["CFG","VAR","1","[unused bit 2]","[unused bit 3]","[unused bit 4]","[unused bit 5]"] 2062195231AppTitle "Fsdn" 3737063363Bits ["RESET","QUICK","KILL","[unused bit 2]","[unused bit 3]","[unused bit 4]","[unused bit 5]"] 0837383711CRC 33628 0665393097ForceBits {"Auxiliary":[{"index":18,"name":"AUX1.INPUT"},{"index":19,"name":"AUX2.INPUT"}],"Network":[{"index":72,"name":"INPUT.1"}],"Physical":[]}

Question 2

Your string is custom format. It may be possible to do with regex. I have tried with simple loop. You need to find open bracket [ or }, get corresponding closing bracket ] or }.

>>>string = '["CFG","VAR","1","[unused bit 2]","[unused bit 3]","[unused bit 4]","[unused bit 5]"] 2062195231AppTitle "Fsdn" 3737063363Bits ["RESET","QUICK","KILL","[unused bit 2]","[unused bit 3]","[unused bit 4]","[unused bit 5]"] 0837383711CRC 33628 0665393097ForceBits {"Auxiliary":[{"index":18,"name":"AUX1.INPUT"},{"index":19,"name":"AUX2.INPUT"}],"Network":[{"index":72,"name":"INPUT.1"}],"Physical":[]}'>>> def getjson(string):square = ['[',']']curly = ['{','}']count = 0json_list = []character = ''complement_character = ''start = 0end = 0for i in range(len(string)):if not character:if string[i] is square[0]:character = square[0] complement_character = square[1]start = icount += 1elif string[i] is curly[0]:character = curly[0]complement_character = curly[1]start = icount += 1else:# when character [ or { is found find corresponding ] or } using count.if string[i] is character:count += 1elif string[i] is complement_character:count -= 1if count == 0 and character :character = ''complement_character = ''end = i+1json_list.append(json.loads(string[start:end]))return json_list>>> print getjson(string)
[[u'CFG', u'VAR', u'1', u'[unused bit 2]', u'[unused bit 3]', u'[unused bit 4]', u'[unused bit 5]'], [u'RESET', u'QUICK', u'KILL', u'[unused bit 2]', u'[unused bit 3]', u'[unused bit 4]', u'[unused bit 5]'], {u'Physical': [], u'Auxiliary': [{u'index': 18, u'name': u'AUX1.INPUT'}, {u'index': 19, u'name': u'AUX2.INPUT'}], u'Network': [{u'index': 72, u'name': u'INPUT.1'}]}]

Python 2.7 Isolate multiple JSON objects in a string

Related Q&A

pass different C functions with pointer arrays as the function argument to a class

dynamic filter choice field in django

How to print a table from a text file and fill empty spaces?

Open a file in python from 2 directory back

Do string representations of dictionaries have order in Python 3.4?

BeautifulSoup Scraping Results not showing

How to verify username and password from CSV file in Python?

adding validation to answer in quiz gives wrong answers

Why do I get None as the output from a print statement? [duplicate]

How to collect tweets about an event that are posted on specific date using python?