I read a CSV file and use the usaddress library to parse an address field. How do I write the resulting OrderedDicts to another CSV file?
import usaddress
import csvwith open('output.csv') as csvfile:reader = csv.DictReader(csvfile)for row in reader:addr=row['Case Parties Address']data = usaddress.tag(addr)print(data)
(OrderedDict([('AddressNumber', u'4167'), ('StreetNamePreType', u'Highway'), ('StreetName', u'319'), ('StreetNamePostDirectional', u'E'), ('PlaceName', u'Conway'), ('StateName', u'SC'), ('ZipCode', u'29526-5446')]), 'Street Address')
see this github issue for a solution:
import csvkit
import usaddress# expected format in input.csv: first column 'id', second column 'address'
with open('input.csv', 'rU') as f:reader = csvkit.DictReader(f)all_rows = []for row in reader:try:parsed_addr = usaddress.tag(row['address'])row_dict = parsed_addr[0]except:row_dict = {'error':'True'}row_dict['id'] = row['id']all_rows.append(row_dict)field_list = ['id','AddressNumber', 'AddressNumberPrefix', 'AddressNumberSuffix', 'BuildingName', 'CornerOf','IntersectionSeparator','LandmarkName','NotAddress','OccupancyType','OccupancyIdentifier','PlaceName','Recipient','StateName','StreetName','StreetNamePreDirectional','StreetNamePreModifier','StreetNamePreType','StreetNamePostDirectional','StreetNamePostModifier','StreetNamePostType','SubaddressIdentifier','SubaddressType','USPSBoxGroupID','USPSBoxGroupType','USPSBoxID','USPSBoxType','ZipCode', 'error']with open('output.csv', 'wb') as outfile:writer = csvkit.DictWriter(outfile, field_list)writer.writeheader()writer.writerows(all_rows)
some notes:
- because each tagged address can have a different set of keys, you should define the columns in the output with all possible keys. this isn't a problem, because we know all the possible usaddress labels
- the usaddress tag method will raise an error if it is unable to concatenate address tokens in an intuitive way. these errors should be captured in the output