Question 1

I have a csv that looks like this:

HA-MASTER,CategoryID
38231-S04-A00,14
39790-S10-A03,14
38231-S04-A00,15
39790-S10-A03,15
38231-S04-A00,16
39790-S10-A03,16
38231-S04-A00,17
39790-S10-A03,17
38231-S04-A00,18
39790-S10-A03,18
38231-S04-A00,19
39795-ST7-000,75
57019-SN7-000,75
38251-SV4-911,75
57119-SN7-003,75
57017-SV4-A02,75
39795-ST7-000,76
57019-SN7-000,76
38251-SV4-911,76
57119-SN7-003,76
57017-SV4-A02,76

What I would like to do is reformat this data so that there is only one line for each categoryID for example:

14,38231-S04-A00,39790-S10-A03
76,39795-ST7-000,57019-SN7-000,38251-SV4-911,57119-SN7-003,57017-SV4-A02

I have not found a way in excel that I can accomplish this programatically. I have over 100,000 lines. Is there a way using python CSV Read and Write to do something like this?

Question 2

Yes there is a way:

import csvdef addRowToDict(row):global myDictkey=row[1]if key in myDict.keys():#append values if entry already existsmyDict[key].append(row[0])else:#create entrymyDict[key]=[row[1],row[0]]global myDict
myDict=dict()
inFile='C:/Users/xxx/Desktop/pythons/test.csv'
outFile='C:/Users/xxx/Desktop/pythons/testOut.csv'with open(inFile, 'r') as f:reader = csv.reader(f)ignore=Truefor row in reader:if ignore:#ignore first rowignore=Falseelse:#add entry to dictaddRowToDict(row)with open(outFile,'w') as f:writer = csv.writer(f)#write everything to filewriter.writerows(myDict.itervalues())

Just edit inFile and outFile

Python CSV writer

Related Q&A

How to perform standardization on the data in GridSearchCV?

how to find the permutations of string? python [closed]

Unicode category for commas and quotation marks

Uppercase every other word in a string using split/join

BeautifulSoup get text from tag searching by Title

Subtract from first value in numpy array [duplicate]

how to give range of a worksheet as variable

how to remove brackets from these individual elements? [duplicate]

First project alarm clock

Invalid Syntax using @app.route