Question 1

This is the file: https://drive.google.com/file/d/0B5v-nJeoVouHc25wTGdqaDV1WW8/view?usp=sharing

As you can see, there are duplicates in the first column, but if I were to combine the duplicate rows, no data would get overridden in the other columns. Is there any way I can combine the rows with duplicate values in the first column?

For example, turn "1,A,A,," and "1,,,T,T" into "1,A,A,T,T".

Question 2

Plain Python:

import csvreader = csv.Reader(open('combined.csv'))
result = {}for row in reader:idx = row[0]values = row[1:]if idx in result:result[idx] = [result[idx][i] or v for i, v in enumerate(values)]else:result[idx] = values

How this magic works:

iterate over rows in the CSV file
for every record, we check if there was a record with the same index before
if this is the first time we see this index, just copy the row values
if this is a duplicate, assign row values to empty cells.

The last step is done via or trick: None or value will return value. value or anything will return value. So, result[idx][i] or v will return existing value if it is not empty, or row value.

To output this without loosing the duplicated rows, we need to keep index, then iterate and output corresponding result entries:

indices = []
for row in reader:# ...indices.append(idx)writer = csv.writer(open('outfile.csv', 'w'))
for idx in indices:writer.writerow([idx] + result[idx])

How can I merge CSV rows that have the same value in the first cell?

Related Q&A

i usually get this error : ValueError: invalid literal for int() with base 10

How to Draw a triangle shape in python? [closed]

DataFrame from list of string dicts with array() values

Need Help Making Buttons to perform for loops when you input a number

Combining multiple conditional expressions in a list comprehension

Arduino Live Serial Plotting with a MatplotlibAnimation gets slow

Hide lines on tree view - openerp 7

Python Append dataframe generated in nested loops

bError could not find or load main class caused by java.lang.classnotfoundation error

Invalid value after matching string using regex [duplicate]