Question 1

I have the following data set:

column1HL111
PG3939HL11
HL339PG
RC--HL--PG

I am attempting to write a function that does the following:

Loop through each row of column1
Pull only the alphabet and put into an array
If the array has "HL" in it, remove it from the array UNLESS HL is the only word in the array.
Take the first word in the array and output results.

So for the above example, my array (step2) would look like this:

[HL]
[PG,HL]
[HL,PG]
[RC,HL,PG]

and my desired final output (step4) would look like this:

desired_columnHL
PG
PG
RC

I have the code for step 2, and it seems to work fine

df['array_column'] = (df.column1.str.extractall('([A-Z]+)').unstack().values.tolist())

But I don't know how to get from here to my final output (step4).

Question 2

You may achieve what you need by replacing all non-letters first, then extracting pairs of letters and then applying some custom logic to extract the necessary value from the array:

>>> df['array_column'].str.replace('[^A-Z]+', '').str.findall('([A-Z]{2})').apply(lambda d: [''] if len(d) == 0 else d).apply(lambda x: 'HL' if len(x) == 1 and x[0] == 'HL' else [m for m in x if m != 'HL'][0])
0    HL
1    PG
2    PG
3    RC
Name: array_column, dtype: object
>>>

Details

.replace('[^A-Z]+', '') - remove all chars other the uppercase letters
.str.findall('([A-Z]{2})') - extract pairs of letters
.apply(lambda d: [''] if len(d) == 0 else d) will add an empty item if there is no regex match in the previous step
.apply(lambda x: 'HL' if len(x) == 1 and x[0] == 'HL' else [m for m in x if m != 'HL'][0]) - custom logic: if the list length is 1 and it is equal to HL, keep it, else remove all HL and get the first element

Loop and arrays of strings in python

Related Q&A

2 Dendrograms + Heatmap from condensed correlationmatrix with scipy

Iterator example from Dive Into Python 3

Getting a 500 Internal Server Error using render_template and Flask [duplicate]

Bokeh use of Column Data Source and Box_Select

How Does a Pyqtgraph Export for Three Subplots Look Like?

Use class variables as instance vars?

Get amount from django-paypal

Python get file regardless of upper or lower

how to save h5py arrays with different sizes?

Cannot allocate memory on Popen commands