Python Regex - checking for a capital letter with a lowercase after

2024/11/14 15:35:52

I am trying to check for a capital letter that has a lowercase letter coming directly after it. The trick is that there is going to be a bunch of garbage capital letters and number coming directly before it. For example:

AASKH317298DIUANFProgramming is fun

as you can see, there is a bunch of stuff we don't need coming directly before the phrase we do need, Programming is fun.

I am trying to use regex to do this by taking each string and then substituting it out with '' as the original string does not have to be kept.

re.sub(r'^[A-Z0-9]*', '', string)

The problem with this code is that it leaves us with rogramming is fun, as the P is a capital letter.

How would I go about checking to make sure that if the next letter is a lowercase, then I should leave that capital untouched. (The P in Programming)

Answer

Use a negative look-ahead:

re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)

This matches any uppercase character or digit that is not followed by a lowercase character.

Demo:

>>> import re
>>> string = 'AASKH317298DIUANFProgramming is fun'
>>> re.sub(r'^[A-Z0-9]*(?![a-z])', '', string)
'Programming is fun'
https://en.xdnf.cn/q/72031.html

Related Q&A

Read hierarchical (tree-like) XML into a pandas dataframe, preserving hierarchy

I have a XML document that contains a hierarchical, tree-like structure, see the example below.The document contains several <Message> tags (I only copied one of them for convenience).Each <Me…

Reference counting using PyDict_SetItemString

Im wondering how memory management/reference counting works when a new value is set into an existing field within a PyDict (within a C extension).For instance, assume a dictionary is created and popula…

How to use statsmodels.tsa.seasonal.seasonal_decompose with a pandas dataframe

from statsmodels.tsa.seasonal import seasonal_decomposedef seasonal_decomp(df, model="additive"):seasonal_df = Noneseasonal_df = seasonal_decompose(df, model=additive)return seasonal_dfseason…

UnidentifiedImageError: cannot identify image file

Hello I am training a model with TensorFlow and Keras, and the dataset was downloaded from https://www.microsoft.com/en-us/download/confirmation.aspx?id=54765 This is a zip folder that I split in the …

Pydub raw audio data

Im using Pydub in Python 3.4 to try to detect the pitch of some audio files.I have a working pitch detection algorithm (McLeod Pitch Method), which is robust for real-time applications (I even made an …

Create Duplicate Rows and Change Values in Specific Columns

How to create x amount of duplicates based on a row in the dataframe and change a single or multi variables from specific columns. The rows are then added to the end of the same dataframe.A B C D E F 0…

writing and saving CSV file from scraping data using python and Beautifulsoup4

I am trying to scrape data from the PGA.com website to get a table of all of the golf courses in the United States. In my CSV table I want to include the Name of the golf course ,Address ,Ownership ,We…

Performance issue turning rows with start - end into a dataframe with TimeIndex

I have a large dataset where each line represents the value of a certain type (think a sensor) for a time interval (between start and end). It looks like this: start end type value 2015-01-01…

How can I create a key using RSA/ECB/PKCS1Padding in python?

I am struggling to find any method of using RSA in ECB mode with PKCS1 padding in python. Ive looked into pyCrypto, but they dont have PKCS1 padding in the master branch (but do in a patch). Neverthel…

Do full-outer-join with pandas.merge_asof

Hi I need to align some time series data with nearest timestamps, so I think pandas.merge_asof could be a good candidate. However, it does not have an option to set how=outer like in the standard merge…