Question 1

I have a string which has both Arabic and English sentences. What I want is to extract Arabic Sentences only.

my_string="""
What is the reason
ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ
behind this?
ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ
"""

This Link shows that the Unicode range for Arabic letters is 0600-06FF.

So, very basic attempt came to my mind is:

import re
print re.findall(r'[\u0600-\u06FF]+',my_string)

But, this fails miserably as it returns the following list.

['What', 'is', 'the', 'reason', 'behind', 'this?']

As you can see, this is exactly opposite of what I want. What I am missing here?

N.B.

I know I can match the Arabic letters by using inverse matching like below:

print re.findall(r'[^a-zA-Z\s0-9]+',my_string)

But, I don't want that.

Question 2

You can use re.sub to replace ascii characters with empty string.

>>> my_string="""
... What is the reason
... ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ
... behind this?
... ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ
... """
>>> print(re.sub(r'[a-zA-Z?]', '', my_string).strip())
ذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَذَلِكَ الْكِتَابُ لَا رَيْبَ فِيهِ هُدًى لِلْمُتَّقِينَ

Your regex didn't work because you are using Python 2 and your string is str you need to convert my_string to unicode for it to work. However it did perfectly work on Python3.x

>>> print "".join(re.findall(ur'[\u0600-\u06FF]', unicode(my_string, "utf-8"), re.UNICODE))
ذَلِكَالْكِتَابُلَارَيْبَفِيهِهُدًىلِلْمُتَّقِينَذَلِكَالْكِتَابُلَارَيْبَفِيهِهُدًىلِلْمُتَّقِينَ

How to retrieve only arabic texts from a string using regular expression?

Related Q&A

Formatted output in OpenOffice/Microsoft Word with Python

Issue in calling Python code from Java (without using jython)

AttributeError: tuple object has no attribute dim, when feeding input to Pytorch LSTM network

Python - Idiom to check if string is empty, print default

Does WordNet have levels? (NLP)

Merge two DataFrames based on columns and values of a specific column with Pandas in Python 3.x

Use range as a key value in a dictionary, most efficient way?

How to replace all instances of a sub-sequence in a list in Python?

How to initialise a 2D array in Python?

numpy: broadcast multiplication over one common axis of two 2d arrays