python email.message_from_string() parse problems

2024/10/11 7:33:22

My setup uses fetchmail to pull emails from Gmail, which are processed by procmail and passes it to a python script.

When I use email.message_from_string(), the resulting object is not parsed as an email object. get_payload() returns the header/body/payload text of the email as a single text blob.

This is the text it returns:

From [email protected]  Sat Aug 17 19:20:44 2013
>From example  Sat Aug 17 19:20:44 2013
MIME-Version: 1.0
Received: from ie-in-f109.1e100.net [74.125.142.109]by VirtualBox with IMAP (fetchmail-6.3.21)for <example@localhost> (single-drop); Sat, 17 Aug 2013 19:20:44 -0700 (PDT)
Received: by 10.70.131.110 with HTTP; Sat, 17 Aug 2013 19:20:42 -0700 (PDT)
Date: Sat, 17 Aug 2013 19:20:42 -0700
Delivered-To: [email protected]
Message-ID: <CAAsp4m0GBeVg80-ryFgNvNNAj_QPguzbX3DqvMSx-xSGZM18Pw@mail.gmail.com>
Subject: test 19:20
From: example <[email protected]>
To: example <[email protected]>
Content-Type: multipart/alternative; boundary=001a1133435474449004e42f7861--001a1133435474449004e42f7861
Content-Type: text/plain; charset=ISO-8859-119:20--001a1133435474449004e42f7861
Content-Type: text/html; charset=ISO-8859-1<div dir="ltr">19:20</div>--001a1133435474449004e42f7861--

My code:

full_msg = sys.stdin.read()
msg = email.message_from_string(full_msg)
msg['to']          # returns None
msg.get_payload()  # returns the text above

What am I missing to get Python to properly interpret the email?

I see from these questions that I may not be getting the proper email headers somewhere along the line, but I cannot confirm. That ">" on line 2 is not a typo: it's in the text.

Answer

Regardless of ">" being "in the text" as you say, whatever that means - it's wrong. After removing this character:

>python test.py <input.txt
example <[email protected]>
[<email.message.Message instance at 0x02810288>,<email.message.Message instance at 0x02810058>]

So the error is not in parsing the message, but in the ">" character somehow corrupting your email text.

https://en.xdnf.cn/q/118351.html

Related Q&A

Python: Function returning highest value in list without max()? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.Want to improve this question? Update the question so it focuses on one problem only by editing this post.Closed 7…

Given edges, how can find routes that consists of two edges in a vectorised way?

I have an array of towns and their neighbours. I want to get a set all the pairs of towns that have at least one route that consists of exactly two different edges. Is there a vectorized way to do this…

Usefulness of one-line statements in Python [closed]

Closed. This question is opinion-based. It is not currently accepting answers.Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.Clo…

Pack data into binary string in Python [closed]

Its difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying thi…

Parsing Complex Mathematical Functions in Python

Is there a way in Python to parse a mathematical expression in Python that describes a 3D graph? Using other math modules or not. I couldnt seem to find a way for it to handle two inputs.An example of…

How do I check if the user has entered a number? [duplicate]

This question already has answers here:How can I read inputs as numbers?(10 answers)Closed last year.I making a quiz program using Python 3. Im trying to implement checks so that if the user enters a …

high F1 score and low values in confusion matrix

consider I have 2 classes of data and I am using sklearn for classification, def cv_classif_wrapper(classifier, X, y, n_splits=5, random_state=42, verbose=0):cross validation wrappercv = StratifiedKFol…

Replace `\n` in html page with space in python LXML

I have an unclear xml and process it with python lxml module. I want replace all \n in content with space before any processing, how can I do this work for text of all elements.edit my xml example:<…

Basic python. Quick question regarding calling a function [duplicate]

This question already has answers here:How do I get ("return") a result (output) from a function? How can I use the result later?(4 answers)Closed 1 year ago.Ive got a basic problem in pyth…

Obtain the duration of a mp4 file [duplicate]

This question already has answers here:How to get the duration of a video in Python?(15 answers)Closed 10 years ago.I need to know the duration of a mp4 file with python 3.3. I search and try to do th…