Parse Email

I’m working on an imap class and would like to be able to display incoming emails. What is the best way to parse/display emails?

I should add that I know how to download the emails via imap, it’s parsing & displaying that leaves me confused and not knowing where to start.

Try to get a hold of Joe Strout’s Zymail classes.

Emails have a nice tree structure which are in theory easy to parse. In practice every mail client sending does it differently, some mail clients lie about the data sent. So emails are really fun to parse.

To get started at email parsing look at the boundary which may or may not be in the header. Also content transfer encoding describes how the email part needs to be decoded.

Thank you for your responses. I checked out the Zymail project. It lists relevant RFC’s:

[quote]RFC 2822 Internet Message Format

RFC 2045 MIME Part One: Format of Internet Message Bodies Updated by RFC 2231
RFC 2046 MIME Part Two: Media Types
RFC 2047 MIME Part Three: Message Header Extensions for Non-ASCII Text Updated by RFC 2231
RFC 2183 The Content-Disposition Header Field Updated by RFC 2231

RFC 2231 MIME Parameter Value and Encoded Word Extensions: Character Sets,…
RFC 2387 The MIME Multipart/Related Content-type
RFC 3462 The MIME Multipart/Report Content-type
RFC 2111 Content-ID and Message-ID Uniform Resource Locators
RFC 2632 S/MIME Version 3 Certificate Handling
RFC 2633 S/MIME Version 3 Message Specification[/quote]

Am I correct that our built in tools EmailMessage, EmailHeaders, & EmailAttachment aren’t designed to parse according to these RFC’s?

Nope. Every time I think I’ve seen everything I get another interesting email. The book I learned email with is still good: http://www.amazon.com/Internet-Email-Protocols-Developers-Guide/dp/0201432889

RFC’s are “guidelines” - or that seems to be the way a lot of developers have treated them

SMTP / POP are a great example where IF you follow the RFC’s there are a pile of mail servers that are not “compliant” but if you don’t support them then a great pile of server’s simply won’t work - and getting ISP’s etc to update to something RFC compliant is a lost cause

Email is much the same where the RFC’s will tell you whats supposed to be in there but it’s not always the only thing you’ll find :slight_smile: