E-Mail file attachment using MIME

Published on: February 8, 2001
Last Updated: February 8, 2001

E-Mail file attachment using MIME

Published on: February 8, 2001
Last Updated: February 8, 2001

The classes presented in this article extend those described in “SMTP MFC Classes” to handle messages using the Multipurpose Internet Mail Extensions (MIME) protocol.

MIME is an e-mail message protocol (described in RFCs 1521 and 1522) that allows messages to incorporate non-text content while still remaining compliant with RFC 822.

This non-text content could be embedded audio or video, an “external” message (meaning the content is not stored in the message itself, but at an external location), or one or more binary file attachments.

Not all of these content types are supported by the classes in this article, but you can easily extend them to do so.

How MIME Works

RFC 822 describes a generic message format, and specifies some standard header information. It says almost nothing about the body of the messages, except that it is a chunk of ASCII text.

MIME takes advantage of that by superimposing a format for the body text, and it also defines some header lines of its own.

The neat thing is that since the body is still just a chunk of text, it doesn’t break any rules established by RFC 822.

This means that any RFC 822-compliant transport system can handle MIME messages without modification.

The transport system expects a message to have a chunk of 7-bit text, and MIME messages happily oblige them.

The transport system neither knows, nor has reason to care, that lines 50 through 700 of the text chunk actually represent an executable file.

For that matter, it doesn’t care that the chunk of text in a non-MIME message represents the English sentence “Attack at dawn”.

As far as a mail server transporting the message is concerned, it’s all just a bunch of 7-bit characters to send up the chain.

Of course, the receiving clients of the message do care because that chunk of text has to be converted back into a binary file.

They can do this only if they know about the format of the message, i.e., only if they are MIME-compliant mail readers.

MIME Headers

MIME defines five additional header lines that inform the receiving client about how the body should be interpreted.

HeaderMeaning
Content-TypeSpecifies the type of data contained in the message
Content-Transfer-EncodingSpecifies how the data is converted into 7-bit text
MIME-VersionIndicates the MIME compliance level to which the message is encoded.
Content-IDUniquely identifies the body. This is used for splitting the contents of large messages into smaller messages.
Content-DescriptionIgnored by MIME applications. Gives a human reader an indication of the content.

The Content-Type header tells decoders what kind of data is contained in the body, giving both a broad data type (such as “image”), and a specific type (such as “GIF”) separated by a forward slash (“/”).

Following the type information, the Content-Type header may include additional parameters in name=value pairs, each separated by a semi-colon (“;”). What these additional parameters are depends on the type of data.

Everyone, and their mothers, and their mothers’ dogs, are proposing new content types (along with their parameters) all the time, so you’ll have to go swimming in an ocean of standards documents to discover them all.

Example:

Content-Type: text/plain; charset="iso-8859-1"

The more content types your mail reader can handle, the more capable it is. Outlook supports text/html, so it can display HTML messages to the user directly.

It supports image/jpeg, so it can show you pictures right there in the message.

Any content type that the mail reader can’t support itself should be saved to disk so that the user can open it with something that knows how to handle it.

In other words, there’s no such thing as a “file attachment”; there are only content types that are encoded and decoded. What you do after decoding is up to you.

Do you support application/zip? Then unzip the binary block. If not, save it to disk and let someone think of it as a “file attachment”.

What most people think of as a file attachment is usually a body of type application/octet-stream.

That signifies a chunk of 8-bit bytes… but that could be anything! Without more concrete information about the data, there’s really nothing the mail reader can do except save it to disk.

The Content-Transfer-Encoding header identifies the mechanism a decoder should use in order to convert the message body back into its original form.

The biggies are 7bit, 8bit, Binary, Quoted-Printable, and Base64. Regardless of what content types your mail reader supports, you need to be able to decode the message in the first place.

You don’t have to support image/gif, but you need to be able to turn the body into a GIF file that you can save to disk.

Your program can’t do this unless it handles all encoding mechanisms (you can’t control how someone else’s program encoded it, so you need to handle them all).

You can get away with supporting only text/plain because you can just dump other types of data to disk, but you can’t do anything if you can’t read the data to begin with.

Luckily, only Base64 encoding could be considered even slightly difficult to implement.

Multi-Part Messages

MIME completes the illusion of file attachments by allowing the message body to be divided into distinct parts, each with their own headers.

The content type multipart/mixed means that the content of the body is divided into blocks separated by “–” + a unique string guaranteed to not be found anywhere else in the message.

If you say that your boundary string is “MyBoundaryString”, then all occurrences of that string will be treated as a boundary. So it better not be in the message the user typed or it won’t be decoded correctly.

The boundary string is specified as a parameter to the Content-Type header:

Content-Type: multipart/mixed; boundary="MyBoundaryString"

Do not include the preceding “–” in the value. The parts should then be separated with this:

--MyBoundaryString

And the end of the entire message will be indicated with a trailing “–” as well:

--MyBoundaryString--

Text before the first boundary and after the end-of-message boundary is ignored by decoders, but since a non-MIME reader will simply display the whole thing as text, these areas can be used to tell the user to get a better mail reader.

From: [email protected]
To: [email protected]
Subject: In One Ear and Out Your Mother
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="MyBoundaryString"
This is a MIME message. If the next few lines look like gibberish,
then your mail reader sucks.
If you are using a MIME reader, then you aren't even seeing this.
--MyBoundaryString
Content-Type: text/plain
Content-Transfer-Encoding: 7bit
The charset= parameter is omitted and will default to US-ASCII.
In fact, I could have had a blank line as my header and it would have
defaulted to exactly what I specified on those header lines.
--MyBoundaryString
Content-Type: application/octet-stream; file="ATTACHMENT.EXE"
Content-Transfer-Encoding: base64
AxfhfujropadladnggnfjgwsaiubvnmkadiuhterqHJSFfuAjkfhrqpeorLAkFn
jNfhgt7Fjd9dfkliodf==
--MyBoundaryString--
This text is ignored.

Note that each part of the message conforms to RFC 822, including the “sub” headers.

The MIME Classes

At the application level, these classes are simple to use:

  1. Create a message object of type CMIMEMessage
  2. Populate it
  3. Call AddMIMEPart() passing the path to a file you want to attach (one call per file).
  4. Create a CSMTP object and connect to a server
  5. Call the CSMTP object’s SendMessage(), giving it the message object.
  6. Disconnect from the server

As is, you can send text messages with one or more reasonably sized file attachments.

But since MIME is constantly evolving, and you may need to support various character sets, you may need to extend these classes at some point. So here’s how it all works.

CMIMEMessage

This is where all the action takes place. It is derived from CMailMessage and works closely with its base class.

Conceptually, CMailMessage represents a message that conforms to RFC 822, and since MIME merely adds some headers and formats the body, CMIMEMessage should only do the same.

CMailMessage does what it needs to do to make the message RFC 822-compliant; CMIMEMessage only tacks on additional information to make the message MIME-compliant.

CMIMEMessages are themselves made up of distinct parts– each with their own content types, parameters and encoding.

Instead of a monolithic class that handles all content types, I decided that it should know nothing about content types at all.

In its constructor, it registers as many content “agents” as required. Each agent is identified with a code.

Message-parts are added individually to an internal list along with the code of the agent that knows how to handle them.

When CMIMEMessage needs to build its body, it asks the appropriate agent to actually incorporate the part into the body.

In this way, whenever I need to handle a new MIME content type, I create an agent by deriving from CMIMEContentAgent, and register the agent object in the constructor of a CMIMEMessage-derived class.

There are two content agents in this package: CAppOctetStream and CTextPlain.

These classes (both derived from CMIMEContentAgent) know how to handle the “application/octet-stream” and “text/plain” content types, respectively.

The message-parts and agents are kept in sync with a unique code. Agent objects are given their code when constructed. When message-parts are added to the message, this same code is used.

The codes are defined as a public enum, CMIMEMessage::eMIMETypeCode. Because the message class controls the codes, it’s very easy to keep things in sync.

Note the final item, “NEXT_FREE_MIME_CODE”. When you derive a message class (and agents) to handle other content types, begin your new codes with this one. When you register your new agents, give them one of your new codes.

Example:

Say you want to create a message class that handles the “application/zip” content type.

You’d first derive a CMIMEContentAgent class that does the useful work. Then you’d derive a message class from CMIMEMessage and create the appropriate code like this:

class CEnhancedMIMEMessage::CMIMEMessage
{
public:
	enum eMIMETypeCode
	{
		APP_ZIP = CMIMEMessage::NEXT_FREE_MIME_CODE,
		NEXT_FREE_MIME_CODE  // Ready to use by 4th-generation class.
	};
};

In the constructor, you’d create the new agent using this code and register it. You would also use this new code when you add message-parts.

You can continue this indefinitely– each sub-class adding the ability to handle more content types.

The internal class CMIMETypeManager, manages the agents. It’s job is to cough up an appropriate agent for a given content type and to delete the agent objects.

CMailMessage::FormatMessage() is what gets the ball rolling. (It is called by CSMTP::SendMessage(), but you you can call it yourself whenever you want.)

There is some dancing back and forth between base class and derived class that’s not obvious, and you should be aware of it. CMailMessage::prepare_header() is virtual, so FormatMessage() will call the derived class’s version. 

The first thing the derived class’s version does is call the base class’s version so that it can do whatever it is supposed to do to the header.

In this way each class only worries about its own header lines. (Classes derived from CMIMEMessage don’t need to override prepare_header() because they are CMIMEMessages, and that class knows how to prepare its headers.)

CMIMEMessage::AddMIMEPart() adds message-parts to the list of parts used to construct the body.

It takes the code of the agent, a suggested encoding type, the actual content, and a flag argument that indicated whether the content should be treated as a path or as actual content. The default parameters supplied are sufficient to attach a file.

msg.AddMIMEPart( "C:\AUTOEXEC.BAT" ); // Send someone my autoexec file	

It’s not only for files, but for any message-part (including what’s normally thought of as the “body”).

Indeed, if you assign the m_sBody member directly (which is perfectly acceptable), it will get converted into a message-part to be processed along with the others. Eventually, m_sBody will be a long string of text incorporating everything in the message.

Limitations And Warnings

CMIMEMessage class doesn’t support message splitting. So if your mail server has a size limit, and doesn’t break apart messages itself, you’ll need to derive a class to handle this. This will involve deriving from CSMTP as well.

CMIMEMessage builds the message body in memory. If you attach a 100-meg file, it will consume 100 megs of memory, plus overhead while processing (and I use CStrings extensively).

These two limitations mean that there is an undefinable size limit for file attachments (undefinable because it depends on your server and the memory manager used by your OS).

One of the first enhancements I did was to override CSMTP::SendMessage() to poll a CMailMessage::FormatMessage()-like function that returned a status code to indicate its next course of action.

CREATE_NEW indicated that the CSMTP object should tell the server it’s done with this message and is going to send another one.

Also, the function accepted a buffer and the message was built, one line per call, into that buffer instead of into m_sHeader and m_sBody.

I can think of 3 other ways to handle this situation, and I’m sure you can too. Heck, you could even break encapsulation and just pass a pointer to the CSocket the SMTP object is using. Let the message object pump its own data to the mail server.

Using the Classes in Your Projects

I know some of you are using the classes from “SMTP MFC Classes” in your applications.

The implementations of those classes have changed, but the public interfaces are the same. So, you won’t need to redo your source code at all. You will just need to compile it with the new source code from this package instead.

Use folder names when you extract the package. You’ll find a folder called \SMTP that contains component files (.OGX) that can be inserted right into your project.

If you have suggestions, find any bugs, or improve these classes, feel free to contact me . In particular, if you develop new content type agents, I’d love to get my grubby hands on them.

Stay on top of the latest technology trends — delivered directly to your inbox, free!

Subscription Form Posts

Don't worry, we don't spam

Written by Bobby

Bobby Lawson is a seasoned technology writer with over a decade of experience in the industry. He has written extensively on topics such as cybersecurity, cloud computing, and data analytics. His articles have been featured in several prominent publications, and he is known for his ability to distill complex technical concepts into easily digestible content.