JCP上的文章
Sometime it happens that we use or refer to a software term or technology a lot of time without being very familiar with it. MIME is one of those terms for me. We use MIME standards to exchange messages between various endpoints, for example in email communication, web services etc.
MIME is every where and we might have used it countless times during our software career. But what exactly is MIME? I posed this question to some of my software developer friends and got ambiguous answers. Some refer to MIME as MIME type, some tried to quote the full form of MIME. It was clear that MIME is not very well understood concept. In this post we will try to shed some light on what is this MIME?
History
As per RFC 822, original mail protocols were built to support only standard US ASCII charset. This left a lot to be desired.
- What if sender wants to send a message in a different charset say Hindi, or Spanish or any other charset?
- What if sender wants to send a multipart message?
- What if sender wants to add some non text attachment?
- What if senders wants to set message header in some other charset?
To address these concerns The Internet Engineering Task Force (IETF) came up with new format for Mail Message. This was an extension to famous RFC822. This new format is referred to as MIME messages.
What is MIME?
MIME stands for Multipurpose Internet Mail Extensions MIME is an Internet standard that extends the email messages to support, non ASCII text content, non text attachment, Multipart message body and non US-ASCII header. The MIME was so successful that is was adopted as message format for general web and lots of other technologies. MIME format are defined using following RFC docs.
- RFC 2045: Describes various headers used to describe the structure of MIME messages.
- RFC 2046: Defines an initial set of Media Types
- RFC 2047: Describes extensions to RFC 822 to allow non-US-ASCII text data in Internet mail header fields
- RFC 2048: Specifies various IANA registration procedures for MIME-related facilities
- RFC 2049: Provides MIME conformance criteria as well as some examples of MIME message formats, acknowledgements, and the bibliography.
Structure of a MIME message
02 |
From: Nathaniel Borenstein < nsb @nsb.fv.com> |
03 |
To: Ned Freed < ned @innosoft.com> |
04 |
Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT) |
05 |
Subject: A multipart example |
06 |
Content-Type: multipart/mixed; |
07 |
boundary=unique-boundary-1 |
11 |
... Some text appears here ... |
14 |
Content-type: text/plain; charset=US-ASCII |
17 |
Content-Type: multipart/parallel; boundary=unique-boundary-2 |
20 |
Content-Type: audio/basic |
21 |
Content-Transfer-Encoding: base64 |
23 |
... base64-encoded 8000 Hz single-channel |
24 |
mu-law-format audio data goes here ... |
27 |
Content-Type: image/jpeg |
28 |
Content-Transfer-Encoding: base64 |
30 |
... base64-encoded image data goes here ... |
35 |
Content-type: text/enriched |
40 |
Content-Type: message/rfc822 |
42 |
From: (mailbox in US-ASCII) |
43 |
To: (address in US-ASCII) |
44 |
Subject: (subject in US-ASCII) |
45 |
Content-Type: Text/plain; charset=ISO-8859-1 |
46 |
Content-Transfer-Encoding: Quoted-printable |
48 |
... Additional text in ISO-8859-1 goes here ... |
Above is an example of a MIME message. On close inspection you will find that it has following parts.
- Headers
- Multiple body part which are of different content types
Multipart
A MIME Multipart message can contain one or more body part, which can have different content-types, the body parts can be embedded in another body part and are enclosed within boundary specified in boundary param on content-type header of parent body part.
Dissecting MIME Headers
MIME Version
Presence of this header let us know that we have a mime email message. The original intention of this header was to support future versions of mime. But the way MIME is implemented makes it impossible to change the version. Now version is always fixed to 1.0 and signifies that we have a non US ASCII message with non text attachments.
Content Type Header
1 |
Content-Type: multipart/mixed; |
2 |
boundary=unique-boundary- 1 |
Content Type header defines the data type present in the body and body parts of the messages. This helps the client in choosing the appropriate mechanism by which they can display the message to user. The type/subtype definition is generally followed by a boundary value. The boundary value represents a body part block and all the body part must start and end with that boundary. For example
Content Disposition Header
01 |
content-disposition = "Content-Disposition" ":" |
02 |
disposition-type *( ";" disposition-parm ) |
03 |
disposition-type = "attachment" | disp-extension-token |
04 |
disposition-parm = filename-parm | disp-extension-parm |
05 |
filename-parm = "filename" "=" quoted-string |
06 |
disp-extension-token = token |
07 |
disp-extension-parm = token "=" ( token | quoted-string ) |
10 |
Content-Disposition: attachment; filename= "fname.ext" |
The body type of a MIME message should be show as is unless a content disposition header is specified as attachment. When Content-Disposition : attachment header is specified then it means that body part should not be displayed normally, rather it should be displayed as attachment and clicking it should result in downloading of body part in file name specified by the filename param of the header.
Content-Transfer-Encoding
As we know that lots of protocol like SMTP allows messages only with 7BIT encoding. Now with MIME it is possible to send across 8-bit, binary data as well. This becomes possible only by encoding the 8-bit or binary data in a 7BIT format. To do this MIME provides Content-Transfer-Encoding header. For example consider a body part consisting of an audio file.
1 |
Content-Type: audio/basic |
2 |
Content-Transfer-Encoding: base64 |
Now since audio file is in binary format so it should be reencoded in 7BIT format. We use Content-Transfer-Encoding header to convert it in BASE 64 encoded 7BIT supported format. Apart for base 64 we also have following encoding.
- 7BIT – default
- Base64
- QUOTED-PRINTABLE
- 8BIT
- BINARY
- x-EncodingName
I hope that this post sheds some extra light on what MIME is. This POST is a result of research and reading I have done in last few days and as I am human, this could have some errors as well. If any of you find some vital basic points missing please let me know so that I can add it to the post. If you find this post useful please drop a comment or two.