HTTP MIME介绍

JCP上的文章


Sometime it happens that we use or refer to a software term or technology a lot of time without being very familiar with it. MIME is one of those terms for me. We use MIME standards to exchange messages between various endpoints, for example in email communication, web services etc.

MIME is every where and we might have used it countless times during our software career. But what exactly is MIME?  I posed this question to some of my software developer friends and got ambiguous answers. Some refer to MIME as MIME type, some tried to quote the full form of MIME. It was clear that MIME is not very well understood concept. In this post we will try to shed some light on what is this MIME?

History

As per RFC 822, original mail protocols were built to support only standard US ASCII charset. This left a lot to be desired.

  1. What if sender wants to send a message in a different charset say Hindi, or Spanish or any other charset?
  2. What if sender wants to send a multipart message?
  3. What if sender wants to add some non text attachment?
  4. What if senders wants to set message header in some other charset?

To address these concerns The Internet Engineering Task Force (IETF) came up with new format for Mail Message. This was an extension to famous RFC822. This new format is referred to as MIME messages.

What is MIME?

MIME stands for Multipurpose Internet Mail Extensions MIME is an Internet standard that extends the email messages to support, non ASCII text content, non text attachment, Multipart message body and non US-ASCII header.  The MIME was so successful that is was adopted as message format for general web and lots of other technologies.  MIME format are defined using following RFC docs.

  1. RFC 2045: Describes various headers used to describe the structure of MIME messages.
  2. RFC 2046: Defines an initial set of Media Types
  3. RFC 2047: Describes extensions to RFC 822 to allow non-US-ASCII text data in Internet mail header fields
  4. RFC 2048: Specifies various IANA registration procedures for MIME-related facilities
  5. RFC 2049: Provides MIME conformance criteria as well as some examples of MIME message formats, acknowledgements, and the bibliography.

Structure of a MIME message

01 MIME-Version: 1.0
02     From: Nathaniel Borenstein <nsb@nsb.fv.com>
03     To: Ned Freed <ned@innosoft.com>
04     Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT)
05     Subject: A multipart example
06     Content-Type: multipart/mixed;
07                   boundary=unique-boundary-1
08  
09     --unique-boundary-1
10  
11       ... Some text appears here ...   
12  
13     --unique-boundary-1
14     Content-type: text/plain; charset=US-ASCII
15  
16     --unique-boundary-1
17     Content-Type: multipart/parallel; boundary=unique-boundary-2
18  
19     --unique-boundary-2
20     Content-Type: audio/basic
21     Content-Transfer-Encoding: base64
22  
23       ... base64-encoded 8000 Hz single-channel
24           mu-law-format audio data goes here ...
25  
26     --unique-boundary-2
27     Content-Type: image/jpeg
28     Content-Transfer-Encoding: base64
29  
30       ... base64-encoded image data goes here ...
31  
32     --unique-boundary-2--
33  
34     --unique-boundary-1
35     Content-type: text/enriched
36  
37     <b>this is a test</b>
38  
39     --unique-boundary-1
40     Content-Type: message/rfc822
41  
42     From: (mailbox in US-ASCII)
43     To: (address in US-ASCII)
44     Subject: (subject in US-ASCII)
45     Content-Type: Text/plain; charset=ISO-8859-1
46     Content-Transfer-Encoding: Quoted-printable
47  
48       ... Additional text in ISO-8859-1 goes here ...
49  
50     --unique-boundary-1--

Above is an example of a MIME message. On close inspection you will find that it has following parts.

  1. Headers
  2. Multiple body part which are of different content types

Multipart

A MIME Multipart message can contain one or more body part, which can have different content-types, the body parts can be embedded in another body part and are enclosed within boundary specified in boundary param on content-type header of parent body part.

Dissecting MIME Headers

MIME Version

1 MIME-Version: 1.0

Presence of this header let us know that we have a mime email message. The original intention of this header was to support future versions of mime. But the way MIME is implemented makes it impossible to change the version. Now version is always fixed to 1.0 and signifies that we have a non US ASCII message with non text attachments.

Content Type Header

1 Content-Type: multipart/mixed;
2               boundary=unique-boundary-1

Content Type header defines the data type present in the body and body parts of the messages.  This helps the client in choosing the appropriate mechanism by which they can display the message to user.  The type/subtype definition is generally followed by a boundary value. The boundary value represents a body part block and all the body part must start and end with that boundary.  For example

1 --unique-boundary-1--
2  
3 body part goes here
4  
5 --unique-boundary-1--

Content Disposition Header

01 content-disposition = "Content-Disposition" ":"
02                               disposition-type *( ";" disposition-parm )
03         disposition-type = "attachment" | disp-extension-token
04         disposition-parm = filename-parm | disp-extension-parm
05         filename-parm = "filename" "=" quoted-string
06         disp-extension-token = token
07         disp-extension-parm = token "=" ( token | quoted-string )
08 An example is
09  
10         Content-Disposition: attachment; filename="fname.ext"

The body type of a MIME message should be show as is unless a content disposition header is specified as attachment. When Content-Disposition : attachment header is specified then it means that body part should not be displayed normally, rather it should be displayed as attachment and clicking it should result in downloading of body part in file name specified by the filename param of the header.

Content-Transfer-Encoding

As we know that lots of protocol like SMTP allows messages only with 7BIT encoding. Now with MIME it is possible to send across 8-bit, binary data as well. This becomes possible only by encoding the 8-bit or binary data in a 7BIT format. To do this MIME provides Content-Transfer-Encoding header.  For example consider a body part consisting of an audio file.

1 Content-Type: audio/basic
2 Content-Transfer-Encoding: base64

Now since audio file is in binary format so it should be reencoded in 7BIT format. We use Content-Transfer-Encoding header to convert it in BASE 64 encoded 7BIT supported format. Apart for base 64 we also have following encoding.

  1. 7BIT – default
  2. Base64
  3. QUOTED-PRINTABLE
  4. 8BIT
  5. BINARY
  6. x-EncodingName

I  hope that this post sheds some extra light on what MIME is. This POST is a result of research and reading I have done in last few days and as I am human, this could have some errors as well.  If any of you find some vital basic points missing please let me know so that I can add it to the post.  If you find this post useful please drop a comment or two.

你可能感兴趣的:(HTTP MIME介绍)