Readers interested in looking at C and coding guidelines from an economic and cultural perspective might like to read the material availablehere. You can find a searchable copy of the latest draft of the C Standard, C0X,here.
The MISRA C guidelines are of a high enough quality (regretfully many coding standards are poorly thought out) to warrant a detailed analysis. The problems with the current version of the MISRA document can be divided into several broad categories:
Download MISRA C test cases.
Those companies interested in a more substantial compiler test suite should look at thePerennial compiler validation suites.
Most of the rules have been taken from Annex G of the C standard. This lists the unspecified, undefined and implementation defined behaviours. A program that does not use any of these constructs is much more likely to give the same results, after being processed by different implementations, than a program that does use one or more of them. In the terminology of the standard such a program is strictly conforming (also, to achieve this status a program "... shall not exceed any minimum implementation limit.").
You can find a searchable copy of the latest draft of C0X here.
The commentary here refers to the 1990 Standard. Text within double quotation marks is quoting from this document. Any mention of 'the Standard' also refers to this document.
The two languages have a different view of the world. Many of the language constructs that they share do behave the same way. But not all of them do.
Developers who know C++ need to have some constructs brought to their attention as behaving differently in C.
The undefined behavior applies "... except in a character constant, a string literal, a header name, a comment, or a preprocessing token that is never converted to a token ...".
The implementation defined behavior kicks in when the characters are converted to an internal representation. This will occur if the characters appear in a character constant, or a string literal.
The rule should be written to exclude the use of these characters within character constants and string literals.
Rule 6: This rule excludes the use of an EBCDIC character set. No use of IBM 370's in embedded systems.
Given that Rule 5 precludes the use of characters outside of the 91 defined by the standard. It seems pointless to specify that a restricted subset of a standard defining nearly 64000 characters should be used.
The rule should be rewritten to specify the ASCII character set (which is already states in a round about way). Or ISO 8859-1 (Latin 1).
Rule 7: Trigraphs have a well defined definition. The example given in the commentary refers to an unintentional usage which could result, if the third character were a defined trigraph, unexpected behaviour. MISRA cannot rule against unintentional usage.
Software developers working in countries whose keyboards do not offer all of the characters required to write C programs will require special equipment, or to break this rule.
Recommending that a compiler operate in a non-standard mode is very bad practice.
Rule 9: This precludes the use of a construct that is not supported by the C standard. As such it is a pointless rule.
Rule 13: C permits the basic type to be omitted, i.e., unsigned is equivalent tounsigned int. This rule needs to state that all object and function declarations shall include a basic type, via a typedefed identifier.
Rule 15: This is not an enforcable coding standards rule.
What is a 'defined standard'? Defined by whom?
Better wording, should this rule be kept, would be to require a fully documented, publically available, floating point specification.
Rule 18: The commentary on this rule fails to capture the true advantage of using suffixes. The type of an integer literal depends on its value and the range of values supported by a given type on a particular implementation. So, for instance, the literal 33000 has type long int on a platform where int is 16 bits, but typeint, if int is 32 bits. Thus it is possible for the same expression, containing integer literals, to have a different type on different compilers.
Rule 22: Incorrect use of terminology here (the wording could also be read to imply that no identifiers should be declared within nested blocks). Only labels have function scope. Identifiers denoting objects have file, block or function prototype scope.
This rule could state: 'Declarations of identifiers denoting objects should have block scope unless a wider scope is necessary', or 'Declarations of identifiers denoting objects should have the narrowest block scope unless a wider scope is necessary'.
Rule 23: This is only half the story. File scope identifiers should be explicitly declaredstatic, or extern. Without an explicit storage class an object is a tentative definition and could be given an implicit initializer at the end of the translation unit. It is tentative in the sense that another, later, file scope, declaration of the same name can result in the storage class being external or static.
Also the file scope declaration:
const int ci;implicitly has the extern storage class in C and implicitly the static storage class in C++.
Rule 24: The example is incorrect. As written the second declaration of x has static storage duration because "... contains the storage-class specifier extern, the identifier has the same linkage as any visible declaration of the identifier with file scope."
The following example exhibits the intended behavior:
static UI_8 x; void f(void) { extern UI_8 x; /* matches against file scope and therefore has static storage class */ { extern UI_8 x; /* matches against the block scope declaration * which is hiding the file scope declaration. * So it does not have the same linkage as the * prior declaration. * Has external linkage. */Rule 26: Nothing crude at all. 6.1.2.6 says "Two types have compatible type if their types are the same.". It then goes on to reference additional conditions under which two types that are not the same can be compatible.
Rule 27: The wording here does not clearly state what is intended. An external object is declared whenever a file containing the declaration is included.
This rule should state: 'External object declarations should only appear within a single source file, which may be included by other source files one or more times.'
Rule 28: The register storage class specifier is at most harmless. It does not change the behavior of a conforming program. It does have other uses in that it can be used to explicitly prevent the address of an object being taken.
Rule 29: What does this rule mean? The commentary is no further help. Tags are not initialized, objects declared using tags can be initialized. If the types of the initializers do not match the member types a diagnostic will be issued. Is this a rule about tags, or about initializers?
Rule 31: The undefined behaviour referenced in this rule appears in Annex G, but it does not appear in the Normative body of the standard. Bug in the C standard.
The rule is effective in making code easier to read and understand The reference to undefined behaviour should be removed.
The commentary appearing in the Note, paragraph 4, is poor practice. MISRA should not encourage coding practices that rely on implicit behaviour. The noted behaviour only applies to block scope object definitions. File scope objects, that do not have an explicit initializer, are given a value of zero on program startup.
Rule 32: This rule has nothing to do with the implementation defined behavior given in the standard. The standard says "Each enumerated type shall be compatible with an integer type; the choice of type is implementation-defined."
In a rather round about way this rule seems to be saying that the values in the definition of an enumeration constant list must not implicitly overlap. If this is the case then the rule needs to say so explicitly.
I think it would be good programming practice to ban all implicit and explicit overlaps in each enumeration definition. If there were a need to allow explicit overlapping of values, then it should be required that the value be assigned using the name of the earlier enumeration constant.
Another rule is needed to handle the above mentioned implementation defined behavior.
Rule 34: This rule should be rewritten so that the user does not have to look up what a primary-expression is.
'The operands of the && and || operators shall be enclosed in parenthesis unless they are single identifiers'.
MISRA might like to make an exception for some postfix-expressions such as array indexing, structure member selection and function invocations.
The Standard does not specify the name of the syntactic rule to which the preprocessor operatordefined belongs. Given the naming of other unary operators, the MISRA committee might reasonably say that use of this operator prevents an expression from being a primary-expression i.e, it must be bracketed like the other unary operators.
Rule 36: This is not a meaningful rule. It is equivalent to having a rule that states; programs shall not contain bugs.
Rule 40: In C90 the behaviour is predictable. The side-effect will not happen. The code is simply misleading. In C9X the side-effect may happen, making the behaviour unpredictable.
This rule does help prevent coding bugs through misunderstanding of the Standard.
Rule 41: Practical advice would be to require a simple test at program startup to check that the intended behavior is being exhibited by the implementation. Failure of this test generating a diagnostic.
Rule 42: If this rule is deemed necessary why is the control expression of afor loop any different than any other situation?
There are cases where a comma operator is very useful. The guidelines specify that the rules apply equally to human and machine generated code. For instance in automatically generated code and macro definitions.
Rule 43: Implicit conversions loose no more information than explicit ones.
Conversions to types capable of representing a wider range of values does not usually loose information (care must be exercised with integer to floating point conversions; rounding may occur for values containing many significant digits).
Some coding standards require all casts to be explicit. While others only require explicit casts in those situations where information may be lost.
This rule should either state: 'All conversions to narrower types shall be explicit.', or 'All conversions shall be explicit.'
Rule 44: The use of typedefs can result in explicit casts that are not strictly necessary. For instance when converting between two types that have been defined using the same underlying base type.
It is hard to see how redundant, explicit, casts could cause any more confusion over the rules of promotion than non-redundant, explicit casts. Such a statement by MISRA is itself confusing.
Rule 47: This would be better stated in terms of requiring explicit parenthesising of expressions that contain more than one operand. Its category should be increased to required.
In C = is an operator. Complete adherence to this rule requires that any expression containing more than one operator be parenthesized.
a=(b+c);
What about expressions that rely on the implicitly, in the grammar, left to right order of evaluation of expressions? Ina+b+c the implicit ordering is a+b followed by adding c to the result. If the three objects have different types it is possible that the result will depend on the order of evaluation.
MISRA also needs to add a rule that states: 'No dependence shall be placed on C's left to right evaluation in expressions. The ordering should be made explicit via the use of parenthesis.'
Rule 48: The coding in the example is sloppy. Parenthesis should be used to make explicit that the cast, a unary operator, binds to the identifier on its right, not the complete expression.
Rule 49: Remove the exception case for boolean operands. It is very unlikely that the implementation will generate different code if an explicit test against zero appears in the source.
Rule 51: The behavior for unsigned arithmetic is well defined by the standard. If anything, this rule should apply to signed arithmetic overflow, which is undefined, and not covered by a MISRA rule.
Rule 53: As written this rule is too narrow. It is possible for parts of an expression to have no side-effect, i.e., the left hand side of a comma opertaor (in afor statement or otherwise, or one of the arms of the conditional operator. The rule needs to be widened to cover size-effects in expressions.
Rule 55: The undefined given on this rule is misleading. The use of labels in itself does not cause undefined behavior. But having more than one label, in the same function, with the same spelling does cause undefined behavior.
Rule 57: This is very narrow minded. continue is a structured flow of control statement.
Rule 58: This is very narrow minded. break is a structured flow of control statement.
Rule 61-64: The reference to Koenig is misleading.
Rule 66: Poor wording.
Presumably what is intended is: 'Only identifiers appearing in the control expression shall appear within the initialization and increment expressions.'
Rule 67: Even poorer wording. Numeric variable, iteration counting? Where are these terms defined? Not in the C standard.
Presumably what is intended is: 'Identifiers modified within the increment expression of a loop header shall not be modified inside the block controlled by that loop header.'
The use of the flag object in the example is an example of how use of a break statement can improve code readability.
Rule 73: This rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
Rule 77: This rule is a poorly reworded form of a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
Rule 78: This rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
The undefined behavior highlighted in this rule applies to calls to function that do not have a declaration prototype in scope. Such a situation is prohibited by Rule 71.
Rule 79: This is a nonsense rule. Functions having a void return type do not return any value. Attempting to use such a value is a constraint within the C standard and must be diagnoses by a conforming implementation. The MISRA guidelines are not intended to list constraints.
The undefined behavior highlighted refers to expressions of type void in general. An attempt to use the non-existent value of a void expression will in general be caught by the type compatibility rules and be a constraint violation.
Rule 80: This rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
The undefined behavior highlighted refers to words that appear in Annex G, but that do not appear in the normative body of the standard. Bug in the C standard.
Rule 81: This is very poorly worded. All parameters in C are passed by value. It is possible to pass a pointer to an object (but the pointer itself is still passed by value).
This rule should state a general condition that any object that is not intended to have its value modified should be defined using theconst qualifier.
Rule 84: This rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
Rule 85: This is a nonsense rule. An identifier declared to have function, or pointer to function type is only called if it is followed by parenthesis. Otherwise the value of the expression is the address referred to by the object; there is no function call involved.
The reference to Koenig is an example of how missing off parenthesis can lead to completely unexpected behavior, i.e. no function call.
Rule 87: The reference should be to clause 7.1.2.
Rule 94: This rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
The referenced undefined behavior refers to the case of an argument that expands to an empty sequence of preprocessor tokens.
This rule should state: 'An argument to a function-like macro shall not consist of no preprocessing tokens.'
Rule 98: As stated the rule is wider than the unspecified behaviour given in the C standard. It is not possible to cause unspecified behaviour through the use of more than one# in a macro body. However, use of more than one ##, or use of both# and ## can lead to unspecified behaviour.
Rule 101: As written this rule is saying that arrays may not be indexed. Sincea[i] is explicitly stated to be equivalent to *(a+1) in the Standard. More thought needs to be put into writing a rule that achieves the desired effect, without outlawing the writing of any meaningful C code.
Rule 103: This is poorly worded. The wording reads as if it is permitted to compare pointers to different objects of the same array, structure, or union type.
This rule should state: 'Relational operators shall not be applied to objects of pointer type except where both operands are of the same type and both point into the same object.'
Rule 104: Confusion. The referenced unspecified behavior has nothing to do with this rule. The two undefined behaviors have nothing to do with this rule and should both refer to 6.3.4 if they are intended to have anything to do with pointers to functions.
The first sentence of paragraph one is a good rule. Such casting would make it difficult to check that the correct arguments were being passed to functions. The headline rule refers to non-constant pointers; there is not a lot of connection between the two constructs.
Function pointers do not cause problems with dependence on the order of evaluation. The order is unspecified. Previous rules specify that code shall not depend on a particular order of evaluation of an expression.
Perhaps what is intended is: 'Pointers to functions shall never be cast, or take part in pointer arithmetic.'
Rule 105: Very poorly worded. The reference to undefined behavior should refer to 6.3.4 only.
This rule should state: 'Objects of type pointer to function shall never be assigned a value that is incompatible with the object type.'
Rule 108: As written the rule makes no sense. The commentary associated with this rule is a constraint within the C standard and must be diagnosed by a conforming implementation. The MISRA guidelines are not intended to list constraints.
The reference to undefined behavior should refer to 6.5
Rule 109: The reference to undefined behavior should refer to 6.3.16.1 (the cited reference to Clause 7 is not relevant). The reference to implementation defined behavior should refer to 6.3.2.3.
This rule means that the union type cannot be used. Within embedded system, where storage space is at a premium, unions are a type safe way for two objects to share storage, provided it is done in a mutually exclusive fashion. MISRA should consider deleting this rule and relying on Rule 110 to achieve the desired result.
Rule 110: Rule 109 requires that unions not be used; making this rule redundant.
The reference to undefined behavior should refer to 6.3.16.1 (the cited reference to Clause 7 is not relevant).
Rule 112: A bit-field object of type signed int and width 1 can represent the values-1 and 0. Why forbid this yet allow a bit-field of type unsigned int and width 1, which can represent the values0 and 1?
Of course there is some danger that developers will make the mistaken assumption that the signed case will be capable of holding the values0 and 1. In which case the rule category should be reduce to advisory.
Rule 113: Bit-fields are not the most poorly defined part of the language. Although they may be the most implementation defined part of the language.
The purpose of unnamed bit-fields is to adjust the padding of named fields. Packing of members is accepted by MISRA. The wording as it stands seems to imply that developers can somehow access these unnamed members. As such half of this rule is misguided.
Declaring a structure or union to contain only unnamed members is undefined behaviour.
Rule 119: A very bad rule. errno is not poorly defined. The additional set of values it may take is implementation defined. But this information, error information which MISRA recommends every effort be made to use, cannot be obtained through any other means.
The reference to implementation defined behavior is incorrect. "If a macro definition is suppressed in order to access an actual object, or a program defines an identifier with the name errno, the behavior is undefined." The Standard is simply trying to make sure that code does not take advantage of a particular implementation of this identifier.
Rule 120: The offsetof macro is a portable method of obtaining the offset of a struct member. The only undefined behavior applies to bit-fields.
This rule should state: 'The use of offsetof is recommended.'
The following is a list of constructs whose behaviour is unspecified, undefined, or implementation defined, for MISRA says nothing.
The type of string literals is array of char. Not array of const char, as in C++.
6.3 An arithmetic operation is invalid (such as division or modulus by 0) or produces a result that cannot be represented in the space provided.
6.5.3 An attempt is made to refer to an object with const-qualified type by means of an lvalue with non-const-qualified type.
The standard contains no prohibitions against casting away any const qualifier.
There should be a rule that says: 'Objects declared with the const-qualifier shall not be modified.'
6.5.3 An attempt is made to refer to an object with volatile-qualified type by means of an lvalue with non-volatile-qualified type.
The volatile qualifier is a signal to the implementation that the value of an object may change through external factors. This implies that all accesses to such an object do occur, they are not optimized away by keeping any previously accessed values in registers.
There should be a rule that says: 'Objects declared with the volatile-qualifier shall not be accessed via objects that do not have this qualification.'
6.8.3.2 The result of the preprocessing operator # is not a valid character string literal.
6.8.3.3 The result of the preprocessing operator ## is not a valid token.
6.1.3.4 The value of an integer character constant that contains more than one character.
The character constant 'ab' may be represented by either the charactera or b being held in the lower order byte. Even more permutations are available for character constants containing more characters
6.1.2.5 The representation and sets of values of the various types of integers.
6.5.5.3 What constitutes an access to an object that has volatile-qualified type.
It is intended that volatile-qualified objects be used to represent, among other things, memory mapped locations. For these types of objects, for instance, an access may cause a new value to be read into that location from an external source. But what exactly constitutes an access? For instance, is the appearance of an identifier on the left hand side of an assignment operator considered an access?
It was intended that this issue (and other issues concerning sequence points) be dealt with by the revised C standard, C9X. But there was insufficient time and it is planned that these issues will be covered by an Amendment.
6.8.2 The method of locating includable source files.
In development projects that use multiple directories care should be taken to ensure that no two header files have the same name. There is always the danger that an implementation will use a different search path and include the incorrect file.
6.9.3 Storage-class specifiers
"The placement of a storage-class specifier other than at the beginning of the declaration specifiers in a declaration is an obsolescent feature.
The revised standard includes some features that aid in the writing of reliable, maintainable code. It also contains additional constructs whose behaviour can vary between implementations and constructs that developers are likely to misunderstand.
At the very least MISRA needs to update the Standard references to include the revised clause numbers.
Comments from the following people (who do not agree with everything I have written) have helped me improve this commentary:
Stephen Parker
Chris Hills
MISRA is a registered trademark of the Motor Industry Research Association.
© Copyright 1999-2005, 2010. Knowledge Software Ltd. All rights reserved;
Home
Last modified 13 October 2010