DQS的认识一

Data Quality Issue Description
Completeness Is all the required information available? Are data values missing, or in an unusable state? In some cases, missing data is irrelevant, but when the information that is missing is critical to a specific business process, completeness becomes an issue.
Example: if you have an email field where only 50,000 values are present out of a total of 75,000 records, then the email field is 66.6% complete.
Conformity Are there expectations that data values conform to specified formats? If so, do all the values conform to these formats? Maintaining conformance to specific formats is important in data representation, presentation, aggregate reporting, search, and establishing key relationships.
Example: The Gender codes in two different systems are represented differently; in one system the codes are defined as ‘M’, ‘F’ and ‘U’ whereas in the second system they appear as 0, 1, and 2.
Consistency Do values represent the same meaning?
Example: Is revenue always presented in Dollars or also in Euro?
Accuracy Do data objects accurately represent the “real-world” values they are expected to model? Incorrect spellings of product or person names, addresses, and even untimely or not current data can impact operational and analytical applications.
Example: A customer’s address is a valid USPS address. However, the ZIP code is incorrect and the customer name contains a spelling mistake.
Validity Do data values fall within acceptable ranges?
Example: Salary values should be between 60,000 and 120,000 for position levels 51 and 52.
Duplication Are there multiple, unnecessary representations of the same data objects within your data set? The inability to maintain a single representation for each entity across your systems poses numerous vulnerabilities and risks. Duplicates are measured as a percentage of the overall number of records. There can be duplicate individuals, companies, addresses, product lines, invoices and so on. The following example depicts duplicate records existing in a data set:
Name Address Postal Code City State
Mag. Smith 545 S Valley View D. # 136 34563 <Anytown> New York
Margaret smith 545 Valley View ave unit 136 34563-2341 <Anytown> New-York
Maggie Smith 545 S Valley View Dr <Anytown>

NY.

你可能感兴趣的:(DQS的认识一)