DDSM数据库

http://figment.csee.usf.edu/Mammography/Database.html

 

University of South Florida
Digital Mammography
Home Page




DDSM: Digital Database for Screening Mammography


The Digital Database for Screening Mammography (DDSM) is a resource for use by the mammographic image analysis research community. Primary support for this project was a grant from the Breast Cancer Research Program of the U.S. Army Medical Research and Materiel Command. The DDSM project is a collaborative effort involving co-p.i.s at the Massachusetts General Hospital (D. Kopans, R. Moore), the University of South Florida (K. Bowyer), and Sandia National Laboratories (P. Kegelmeyer). Additional cases from Washington University School of Medicine were provided by Peter E. Shile, MD, Assistant Professor of Radiology and Internal Medicine. Additional collaborating institutions include Wake Forest University School of Medicine (Departments of Medical Engineeringand Radiology), Sacred Heart Hospital and ISMD, Incorporated. The primary purpose of the database is to facilitate sound research in the development of computer algorithms to aid in screening. Secondary purposes of the database may include the development of algorithms to aid in the diagnosis and the development of teaching or training aids. The database contains approximately 2,500 studies. Each study includes two images of each breast, along with some associated patient information (age at time of study, ACR breast density rating, subtlety rating for abnormalities, ACR keyword description of abnormalities) and image information (scanner, spatial resolution, ...). Images containing suspicious areas have associated pixel-level "ground truth" information about the locations and types of suspicious regions. Also provided are software both for accessing the mammogram and truth images and for calculating performance figures for automated image analysis algorithms.


 IMPORTANT NOTE:
The DDSM has been extensively used by the research community. It is maintained at the University of South Florida for purposes of keeping it accessible on the web. Additional functionality for DDSM has been created by other research groups and is described in the following publications:

  • Web services for the DDSM and digital mammography research, Chris Rose, Daniele Turi, Alan Williams, Katy Wolstencroft and Chris Taylor, IWDM 2006.

Please click here for information about the special DoD Breast Cancer Research Program mass and calcification datasets mentioned in the 2000 rfp.



The Digital Database for Screening Mammography is organized into "cases" and "volumes." A "case" is a collection of images and information corresponding to one mammography exam of one patient. A "volume" is simply a collection of cases collected together for purposes of ease of distribution. All volumes are available on 8mm tape. Normally all (or almost all) volumes are also available on-line. The README file explaining "everything" about the database is available, and many answers to questions about the database are listed below.

  • What information is included in a case?
    A case consists of between 6 and 10 files. These are an "ics" file, an overview "16-bit PGM" file, four image files that are compressed with lossless JPEG encoding and zero to four overlay files. Normal cases will not have any overlay files. Click here for more detailed information on the files contained in a case.

  • What is the difference between normal, cancer, benign and benign without callback volumes?
    Each volume is a collection of cases of the corresponding type. Normal cases are formed from a previous normal screening exam (pulled from a file) for a patient with a normal exam at least four years later. A normal screening exam is one in which no further "work-up" was required. Cancer cases are formed from screening exams in which at least one pathology proven cancer was found. Benign cases are formed from screening exams in which something suspicious was found, but was determined to not be malignant (by pathology, ultrasound or some other means). The term benign without callback is used to identify benign cases in which no additional films or biopsy was done to make the benign finding. These cases, however, contained something interesting enough for the radiologist to mark. A small number of cancer cases may contain, in addition to one or more regions that are path-proven malignant, one or more regions that are unproven. These are suspicious regions for which there is no path result. (Click here for more about ground truth.) There is information on the BI-RADS lexicon used to encode the descriptions in the DDSM ground truth files on the ACR website.

  • How can I search the database?
    DDSM does have a search capability designed to allow you to identify cases that meet specified criteria such as normal/cancer/benign, ACR breast density rating and ACR abnormality keyword description. Click here to try out the search facility.

  • If I use data from DDSM in publications...
    Please credit the DDSM project as the source of the data, and reference:
    • The Digital Database for Screening Mammography, Michael Heath, Kevin Bowyer, Daniel Kopans, Richard Moore and W. Philip Kegelmeyer, in Proceedings of the Fifth International Workshop on Digital Mammography, M.J. Yaffe, ed., 212-218, Medical Physics Publishing, 2001. ISBN 1-930524-00-5.
    • Current status of the Digital Database for Screening Mammography, Michael Heath, Kevin Bowyer, Daniel Kopans, W. Philip Kegelmeyer, Richard Moore, Kyong Chang, and S. MunishKumaran, in Digital Mammography, 457-460, Kluwer Academic Publishers, 1998; Proceedings of the Fourth International Workshop on Digital Mammography.
    Also, please send a copy of your publication to Professor Kevin Bowyer / Computer Science and Engineering / University of Notre Dame / Notre Dame, Indiana 46530.

  • What volumes are available?
    We have 2620 cases available in 43 volumes. The table below summarizes the contents of each volume.
    >>
    VOLUME CASES SIZE SCANNER BITS RESOLUTION THUMBNAILS NOTES AVAILABILITY
    normal_01 111 5.8 GB DBA 16 42 microns thumbnails notes ftp
    normal_02 117 6.6 GB DBA 16 42 microns thumbnails notes ftp
    normal_03 38 4.1 GB DBA 16 42 microns thumbnails notes ftp
    normal_04 57 5.1 GB DBA 16 42 microns thumbnails notes ftp
    normal_05 47 4.3 GB DBA 16 42 microns thumbnails notes ftp
    normal_06 60 5.5 GB DBA 16 42 microns thumbnails notes ftp
    normal_07 78 6.2 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    normal_08 27 2.8 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    normal_09 59 4.9 GB LUMYSIS 12 50 microns thumbnails notes ftp
    normal_10 23 2.1 GB LUMYSIS 12 50 microns thumbnails notes ftp
    normal_11 58 6.1 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    normal_12 20 2.2 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_01 69 3.9 GB LUMISYS 12 50 microns thumbnails notes ftp
    cancer_02 88 5.7 GB LUMISYS 12 50 microns thumbnails notes ftp
    cancer_03 66 6.0 GB DBA 16 42 microns thumbnails notes ftp
    cancer_04 31 2.8 GB DBA 16 42 microns thumbnails notes ftp
    cancer_05 83 6.6 GB LUMISYS 12 50 microns thumbnails notes ftp
    cancer_06 56 6.3 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_07 52 6.1 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_08 55 6.0 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_09 81 6.5 GB LUMISYS 12 50 microns thumbnails notes ftp
    cancer_10 59 6.6 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_11 59 5.9 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_12 80 6.8 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_13 21 2.0 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_14 42 4.6 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    cancer_15 72 6.0 GB LUMISYS 12 50 microns thumbnails notes ftp
    benign_01 80 6.5 GB LUMISYS 12 50 microns thumbnails notes ftp
    benign_02 69 6.9 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_03 64 6.7 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_04 81 6.5 GB LUMISYS 12 50 microns thumbnails notes ftp
    benign_05 62 6.5 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_06 74 6.1 GB LUMISYS 12 50 microns thumbnails notes ftp
    benign_07 61 6.1 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_08 64 6.5 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_09 75 6.1 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_10 21 2.1 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_11 62 6.5 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_12 64 6.4 GB HOWTEK 12 43.5 microns thumbnails notes ftp
    benign_13 72 6.1 GB LUMISYS 12 50 microns thumbnails notes ftp
    benign_14 21 2.1 GB LUMISYS 12 50 microns thumbnails notes ftp
    bwc_01 75 6.4 GB LUMISYS 12 50 microns thumbnails notes ftp
    bwc_02 66 5.9 GB LUMISYS 12 50 microns thumbnails notes ftp
    bwc = benign_without_callback

  • Do you have a "troubleshooting" section on you web pages?
    Yes. We have compiled a list of frequently asked questions and have provided answers to them.Click here to go to that information.

  • How do I acquire a volume?
    Several volumes will be available by anonymous ftp at any given time (figment.csee.usf.edu in pub/DDSM/cases). You can download individual cases or entire volumes. Occasionally, we will change which volumes are available on line giving preference to the more recently released volumes. The tape option is no longer available.

  • What software is available for working with this data?
    Link Description
    software_v1.1.tar.Z Software for viewing cases in the DDSM database. This code is somewhat outdated. You might try using the software in heathusf_v1.1.0.html for decompressing and converting images and ground truth to other formats.
    Manual.html Documentation on the use of the viewing software.
    JpegInfo.html Source code from the Portable Video Research Group for the lossless JPEG compression program.
    heathusf_v1.1.0.html Source code for software that that can be used to extract images and ground truth from DDSM cases. It also includes a mass detection algorithm and performance assessment software. Version 1.1.0 was made available on August 3, 2000. It contains additional source code for a program to display mammography images in X-Windows.
    IWDM 2000 paper outlining use of software.
    M. D. Heath and K. W. Bowyer, "Mass detection by Relative Image Intensity", in The Proceedings of the 5th International Conference on Digital Mammography (Toronto, Canada, June 2000), Medical Physics Publishing (Madison, WI), ISBN 1-930524-00-5.
     

  • Can I preview the cases in a volume?
    Yes, we have made web pages that show "thumbnail" versions of the images. See the table for links to each volume of thumbnails. Each case has a separate web page. On each page, "thumbnail" images are displayed with all of the ground truth markings overlayed on them. The text information from the ics file and all of the overlay files is also provided. Please note that the colors for the overlayed ground truth markings are selected independently for each image. The color of each boundary can be used to index the associated textual information for that marking in the overlay table. Colors are not coordinated across MLO and CC views of the breast.

  • What is the "notes" link in the table of cases?
    The table of cases has a link to a page for each volume. Each page contains additional information about cases, such as presence of pacemaker, implants, skin markers, and other rare occurrences. The notes also contain information on any changes made to the cases after they were released. Although each case is checked thoroughly (and re-checked) before being released, errors may rarely exist in released volumes. When any errors are found, they will be corrected and listed on the notes page for that volume.

  • How do I map grey levels to optical density?
    In some situations, it may be useful to be able to map the grey levels in a mammogram image to optical density values. For example, you may want to run your image analysis software on data sets that were acquired on two different scanners. Since the grey levels in images acquired on different scanners will probably not correspond to the same optical density, you may want to "normalize" the images in some manner prior to processing them.
    Here's how to map grey levels to optical density for images digitized at:
    • DBA scanner at MGH   ('A' and DBA)
    • HOWTEK scanner at MGH   ('A' and HOWTEK)
    • LUMISYS scanner at Wake Forest University   ('B' or 'C' and LUMISYS)
    • HOWTEK scanner at ISMD   ('D' and HOWTEK)

  • Are statistics available on patient population?
    The largest portion of the DDSM cases come from the Massachusetts General Hospital mammography program. Another substantial portion of the DDSM cases come from the Wake Forest University School of Medicine mammography program. All cases in DDSM are female patients, of course. The general statistical breakdown of patients by race at MGH and WFUSM is:
      MGH WFUSM
    • Asian
    2.06 0.2
    • Black
    4.12 20.4
    • Spanish Surname
    6.55 1.8
    • American Indian
    0.00 0.1
    • Other
    0.75 0.1
    • Unknown
    30.34 0.3
    • White
    56.18 77.0

  • Is there any additional information available on DDSM?
    No. There is no information or support available on DDSM beyond what is listed on these web pages. We regret that support for this project is now ended and we are not able to respond to technical questions. Please, see the proceedings of recent instances of the International Workshop on Digital Mammography for examples of how DDSM has been used in research.

  • Do you have anonymous ftp access statistics available?
    Yes. We have a page displaying a graph showing the amount of data downloaded from DDSM (pub/DDSM/cases) by anonymous ftp each week. Click here to view the graph.

  • Are there other Mammography resources on this web site.
    Yes. They have been moved to our "Other Resources" page.



Note: The Digital Database for Screening Mammography (DDSM) was developed through a grant from the DOD Breast Cancer Research Program, US Army Research and Material Command DAMD17-94-J-4015.

你可能感兴趣的:(DDSM数据库)