MIA4LUNG - Dataset for segmentation

Dataset for lung segmentation

MIA4LUNG Dataset

When constructing the experimental database, we initially considered publicly annotated databases containing chest X-rays with diverse visual characteristics of the images, aiming to enhance the model's generalization capability. Thus, we utilized images from the following public databases: JSRT, miniJSRT, Montgomery and the COVID-19 Radiography Database.

JSRT

This dataset was generated by the Japanese Society of Radiological Technology, and all the information related to it can be found at this link. It comprises 247 images, with 154 labeled as malignant (nodules present) and 93 as benign (nodules absent). All the images have a mask with the position of the lungs.

miniJSRT

This dataset was generated by the Japanese Society of Radiological Technology, and all the information related to it can be found at this link. It comprises 60 images with their corresponding masks.

Montgomery

This dataset was generated by the Department of Health and Human Services, Montgomery County, Maryland, USA and Shenzhen No. 3 People’s Hospital in China, and all the information related to it can be found at this link. It comprises 138 images, with 58 diagnosed with tuberculosis and 80 classified as benign. For each image, there are two masks: one with the location of the left lung, and one with the location of the right lung.

COVID-19 Radiography database

The COVID-19 Radiography database originally consisted of 21165 images, with 3616 showing COVID-19, 10192 normal, 6012 lung opacity and 1345 cases of viral pneumonia in the chest. Given the technical limitations for processing this high number of images, we selected a subset of 2555 images (to obtain a database with 3000 images). All the information related to it can be found at this link.