晓彤

verify-tagKvasir-SEG Data (Polyp segmentation & detection)

health

6

已售 0
144.38MB

数据标识:D17222397652417884

发布时间:2024/07/29

以下为卖家选择提供的数据验证报告:

数据描述

Kvasir-SEG information:

The Kvasir-SEG dataset (size 46.2 MB) contains 1000 polyp images and their corresponding ground truth from the Kvasir Dataset v2. The images' resolution in Kvasir-SEG varies from 332x487 to 1920x1072 pixels. The images and its corresponding masks are stored in two separate folders with the same filename. The image files are encoded using JPEG compression, facilitating online browsing. The open-access dataset can be easily downloaded for research and educational purposes.

Applications of the Dataset

The Kvasir-SEG dataset is intended to be used for researching and developing new and improved methods for segmentation, detection, localization, and classification of polyps. Multiple datasets are prerequisites for comparing computer vision-based algorithms, and this dataset is useful both as a training dataset or as a validation dataset. These datasets can assist the development of state-of-the-art solutions for images captured by colonoscopes from different manufacturers. Further research in this field has the potential to help reduce the polyp miss rate and thus improve examination quality. The Kvasir-SEG dataset is also suitable for general segmentation and bounding box detection research. In this context, the datasets can accompany several other datasets from a wide range of fields, both medical and otherwise.

Ground Truth Extraction

We uploaded the entire Kvasir polyp class to Labelbox and created all the segmentations using this application. The Labelbox is a tool used for labeling the region of interest (ROI) in image frames, i.e., the polyp regions for our case. We manually annotated and labeled all of the 1000 images with the help of medical experts. After annotation, we exported the files to generate masks for each annotation. The exported JSON file contained all the information about the image and the coordinate points for generating the mask. To create a mask, we used ROI coordinates to draw contours on an empty black image and fill the contours with white color. The generated masks are a 1-bit color depth images. The pixels depicting polyp tissue, the region of interest, are represented by the foreground (white mask), while the background (in black) does not contain positive pixels. Some of the original images contain the image of the endoscope position marking probe, ScopeGuide TM, Olympus Tokyo Japan, located in one of the bottom corners, seen as a small green box. As this information is superfluous for the segmentation task, we have replaced these with black boxes in the Kvasir-SEG dataset.

Suggested Metrics

There are different metrics for evaluating the performance of the architectures on the image segmentation dataset. For medical image segmentation task, the most commonly used ones are Dice coefficient and Intersection over Union (IOU). Based on related work in this field, we have used these metrics for the evaluation of the algorithms. In future work, we encourage the use of these metrics for evaluating the performance of the model. In the future, it might be even better to include as many as possible metrics for the fair comparison of the models.

The bounding box (coordinate points) for the corresponding images are stored in a JSON file. This dataset is designed to push the state-of-the-art solution for the polyp detection task. Some examples of the dataset.

data icon
Kvasir-SEG Data (Polyp segmentation & detection)
6
已售 0
144.38MB
申请报告