Back to Projects List
Discuss our experiences and thoughts on the DICOM SEG standard.
Compare notes, benchmarks, and experience with interoperability and performance of DICOM SEG instances across platforms. Evaluate the extent to which any observed performance issues are inherent in the format or simply inefficient implementations. Consider proposals to improve the standard to address any inherent issues.
Performed timings with various methods to load segmentations in Slicer
We had several conversations about the importance of DICOM for organizing derived data from quantitative analysis, conversations which underlined the point of defining efficient implementations.
In discussion with machine learning researchers, e.g. developers and users of tools like TotalSegmentator, the number of segments is set increase rapidly, perhaps doubling within months to 200 or more, and with over 1000 segments expected within a year.
Example code to load with pydicom-seg
vi Slicer 5.2.1 python console:
try:
import pydicom_seg
except ModuleNotFoundError:
pip_install("pydicom_seg")
import pydicom
import pydicom_seg
import SimpleITK as sitk
dcm = pydicom.dcmread('/Users/pieper/slicer/latest/pydicom-seg/ABD_LYMPH_008_SEG.dcm') # 19 seconds
reader = pydicom_seg.MultiClassReader()
result = reader.read(dcm)
image_data = result.data # directly available
image = result.image # lazy construction
sitk.WriteImage(image, '/tmp/segmentation.nrrd', True)
seg = slicer.util.loadSegmentation('/tmp/segmentation.nrrd')
for segmentID in seg.GetSegmentation().GetSegmentIDs():
segmentIndex = int(segmentID.split("_")[1])
description = result.segment_infos[segmentIndex].SegmentDescription
seg.GetSegmentation().GetSegment(segmentID).SetName(description)
The DICOM SEG standard has been around for several years and has been implemented as part of several tools in various languages:
While interoperability has generally been good, performance of these SEG implementation has in general been orders of magnitude slower than research formats (e.g. nii.gz, nrrd, or seg.nrrd) at supporting segmentation use cases such as using segmentation data for machine learning. For example, this notebook shows that decoding a TotalSegmentator result from DICOM SEG with approximately 100 segments can take several minutes and consume very large amounts of memory for a segmentation that takes less than a second to read from a research format.
Poor performance is due to at least two factors:
We are interested in how the benefits of DICOM (standardized encoding, rich metadata, coded concepts, etc) can coexist with efficient read-write performance for real-world use cases.
A DICOM SEG may contain many segments (elsewhere known as “classes” or “labels”). But these segments are each stored in separate frames in the segmentation as multiple binary masks (0 or 1 everywhere). This is in contrast to many other formats that use a “label map” style encoding in which a single array contains many segments using pixel values to represent membership of a segment (i.e. pixel value 1 for segment 1, pixel value 2 for segment 2). Using separate frames does confer two important advantages over the label map approach:
However, this also comes at a steep cost for what is arguably the overwhelmingly common use case of non-overlapping non-fractional multi-segment segmentations. Especially in the case of a large number of segments (such as the TotalSegmentator mentioned above), this can lead to a very large number of frames and makes the memory/storage utilization much higher than would be necessary with a “label map” style. When you imagine doing instance segmentation of cells in a whole slide image, this becomes completely untenable.
It has been proposed that this could be solved relatively simply by adding a new Segmentation Type (e.g. “LABELED”) in addition to the existing “BINARY” and “FRACTIONAL”. This is not a formal proposal at this stage.
There is a highdicom draft implementation of what this could look like.
One issue is that currently SEGs images are limited to 8 bits per pixel, which would limit the number of segments representable in “LABELMAP” style to 255. This may not be high enough for some applications (e.g. instance segmentation). A proposal on “label map” encoding should consider whether this limitation should be relaxed.
Fractional segs are quantized and stored as integers. As mentioned above, the bits allocated is limited to a maximum of 8 currently. This means that fractional segmentations have limited precision and are quantized to 256 values, which is a lower level of precision than users would generally expect.
Even if it is encoded in labelmap representations, uncompressed data is inefficient for storing segmentation data. A typical nii.gz or .seg.nrrd file is compressed with gzip and can be 100 or more time smaller than the source data due to redundancy in the segmentation data (large areas of uniform segmentation or repeating patterns that can be more efficiently represented by short codes). DICOM currently offers some options for this like RLE, but as yet they have not be widely supported in currently used open source tools.
There are repeated reports of interoperability issues between segmentations created with highdicom and viewed in OHIF. See this issue.
Multiple users of highdicom have been asking for support for 2D+T files. This is possible but not straightforward due to the need to create a dimension organization methodology that includes time as a dimension. Due to time limitations this has not been a priority for highdicom but remains an open issue. See
A broader issue is whether these would be understood by viewing software unless the dimension organization method is standardized to some extent.