Artificial Intelligence (AI)
Discuss current events in AI and technological innovations with Intel® employees
487 Discussions

Medical Data (DICOM) Annotation in Computer Vision Tool

0 0 3,519


The second decade of the 21st century has shown us that using of artificial neural networks is beneficial for resolving issues that are difficult to formalize within classic methods. It’s important to note that a lot of applications in modern life are found for these solutions. They are applied in security, manufacturing, and business processes automatization, or just to introduce to our life different simplifications. A significant part of neural network-based solutions refers to the Computer Vision area, where the main data are visual images (as pictures, videos, deep maps). When AI-researchers develop neural network algorithms, they often face the problem of insufficient reliable training data (ground truth examples). It’s not a secret, that amount of such data influences prediction quality of a final modal. Here at Intel® company, we use different data sources. One of them — is preparing data ourselves, with the help of our internal, professional data annotation team. To simplify this process our team has been developing the special software Computer Vision Annotation Tool (CVAT), which source code is available on GitHub. The software is a part of Intel® Distribution of OpenVINO™ toolkit ecosystem — a toolkit for quick deployment of efficient applications that use different deep learning models. By now CVAT has already gained quite high popularity among regular and commercial users across in the world.

A piece of history

Once, after the New Year holidays, one of our customers contacted us asking if CVAT can work with data represented as medical images. One of the most popular formats to store medical data is .dcm files. Their internals are specified by Digital Imaging and Communication in Medicine (DICOM) standard. Unfortunately, CVAT has never been developed to support this format, although some of our customers were able to use it for this purpose. For example, it has been used in development of the NerveTrack product. There were also independent customers’ attempts to change the source code of CVAT to support DICOM format (one of such instances is described here). In the latest source, the customer highlighted the following advantages of CVAT usage for these purposes: it is an open-source software with the ability to adjust it if necessary; the ability to store confidential data locally on internal servers; the ability to deploy it in a local or corporate network with working via a browser. The latest item is crucial because ordinary users often don’t have advanced knowledge in system administration, and it is hard for them to install and set up such tools on their own.

Since then we have decided that CVAT should be able to work with DICOM data, and we should provide a convenient user interface to work with them. We believe it is important, because, firstly, AI's part in medicine is quite significant. It means that there is high demand in the community for such tools as CVAT, especially in times of the COVID-19 pandemic. Besides, we didn't find annotation tools to label DICOM data with similar features (complex, web-based, free, open-source) on the software market. There are different quite interesting solutions, that provide features to process, store, maintenance, and view DICOM data (pydicom, cornerstonejs, orthanc-server, different DICOM viewers), simple annotation tools (, or complex solutions with a lot of features to data annotation, but they are commercial with certain restrictions ( 

Not so easy, guys

 Our team has conducted research and developed possible high-level design solutions (how the support of DICOM could be injected into existing CVAT architecture). In the process, we faced two main problems, which combined present a significant obstacle for further work development.  The first issue is that DICOM standard suggests a tremendous variety of .dcm file formats. For example, the standard assumes that 79 DICOM modalities exist, each defines DICOM data representation or content (CT – computed tomography, CR – computed radiography, LEN – lensometry, MR – magnetic-resonance therapy, ...).  Furthermore, DICOM standard specifies a huge number of different attributes or tags. Some of them are general, others depend on a modality. The .dcm files could include one image, or several (a slice), or several slices. The data in these images, often cannot be interpreted as regular pixels, they could be physical values, measured by a certain device instead. Finally, DICOM is not the only format representing medical data. Support of all these scenarios would be an overkill task for our team. And is it actually necessary?

Moving on to the second issue. The usual process of CVAT features development is based upon customers first needs. For example, a significant part of CVAT features has been added in order to fulfill requests of our internal data annotation team, which prepared data to train a lot of models in OpenVINO™ Open Model Zoo – a set of accurate and highly-optimized deep learning models. The approach implies an interested customer who leaves a request to annotate a certain dataset in a certain way and expects results in a certain format. After that, we need to solve a specific and understandable task, instead of trying to predict what exactly our customers need. It should be noted that CVAT has never been presented as a tool to annotate medical data, so, the formed CVAT community is not very interested in supporting of such features. This is why we do not have enough ideas about further development. To tell the truth, one of the reasons to write this article is to gather details about actual problems the community deals with and the desirable feature. So, if you are interested in helping further CVAT development, please contact us via GitHub, or by email (you can find all contact information at the end of the article).  

Nevertheless, how to..?

The paper would not justify its title if it had ended after the previous paragraph. Regardless of shown difficulties, we were able to design a quick solution to annotate DICOM files in CVAT. The solution might not be the most convenient, but it can be applied in some simple cases. As it was noted, DICOM files are very different, but a significant part of this problem can be solved using nice, ready solutions, including open-source. So, we have used pydicom – a module of a library for Python programming language to prepare a script to convert DICOM files to regular images. Below we demonstrated how you can use it. A command-line interface to conversion is in the CVAT repository here. It’s available on release with version number 1.4. The commands below were tested in OS Ubuntu 20.04, but these are generally simple actions that could be done using a graphical user interface of Windows, or other systems.    

It’s assumed, that some tools are installed in the system. To install them on Ubuntu you can use the following command:

sudo apt install curl zip unzip python3 python3-pip python3-venv git

 The first step is to clone CVAT repository if you haven't done it yet:

git clone --branch v1.4.0 cvat && cd cvat

The next step is to change the current working directory to the script directory. It’s also recommended to create Python virtual environment to avoid pollution of the system by dependencies. After that, install the dependencies:

cd utils/dicom_converter/
python3 -m venv .env
. .env/bin/activate
pip install -r requirements.txt

Now we can run the script to convert a dataset. For example, let's take a dataset CHAOS (Combined (CT-MR) Healthy Abdominal Organ Segmentation). You can download it manually using the links below, or using curl tool in Ubuntu:

curl -L --output 
curl -L --output

Unzip fetched data using GUI or Ubuntu CLI:

unzip -d CHAOS_Test
unzip -d CHAOS_Train

Now we can run the script, pass the right arguments, and convert the dataset. The first argument is a root directory with the source data. The second argument is a root directory for converted files:

python3 CHAOS_Train CHAOS_Train_converted
python3 CHAOS_Test CHAOS_Test_converted

Note: The script does a recursive search of DICOM files. The file tree will be kept in the results. 

Note: If a DICOM file consists of a lot of images, received images include a postfix with an image number. For example, multi-frame DICOM file called 055829-00000000.dcm is converted to a set of files 055829-00000000_000.png, 055829-00000000_001.png, ...

After the script is completed, regular images, which CVAT can work with, are in directories CHAOS_Train_converted and CHAOS_Test_converted. Depending on your requirements, you can take a part of the data, or the whole data and create an annotation task in CVAT. For convenience, let's zip converted files back using GUI or Ubuntu CLI:

zip -r CHAOS_Train_converted/
zip -r CHAOS_Test_converted/

Another way is to skip zipping and move the converted files to the CVAT shared storage (if it is connected) and use the corresponding files tab in CVAT interface when creating a task; or simply upload them with standard browser capabilities from any place of the filesystem if the images are a linear list (directory tree is not yet supported for upload).  

Then you need to install CVAT if you haven't done it yet. Detailed installation and configuration guide for your operating system can be found in documentation pages

After CVAT installation let's create a project called DICOM, which includes a couple of abstract labels to be annotated (we are not medical experts and we do not know what should be annotated in the dataset, so it is just an example). Next, open the project and create a couple of tasks inside, using two archives created above:

figure 1
figure 2

Now we are going to annotate abstract instances on the images:

figure 3
figure 4

Manually annotated image

Besides  manual annotation, we can use semi-automatic methods:

Semi-automatic segmentation using OpenCV JavaScript library looks efficient in the scenario because the annotated objects are very contrasted and it allows the algorithm to work quite accurately. Here is one more  example of the annotated image:

figure 5

Image, annotated using semi-automatic methods

Finally, when our data are annotated, we can get results as PNG masks for example. Actually, CVAT supports a huge number of different formats, but IMHO masks are quite popular when working with DICOM data. 

figure 6

The final file contains masks and other useful information:

figure 7
figure 8

In this brief article guide, we considered a possible way to annotate DICOM data using Computer Vision Annotation Tool. Moreover, we tried to explain how exactly our team comes to the development of new features and stressed the importance of customers feedback and detailed requests in this process. If you want to leave feedback about the article or to share your ideas on the considered topic, contact us using the contacts below.

Author and developer who is responsible for DICOM-related features: Boris Sekachev 

CVAT architect designer and the team manager: Nikita Manovich 

Link to GitHub
Previous materials about CVAT


Notices & Disclaimers

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure. 

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications.

Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.

Your costs and results may vary. 

© Intel Corporation.  Intel, the Intel logo, OpenVINO, the OpenVINO logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.  Other names and brands may be claimed as the property of others.

About the Author
Mary is the Community Manager for this site. She likes to bike, and do college and career coaching for high school students in her spare time.