Radiomics
From Wikipedia, the free encyclopedia
In the field of medicine, radiomics is a method that extracts a large number of features from medical images using data-characterisation algorithms.[1][2][3][4][5] These features, termed radiomic features, have the potential to uncover tumoral patterns and characteristics that fail to be appreciated by the naked eye.[6] The hypothesis of radiomics is that the distinctive imaging features between disease forms may be useful for predicting prognosis and therapeutic response for various cancer types, thus providing valuable information for personalized therapy.[1][7][8] Radiomics emerged from the medical fields of radiology and oncology[3][9][10] and is the most advanced in applications within these fields. However, the technique can be applied to any medical study where a pathological process can be imaged.
Image acquisition
The image data is provided by radiological modalities as CT,[11] MRI,[12] PET/CT or even PET/MR.[13] The produced raw data volumes are used to find different pixel/voxel characteristics through extraction tools.[2]
The extracted features are saved in large databases where clinics have access so as to enable broadly collaborative and cumulative work in which all can benefit from growing amounts of data, ideally enabling a more precise workflow.
Image segmentation
After the images have been saved in the database, they have to be reduced to the essential parts, in this case the tumors, which are called "volumes of interest".[2]
Because of the large image data that needs to be processed, it would be too much work to perform the segmentation manually for every single image if a radiomics database with lots of data is created. Instead of manual segmentation, an automated process has to be used. Two possible solutions are automatic and semiautomatic segmentation algorithms. Before it can be applied on a big scale an algorithm must score as high as possible in the following four tasks:
- First, it must be reproducible, which means that when it is used on the same data the outcome will not change.
- Another important factor is the consistency. The algorithm does solve the problem at hand and performs the task rather than doing something that is not important. In this case, it is necessary that the algorithm can detect the diseased part in all different scans.
- The algorithm also needs to be accurate. It is very important that the algorithm detects the diseased part in the most precise way possible. Only with accurate data, accurate results can be achieved.
- A minor but still important point is the time efficiency. The results should be generated as fast as possible so that the whole process of radiomics can also be accelerated. A minor point means in this case that, if it is in a certain frame, it is not as important as the others.
Feature extraction and qualification
After the segmentation, many features can be extracted and the relative net change from longitudinal images (delta-radiomics) can be computed. Radiomic features can be divided into five groups: size and shape based–features, descriptors of the image intensity histogram, descriptors of the relationships between image voxels (e.g. gray-level co-occurrence matrix (GLCM), run length matrix (RLM), size zone matrix (SZM), and neighborhood gray tone difference matrix (NGTDM) derived textures), textures extracted from filtered images, and fractal features. The mathematical definitions of these features are independent of imaging modality and can be found in the literature.[14][15][16][17] A detailed description of texture features for radiomics can be found in Parekh et al. (2016) [4] and Depeursinge et al. (2017).[18]
Due to its massive variety, feature reductions need to be implemented to eliminate redundant information. Hundreds of different features need to be evaluated with a selection algorithm to accelerate this process. Additionally, features that are unstable and non-reproducible should be eliminated since features with low-fidelity will likely lead to spurious findings and unrepeatable models.[19][20]
Analysis
After the selection of features that are important for our task it is crucial to analyze the chosen data. Before the actual analysis, the clinical and molecular (sometimes even the genetic) data needs to be integrated because it has a big impact on what can be deducted from the analysis. There are different methods to finally analyze the data. First, the different features are compared to one another to find out whether they have any information in common and to reveal what it means when they all occur at the same time.
Another way is Supervised or Unsupervised Analysis. Supervised Analysis uses an outcome variable to be able to create prediction models. Unsupervised Analysis summarizes the information we have and can be represented graphically. So that the conclusion of our results is clearly visible.
Databases
Creation
Several steps are necessary to create an integrated radiomics database. The imaging data needs to be exported from the clinics. This is already a very challenging step because the patient information is very sensitive and governed by Privacy laws, such as HIPAA. At the same time the exported data must not lose any of its integrity when compressed so that the database only incorporates data of the same quality. The integration of clinical and molecular data is important as well and a large image storage location is needed.
Use
The goal of radiomics is to be able to use this database for new patients. This means that we need algorithms that run new input data through the database which return a result with information about what the course of the patients' disease might look like. For example, how fast the tumor will grow or how good the chances are that the patient survives for a certain time, whether distant metastases are possible and where. This determines how the further treatment (like surgery, chemotherapy, radiotherapy or targeted drugs etc.) and the best solution which maximizes survival or improvement is selected. The algorithm has to recognize correlations between the images and the features, so that it is possible to extrapolate from the data base material to the input data.