DEN format
The DEN format is a versatile and powerful storage format for multidimensional (ND) stacks of data of various types. The dentk
package provides a range of tools for manipulating this data, including summing, slicing, and applying complex operations like Gaussian blur. Although the format is uncompressed, which is a trade-off for ease of access, it remains highly efficient for handling large datasets. The format is also highly flexible, allowing for the storage of different data types in up to 16 dimensions.
Overview of Extended DEN format
The extended DEN format is a binary format for storing multidimensional arrays of up to 16 dimensions. It is an extension of the legacy DEN format, which was limited to three dimensions. The extended DEN format is designed to address the limitations of the legacy format and provide a more flexible and versatile storage solution for multidimensional data.
Supported Element Types and Byte Sizes
Element type | ID | Byte Size |
---|---|---|
UINT16 | 0 | 2 |
INT16 | 1 | 2 |
UINT32 | 2 | 4 |
INT32 | 3 | 4 |
UINT64 | 4 | 8 |
INT64 | 5 | 8 |
FLOAT32 | 6 | 4 |
FLOAT64 | 7 | 8 |
UINT8 | 8 | 1 |
Table 1: Data element types supported in extended DEN format with corresponding type IDs and Byte sizes.
Header Structure
The header consists of 4096 bytes. Currently, 74 bytes are utilized and rest is reserved for future use or user specific information. The size of the header is fixed at 4096 bytes to ensure that the data starts at a 4096-byte boundary, which can be beneficial for memory alignment and performance reasons.
Field | Value Description |
---|---|
1st uint16
|
0 |
2nd uint16
|
Number of dimensions (less than 16) |
3rd uint16
|
Byte size of the element |
4th uint16
|
X-major (0) or Y-major (1) specifier |
5th uint16
|
Element ID, see table above |
Following these, there are up to 16 dimension specification values:
Byte Field | Value Description |
---|---|
Each uint32
|
Dimension size |
Dimension Names | [dim_1(x), dim_2(y), ..., dim_16] |
Note: Be aware that the NumPy index and shape convention might also be counterintuitive. When converting DEN files with the dimension specification (dim_1, dim_2, dim_3) into NumPy arrays, the shape of the array will be (dim_3, dim_2, dim_1). This is because the last dimension in the DEN file corresponds to the slowest changing dimension in NumPy. Thus in case of 3D file array[k] corresponds to the k-th frame (dim_1 x dim_2), where k is in the range [0, dim_3). This means you can directly access the k-th slice using array[k] instead of array[:,:,k].
Example Java Code for DataType Enumeration
The supported element types and their corresponding byte sizes are listed below:
public enum DenDataType {
UINT16(2), // 0
INT16(2), // 1
UINT32(4), // 2
INT32(4), // 3
UINT64(8), // 4
INT64(8), // 5
FLOAT32(4), // 6
FLOAT64(8), // 7
UINT8(1); // 8
private final int byteSize;
private DenDataType(int byteSize) { this.byteSize = byteSize; }
public int getSize() { return byteSize; }
}
Legacy DEN Format
Within this framework, the legacy DEN format was used to store volume, projection, and other types of data. It is a binary format for storing three-dimensional arrays of uint16, float32, or float64 values. The legacy format has a fixed header of 6 bytes, representing three uint16 values corresponding to the dimensions of the array dimy, dimx, dimz, respectively. The data after the header is aligned in x-major order, with all values encoded in little-endian. The x-major alignment means that the value representing the position (ix, iy, iz) in the array has a flat index ix + iy * dimx + iz * dimx * dimy.
The DEN format is named after the German CT researcher, Dr. Ing. Frank Dennerlein.
Problems with the Legacy Format
The legacy format has several limitations:
- The data type can only be derived from the file size and number of elements indicated by the header. This limitation prevents the representation or distinction of two types with a 4-byte element size.
- The order of dimensions in the header is (dimy, dimx, dimz), which is counterintuitive for multidimensional arrays.
- There is no support for y-major alignment.
- A single dimension can have a maximum of 65,535 elements.
- Fixing the number of dimensions to 3 reduces the flexibility of the format, making it unsuitable for representing 1D arrays or arrays with more than 3 dimensions.
Use Cases and Tools for Manipulation
The Extended DEN format is supported by several tools for efficient manipulation and visualization of the data:
Tools
- DENTK: DENTK is a collection of BASH command line programs written in C++ for manipulating DEN format data.
- CTIOL: CTIOL is a C++ library dedicated to I/O operations with the DEN format.
- KCT denpy: The KCT denpy library, particularly its submodule DEN, provides extensive functionality for manipulating DEN format data in Python.
- KCT den file opener: KCT den file opener is a tool for visualizing DEN data stacks in ImageJ. It also includes classes useful for implementing reading/writing the format in Java.
The extended DEN format is implemented in the DEN file opener from commit 9d5696
(version 1.3.1), denpy
from commit 21ded7
(version 1.2.1), and ctiol
from commit 618aec3c0bf
.
Use Cases
The DEN format is utilized in various applications for tomographic reconstruction, perfusion analysis, and more:
- KCT CBCT: KCT cbct is a tool for tomographic reconstruction that stores projection data and output volumes in DEN format.
- KPCT perfviz: KPCT perfviz is used for processing perfusion and brain perfusion CT datasets, leveraging the DEN format for efficient data handling.
These tools and use cases demonstrate the flexibility and power of the Extended DEN format for storing and manipulating complex multidimensional datasets.