While analytic and iterative reconstruction techniques have been studied extensively for several decades, direct reconstruction of medical images using convolutional neural networks has only recently received attention. In direct reconstruction schemes, network architecture plays a profound role, influencing how efficiently the network learns the necessary domain transform and how efficiently it learns and stores prior information. Previous studies have, however, not rigorously tested diverse architectures against each other. In this work, Monte Carlo simulations of realistic positron emission tomography data acquisition were performed, and the data were binned into 2-D sinograms. A flexible architecture was used to generate reconstruction networks whose characteristics depended on 15 hyperparameters. A Bayesian search algorithm was employed to efficiently search the hyperparameter space for the best-performing networks (tuned networks), according to two quality metrics: mean squared error (MSE) and structural similarity index metric (SSIM). A total of 341 networks were trialed. The best-performing networks consistently outperformed reconstruction by maximum likelihood expectation maximization, both in terms of mean SSIM (0.887 vs. 0.855) and MSE (0.711 vs. 0.854), but also on an image-by- image basis, as evidenced by 2D metric histograms. Furthermore, compared to untuned networks, tuned networks used less training data (70k vs. 105k training examples) and required far fewer epochs to converge (6 vs. 150). Compared to metrically inferior ML-EM images, network-reconstructed images suffered from over-smoothing, loss of finer details, and over-regularity of high-contrast regions. Since networks performed well by the metrics, this indicates that MSE and SSIM did not adequately quantify important image features.
|