Advanced signal processing such as multi-resolution decomposition and three-dimensional processing and data
sets are gradually becoming a integral part of medical imaging. With the growing number of signal dimensions,
the bandwidth requirements increase exponentially. Because memory bandwidth is a scarce parameter,
this paper focusses on bandwidth optimization at the processor-chip level within multiprocessor systems. We
introduce a practical model including formulas for the computing, memory and cache read/write procedures to
optimize the mapping of data into the memory and cache for different configurations. A substantial performance
improvement is realized by a new memory-communication model that incorporates the data-dependencies of the
image-processing functions. More specifically, bandwidth optimization and minimization is achieved by implementing
two measures: (1) breaking down the algorithm such that the processing gets a locality that fits with
the cache size of the processor, and (2) a technique known from based on addressing and organizing the data
prior to processing in such a way that memory traffic is minimized. For the experiments, we have concentrated
particularly on image enhancement and noise reduction build around image pyramids for 3D X-ray data sets.
First experimental results show a bandwidth reduction in the order of 80% and a throughput increase of 60%
compared to straightforward implementations.
In Cardiovascular minimal invasive interventions, physicians require low-latency X-ray imaging applications, as
their actions must be directly visible on the screen. The
image-processing system should enable the simultaneous
execution of a plurality of functions. Because dedicated hardware lacks flexibility, there is a growing interest
in using off-the-shelf computer technology. Because memory bandwidth is a scarce parameter, we will focus
on optimization methods for bandwidth reduction within multiprocessor systems at the chip level. We create
a practical realistic model of required compute and memory bandwidth for a given set of image-processing
functions. Similar modeling is applied for the available system resources. We concentrate in particular on X-ray
image processing based on multi-resolution decomposition, noise reduction and image-enhancement techniques.
We derive formulas for which we can optimize the mapping of the application onto processors, cache and memory
for different configurations. The data-block granularity is matched to the memory hierarchy, so that caching
will be optimized for low latency. More specifically, we exploit the locality of the signal-processing functions to
streamline the memory communication. A substantial performance improvement is realized by a new memorycommunication
model that incorporates the data dependencies of the
image-processing functions. Results show
a memory-bandwidth reduction in the order of 60% and a latency reduction in the order of 30-60% compared to
straightforward implementations.
Low-dose X-ray imaging, diagnosis by image analysis and multi-modal medical imaging are example aspects
that lead to more advanced image processing algorithms and the corresponding platforms on which they have to
be executed. In this paper, we investigate the applicability of commercially available off-the-shelf components
for a new computing platform. In the analysis, we will comply to some specific use cases. In cardiovascular
minimal invasive surgery, physicians require low-latency imaging applications, as their actions must be directly
visible on the screen. Typical image-processing algorithms in this domain are based on multi-resolution decomposition,
noise reduction, image analysis and enhancement techniques. We have compared various solutions
for possible processing architectures. The most interesting technology areas for constituting a new architecture
are presented and we discuss the mapping of the use cases onto the various architectural proposals. Results
show that a heterogeneous architecture gives the highest potential for current and upcoming image-processing
applications. However, hardware and software solutions to support low-latency, high-bandwidth image streaming
and an efficient concurrent distribution of functionality still need further development. This validates a clear
direction for the future, which is based on modeling streaming computing architectures and special interconnect
infrastructures.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.