PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 8656, including the Title Page, Copyright information, Table of Contents, and Conference Committee listing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Video tracking is a fundamental problem in computer vision with many applications. The goal of video tracking
is to isolate a target object from its background across a sequence of frames. Tracking is inherently a three
dimensional problem in that it incorporates the time dimension. As such, the computational efficiency of video
segmentation is a major challenge. In this paper we present a generic and robust graph-theory-based tracking
scheme in videos. Unlike previous graph-based tracking methods, the suggested approach treats motion as a
pixel's property (like color or position) rather than as consistency constraints (i.e., the location of the object in
the current frame is constrained to appear around its location in the previous frame shifted by the estimated
motion) and solves the tracking problem optimally (i.e., neither heuristics nor approximations are applied).
The suggested scheme is so robust that it allows for incorporating the computationally cheaper MPEG-4
motion estimation schemes. Although block matching techniques generate noisy and coarse motion fields, their
use allows faster computation times as broad variety of off-the-shelf software and hardware components that
specialize in performing this task are available. The evaluation of the method on standard and non-standard
benchmark videos shows that the suggested tracking algorithm can support a fast and accurate video tracking,
thus making it amenable to real-time applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
An algorithmic framework for real-time localization of single yarns within industrial fabric images is presented. The information about precise yarn locations forms the foundation for a fabric flaw detection system which is based on individual yarn measurements. Matching a camera frame rate of 15 fps, we define the term "real-time"
by the capability of tracking all yarns within a 5 megapixel image in less than 35 ms, leaving a time slot of 31ms for further image processing and defect detection algorithms. The processing pipeline comprises adaptive histogram equalization, Wiener deconvolution, normalized template matching and a novel feature point sorting scheme. To meet real-time requirements, extensive use of the NVIDIA CUDA framework is made. Implementation details are given and source code for selected algorithms is provided. Evaluation results show that wefts and warps can be tracked reliably and independently of the fabric material or binding. Video and image footage is provided on
the project website to expand the paper content.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
More and more governments and authorities around the world are promoting the use of bicycles in cities, as this is
healthy for the bicyclist and improves the quality of life in general. Safety and efficiency of bicyclists has become a
major focus. To achieve this, there is a need for a smarter approach towards the control of signalized intersections. Various traditional detection technologies, such as video, microwave radar and electromagnetic loops, can be used to detect vehicles at signalized intersections, but none of these can consistently separate bikes from other traffic, day and night and in various weather conditions.
As bikes should get a higher priority and also require longer green time to safely cross the signalized intersection, traffic managers are looking for alternative detection systems that can make the distinction between bicycles and other vehicles near the stop bar. In this paper, the drawbacks of a video-based approach are presented, next to the benefits of a thermal-video-based approach for vehicle presence detection with separation of bicycles. Also, the specific technical challenges are highlighted in developing a system that combines thermal image capturing, image processing and output triggering to the traffic light controller in near real-time and in a single housing.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Image scaling is a frequent operation in video processing for optical metrology. In the paper, results of comparative
study of computational complexity of different algorithms for scaling digital images with arbitrary scaling factors are
presented and discussed. The following algorithms were compared: different types of spatial domain processing
algorithms (linear, cubic, cubic spline interpolation) and a new DCT-based algorithm, which implements perfect
(interpolation error free) scaling through discrete sinc-interpolation and is virtually free of boundary effects
(characteristic for the DFT-based scaling algorithms). The comparison results enable evaluation of the feasibility of realtime
implementation of the algorithms for arbitrary image scaling.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Stereo metrology involves obtaining spatial estimates of an object’s length or perimeter using the disparity between
boundary points. True 3D scene information is required to extract length measurements of an object’s projection onto
the 2D image plane. In stereo vision the disparity measurement is highly sensitive to object distance, baseline distance,
calibration errors, and relative movement of the left and right demarcation points between successive frames. Therefore
a tracking filter is necessary to reduce position error and improve the accuracy of the length measurement to a useful
level. A Cartesian coordinate extended Kalman (EKF) filter is designed based on the canonical equations of stereo
vision. This filter represents a simple reference design that has not seen much exposure in the literature. A second filter
formulated in a modified sensor-disparity (DS) coordinate system is also presented and shown to exhibit lower errors
during a simulated experiment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Traditional noise removal filters have an undesirable side effect of blurring edges, which is unacceptable for some image processing applications. To overcome this problem, our ongoing project evaluates an edge enhancing smoothening filter and implements it on FPGAs to reduce noise while sharpening edges. One such edge enhancing smoothing filter consists of a combination of the bilateral filter (used for edge preserving smoothing) and the Shock filter (used for edge enhancement) to achieve the desired result. This paper describes an implementation of the bilateral filter on Altera FPGAs. Shock filter part is then briefly described. Area and speed performance results for different Altera FPGA families are comparatively shown.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Object filtering by size is a basic task in computer vision. A common way to extract large objects in a binary
image is to run the connected-component labeling (CCL) algorithm and to compute the area of each component.
Selecting the components with large areas is then straightforward. Several CCL algorithms for the GPU have
already been implemented but few of them compute the component area. This extra step can be critical for
real-time applications such as real-time video segmentation. The aim of this paper is to present a new approach
for the extraction of visually large objects in a binary image that works in real-time. It is implemented using
CUDA (Compute Unified Device Architecture), a parallel computing architecture developed by NVIDIA.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Computer aided design and manufacturing (CAD/CAM) is increasingly becoming a standard feature and
service provided to patients in dentist offices and denture manufacturing laboratories. Although the quality of the tools
and data has slowly improved in the last years, due to various surface measurement challenges, practical, accurate, invivo,
real-time 3D high quality data acquisition and processing still needs improving. Advances in GPU computational
power have allowed for achieving near real-time 3D intraoral in-vivo scanning of patient’s teeth. We explore in this
paper, from a real-time perspective, a hardware-software-GPU solution that addresses all the requirements mentioned
before. Moreover we exemplify and quantify the hard and soft deadlines required by such a system and illustrate how
they are supported in our implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a Fuzzy 3D filter for color video sequences to suppress impulsive noise. The difference between
the designed algorithm in comparison with other state- of-the-art algorithms consists of employing the three RGB bands
of the video sequence data and analyzing the fuzzy gradients values obtained in eight directions, finally processing two
temporal neighboring frames together. The simulation results have confirmed sufficiently better performance of the
novel 3D filter both in terms of objective metrics (PSNR, MAE, NCD, SSIM) as well as in subjective perception via
human vision in the color sequences. An efficiency analysis of the designed and other promising filters have been
performed on the DSP TMS320DM642 by Texas InstrumentsTM through MATLAB’s SimulinkTM module, showing that
the 3D filter can be used in real-time processing applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The pseudo-log image transform is essentially a logarithmic transformation that simulates the distribution of
the eye’s photoreceptors and finds application in many important areas of real time image and video processing
such as motion detection and estimation in robots and foveated space variant cameras. It belongs to a family
of non-linear image processing kernels in which references made to memory are non-linear functions of loop
indices. Non-linear kernels need some form of memory management in order to achieve the required throughput,
to minimize on-chip memory and to maximize possible data re-use. In this paper we present the design of a
pseudo-log image processing hardware accelerator IP, integrated with different interpolation filtering techniques,
using a memory management framework. The framework can automatically generate a memory hierarchy around
the IP and a data transfer controller that facilitates data exchange with main memory. The memory hierarchy
reduces on-chip memory requirements, optimizes throughput and increases data-reuse. The design of the IP is
fully performed at the algorithmic level in C/C++. The algorithmic description is profiled within the framework
to create a customized memory hierarchy, also described at the synthesizable algorithmic level. Finally, high
level synthesis is used to perform hardware design space exploration and performance estimation. Experiments
show that the generated memory hierarchy is able to feed the IP with a very high bandwidth even in presence
of long external memory latencies.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A real-time system is proposed to acquire from an automotive fish-eye CMOS camera the traffic signs, and provide
their automatic recognition on the vehicle network. Differently from the state-of-the-art, in this work color-detection is
addressed exploiting the HSI color space which is robust to lighting changes. Hence the first stage of the processing
system implements fish-eye correction and RGB to HSI transformation. After color-based detection a noise deletion
step is implemented and then, for the classification, a template-based correlation method is adopted to identify potential
traffic signs, of different shapes, from acquired images. Starting from a segmented-image a matching with templates of
the searched signs is carried out using a distance transform. These templates are organized hierarchically to reduce the
number of operations and hence easing real-time processing for several types of traffic signs. Finally, for the
recognition of the specific traffic sign, a technique based on extraction of signs characteristics and thresholding is
adopted. Implemented on DSP platform the system recognizes traffic signs in less than 150 ms at a distance of about 15
meters from 640x480-pixel acquired images. Tests carried out with hundreds of images show a detection and recognition
rate of about 93%.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The on-board data processing is a vital task for any satellite and spacecraft due to the importance of elaborate the sensing
data before sending them to the Earth, in order to exploit effectively the bandwidth to the ground station. In the last years
the amount of sensing data collected by scientific and commercial space missions has increased significantly, while the
available downlink bandwidth is comparatively stable. The increasing demand of on-board real-time processing
capabilities represents one of the critical issues in forthcoming European missions. Faster and faster signal and image
processing algorithms are required to accomplish planetary observation, surveillance, Synthetic Aperture Radar imaging
and telecommunications. The only available space-qualified Digital Signal Processor (DSP) free of International Traffic
in Arms Regulations (ITAR) restrictions faces inadequate performance, thus the development of a next generation
European DSP is well known to the space community.
The DSPACE space-qualified DSP architecture fills the gap between the computational requirements and the available
devices. It leverages a pipelined and massively parallel core based on the Very Long Instruction Word (VLIW)
paradigm, with 64 registers and 8 operational units, along with cache memories, memory controllers and SpaceWire
interfaces. Both the synthesizable VHDL and the software development tools are generated from the LISA high-level
model. A Xilinx-XC7K325T FPGA is chosen to realize a compact PCI demonstrator board. Finally first synthesis results
on CMOS standard cell technology (ASIC 180 nm) show an area of around 380 kgates and a peak performance of 1000
MIPS and 750 MFLOPS at 125MHz.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The rapid growth in the use of video streaming over IP networks has outstripped the rate at which new network
infrastructure has been deployed. These bandwidth-hungry applications now comprise a significant part of all Internet
traffic and present major challenges for network service providers. The situation is more acute in mobile networks where
the available bandwidth is often limited. Work towards the standardisation of High Efficiency Video Coding (HEVC),
the next generation video coding scheme, is currently on track for completion in 2013. HEVC offers the prospect of a
50% improvement in compression over the current H.264 Advanced Video Coding standard (H.264/AVC) for the same
quality. However, there has been very little published research on HEVC streaming or the challenges of delivering
HEVC streams in resource-constrained network environments.
In this paper we consider the problem of adapting an HEVC encoded video stream to meet the bandwidth limitation in a
mobile networks environment. Video sequences were encoded using the Test Model under Consideration (TMuC HM6)
for HEVC. Network abstraction layers (NAL) units were packetized, on a one NAL unit per RTP packet basis, and
transmitted over a realistic hybrid wired/wireless testbed configured with dynamically changing network path conditions
and multiple independent network paths from the streamer to the client. Two different schemes for the prioritisation of
RTP packets, based on the NAL units they contain, have been implemented and empirically compared using a range of
video sequences, encoder configurations, bandwidths and network topologies.
In the first prioritisation method the importance of an RTP packet was determined by the type of picture and the temporal
switching point information carried in the NAL unit header. Packets containing parameter set NAL units and video
coding layer (VCL) NAL units of the instantaneous decoder refresh (IDR) and the clean random access (CRA) pictures
were given the highest priority followed by NAL units containing pictures used as reference pictures from which others
can be predicted. The second method assigned a priority to each NAL unit based on the rate-distortion cost of the VCL
coding units contained in the NAL unit. The sum of the rate-distortion costs of each coding unit contained in a NAL unit
was used as the priority weighting.
The preliminary results of extensive experiments have shown that all three schemes offered an improvement in PSNR,
when comparing original and decoded received streams, over uncontrolled packet loss. Using the first method
consistently delivered a significant average improvement of 0.97dB over the uncontrolled scenario while the second
method provided a measurable, but less consistent, improvement across the range of testing conditions and encoder
configurations.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we defined a low complexity 2D-DCT architecture. The latter will be able to transform spatial pixels
to spectral pixels while taking into account the constraints of the considered compression standard. Indeed, this
work is our first attempt to obtain one reconfigurable multistandard DCT. Due to our new matrix decomposition,
we could define one common 2D-DCT architecture. The constant multipliers can be configured to handle the
case of RealDCT and/or IntDCT (multiplication by 2). Our optimized algorithm not only provides a reduction
of computational complexity, but also leads to scalable pipelined design in systolic arrays. Indeed, the 8 × 8
StdDCT can be computed by using 4×4 StdDCT which can be obtained by calculating 2×2 StdDCT. Besides,
the proposed structure can be extended to deal with higher number of N (i.e. 16 × 16 and 32 × 32). The
performance of the proposed architecture are better when compared with conventional designs. In particular,
for N = 4, it is found that the proposed design have nearly third the area-time complexity of the existing DCT
structures. This gain is expected to be higher for a greater size of 2D-DCT.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents an optimized H.264/AVC coding system for HDTV displays based on a typical flow with high coding efficiency and statics adaptivity features. For high quality streaming, the codec uses a Binary Arithmetic Encoding/Decoding algorithm with high complexity and a JVCE (Joint Video compression and encryption) scheme. In fact, particular attention is given to simultaneous compression and encryption applications to gain security without compromising the speed of transactions [1].
The proposed design allows us to encrypt the information using a pseudo-random number generator (PRNG). Thus we achieved the two operations (compression and encryption) simultaneously and in a dependent manner which is a novelty in this kind of architecture.
Moreover, we investigated the hardware implementation of CABAC (Context-based adaptive Binary Arithmetic Coding) codec. The proposed architecture is based on optimized binarizer/de-binarizer to handle significant pixel rates videos with low cost and high performance for most frequent SEs. This was checked using HD video frames. The obtained synthesis results using an FPGA (Xilinx’s ISE) show that our design is relevant to code main profile video stream.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we present a modified inter-view prediction scheme for the multiview video coding (MVC).With more inter-view prediction, the number of reference frames required to decode a single view increase. Consequently, the data size of decoding a single view increases, thus impacting the decoder performance. In this paper, we propose an MVC scheme that requires less inter-view prediction than that of the MVC standard scheme. The proposed scheme is implemented and tested on real multiview video sequences. Improvements are shown using the proposed scheme in terms of average data size required either to decode a single view, or to access any frame (i.e., random access), with comparable rate-distortion. It is compared to the MVC standard scheme and another improved techniques from the literature.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this paper, we mainly present a novel and real-time capsule endoscopy (CE) video visualization concept based on
panoramic imaging. Typical CE videos run about 8 hours and are manually reviewed by physicians to locate diseases
such as bleedings and polyps. To date, there is no commercially available tool capable of providing stabilized and
processed CE video that is easy to analyze in real time. The burden on physicians’ disease finding efforts is thus big. In
fact, since the CE camera sensor has a limited forward looking view and low image frame rate (typical 2 frames per
second), and captures very close range imaging on the GI tract surface, it is no surprise that traditional visualization
method based on tracking and registration often fails to work. This paper presents a novel concept for real-time CE video
stabilization and display. Instead of directly working on traditional forward looking FOV (field of view) images, we
work on panoramic images to bypass many problems facing traditional imaging modalities. Methods on panoramic
image generation based on optical lens principle leading to real-time data visualization will be presented. In addition,
non-rigid panoramic image registration methods will be discussed.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The current trend in embedded vision systems is to propose bespoke solutions for specific problems as each application
has different requirement and constraints. There is no widely used model or benchmark which aims to facilitate generic
solutions in embedded vision systems. Providing such model is a challenging task due to the wide number of use cases,
environmental factors, and available technologies. However, common characteristics can be identified to propose an
abstract model. Indeed, the majority of vision applications focus on the detection, analysis and recognition of objects.
These tasks can be reduced to vision functions which can be used to characterize the vision systems. In this paper, we
present the results of a thorough analysis of a large number of different types of vision systems. This analysis led us to
the development of a system’s taxonomy, in which a number of vision functions as well as their combination
characterize embedded vision systems. To illustrate the use of this taxonomy, we have tested it against a real vision
system that detects magnetic particles in a flowing liquid to predict and avoid critical machinery failure. The proposed
taxonomy is evaluated by using a quantitative parameter which shows that it covers 95 percent of the investigated vision
systems and its flow is ordered for 60 percent systems. This taxonomy will serve as a tool for classification and
comparison of systems and will enable the researchers to propose generic and efficient solutions for same class of
systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In order to realize the fast search and tracking of ground dim target under the movement condition of infrared
detector, an infrared image registration method is proposed based on phase correlation registration and the feature point
registration, which can make the infrared image motion compensation come true. After using the phase correlation
registration for rough registration in movement infrared images, the high precision registration was realized by the priori
information of the feature point image registration, which combined the features of phase correlation registration and
feature point registration. Experimental results validate that the infrared image displacement information had been
provided and the small detection rate and accuracy had been improved by the algorithm, which didn’t reduce the
registration accuracy in real-time IRST.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Wireless Visual Sensor Networks (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field.
Typical applications of WVSN include environmental monitoring, health care, industrial process monitoring,
stadium/airports monitoring for security reasons and many more. The energy budget in the outdoor applications of
WVSN is limited to the batteries and the frequent replacement of batteries is usually not desirable. So the processing as
well as the communication energy consumption of the VSN needs to be optimized in such a way that the network
remains functional for longer duration. The images captured by VSN contain huge amount of data and require efficient
computational resources for processing the images and wide communication bandwidth for the transmission of the
results. Image processing algorithms must be designed and developed in such a way that they are computationally less
complex and must provide high compression rate. For some applications of WVSN, the captured images can be
segmented into bi-level images and hence bi-level image coding methods will efficiently reduce the information amount
in these segmented images. But the compression rate of the bi-level image coding methods is limited by the underlined
compression algorithm. Hence there is a need for designing other intelligent and efficient algorithms which are
computationally less complex and provide better compression rate than that of bi-level image coding methods. Change
coding is one such algorithm which is computationally less complex (require only exclusive OR operations) and provide
better compression efficiency compared to image coding but it is effective for applications having slight changes
between adjacent frames of the video. The detection and coding of the Region of Interest (ROIs) in the change frame
efficiently reduce the information amount in the change frame. But, if the number of objects in the change frames is
higher than a certain level then the compression efficiency of both the change coding and ROI coding becomes worse
than that of image coding. This paper explores the compression efficiency of the Binary Video Codec (BVC) for the data
reduction in WVSN. We proposed to implement all the three compression techniques i.e. image coding, change coding
and ROI coding at the VSN and then select the smallest bit stream among the results of the three compression
techniques. In this way the compression performance of the BVC will never become worse than that of image coding.
We concluded that the compression efficiency of BVC is always better than that of change coding and is always better
than or equal that of ROI coding and image coding.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Finding a given object in an image or a sequence of frames is one of the fundamental computer vision challenges. Humans can recognize a multitude of objects with little effort despite scale, lighting and perspective changes. A robust computer vision based object recognition system is achievable only if a considerable tolerance to change in scale, rotation and light is achieved. Partial occlusion tolerance is also of paramount importance in order to achieve robust object recognition in real-time applications. In this paper, we propose an effective method for recognizing a given object from a class of trained objects in the presence of partial occlusions and considerable variance in scale, rotation and lighting conditions. The proposed method can also identify the absence of a given object from the class of trained objects. Unlike the conventional methods for object recognition based on the key feature matches between the training image and a test image, the proposed algorithm utilizes a statistical measure from the homography transform based resultant matrix to determine an object match. The magnitude of determinant of the homography matrix obtained by the homography transform between the test image and the set of training images is used as a criterion to recognize the object contained in the test image. The magnitude of the determinant of homography matrix is found to be very near to zero (i.e. less than 0.005) and ranges between 0.05 and 1, for the out-of-class object and in-class objects respectively. Hence, an out-of-class object can also be identified by using low threshold criteria on the magnitude of the determinant obtained. The proposed method has been extensively tested on a huge database of objects containing about 100 similar and difficult objects to give positive results for both out-of-class and in-class object recognition scenarios. The overall system performance has been documented to be about 95% accurate for a varied range of testing scenarios.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper presents a novel method for the real-time protection of new emerging High Efficiency Video Coding (HEVC)
standard. Structure preserving selective encryption is being performed in CABAC entropy coding module of HEVC, which
is significantly different from CABAC entropy coding of H.264/AVC. In CABAC of HEVC, exponential Golomb coding
is replaced by truncated Rice (TR) up to a specific value for binarization of transform coefficients. Selective encryption
is performed using AES cipher in cipher feedback mode on a plaintext of binstrings in a context aware manner. The
encrypted bitstream has exactly the same bit-rate and is format complaint. Experimental evaluation and security analysis
of the proposed algorithm is performed on several benchmark video sequences containing different combinations of motion,
texture and objects.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper addresses improving the computational efficiency of the 3D point cloud reconstruction pipeline using uncalibrated
image sequences. In existing pipelines, the bundle adjustment is carried out globally, which is quite time
consuming since the computational complexity keeps growing as the number of image frames is increased. Furthermore,
a searching and sorting algorithm needs to be used in order to store feature points and 3D locations. In order to reduce
the computational complexity of the 3D point cloud reconstruction pipeline, a local refinement approach is introduced in
this paper. The results obtained indicate that the introduced local refinement improves the computational efficiency as
compared to the global bundle adjustment.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Positron emission tomography (PET) is a clinical and research tool for in vivo metabolic imaging. The demand for better
image quality entails continuous research to improve PET instrumentation. In clinical applications, PET image quality
benefits from the time of flight (TOF) feature. Indeed, by measuring the photons arrival time on the detectors with a
resolution less than 100 ps, the annihilation point can be estimated with centimeter resolution. This leads to better noise
level, contrast and clarity of detail in the images either using analytical or iterative reconstruction algorithms. This work
discusses a silicon photomultiplier (SiPM)-based magnetic-field compatible TOF-PET module with depth of interaction
(DOI) correction. The detector features a 3D architecture with two tiles of SiPMs coupled to a single LYSO scintillator
on both its faces. The real-time front-end electronics is based on a current-mode ASIC where a low input impedance, fast
current buffer allows achieving the required time resolution. A pipelined time to digital converter (TDC) measures and
digitizes the arrival time and the energy of the events with a timestamp of 100 ps and 400 ps, respectively. An FPGA
clusters the data and evaluates the DOI, with a simulated z resolution of the PET image of 1.4 mm FWHM.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Cameras used in outdoor scenes require high visibility performance under various environmental conditions. We present
a visibility improvement technique which can improve the visibility of images captured in bad weather such as fog and
haze, and also applicable to real-time processing in surveillance cameras and vehicle cameras. Our algorithm enhances
contrast pixel by pixel according to the brightness and sharpness of neighboring pixels. In order to reduce computational
costs, we preliminarily specify the adaptive functions which determine contrast gain from brightness and sharpness of
neighboring pixels. We optimize these functions using the sets of fog images and examine how well they can predict the
fog-degraded area using both qualitative and quantitative assessment. We demonstrate that our method can prevent
excessive correction to the area without fog to suppress noise amplification in sky or shadow region, while applying
powerful correction to the fog-degraded area. In comparison with other real-time oriented methods, our method can
reproduce clear-day visibility while preserving gradation in shadows and highlights and also preserving naturalness of
the original image. Our algorithm with low computational costs can be compactly implemented on hardware and thus
applicable to wide-range of video equipments for the purpose of visibility improvement in surveillance cameras, vehicle
cameras, and displays.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In the ultrasound medical imaging system, blurring which occurs after passing through ultrasound scanner system,
represents Point Spread Function (PSF) that describes the response of the ultrasound imaging system to a point source
distribution. So, de-blurring can be achieved by de-convolving the images with an estimated of PSF. However, it is hard
to attain an accurate estimation of PSF due to the unknown properties of the tissues of the human body through the
ultrasound signal propagates. In addition to, the complexity is very high in order to estimate point spread function and
de-convolve the ultrasound image with estimated PSF for real-time implementation of ultrasound imaging. Therefore,
conventional methods of ultrasound image restoration are based on a simple 1D PSF estimation [8] that axial direction
only by restoring the performance improvement is not in the direction of Lateral. And, in case of 2D PSF estimation,
PSF estimation and restoration of the high complexity is not being widely used. In this paper, we proposed new method
for selection of the 2D PSF (estimated PSF of the average speed sound and depth) simultaneously with performing fast
non-blind 2D de-convolution in the ultrasound imaging system. Our algorithm works on the beam-formed uncompressed
radio-frequency data, with pre-measured and estimated 2D PSFs database from actual probe used. In the 2d PSF
database, there are pre-measured and estimated 2D PSFs that classified the each different depth (about 5 different depths)
and speed of sound (about 1450 or 1540m/s). Using a minimum variance and simple Weiner filter method, we present a
novel way to select the optimal 2D PSF in pre-measured and estimated 2D PSFs database that acquired from the actual
transducer being used. For de-convolution part with the chosen PSF, we focused on the low complexity issue. So, we are
using the Weiner Filter and fast de-convolution technique using hyper-Laplacian priors [11], [12] which is several orders
of magnitude faster than existing techniques that use hyper-Laplacian priors. Then, in order to prevent discontinuities
between the differently restored each depth image regions, we use the piecewise linear interpolation on overlapping
regions. We have tested our algorithm with vera-sonic system and commercial ultrasound scanner (Philips C4-2), in
known speed of sound phantoms and unknown speeds in vivo scans. We have applied a non-blind de-convolution with
2D PSFs database for ultrasound imaging system. Using the real PSF from actual transducer being used, our algorithm
produces a better restoration of ultrasound image than de-convolution by simulated PSF, and has low complexity for
real-time ultrasound imaging. This method is robust and easy to implement. This method may be a realistic candidate for
real-time implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.