In surveillance, reconnaissance and numerous other video applications, enhancing the resolution and
quality enhances the usability of captured video. In many such applications, video is often acquired
from low cost legacy sensors that offer low resolution due to modest optics and low-resolution arrays,
providing imagery that may be grainy and missing important details - and still face transmission
bottlenecks. Many post-processing techniques have been proposed to enhance the quality of the video and
superresolution is one such technique. In this paper, we extend previous work on a real-time
superresolution application implemented in ASIC/FPGA hardware. A gradient based technique is used to
register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved
regularization technique in which the image is iteratively modified by applying back-projection to get a
sharp and undistorted image. The matlab/simulink proven algorithm was migrated to hardware, to achieve
320x240 -> 1280x960, at more than 38 fps, a stunning superresolution by 16X in total pixels. This
significant advance beyond real-time is the main contribution of this paper. Additionally the algorithm is
implemented in C to achieve real-time performance in software with optimization for Intel I7 processor.
Fixed 32 bit processing structure is used to achieve easy migration across platforms. This gives us a fine
balance between the quality and performance. The proposed system is robust and highly efficient.
Superresolution greatly decreases camera jitter to deliver a smooth, stabilized, high quality video.
In numerous computer vision applications, enhancing the quality and resolution of captured video can be
critical. Acquired video is often grainy and low quality due to motion, transmission bottlenecks, etc.
Postprocessing can enhance it. Superresolution greatly decreases camera jitter to deliver a smooth,
stabilized, high quality video. In this paper, we extend previous work on a real-time superresolution
application implemented in ASIC/FPGA hardware. A gradient based technique is used to register the
frames at the sub-pixel level. Once we get the high resolution grid, we use an improved regularization
technique in which the image is iteratively modified by applying back-projection to get a sharp and
undistorted image. The algorithm was first tested in software and migrated to hardware, to achieve 320x240
-> 1280x960, about 30 fps, a stunning superresolution by 16X in total pixels. Various input parameters,
such as size of input image, enlarging factor and the number of nearest neighbors, can be tuned
conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in Matlab
Simulink as well as in FPGA hardware, which gives us a fine balance between the number of bits and
performance. The proposed system is robust and highly efficient. We have shown the performance
improvement of the hardware superresolution over the software version (C code).
KEYWORDS: Particles, Video, Video surveillance, Detection and tracking algorithms, Electronic filtering, Particle filters, Affine motion model, Systems modeling, Surveillance, Monte Carlo methods
In this paper, we present an improved target tracking algorithm in aerial video. An adaptive appearance
model is incorporated in Sequential Monte Carlo framework to infer the deformation (or tracking)
parameter best describing the differences between the observed appearances of the target and the
appearance model. The appearance model of the target is adaptively updated based on the tracking
result up to the current frame, balancing a fixed model and the dynamic model with a pre-defined
forgetting parameter. For targets in the aerial video, an affine model is accurate enough to describe the
transformation of the targets across frames. Particles are formed with the elements of the affine model.
To accommodate the dynamics embedded in the video sequence, we employ a state space time series
model, and the system noise constrains the particle coverage. Instead of directly using the affine
parameters as elements of particles, each affine matrix is decomposed into two rotation angles, two
scales and the translation parameter, which form the particles with more geometrical meaning. Larger
variances are given to the translation parameter and the rotation angles, which greatly improve the
tracking performance compared with treating these parameters equally, especially for the fast rotating
targets. Experimental results show that our approach provides high performance for target tracking in
aerial video.
Video, especially massive video archives, is by nature dense information medium. Compactly
presenting the activities of targets of interest provides an efficient and cost saving way to analyze the
content of the video. In this paper, we propose a video content analysis system to summarize and
visualize the trajectories of targets from massive video archives. We first present an adaptive
appearance-based algorithm to robustly track the targets in a particle filtering framework. It provides
high performance while facilitating implementation of this algorithm in hardware with parallel
processing. Phase correlation algorithm is used to estimate the motion of the observation platform
which is then compensated in order to extract the independent trajectories of the targets. Based on the
trajectory information, we develop the interface for browsing the videos which enables us to directly
manipulate the video. The user could scroll over objects to view their trajectories. If interested, he/she
could click on the object and drag it along the displayed path. The actual video will be played in
synchronous to the mouse movement.
We propose an improvement on the existing super-resolution technique that produces high resolution video from low
quality low resolution video. Our method has two steps: (1) motion registration and (2) regularization using back-projection.
Sub-pixel motion parameters are estimated for a group of 16 low resolution frames with reference to the next
frame and these are used to position the low resolution pixels on high resolution grid. A gradient based technique is used
to register the frames at the sub-pixel level. Once we get the high resolution grid, we use an improved state-of-the-art
regularization technique where the image is iteratively modified by applying back-projection to get a sharp and
undistorted image. This technique is based on bilateral prior and deals with different data and noise models. This
computationally inexpensive method is robust to errors in motion/blur estimation and results in images with sharp edges.
The proposed system is faster than the existing ones as the post-processing steps involved only simple filtering. The
results show the proposed method gives high quality and high resolution videos and minimizes effects due to camera
jerks. This technique can easily be ported to hardware and can be developed into a product.
In this paper, we propose a multi-agent system which uses swarming techniques to perform high
accuracy Automatic Target Recognition (ATR) in a distributed manner. The proposed system can
co-operatively share the information from low-resolution images of different looks and use this information
to perform high accuracy ATR. An advanced, multiple-agent Unmanned Aerial Vehicle (UAV)
systems-based approach is proposed which integrates the processing capabilities, combines detection
reporting with live video exchange, and swarm behavior modalities that dramatically surpass individual
sensor system performance levels. We employ real-time block-based motion analysis and compensation
scheme for efficient estimation and correction of camera jitter, global motion of the camera/scene and the
effects of atmospheric turbulence. Our optimized Partition Weighted Sum (PWS) approach requires only
bitshifts and additions, yet achieves a stunning 16X pixel resolution enhancement, which is moreover
parallizable. We develop advanced, adaptive particle-filtering based algorithms to robustly track multiple
mobile targets by adaptively changing the appearance model of the selected targets. The collaborative ATR
system utilizes the homographies between the sensors induced by the ground plane to overlap the local
observation with the received images from other UAVs. The motion of the UAVs distorts estimated
homography frame to frame. A robust dynamic homography estimation algorithm is proposed to address
this, by using the homography decomposition and the ground plane surface estimation.
Superresolution of images is an important step in many applications like target recognition where the input images are
often grainy and of low quality due to bandwidth constraints. In this paper, we present a real-time superresolution
application implemented in ASIC/FPGA hardware, and capable of 30 fps of superresolution by 16X in total pixels.
Consecutive frames from the video sequence are grouped and the registered values between them are used to fill the
pixels in the higher resolution image. The registration between consecutive frames is evaluated using the algorithm
proposed by Schaum et al. The pixels are filled by averaging a fixed number of frames associated with the smallest error
distances. The number of frames (the number of nearest neighbors) is a user defined parameter whereas the weights in
the averaging process are decided by inverting the corresponding smallest error distances. Wiener filter is used to post
process the image. Different input parameters, such as size of input image, enlarging factor and the number of nearest
neighbors, can be tuned conveniently by the user. We use a maximum word size of 32 bits to implement the algorithm in
Matlab Simulink as well as the hardware, which gives us a fine balance between the number of bits and performance.
The algorithm performs with real time speed with very impressive superresolution results.
We propose a real-time vehicle detection and tracking system from an electro-optical (EO)
surveillance camera. Real time object detection remains a challenging computer vision problem
in uncontrolled environments. The state-of-the-art adaboosting technique is used to serve as a
robust object detector. In addition to the generally-used Haar features, we propose to include
corner features to improve the detection performance of the vehicles. Having the objects of
interest detected, we use the detection results to initialize the object tracking module. We propose
an advanced, adaptive particle-filtering based algorithm to robustly track multiple mobile targets
by adaptively changing the appearance model of the selected targets. We use the affine
transformation to describe the motion of the object across frames. By drawing multiple particles
on the transformation parameters, our approach provides high performance while facilitating
implementation of this algorithm in hardware with parallel processing. In order to resume from
the lost track case, which may result from the objects' out of boundary or being occluded, we
utilize the prior information (height-to-width ratio) and the temporal information of the objects to
estimate if the tracking is reliable. Object detectors will be evoked at the frames which fail in
tracking the objects reliably. We also check for occlusion by comparing hue values within the
rectangular region for the current frame with that of the previous frame. Detection is re-initialized
for the next frame if an occlusion is claimed for the current frame. The system works very well in
terms of speed and performance for the real surveillance video.
The registration of images from cameras of different types and/or at different locations is well researched topic.
It is of great interest for both military and civilian applications. Researchers have come up with pixel level
registration techniques by exploiting intensity correlations to spatially align pixels from the two cameras. This
is a computationally expensive method as it requires pixel level operation on the images and this would make
it difficult to register the images in real time. Furthermore, images from different types of cameras may have
different intensity distributions for corresponding pixels which will degrade the registration accuracy. In this
paper we propose to use Multilayer Perceptron (MLP) neural network to solve the image registration problem.
The experimental results show that the performance of the proposed method is suited for registration both in
speed and accuracy.
Real time object detection is still a challenging computer vision problem in uncontrolled
environments. Unlike traditional classification problems, where the training data can properly
describe the statistical models, it is much harder to discriminate certain object class from rest of
the world with limited negative training samples. Due to the large variation of negatives,
sometimes the intra-object class difference may be even larger than the difference between
objects and non-objects. Besides this, there are many other problems that obstruct object
detection, such as pose variation, illumination variation and occlusion.
Previous studies also demonstrated that infrared (IR) imagery provides a promising
alternative to visible imagery. Detectors using IR imagery are robust to illumination variations
and able to detect object under all lighting conditions including total darkness, where detectors
based on visible imagery generally fail. However, IR imagery has several drawbacks, while
visible imagery is more robust to the situations where IR fails. This suggests a better detection
system by fusing the information from both visible and IR imagery.
Moreover, the object detector needs exhaustive search in both spatial and scale domain,
which inevitably lead to high computation load.
In this paper, we propose to use boosting based vehicle detection in both infrared and visible
imagery. Final decision will be a combination of detection results from both the IR and visible
images. Experiments are carried out using ATR helmet device with both EO and IR sensors.
The registration of images from cameras of di.erent types and/or at different locations is of great interest for
both military and civilian applications. Most available techniques are pixel level registration and use intensity
correlation to spatially align pixels from the two cameras. Lots of computation is consumed to operate on
each pixel of the images and as a result, it would be diffcult to register the images in real time. Furthermore,
images from different types of cameras may have different intensity distributions for corresponding pixels which
will degrade the registration accuracy. In this paper we propose to use improved Minimal Resource Allocation
Network (MRAN) to solve the image registration problem from two cameras. Potential features are added to
improve the performance of MRAN. There are two main contributions in this paper - First, weights going directly
from inputs to outputs are introduced and these parameters are updated by including in the extended Kalman
filter algorithm. Second, initial number of hidden units for the sequential training of MRAN are specified and
the means of the initial hidden units are precalculated using Self Organizing Maps. The experimental results
show that the proposed algorithm peforms very well both in speed and accuracy.
Multi-sensor platforms are widely used in surveillance video systems for both military and civilian applications. The
complimentary nature of different types of sensors (e.g. EO and IR sensors) makes it possible to observe the scene under
almost any condition (day/night/fog/smoke). In this paper, we propose an innovative EO/IR sensor registration and
fusion algorithm which runs real-time on a portable computing unit with head-mounted display. The EO/IR sensor suite
is mounted on a helmet for a dismounted soldier and the fused scene is shown in the goggle display upon the processing
on a portable computing unit. The linear homography transformation between images from the two sensors is precomputed
for the mid-to-far scene, which reduces the computational cost for the online calibration of the sensors. The
system is implemented in a highly optimized C++ code, with MMX/SSE, and performing a real-time registration. The
experimental results on real captured video show the system works very well both in speed and in performance.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.