PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.
This PDF file contains the front matter associated with SPIE Proceedings Volume 13057, including the Title Page, Copyright information, Table of Contents, and Conference Committee information.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multisensor Fusion, Multitarget Tracking, and Resource Management I
Nowadays, multi target trackers have to be more self-adaptive to face pluralities of detection and clutter contexts. For instance, current naval surveillance systems have to deal with different kinds of threats including asymmetric targets for suicide actions or harassing missions. Moreover, time and space fluctuating heavy clutters have to be considered (solar reflection on sea, ground…). Usually, dedicated clutter maps techniques are implemented to adapt tracker parameters to local detection statistics and false alarms rates. In this paper, a self-adaptive tracking concept is proposed. It combines a PHD (Probabilistic Hypothesis Density) filter for clutter suppression, with a current track-oriented tracker. The first module adapts the functioning point of the terminal tracker to low/medium clutter situations. Regarding new algorithms under study, results are provided with infrared registered maritime scenes. They put forward performances that are achieved.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Electric pumps are widely used in applications such as sanitation, manufacturing and agriculture. Electric current is supplied to the pumps, which translates into a corresponding flow rate and therefore output pressure. This relationship between a pump’s pressure and flow rate is described as its performance curve. This conference paper uses estimation theory and cognitive system techniques to improve the efficiency of electric pumps. Specifically, using the perception-action cycle to observe the states, predict the system behaviour and then optimize it. The system states are estimated using sensor measurements and system dynamics, where the control system uses the states to find the optimal flow rate based on the performance curve and adjust the system accordingly. This methodology is validated using simulations. The simulation models a sprayer that is powered by a DC motor where the ideal spray angle is maintained based on the distance to the surface. Optimizing the electric pump performance, reduces energy consumption and optimizes fluid usage, which can provide savings in many industries and systems.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Nonlinear Gaussian filters have traditionally used cubature rules and/or Gaussian quadrature to compute multidimensional expectation integrals recursively, which provide mean and covariance estimates of the state and state error, respectively. Minimal and near minimal-point filters attain moderate accuracy while avoiding the so-called “curse of dimensionality”, but their accuracy can diverge over time. Recent trends in cubature-based filtering have opted for more evaluation points to increase accuracy at the cost of higher computational overhead, while still avoiding the dreaded curse. These methods use more complex and higher-degree cubature rules. The present work, contrary to recent trends, uses a quadrature method other than that of the Gaussian variety. Double exponential quadrature is used to achieve high levels of relative accuracy with a moderate number of evaluation points, rivalling that of current state-of-the-art Gaussian filters and the best-in-class Gauss-Hermite filter.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper proposes a vision-aided system to control the horizontal velocity of Unmanned Aerial Vehicles (UAVs). It fuses data from an Inertial Measurement Unit (IMU) and optical flow sensor to measure horizontal velocity. The IMU provides angular rate and acceleration data, while the optical flow sensor provides a two-dimensional incremental displacement of the scene in view. Fusing these complementary data sources facilitates velocity control without Global Positioning System (GPS) dependency. A series of simulations validated the system’s effectiveness, performed at lower altitudes where the optical flow sensor functions best. Results demonstrate that fusing the sensors enables accurate horizontal velocity control, reducing position drift and navigation error compared to using inertial data alone.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Centralized fusion ensures minimal information loss and maximizes the effectiveness of state estimation. Statistically, it is the optimal solution for all sensor fusion configurations. In this paper, we introduce a local-sensor-driven asynchronous low-level centralized fusion methodology that seamlessly integrates radar and camera data at the level of detections from each sensor. For a local-sensor-driven asynchronous system, detections from the two sensing modalities with different sampling rates are transmitted to a centralized filter, which is updated whenever it receives a measurement. We implemented the proposed algorithm and validated the results using real data from manufacturing and industrial work sites. The data was obtained by Plato System’s Argus perception system, which combines high-resolution imaging mm-wave radar with camera sensors to provide indoor and outdoor activity tracking. We further compare the fusion results with vision-only MOT, as well as track-level fusion (track-to-track fusion).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Multisensor Fusion, Multitarget Tracking, and Resource Management II
When searching for a target whose state is unknown, it is desirable to implement an appropriate search method to maximise efficiency through the minimisation of an associated cost function. The posterior distribution over the target state returned by Bayesian search provides just such a function. Nevertheless, finding the best algorithm for a given task is often non-trivial; a common approach is to build a model that accurately represents the scenario and to compare the efficacy of competing algorithms. This requires a toolkit that is easy to adapt and is able to demonstrate a range of sensor characteristics, target behaviours and search schemes. This paper shows how Stone Soup, an open source state estimation and tracking framework, can be an effective tool for Bayesian search. It demonstrates how user-de fined search scenarios can be incorporated into Stone Soup's sensor management capability to model Bayesian search algorithms and compare them against heuristic methods. Several examples are provided to demonstrate this. The bene t of using Stone Soup is that the implementer of Bayesian search need not exert significant energy understanding or reinventing algorithms for modelling all aspects of sensor management. Instead, they can focus on their area of expertise, building up an appropriate model, and use the relevant tools in Stone Soup to implement the search algorithms. This paper lays the foundations for more complex search scenarios to be modelled using Stone Soup, offering more realism to the user.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
We derive a new theory of Bayesian particle flow that bullet proofs the algorithm against stiffness. Many researchers have attempted to apply particle flow filters, but sometimes they have gotten disappointing results owing to stiffness. Such researchers may be experts in fancy estimation algorithms, but they are rarely experts in mitigating stiffness for Itô stochastic differential equations. We solve this problem by bullet-proofing the Bayesian particle flow algorithm itself against stiffness. The new theory allows us to avoid fancy stiff ODE solvers that require large amounts of computer run time, and which are not parallelizable on GPUs. We also derive a very simple upper bound on the stiffness for our Gromov particle flow that gives us insight into the root cause of the problem. This shows that the stiffness of the flow is infinite if we do not directly measure all components of the state vector. The new theory fixes this problem completely. This paper is for normal engineers who do not have Itô calculus for breakfast.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In this work, we consider a discrete-time scalar Wiener process driven by a zero-mean Gaussian white noise with unknown variance, observed with an additive Gaussian white measurement noise, also with unknown variance. The estimators of these noise variances are obtained using Maximum Likelihood Estimation (MLE). We demonstrate that the Log-Likelihood Function (LLF) of the Kalman filter gain and innovation variance in this system can be expressed as a quadratic function of the measurements. This quadratic formulation of the LLF, derived from the measurements' probability density function (pdf) as a product of the pdf of the innovations, allows for an analytical expression of the LLF in terms of the filter gain and innovation variance. This approach facilitates the evaluation of the Cramér-Rao Lower Bound (CRLB) and makes it possible to confirm the statistical efficiency of the MLE for the filter gain and innovation variance, i.e., achieving the CRLB and thus demonstrating optimality. The practical application of this methodology is shown for an inertial navigation sensor, characterized by a Wiener process drift and measurement noise.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Information Fusion Methodologies and Applications I
DoD agencies produces a deluge of heterogeneous data from arrays of multimodal sensor sources. Ideally, these collects contribute to global and local situational awareness supporting decision speed. The effective management, orchestration, and interpretation of this data, within ever-increasing adversarial deception capabilities, for (near) real-time actionable processes has obscured resulting in imminent costs and loss of life. A key factor contributing to these mission needs include the lack of exploitation over degree of freedom spaces that upstream multimodal sensor data and their fused manifolds possess. Within these structures are rich, alternative sources of mathematically rigorous organization and data fusion techniques where a paradigm shift in local or global SA could be instantiated. This research expands upon and validates a TDA AI/ML network design (U.S. Patent Pending No. 63/499,338) presented in the 2023 SPIE DCS conference. Modified custom approaches, involving the data fusion of its three modalities (specifically acoustic, electro-optical, and infrared) and testing results for predictive automatic target recognition are presented along with several mathematical generalizations and clustering capacities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Based upon bench and field testing five distinct sensors as candidates for use in a large area ground sensor network, our team has determined cooperative sensor fusion modes of operation for infrasound, audible acoustic, fence vibration, visible cameras, and seismic geophones. The goal has been efficient coverage of large areas to provide alerts for poaching activity hot spots so UAVs (Unoccupied Aerial Vehicles) can provide rapid response to prevent poaching without need for constant patrolling. Prior work by our research team has focused on evaluation of sensing modes with a range of spatial, temporal, and spectral capabilities (satellite, aerial, ground acoustic, electrooptical/infrared, seismic, and fence vibration). Focus has been construction of low-cost sensors for sensor fusion to provide situational awareness via web interfaces. These systems jointly developed by Embry-Riddle Aeronautical University with California State University Chico have been tested at the Chico State University farm, and top-down sensor fusion methods such as deep learning (to detect or classify animals and threats) as well as bottom-up image and signal processing have been developed to create a fog and edge computing architecture. Modalities that specifically target elephant communication with infrasound and seismic activity are being investigated to enhance overall animal detection, tracking, and assessment of behavior. The goal is to evaluate effectiveness prior to testing on-site at a game park in South Africa, and to determine if the methods can be scaled to areas as large as Rietvlei, Medikwe, and Coleridge South Africa. Preliminary results from fog and edge node testing of visual and acoustic sensor fusion with artificial emulation of elephant vocalizations, infrasound rumbles, and stimulation typical of human presence (vehicles and voices) are provided along with promise to drive a heat map showing where park rangers should respond with highest priority.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study explores the increasingly complex domain of drone behavior and management within urban airspaces, particularly focusing on Europe’s U-space framework. It highlights the critical evolution from basic drone detection to the essential identification of abnormal drone behaviors in the challenging and variable environments of city skies. Additionally, the paper explores the advancement of drone movement management through the application of data fusion techniques. These techniques integrate diverse sensor data to distinguish between normal and abnormal drone activities effectively. Furthermore, this research introduces a novel approach for identifying abnormal drone behaviors, leveraging data fusion methods adeptly, marking a significant step toward more reliable and robust drone management in the multifaceted, dynamic environments of urban areas. This study identifies a notable gap in current methodologies for detecting abnormal drone behaviors, particularly in the specialized application of data fusion techniques. We introduce a comprehensive approach utilizing Dynamic Bayesian Networks (DBN) as the foundation for our methodology. The integration of advanced data fusion techniques with DBNs is shown to significantly enhance the capability to identify and analyze abnormal drone activities accurately. This dual focus on DBNs and data fusion methodologies not only addresses the identified research gap but also establishes a sophisticated framework for future advancements in drone behavior monitoring. Through this work, we demonstrate the enhanced effectiveness of DBNs in drone management research, setting a new standard for precision and reliability in the field.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Strategic competition with near peer threats such as Russia and China replaced a focus on non-state extremist threats. Competition “success” is measured by domestic stability, strong alliances, networks and partners; the goal is greater influence in order to shape international norms, institutions. In the current information saturated age, manipulation of truth is common and propaganda can be used to weaponize information in order to compete. Diplomatic and economic strategies (e.g., Belt Road) are also important. Competitive advantage requires national ambition, unified identity, will, effective institutions. Analytics and modeling for strategic competition must characterize competitors across multiple information vectors (culture, cyber, diplomatic, social-cyber, economic, etc.) with “emic” (1st person), “etic” (3rd person) perspectives, leveraging expert-AI to contextualize information for situation awareness, planning, developing tactics for behavior change without war. This paper highlights the state-of-the-knowledge/art re: behaviors and assessment of strategic competition, spotlighting gaps, issues (e.g., inappropriate/outdated assumptions) and identifies future research areas.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Information Fusion Methodologies and Applications II
Within the recent years, the concept of Digital Twins (DT) emerged to support digital engineering physical-systems design in the processing and analysis of device components, data flows, and networked systems. As an element of systems engineering, the DT serves as a method for life cycle management for operations and maintenance through monitoring, diagnostics, and prognostics. Key to DT methods is the use of distributed sensors to monitor the system and determine the functioning with the use of physical design information, such as a series of distributed edge sensors and the layout of a physical electrical grid. While industries like industrial manufacturing, electrical power, space systems, and healthcare maintenance have embraced DT; other groups are utilizing the concept for analysis. Given that a large number of sensors are used to gather data on the health of system, it is natural that data fusion, estimation theory, and signals processing support digital twin fusion (DTwF); but there are a variety of challenges such as big data, distribution fusion, and edge analytics. The panel will discuss areas in which data and information fusion techniques enhance DT applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
There are many examples, methods, and processes showing the importance of sensor, data, and information fusion. However, there is a need to determine the value added of information fusion in the context of data (e.g., multimodal, amount, and resolution), software (e.g., artificial intelligence/machine learning), hardware (e.g., size, weight, and performance), as well as architectures (e.g., cloud, fog, and edge computing). This paper utilizes the analytical hierarchy process (AHP) to determine the pairwise performance needs among different human-machine information fusion system tradeoffs to show the value added from sensor fusion. The paper examines the potential value added by coordinating deep, active, and reinforcement learning for information fusion systems. Among the information metrics, the combined methods of artificial intelligence learning methods highlight user requirements for accuracy, confidence, and timeliness.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Signal and Image Processing, and Information Fusion Applications I
Compressive sensing of complex stepped frequency radar returns is explored in this paper. The goal is to classify targets after reconstructing their sparse high range resolution profile from complex frequency compressive measurements. Given the limited number of scatterers along an aircraft, the target range profile (or impulse response) is considered sparse in the range or time domains. The paper focuses on 1) comparing different methods (or strategies) for compressive sensing of complex signals, 2) exploring feature selection algorithms as compressive sensing tools, and 3) exploring the benefits of denoising compressively sensed radar returns corrupted with additive noise. A synthetic radar target of five different scatterers is first used to assess the performance of different sensing strategies. Real radar returns of commercial aircraft models are used to assess the performance of a target recognition system that utilizes compressive sensing with or without denoising, and with or without feature selection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper introduces an innovative technique for object recognition in environments characterized by high-scattering conditions, leveraging radio and microwave frequencies. The method integrates the angular spectrum of diffracted and scattered components into Fresnel Coherent Diffraction Imaging, enabling real-time digital recording of distribution patterns. Traditional approaches often grapple with limitations imposed by minimum pixel size, which this technique overcomes through the application of digital holography. Here, imaging resolution is dictated by minimum phase measurements and signal sampling steps. Long-wavelength radio frequency signals, recognized for their exceptional penetration capabilities, are harnessed for high-resolution detection and recognition. The diffraction components offer valuable insights into an object's dielectric properties, size, and shape. Importantly, this information remains invariant to the object's spatial position when radio and microwave wavelengths significantly exceed the object's size. The integration of monopulse multi-channel signal processing with Software Defined Radios plays a pivotal role in substantially reducing computational complexity for object recognition. The application of simple onestep algorithms enhances efficiency in decision-making processes. This proposed technique marks a promising advancement in overcoming challenges associated with object recognition in high-scattering conditions, promising improved resolution, and computational efficiency.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Combining data from multiple sensors to improve the overall robustness and reliability of a classification system has become crucial in many applications, from military surveillance and decision support, to autonomous driving, robotics, and medical imaging. This so-called sensor fusion is especially interesting for fine-grained target classification, in which very specific sub-categories (e.g. ship types) need to be distinguished, a task that can be challenging with data from a single modality. Typical modalities are electro-optical (EO) image sensors, that can provide rich visual details of an object of interest, and radar, that can yield additional spatial information. Defined by the approach used to combine data from these sensors, several fusion techniques exist. For example, late fusion can merge class probabilities outputted by separate processing pipelines dedicated to each of the individual sensor data. In particular, deep learning (DL) has been widely leveraged for EO image analysis, but typically requires a lot of data to adapt to the nuances of a fine-grained classification task. Recent advances in DL on foundation models have shown a high potential when dealing with in-domain data scarcity, especially in combination with few-shot learning. This paper presents a framework to effectively combine EO and radar sensor data, and shows how this method outperforms stand-alone single sensor methods for fine-grained target classification. We adopt a strong few-shot image classification baseline based on foundation models, which robustly handles the lack of in-domain data and exploits rich visual features. In addition, we investigate a weighted and a Bayesian fusion approach to combine target class probabilities outputted by the image classification model and radar kinematic features. Experiments with data acquired in a measurement campaign at the port of Rotterdam show that our fusion method improves on the classification performance of individual modalities.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The development of Activity Based Intelligence (ABI) requires an understanding of individual actors’ intents, their interactions with other entities in the environment, and how these interactions facilitate accomplishment of their goals. Statistical modelling alone is insufficient for such analyses, mandating higher-level representations such as ontology to capture important relationships. However, constructing ontologies for ABI, ensuring they remain grounded to real-world entities, and maintaining their applicability to downstream tasks requires substantial hand-tooling by domain experts. In this paper, we propose the use of a Large Language Model (LLM) to bootstrap a grounding for such an ontology. Subsequently, we demonstrate that the experience encoded within the weights of a pre-trained LLM can be used in a zero-shot manner to provide a model of normalcy, enabling ABI analysis at the semantics level, agnostic to the precise coordinate data. This is accomplished through a sequence of two transformations, made upon a kinematic track, toward natural language narratives suitable for LLM input. The first transformation generates an abstraction of the low-level kinematic track, embedding it within a knowledge graph using a domain-specific ABI ontology. Secondly, we employ a template-driven narrative generation process to form natural language descriptions of behavior. Computation of the LLM perplexity score upon these narratives achieves grounding of the ontology. This use does not rely on any prompt engineering. In characterizing the perplexity score for any given track, we observe significant variability given chosen parameters such as sentence verbosity, attribute count, clause ordering, and so on. Consequently, we propose an approach that considers multiple generated narratives for an individual track and the distribution of perplexity scores for downstream applications. We demonstrate the successful application of this methodology against a semantic track association task. Our subsequent analysis establishes how such an approach can be used to augment existing kinematics-based association algorithms.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Signal and Image Processing, and Information Fusion Applications II
StrokeChange is a proof-of-concept system that uses computer vision and machine learning to monitor stroke patients in-situ towards the goal of monitoring patient progress and changes. The system uses advanced algorithms and deep learning techniques to analyze and interpret the patient's facial data in real-time. StrokeChange is designed to assist healthcare professionals in remotely monitoring the patient's condition and intervening promptly if necessary. This system has the potential to revolutionize stroke care by affordably improving patient outcomes, while also providing patients with greater independence and reducing the burden on healthcare systems. This innovative system can detect changes in a patient's facial expressions, which may indicate changes in their condition, such as the onset of new symptoms or the worsening of existing ones. Various approaches including detection, classification, and regression to solve this problem are implemented and compared. A dataset was curated for training and testing of StrokeChange. When evaluated on test data, StrokeChange achieved best results on a detection and regression combination, achieving an 98% accuracy on facial area detection (eye/mouth detector), a 1.255 mean average loss, and a “perceived accuracy” of 80.3% on regression of eye patterns, and an 83.3% on regression of mouth patterns. The StrokeChange model is deployed into an Android Application for proof of concept. Results including testing and application outcomes are demonstrated as well as challenging results, problems, and areas of future work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Face recognition (FR) technology has gained widespread popularity due to its diverse utility and broad range of applications. It is extensively used in various domains, including information security, access control, and surveillance. Achieving better real-time face detection (FD) performance can be challenging, especially when running multiple algorithms that require both high accuracy and swift execution (high frame rate) into embedded System on Chips (SoC). In this study, a comprehensive methodology and system implementation are proposed for concurrent face detection, landmark extraction, quality assessment, and face recognition directly at the edge, without relying on external resources. The approach integrates cutting-edge techniques, including the utilization of the Extended YOLO model for face detection and the ArcFace model for feature extraction, optimized for deployment on embedded devices. By leveraging these models alongside a dedicated recognition database and efficient software architecture, the system achieves remarkable accuracy and real-time processing capabilities. Critical aspects of the methodology involve tailoring model optimization for SoC environments, specifically focusing on the YOLO face detection model and the ArcFace feature extraction model. These optimizations aim to enhance computational efficiency while preserving accuracy. Furthermore, efficient software architecture plays a crucial role, allowing for the seamless integration of multiple components on embedded devices. Optimization techniques are employed to minimize overhead and maximize performance, ensuring real-time processing capabilities. By offering a detailed framework and implementation strategy, this research contributes significantly to the development of a high-performance, highly accurate real-time face recognition system optimized for embedded devices.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Attritable sensor systems with microphones are ubiquitous and can detect a variety of moving targets from ground vehicles to unmanned autonomous systems (UAS), aircraft, and spacecraft. Smartphone platforms have open data collection interfaces and can use cellular, Wi-Fi, and other public communications to relay data. In addition, acoustic signatures are more difficult to mask than transmissions in radio-frequency bands, making sound a passive complement to RF in an early warning system. This presentation discusses the development, deployment, and ongoing AI/ML work behind RedVox, a smartphone app available through the Apple App Store and Google Play Store. RedVox records data using the smartphone’s onboard sensors, including the accelerometer, gyroscope, magnetometer, barometer, and microphone; these data packets can be streamed to interested parties in real-time and analyzed online using RedVox’s open-source software tools. One of RedVox’s unique features is the ability to collect infrasonic data at frequencies below human hearing, which has been shown to detect natural disasters, explosions, and motorized vehicles. The RedVox app is rapidly deployable at scale in regions of interest. Two current focus areas are Guam and Coconut Island in Oahu, where networks of smartphones can perform edge processing and dispersed sensing of airborne and maritime targets. The RedVox ecosystem has matured over the past decade across diverse environments and currently provides advanced edge and cloud analytics. Further development could lead to a lightweight machine learning model that flags anomalous entities for human review, offering a resilient ad hoc sensor network for defense or emergency response applications.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A human-centric navigation system has been developed with a focus on supporting blind users of prosthetic vision devices by providing these users the ability to navigate their environment independently. The system maps the environment and localizes the user while incorporating context-enhanced information about the scene generated by AI-based methods. A deep learning semantic segmentation engine is utilized to process information from RGB and incorporates depth imaging sensors to produce semantic mappings of the scene. The heightened level of environmental interpretability provided by semantic mapping enables high-level human-computer interactions with the user, such as queries for guidance to specific objects or features within the environment. Unlike traditional sensor-based mapping frameworks that represent the environment as simple occupied / unoccupied space, our semantic mapping approach interprets the identity of occupied space as specific types of objects and their regional association to region types (e.g., static, movable, dynamic). The semantic segmentation also enables contextually-aware scene processing, which our framework leverages for robust ground estimation and tracking with fused depth data to distinguish above-ground obstacles. To help address the highly limited vision performance of current prosthetic vision technology, the processed depth information is used to generate augmented vision feedback for the prosthetic vision user by filtering out ground and background scene elements and highlighting near-field obstacles to aid in visual identification and avoidance of obstacles while navigating. Supplemental user feedback is provided via a directional haptic headband and voice-based notifications paired with spatial sound for path following along autonomously computed trajectories towards desired destinations. An optimized architecture enables real-time performance on a wearable embedded processing platform, which provides high-fidelity update rates for time-critical tasks such as localization and user feedback while decoupling tasks with heavy computational loads. Substantial speed-up is thereby achieved compared to the conventional baseline implementation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Vision Transformers (ViTs) have demonstrated remarkable performance in various visual tasks, but they suffer from expensive computational and memory challenges, which hinder their practical application in the real world. Model quantization methods reduce the model computation and memory requirements through low-bit representations. Knowledge distillation is used to guide the quantized student network to imitate the performance of its full-precision counterpart teacher network. However, for ultra-low bit quantization, the student networks experience a noticeable performance drop. This is primarily due to the limited learning capacity of the smaller network to capture the knowledge of the full-precision teacher, especially when the representation gaps between the student and the teacher networks are significant. In this paper, we introduce a multi-step knowledge distillation approach, utilizing intermediate-quantized networks with varying bit precision. This multi-step knowledge distillation approach enables an ultra-low bit quantized student network to effectively bridge the gap with the teacher network by gradually reducing the model’s bit representation. We progressively teach each TA network to learn by distilling the knowledge from higher-bit quantized teacher networks from the previous step. The target student network learns from the combined knowledge of the teacher assistants and the full-precision teacher network, resulting in improved learning capacity even when faced with significant knowledge gaps. We evaluate our methods using the DeiT vision transformer for both ground level and aerial image classification tasks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A topic model is a probabilistic method for data analysis and characterization that provides insight into the topics that comprise each document in a corpus, where each topic is described by an associated word distribution. A dynamic topic model is an extension of this model that can be applied to time series data. These models have typically been applied to the text domain where the concepts of tokens and words are well defined. Applying these models to the image domain is non-obvious because the concepts of tokens and words need to hand-crafted. In this work, we apply the dynamic topic model to a sequence of images to provide insight into their dynamic nature, e.g., by helping to identify interesting locations in time that correspond to change in operating conditions We apply this model to images from the KITTI dataset and show that the model captures the evolving nature of these topics over time.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Signal and Image Processing, and Information Fusion Applications III
In order to protect and secure critical infrastructures, it is important to advance technologies for automated detection and mapping of crucial components like the pipes of fire suppression systems. This work explores the instance segmentation on such pipes in multimodal image media. Instance segmentation has been significantly enhanced by machine learning techniques, but faces complexities when the used RGB training set is limited and lacks diversity. In the means of improving the training efficiency, we study the influence of enhancing the training set with thermal infrared images on the model’s performance. In our study, we harnessed both RGB and infrared images captured from the same location. We employed Mask R-CNN with transfer learning from the COCO weights and trained multiple neural networks on different training set combinations with RGB images and thermal infrared images. In order to further enlarge each training data set, we implemented different augmentation methods in the training. Subsequently, we conducted fine-tuning and optimization procedures for the Mask R-CNN training and determined the quality of the pipe instance segmentation of the produced models on test sets containing RGB and infrared images. Using the Jaccard score we provide a quantitative measure of pipe segmentation. The results show that the addition of a comparatively low number of infrared images to the training process does not only improve the detection performance on infrared data, but also improves the model’s performance on RGB data. A SOTA comparison shows that the detection quality of our models is comparable with the detection results of Ultralytics’ YOLOv8 and Meta’s SAM, which are both more recently released.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A physics-based approach to detecting and classifying surface and sub-surface objects in longwave (thermal) infrared imagery is described. The main premise is to associate a heat capacity and effective depth with each voxel (or segment) in the image. An energy budget for the voxel then leads to a linear, first-order differential equation, in which the temperature is forced by fluxes in and out of the voxel (shortwave solar radiation, longwave radiation, sensible and latent turbulent heat exchanges with the atmosphere), while relaxing towards an equilibrium temperature determined by a weighted mean of the air and ground temperatures. Next, it is shown how this simplified model can be incorporated into maximum-likelihood and Bayesian classifiers to distinguish buried objects from their surroundings. In particular, a version of the Bayesian classifier is formulated that leverages the differing amplitude and phase response of a buried object over the diurnal cycle. These classifiers will be tested on experimental data in future work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Gas pipelines are critical for transporting vast quantities of natural gas across regions. Third-party damage, such as unauthorized excavation activities is a primary cause of accidents and damage to these pipelines, leading to significant economic losses, environmental harm, and potential threats to human safety. The importance of detecting third-party damages as early as possible cannot be overstated, as it allows pipeline operators to take timely actions to prevent damage. Recent technological advancements, particularly the development of fiber optic sensors, offer promising solutions for real-time monitoring and early warning systems against such damages. Analyzing incidents related to third-party damages present significant challenges due to their non-adherence to physical laws, in contrast to phenomena like corrosion, temperature, and pressure changes. Traditional analytical or empirical models fall short of detecting such damages effectively. However, deep learning techniques have demonstrated notable success in identifying distinctive features from non-physical data sources, including images, speech, and third-party damage acoustic signals pertinent to this study. The efficacy of deep learning methods is contingent upon the availability of a robust dataset for training. The scarcity of fiber optic sensor data pertaining to third-party damages is a critical limitation in this field. This research aims to mitigate this challenge by generating a dataset of third-party damage events on a laboratory scale utilizing a single mode-multi mode-single mode (SMS) fiber acoustic sensor. The sound samples representative of various third-party activities, such as vehicle movements, excavation, and digging were sourced from open-source databases. These samples were then played through a speaker in proximity to an SMS sensor, and the resultant fiber acoustic vibration data were recorded for each event. This process yielded a collection of 200 samples across 13 distinct third-party events. Convolutional neural networks (CNNs) were employed to classify these samples into their respective categories, and an accuracy exceeding 97% was obtained from our results.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The ability to detect satellite maneuvers is an integral part of space domain awareness. While satellite maneuver detection typically refers to orbital maneuvers, the ability to detect a change in attitude or spin state is equally important. In many cases, these attitude maneuvers are the predecessor to orbital maneuvers, and can serve as an indication of future behaviors. One means of assessing a satellites attitude and spin state involves analysis of the object’s light curve, which is the temporal history of reflected light off the satellite and collected by an observer. Model-based approaches have been demonstrated to be excellent tools for eliciting attitude information from light curves, but the results are heavily dependent on the accuracy of the input model. An alternative approach, based on a wavelet decomposition of the light curve signal, has been used to identify attitude activity without the need for a well-defined satellite model. Wavelet decomposition allows the frequency of a signal to be assessed over time, providing critical insight into temporal changes of the signal. When applied to a light curve, these changes can be indicative of attitude activities. This paper focuses on methods to determine and assert that a change has occurred using the light curve signals and wavelet decomposition. Various approaches to identify changes are discussed and compared. The practical application of these methods on noisy sensor data under real collection scenarios is also discussed. Use cases considered include a simulation study as well as real world data collection on GOES-16, a Geostationary satellite, during multiple phases of its lifetime.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Automatic Modulation Recognition (AMR) is critical in identifying various modulation types in wireless communication systems. Recent advancements in deep learning have facilitated the integration of algorithms into AMR techniques. However, this integration typically follows a centralized approach that necessitates collecting and processing all training data on high-powered computing devices, which may prove impractical for bandwidth-limited wireless networks. In response to this challenge, this study introduces two methods for distributed learning-based AMR on the collaboration of multiple receivers to perform AMR tasks. The TeMuRAMRD 2023 dataset is employed to support this investigation, uniquely suited for multi-receiver AMR tasks. Within this distributed sensing environment, multiple receivers collaborate in identifying modulation types from the same RF signal, each possessing a partial perspective of the overall environment. Experimental results demonstrate that the centralized-based AMR, with six receivers, attains an impressive accuracy rate of 91%, while individual receivers exhibit a notably lower accuracy, at around 41%. Nonetheless, the two proposed decentralized learning-based AMR methods exhibit noteworthy enhancements. Based on consensus voting among six receivers, the initial method achieves a marginally lower accuracy. It achieves this while substantially reducing the bandwidth demands to a 1/256th of the centralized model. With the second distributed method, each receiver shares its feature map, subsequently aggregated by a central node. This approach also accompanies a substantial bandwidth reduction of 1/8 compared to the centralized approach. These findings highlight the capacity of distributed AMR to significantly enhance accuracy while effectively addressing the constraints of bandwidth-limited wireless networks.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
The Havana Syndrome, named after its initial detection in Havana, Cuba, in 2016, has been a subject of numerous studies and debates in the scientific community. The mysterious condition manifests with various symptoms, ranging from auditory disturbances to neurological deficits. This study aimed to collate and analyze data from reported cases of the Havana Syndrome and the existing literature to see if a possible mechanism for the Havana Syndrome could be found. Our research analyzes how two modulated ultrasound phased array beams could explain the recorded waveform, as well as the experiences of the victims. This is shown through simulations of the beam propagation pattern and the sound produced at the target, which show remarkably similar characteristics to those observed in eyewitness reports. We then show experiments using small ultrasound phased arrays, comparing those results to the simulations and recordings from the affected individuals. We conclude that a scenario involving two directed ultrasound beams could potentially link external interference to the onset of the Havana Syndrome. While our findings suggest this plausible connection, establishing whether such interference is borne from malicious intent or is incidental remains beyond the scope of this work.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
A case study analysis is presented demonstrating application of a deinterleaving algorithm, and its characteristics in terms of parameter sensitivity, which is for PRI-modulation recognition associated with ELINT analysis. This study examine data obtained from RADAR collection sites, consisting of various PRI-modulated characteristics, such as those occurring typically for ELINT-analysis environments. A specific aspect of this algorithm is that it is structured for deinterleaving of RADAR scans. Ultimately, the goal of this algorithm is to construct and identify RADAR signals, which represent a post-processed data format with respect to scans.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This paper will conduct an exhaustive analysis of sliding mode and variable structure filters, essential methodologies employed in signal processing that solve problems noisy and volatile environments induces. This study brings into focus the very foundations and practicality of interference filters by professional exploration of the theoretical basis, the extent of practical implementation, and the present research frontiers. It also emphasizes the critical role these filters play in making signal processing system more precise, adaptable, and resilient. The Sliding mode and the variable structure filters are optimal in the field of nonlinear problems that are augmented by interferences. These are what their purpose in noise reduction, error minimization and the system stability is. Review encompasses a wide range of applications, from robotics, aerospace to telecommunication with the aim to emphasis the variety and successfully of the filtering techniques when the signal processing is carried out in complex situations. The synthesis of the most recent achievements of the technology and discovery of the potential future directions passed down the concept of this work. The aim is to provide a push to developers to promote the implementation of the advanced application-specific solutions. And this discussion also demonstrates the flexible character that helps meet the demands of dynamic signal processing environments, forming the foundation for the next advanced systems that are adaptive and robust.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In recent years, the random number generators in engineering applications have attracted the attention of many researchers. One of the most important aspects is to obtain some sequences of both truly random and pseudorandom numbers that can be applicated in the various fields of cryptography, signal processing, and in many other fields of engineering. At the production of pseudorandom number generators, chaotic maps are the source of entropy to create some randomness. Most of the methods based on chaotic maps have been attacked easily in recent years with the help of nonlinear prediction (forecasting), as well as analysis of the phase space on the map. A true random number generator is actual area of research when using in it the encryption. Encryption of any data with random source ensures the security of information. The purpose of this article is to develop the self-generated truly random numbers for audio encryption. For this aim, a sound encryption system was created. Digital audio data is converted to binary form. The XOR operation is performed to develop for truly random number generator. The testing systems NIST 800–22 Test Suite and TestU01 are ap-plied to the bit stream. The source data and the encryption key are evaluated for randomness. Audio encryption is performed with the generated bits. This article proves that the self–generated audio encryption system can be implemented. Statistical analysis and data distribution show that the encryption process with self–generation is successful. In this paper, the authors propose the model of sound encryption of self-generated true random number.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
In an era where data security is paramount, encrypting sensitive information is ubiquitous. Traditional encryption methods often rely on complex algorithms and keys, making decryption an intricate and computationally intensive task. This paper introduces a novel approach to signal recovery and decryption, leveraging data-driven techniques, specifically Long Short-Term Memory (LSTM) neural networks. Our methodology is not limited to any specific encryption algorithm, allowing it to be applicable in various domains. We explore using LSTM networks as powerful tools for deciphering encrypted signals without prior knowledge of the encryption process. The critical insight behind our approach is the ability of LSTMs to capture intricate patterns and dependencies in the encrypted data, thus enabling the reconstruction of the original signal. We present experimental results that showcase the effectiveness of our data-driven decryption approach in a range of scenarios. This research signifies a paradigm shift in signal recovery and decryption, offering an alternative to traditional cryptanalysis techniques. By harnessing the power of data-driven modelling, we open new avenues for retrieving valuable information from encrypted signals, with potential applications in data cybersecurity and beyond.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
This study provides a bibliometric analysis of the Unscented Kalman Filter (UKF) within chaotic digital communication, highlighting its significance and evolving role in enhancing the robustness of signal processing in unpredictable environments. Through a systematic review of literature from key databases and a detailed bibliometric examination, we trace the UKF's development, applications, and its intersection with various research themes, including chaos synchronization and nonlinear filtering. The findings underscore the UKF's importance in addressing the complexities of chaotic systems, revealing its impact across algorithms, chaos theory, and secure digital communication strategies. This analysis not only showcases the UKF's practical advantages and theoretical contributions but also positions it as a crucial tool for future advancements in digital communication technologies. The study encapsulates the UKF's potential to revolutionize signal processing by leveraging chaos for improved system performance, emphasizing the need for continued interdisciplinary research to harness this potential fully.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
Recognizing the behavioral intent of group targets or formation targets can not rely solely on a single moment and an individual target, nor can it ignore the dynamic changes of situation elements in the time domain, and especially the integrity of formation targets should be considered. To address this issue, this paper proposes a group targets intent recognition model based on BiConvLSTM-Attention. The feature set is constructed based on the electromagnetic and motion characteristics of the group targets in a certain period of time, which is encoded into time series features, and corresponding intention labels are assigned at the same time. Simulation results show that the proposed method is more accurate and reliable for the behavior intention recognition of group targets than other comparison methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.