KEYWORDS: Visualization, Image retrieval, Data modeling, Taxonomy, Visual process modeling, Systems modeling, Nickel, Neural networks, Information visualization, Image visualization
Visual search and similarity can aid an e-commerce platform by providing appropriate recommendations where semantic labeling and associated metadata does not always exist. In this work, we detail the specifics of our system that powers visually similar recommendations. While a common approach leverages learned representation from common classification tasks using DNN’s, the crux of the problem are the labels and ontology that is applied in order for the DNN to functionally perform. Our proposed approach in production for a variety of products is to supply these recommendations based on a defined taxonomy through a hierarchy that has been carefully curated while additionally scaled up through our platform's natural crowd-sourcing interface. To scale the use of these taxonomies in production, we quantization schemes for retrieving approximate nearest neighbors after applying base transformations on the images using Apache Beam and Tensorflow Transform. The nearest neighbor retrievals are based on using a ResNet model architecture, trained from scratch on 3000+ classes. These are trained daily in a distributed fashion and optimizing data throughput. Finally, in order to verify the appropriateness, we use an extensive human evaluation pipeline and quality control. In this work, we share our product design learnings from the various attempts/experiments we conducted for a successful launch.
The performance of recommendation systems is highly dependent on candidate matching techniques for scoping users’ information needs. Existing candidate matching methods are based on text embedding and collaborative filtering, which base similarity primarily on semantics or co-occurrences of listings. Unfortunately, they do not leverage valuable user behavior to include recommendation impressions, clicks, or the sequences leading up to a purchase. This rich information reflects accurate user preferences, which have been widely used to enhance the ranking stage of a recommendation system. Yet integrating contextual info into the matching stage is challenging because the feature dimensionality and sparsity will be increased as well. Recently, graph representation learning (GRL) has gained much success in industrial applications like item-to-item recommendation systems. GRL learns a mapping of embedding nodes (with edges) into a low dimensional space by representing users’ behaviours into an activity graph. The goal is to optimize this mapping function so that the learnt geometric relationships can reflect the structural information of the original graph. The trained embeddings can be used as features for downstream applications like nearest neighbor search and ranking problems. Our work focuses on a GRL framework to enhance the performance of the candidate generation. Such an approach inevitably faces the cold start problem. For example, listings with fewer or none user interactions can not be learned effectively. To deal with it, side information like shop, category, price, etc. are integrated into the listing embedding by learning an integrated multi-view embedding.
Development of a silicon-based on-chip light source could be facilitated by the incorporation of nanocrystalline silicon
(nc-Si) into a multislot waveguide structure, using erbium embedded in silicon oxide as a luminescence source. The
multislot waveguide confines TM polarized light in the oxide (low-index) layers, thus reducing the loss caused by
interaction with free carriers in the nc-Si layers. Here we demonstrate a lateral electrical injection scheme using a p-i-n
junction embedded into the multislot, allowing much more efficient charge injection than alternative vertical injection
approaches which have been limited by the highly insulating oxide layers. By exploiting the difference in the mode
profiles of TE and TM light, we were able to gauge the injection of free carriers as a function of applied voltage, by
measuring the polarization-dependent optical loss for light transmitted through the multislot waveguide. Experimental
measurements are well-predicted by numerical computations using both FDTD and the transfer matrix method.
KEYWORDS: Image registration, Cameras, 3D image processing, 3D modeling, Clouds, Atomic force microscopy, Detection and tracking algorithms, 3D image reconstruction, Error analysis, Machine vision
This paper proposes a trainable computer vision approach for visual object registration relative to a collection of training
images obtained a priori. The algorithm first identifies whether or not the image belongs to the scene location, and should
it belong, it will identify objects of interest within the image and geo-register them. To accomplish this task, the processing
chain relies on 3-D structure derived from motion to represent feature locations in a proposed model. Using current state-of-
the-art algorithms, detected objects are extracted and their two-dimensional sizes in pixel quantities are converted into
relative 3-D real-world coordinates using scene information, homography, and camera geometry. Locations can then be
given with distance alignment information. The tasks can be accomplished in an efficient manner. Finally, algorithmic
evaluation is presented with receiver operating characteristics, computational analysis, and registration errors in physical
distances.
A stochastic framework combining classification with nonlinear regression is proposed. The performance evaluation
is tested in terms of a patch-based image superresolution problem. Assuming a multi-variate Gaussian
mixture model for the distribution of all image content, unsupervised probabilistic clustering via expectation
maximization allows segmentation of the domain. Subsequently, for the regression component of the algorithm,
a modified support vector regression provides per class nonlinear regression while appropriately weighting the
relevancy of training points during training. Relevancy is determined by probabilistic values from clustering.
Support vector machines, an established convex optimization problem, provide the foundation for additional formulations
of learning the kernel matrix via semi-definite programming problems and quadratically constrained
quadratic programming problems.
Conference Committee Involvement (7)
Applications of Machine Learning 2025
3 August 2025 | San Diego, California, United States
Applications of Machine Learning 2024
20 August 2024 | San Diego, California, United States
Applications of Machine Learning 2023
23 August 2023 | San Diego, California, United States
Applications of Machine Learning 2022
23 August 2022 | San Diego, California, United States
Applications of Machine Learning 2021
4 August 2021 | San Diego, California, United States
Applications of Machine Learning 2020
23 August 2020 | Online Only, California, United States
Applications of Machine Learning
13 August 2019 | San Diego, California, United States
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.