1.IntroductionThe progressive scaling of transistors in semiconductor manufacturing pushes the limit of lithographic techniques capable of smaller feature patterning with feature sizes well below 7-nm nodes. Electron-beam imaging has been established as the technology of choice for in-line and off-line metrology tasks. Highly massive metrology results are necessary to evaluate the process quality of printed resist patterns. With the scaling of semiconductor manufacturing technology, requirements for metrology and tolerance for edge placement error (EPE) are becoming increasingly demanding. High accuracy, precision, repeatability, and fast turn-around time are desirable. As the patterns scale, a lower electron dosage is needed to avoid photoresist damage. Under such conditions, the scanning electron microscope (SEM) images contain an excessive amount of noise and blurring. Electronic, thermal, mechanical, and quantization noise all contribute to images with low signal-to-noise ratio (SNR). Furthermore, E-beam aberration as well as relative movement between the wafer stage and detector introduces optical and motion blur to the images and causes high uncertainty for the edge profile of printed resist patterns. These pose challenges to SEM image pre-processing algorithms for high-quality metrology tasks. In extreme conditions, the SEM images are simply unmeasurable with edge profile-based metrology algorithms. Hence, an effective image restoration methodology is crucial to turning unmeasurable SEM images metrology-ready and effectively improving metrology performance to meet demanding specifications for massive metrology. Below, we give a high-level review of existing technology for image quality (IQ) enhancement and restoration. 1.1.Non-Deep Learning Based Image RestorationBefore machine learning (ML) became the dominant methodology for every field in science and technology, the problem of image restoration has already been extensively studied. Image deblurring methodology can be classified into non-blind deblurring and blind deblurring using non-deep learning (DL)-based architecture. The first category describes a flow that makes explicit assumptions on the function of the underlying blurring kernel and uses iterative optimization or Fourier domain inverse filtering to solve for restored image.1,2 The second category does not explicitly assume a function for the kernel and usually describes a flow using a Bayesian approach. Usually, complex optimization algorithms are needed to solve for the solution.3 Both approaches are considered ill-posed inverse problems. Most image restoration research models the image degradation process as a convolution of the underlying sharp image with degradation kernel plus additive noise term introduced by the SEM image acquisition process. As is shown in Eq. (1), represents latent sharp and clean image, models the kernel, models noise intrinsic to the SEM image, and is the captured SEM image. In a signal restoration process, the goal is to recover the underlying image . Figure 1 shows the described image degradation process Eq. (1): Usually this is an ill-posed, highly under-constrained problem hence it is necessary to reduce the solution space to obtain meaningful solutions. Assumptions are usually made on noise form, degradation kernel, and sometimes on image itself. Another approach is to solve the maximum a posterior problem as in Eq. (2), in which the latent image has a conditional probability distribution density given assumptions on kernel and degraded image. The proper choice of prior term has been researched extensively to ensure the correct model. Among many proposed terms, the most widely used is norm and its variants. This is based on the observation that gradient intensity of natural image follows a Laplace distribution as in Eq. (3).4 To recover the sharp image, many non-blind deblurring approaches try to solve the optimization problem with objective functions in the form of Eq. (4). Total variation (TV) prior of sparse image gradient as well as norm prior on image pixel intensity are popular choices.2,5,6 An improved regularization terms are used instead of or alone which both tend to over-regularize gradient while scaled by term has a better chance of converging to sharp restored image instead of smooth and blurry solution.4 All the above-mentioned methodologies assume an initial kernel form then iteratively optimizes the kernel to converge to underlying true kernel using appropriate image prior as regularization. More recent research has been focusing on solving blind restoration task where kernel form is not assumed.7,8 This is an even more challenging task since we want to model both kernel and latent image distribution given degraded image only. From a Bayesian perspective, a generative model is needed to model joint distribution of both kernel and latent image. It was pointed out that to successfully model as in Eq. (5), we can never collect enough measurements because the number of unknowns grows exponentially with image size, hence instead of modeling , we can model which has much lower dimension and fewer unknown since kernel size is much smaller than image itself.9 After solving for , can then be solved by Fourier technique. In Ref. 10, instead of modeling image prior, the author modeled image gradients. It has been observed that image gradient obeys heavy-tailed distribution with most of its masses concentrated on small values but significantly higher probability at large values than Gaussian distribution, due to abrupt image intensity variations caused by strong image features such as edges and corners. The paper tried to solve restoration problem from a Bayesian approach and modeled latent image gradient as mixture of Gaussians and assumed exponential distribution for the kernel to induce sparsity and smoothness.10 As is shown in Eq. (6), the first term on the right-hand side is Gaussian term on gradient of degraded image, second term is mixture of Gaussians on restored image gradient, and third term is exponential function assumption on kernel. The authors tried to approximate the full posterior distribution then compute kernel with maximum marginal probability: In real situations, blurring kernel can be arbitrary without specific form and this is especially true for random motion blur. Hence non-blind deblurring usually suffers in performance due to simplistic or simply wrong assumption of kernel. Another challenge is to apply optimization algorithm with appropriate regularization strength to converge to the desired solution. On the other hand, blind deblurring loosens strong prior term but requires more sophisticated variational inference algorithm. The two solutions contain intrinsic trade-off of over and under regularization. Next section will go through some typical optimization techniques used in non-blind deblurring tasks. 1.2.Iterative Optimization for Non-Blind DeblurringNon-blind deblurring usually resorts to optimization algorithms such as half quadratic splitting (HQS) or alternating direction method of multipliers.3,5 A routine practice is to initialize degradation kernel as 2D Gaussian function and degraded image as initial latent image then iteratively solve for kernel and latent image alternatively while fixing the other term. Authors in Ref. 11 took advantage of the denoising effect of U-Net and used diluted convolutional layers to increase receptive fields of the convolutional neural networks (CNNs).12 Such an optimization process can be formulated as Eqs. (7) and (8). Both methods optimize data fidelity subproblem in HQS using CNNs, as in Eq. 9(a) and solve for sharp image using inverse filtering. Equation 9(b) corresponds to Gaussian denoising on with noise level of , hence any CNN inspired Gaussian denoiser can be plugged into the formula to solve Eq. 9(b). This methodology has been extended to solve other image restoration problems such as de-mosaicking and image super-resolution.11 Prior regularization term is routinely added to optimization formulation to reduce solution choices and stabilize convergence to avoid noise amplification or trivial solution which are common results for its unregularized counterpart: 1.3.Deep Learning Based Image RestorationDue to high heterogeneity of image statistics, a fixed prior form usually cannot model true image prior, and this leads to wrong model and unsatisfactory results. This problem led researchers to experiment with a data-driven approach to model prior term. In Refs. 6 and 13, the author used CNNs to model the prior of training images. In Ref. 14, the author generalized shrinkage fields by removing unnecessary parameter sharing and replacing pixel-wise applied shrinkage functions with CNN that applied to entire image. These approaches outperform the results obtained from hand-crafted prior term approaches, illustrating flexible modeling capability of CNNs. To further utilize modeling capability of CNNs, end-to-end image restoration frameworks have been proposed and researched. In this framework, CNN is treated as black box and takes low quality image as input while outputs restored high quality images. Convolutional layer followed by element-wise rectified linear unit (ReLU) layer, sum of data fidelity term, and regularization as loss function was used.15,16 Researchers used powerful generative adversarial network to restore image by optimizing adversarial information loss.17 Reference 18 used CNN as feature extraction module that extracts image features from degraded images then estimates kernel and latent image. Reference 19 used six-layers CNNs to learn gradient map and obtained kernel using Fourier transform and HQS to obtain restored image. CNN empowered approaches, when properly combined with physical model, can achieve state-of-the art results compared with methods that assume prior explicitly and can provide end-to-end solutions.20,21 However, the supervised method carries crucial pitfalls that make it undesirable for SEM image restoration task. The trained model suffers from poor generalization to images and kernel forms it was not trained on. This limits the real-world applications of these approaches since realistic blurring kernels in SEM images could be arbitrary without fixed patterns.22,23 Another crucial challenge in restoring low quality SEM images is the lack of ground truth images and supervised ML model does not work without ground truth. Therefore, it becomes obvious that a pre-training free, self-supervised, generative methodology that imposes just enough prior assumption is strongly preferred in SEM image restoration applications. In this paper, we present a thorough investigation of such a SEM image restoration methodology and detailed evaluation of its performance. 2.Dataset and MethodologyIn this section, we describe in detail (1) SEM image dataset, (2) architecture of the neural networks, (3) IQ evaluation metrics, and (4) model regularization and convergence. 2.1.SEM Image DatasetSEM images were collected from 10 dies and 10 runs for each die and four images were captured for each run, resulting in 400 images. The images were captured using ASML HMI eP5 metrology and inspection machine. The patterns include AEI and ADI line space (LS) and contact holes (CH). Pixel size is 1 nm, the field of view is , there are 19 line patterns in each LS image and 1024 CHs in each CH image. To create a dataset with improving image qualities, we used frame averaging to obtain 1, 2, 3, and 4 frame averaged images from the original datasets. Figure 2 summarizes the entire dataset. 2.2.Neural Network ArchitectureThe proposed image restoration framework is composed of generative networks which generates kernel and which generates latent image. Generative network is used to generate blurring kernel , which has much lower dimension than latent image hence should be modeled using a lightweight fully connected network (FCN). The FCN takes a one-dimensional (1D) noise vector with 200 dimensions as input and a hidden layer of 1000 nodes and an output layer of nodes. To guarantee the non-negativity constraint, the SoftMax nonlinearity is applied to the output layer of . Finally, the 1D output of entries is reshaped to a 2D blurring kernel. Table 1 shows the architecture details. Table 1Architecture of generative network Gk.
Generative network is an asymmetric autoencoder with skip connections and is used to generate latent clean image.24 The first five layers of encoder are skip-connected to the last five layers of decoder. Finally, a convolutional output layer is used to generate a latent clean image. Since the output image needs to have positive values, the Sigmoid nonlinearity is applied to the output layer. As is shown in Fig. 3, the ’th unit encoder–decoder architecture is illustrated. Taking as an example, we use the form to represent that the convolution in has filters with kernel size and padding. The filter size in the last convolutional layer is fixed as since we apply Sigmoid activation function to each pixel without any spatial averaging to avoid blurring the image. Down-sampling and up-sampling both have a stride of 2 and bilinear interpolation is used for up-sampling. The input dimension is and the filter number at ’th skip connection layer is proportional to the filter number at ’th encoder layer with ratio . This ratio should determine the proportion of the information ’th decoder layer obtains directly from ’th encoder layer. Kernel size is fixed at and padding fixed at 1. Neural network architecture usually has major impacts on the results and in this case the restored image’s quality. We picked key architectural hyperparameters that include input dimension , filter number combination for each layer , and the ratio of the filter number between skip layer and encoder layer . We experimented with a series of choices and chose the best combination guided by IQ metrics. Since SEM images are single channel gray level images, the input has single channel. The architecture details are presented in Table 2, and the illustration is shown in Fig. 4. The choice of model architecture is determined by factors, such as pattern complexity, computational latency, and restoration performance. For 1D patterns, and filter number is (4, 4, 8, 16, 16), while for 2D patterns, and filter number is (8, 8, 16, 32, 32). We also set since it provides the best restored image while still maintaining enough pattern style from the low-dosage image. Table 2Architecture of generative network Gx.
Image restoration network is composed of and , the latent image convolves with kernel , resulting in a blurred image, and we use its mean squared error (MSE) with low-quality SEM image as the image fidelity term. We formulate the image restoration into an unconstrained optimization problem. It can be seen from the introduction that appropriate regularization strength is crucial to obtaining satisfactory results, and we add TV regularization terms for both image and kernel and weigh the two terms by coefficients and , respectively, to encourage sparsity. The choice of should be related to the noise level in image with larger for a higher noise level to prevent overfitting the noise in the image. The choice of should reflect the complexity of the kernel and since the kernel is supposed to be simple for SEM image, the regularization term could be fixed at a higher value. Our image restoration loss function is composed of the MSE term in addition to two TV regularization terms; the goal is to minimize the total loss, which can be written as Eq. (10): The optimization process for Eq. (10) can be viewed as a “zero-shot,” self-supervised learning,25 where both the generative networks are trained using only the low-quality SEM image and no ground-truth clean image. To optimize the networks, we adopt joint optimization which takes advantages of the automatic differentiation technique, the gradients w.r.t. and can be derived and the network parameters updated. The reason alternating minimization is not used is because this is an unconstrained problem, and it is easy to get stuck at saddle point due to the highly non-convex nature of loss function. We picked the ADAM optimization algorithm to simultaneously update and in one step due to its gradient and learning rate adaptive nature by using momentum.26 Table 3 shows the pseudo code for the optimization process. Both the generative networks and the training process are schematically illustrated in Fig. 5. Table 3Pseudo code for join optimization.
2.3.Image Quality Evaluation MetricsWe describe several popular IQ evaluation metrics that will be used to evaluate the image restoration process and guide the learning process. 2.3.1.Peak signal-to-noise ratioPeak signal-to-noise ratio (PSNR) is calculated by Eq. (11), and are the image dimensions, and MAX is the maximum signal value, which is 255 for grayscale 8 bit SEM images. and represent the pixel value matrix of the reference image and the image to be measured, respectively. High PSNR value indicates a low noise level in the measured image compared to the reference image. We picked this metric since we want to denoise low-dosage SEM images and hope to achieve high PSNR: 2.3.2.Structural similarity index measureStructural similarity index measure (SSIM) is calculated by Eq. (12), , , , are mean and variance of image matrix and , is the covariance term of the two, and and are two scalars to stabilize division numerically. SSIM measures the relative geometrical information similarity between test image w.r.t. reference image and the value ranges from 0 to 1. We picked this metric since we want to restore the underlying patterns from the low-dosage SEM image with high integrity. Since SSIM is sensitive to pattern shift, this metric helps to identify potential pattern misalignment which could cause issues in metrology: 2.3.3.Pattern sharpnessPattern sharpness (PS) is a measure to evaluate quality of the pattern edges. Figure 6 shows how this measure is calculated. It is calculated based on the gray-level profile extracted at pattern edge, and the value is in unit of nm. Smaller value means faster rise time in an edge slope, hence sharper edges. The -axis is pixel location and -axis is the gray level values. The calculated measure converts to a score in nm unit by multiplying the pixel size, which is 1 nm in this case. We picked this metric since we want the restoration framework to deblur or sharpen the image, especially at the pattern edges. Since we have 10 runs for each die and each run has our frames, we average these 40 images and use it as reference image when calculating SSIM and PSNR. 2.4.Model ArchitectureAs computer vision tasks become more challenging over the years, the complexity of neural network architecture has increased drastically to model complex functions. The search space dimension for optimal hyperparameters and training parameters is so high that optimizing in a trial-and-error approach becomes infeasible. Neural architecture search has become a heated research area aiming to provide an automatic way to select optimized architecture.27 This approach usually resorts to cross validation performance to guide automatic selection of architecture. However, due to the lack of ground truth image and self-supervised nature of this methodology, using cross validation to optimize model architecture is infeasible. Alternatively, since we want to restore the IQ as much as possible while keeping the intrinsic geometrical patterns undistorted, we use the IQ metrics introduced in previous sections to guide architecture selection. The neural network architecture under evaluation is in Table 4. To better balance underfitting/overfitting and maximize the application scenario for different pattern complexity, we picked a network of intermediate complexity. Table 5 shows how all IQ metrics improve significantly after restoration and reach comparable values as the four frames averaged image, suggesting successful restoration. The up-arrow means the higher the value, the better the IQ, and vice versa. Table 4Neural network architecture under evaluation.
Table 5IQ metrics before and after restoration.
2.5.Model Regularization and ConvergenceIn this section, we discuss how regularization affects convergence. We use AEI pattern for restoration, we fixed training parameters with input noise standard deviation of 0.01, a fixed learning rate of 0.002, and total 120 iterations. ADAM optimizer was used for all learning processes. It has been discussed previously that we can add prior on image or kernel as a form of regularization so the inverse optimization does not easily converge to trivial or random solutions. We explore how regularization strength on image and kernel TV gradient affects the convergence behaviors. We experiment with different regularization coefficient combinations to better understand how the strength of each regularization term affects convergence as well as learned kernel and restored image. Figure 7 shows the loss function versus iterations for several selected regularization coefficient combinations to illustrate their effects on learned kernels, and the loss is plotted in log scale for better visualization. We use zero regularization (blue curve) as baseline condition and study the effects of regularization coefficients by varying them individually. The kernel snapshots are at iteration 20, 60, and 120, respectively. When increases from 1 to 100 (orange to green), kernel becomes sparser and shows pixelation characteristics. This is due to the typical sparsity inducing effect of TV regularization. Nevertheless, over-regularization leads to failure of kernel convergence. When , we can observe the kernel becoming less noisy and sparse as iteration increases while still being able to converge to underlying blurring kernel characteristics. Note the kernel size is and is artificially enlarged for better visualization. This experiment confirms the divergent nature of this inverse problem and the importance of choosing the appropriate regularization strength. We further examine if IQ metrics could be used as a proxy for the quality of restored latent images. Figure 8 shows SSIM and PSNR under different image regularization strengths. When (over-regularized), the model converges to a trivial solution containing no pattern, this is because the over regularization strongly prefers image with little features such as edge and corner. This correlates well with low IQ metrics. Comparatively for (appropriately regularized), the model successfully restores low noise and sharp latent image with high structural similarity with reference image which correlates well with high IQ metrics. Hence, appropriate regularization strength is crucial to the successful restoration and IQ metrics, such as PSNR and SSIM, could be used to evaluate convergence and determine stopping criteria from an IQ perspective when proper cross-validation technique is not feasible. 3.Results and AnalysisThis section analyzes line edge roughness (LER) and critical dimension (CD), which provides us insights as to how image restoration affects metrology results. It has been established that CD histogram of the image after restoration shows a much smaller standard deviation and suppressed outliers compared to that of low-dosage image, and this was demonstrated in a previous conference paper.28 This is because poor IQ leads to inaccurate CD measurements with higher standard deviation (1.676 nm) and is reduced after restoration (1.057 nm). The restoration also biases the mean CD value by only a small amount (47.08 nm before versus 46.78 nm after). These are metrology results of AEI patterns from Fig. 9. We conduct LER characterization for left EPE and right EPE using power spectral density (PSD) estimation to further confirm our observations. We place measurement gauges along the direction of the line patterns with a 2 nm interval for both left and right edges, respectively, to obtain the EPEs, then we average the left and right LERs for all line patterns from all images to get a better estimation. The PSD formula is shown as Eq. (13). is the spatial frequency, , is signals sampled at discrete positions for a total measurement of samples, we study since we care about spectral amplitude. Note that the actual PSD is achieved when approaches infinity and the expected value can be obtained accurately. In real world cases, the number of real measurement samples is finite hence averaging PSD over many trials is necessary to more accurately estimate the underlying physical process:29,30 Due to the stochastic nature of LER measurement, we break the variance into three parts as in Eq. (14).31–33 refers to the stochasticity of LER profiles caused by process variation, is the measurement uncertainty introduced by metrology algorithm and in our case influenced by IQ, refers to extrinsic noise contributions from factors such as SEM shot noise, SEM tool stage movement, and beam profile. Unbiasing could be used to remove high frequency extrinsic noise and the results are shown in Fig. 9.31 By comparing the unbiased PSD spectrums in Fig. 9, we can see the middle frequency range was drastically attenuated after restoration. This is due to the reduction of metrology noise due to the improvement of IQ. This is in accordance with the observation that CD standard deviation reduces after restoration: Hence, this image restoration method can be used to reduce metrology noise by improving IQ without introducing mean shift and expose intrinsic pattern edge stochasticity. This might be able to provide valuable process information if the underlying process of the patterns is known. The relationship between process and corresponding PSD characteristics is complicated and beyond the scope of this paper and not discussed here. 4.ConclusionEffective restoration of low-quality SEM images is critical in future high-performance metrology applications as the user pushes for higher throughput and faster turn-around time. This paper introduced a new methodology based on a self-supervised, generative, neural network model. A huge advantage of this approach is that it does not require high-fidelity “ground truth” image for training, making it especially desirable for low-dosage metrology applications since such ground truth data are usually unavailable. Detailed description of model architecture and regularization was provided. It has been shown by applying the proposed framework IQ can be improved greatly while preserving intrinsic pattern geometry. CD precision, mean CD, and overall distribution confirm the effectiveness in metrology applications. Extension to 2D patterns is also promising, the image is transformed from a state of non-measurable to one that enables reliable metrology results.28 PSD based LER analysis suggests that the restoration method could reduce metrology noise by improving IQ hence exposes intrinsic process-induced line edge profile stochasticity, which is of great value since process stochasticity becomes more prominent as the device feature keeps shrinking in size. We think more use cases of this restoration framework are yet to be discovered. Code, Data, and Material AvailabilityThe data utilized in this study were obtained from wafers manufactured by our customer, and they do not allow sharing their images freely. The source code that supports the finding of this article is not publicly available because the work is patented and is considered an internal IP for ASML. Nevertheless, to foster collaboration and research, the code could be requested by contacting the author at zijian.du@asml.com AcknowledgmentsThis article is based on the conference proceeding paper in 2022 SPIE Advanced Lithography (Ref. 28) and is a detailed extension of it. The authors would like to acknowledge Dr. Rui Yuan for his technical help for conducting the LER analysis and general discussion of research ideas. ReferencesL. B. Lucy,
“An iterative technique for the rectification of observed distributions,”
Astron. J., 79
(6), 745 https://doi.org/10.1086/111605
(1974).
Google Scholar
S. Boyd et al.,
“Distributed optimization and statistical learning via the alternating direction methods of multipliers,”
Found. Trends Mach. Learn., 3
(1), 1
–122 https://doi.org/10.1561/2200000016
(2010).
Google Scholar
W. H. Richardson,
“Bayesian-based iterative method of image reconstruction,”
J. Opt. Soc. Am., 62
(1), 55
–59 https://doi.org/10.1364/JOSA.62.000055 JOSAAH 0030-3941
(1972).
Google Scholar
L. Xu, S. Zheng and J. Jia,
“Unnatural sparse representation for natural image deblurring,”
http://www.cse.cuhk.edu.hk/leojia/projects/l0deblur/
(2013).
Google Scholar
A. Chakrabarti,
“A neural approach to blind motion deblurring,”
Lect. Notes Comput. Sci., 9907 221
–235 https://doi.org/974310.1007/978-3-319-46487-9_14 LNCSD9 0302-
(2016).
Google Scholar
S. H. Chan et al.,
“An augmented Lagrangian method for total variation video restoration,”
IEEE Trans. Image Process., 20
(11), 3097
–3111 https://doi.org/10.1109/TIP.2011.2158229 IIPRE4 1057-7149
(2011).
Google Scholar
D. Krishnan, T. Tay and R. Fergus,
“Blind deconvolution using a normalized sparsity measure,”
in CVPR,
(2011). Google Scholar
R. Wang and D. Tao,
“Recent progress in image deblurring,”
in SIGGRAPH Asia,
(2013). Google Scholar
C. J. Schuler et al.,
“A machine learning approach for non-blind image deconvolution,”
in CVPR,
(2013). Google Scholar
A. Levin et al.,
“Understanding and evaluating blind deconvolution algorithms,”
in CVPR,
(2009). Google Scholar
J. Kruse, C. Rother and U. Schmidt,
“Learning to push the limits of efficient FFT-based image deconvolution,”
in ICCV,
(2017). Google Scholar
K. Zhang et al.,
“Plug-and-play image restoration with deep denoiser prior,”
in CVPR,
(2019). Google Scholar
K. Zhang et al.,
“Learning deep CNN denoiser prior for image restoration,”
in CVPR,
(2017). Google Scholar
S. Xie et al.,
“Non-blind image deblurring method by the total variation deep network,”
IEEE Access, 7 37536
–37544 https://doi.org/10.1109/ACCESS.2019.2891626
(2019).
Google Scholar
D. Gong et al.,
“Self-paced kernel estimation for robust blind image deblurring,”
in ICCV,
(2017). Google Scholar
Y. Nan, Y. Quan and H. Ji,
“Variational-EM-based deep learning for noise-blind image deblurring,”
in CVPR,
(2020). Google Scholar
M. Hradis et al.,
“Convolutional neural networks for direct text deblurring,”
in BMVC,
(2015). Google Scholar
O. Kupyn et al.,
“DeblurGAN: blind motion deblurring using conditional adversarial networks,”
in CVPR,
(2018). Google Scholar
M. L. Green,
“Statistics of images, the TV algorithm of Rudin-Osher-Fatemi for image denoising and an improved denoising algorithm,”
https://ww3.math.ucla.edu/camreport/cam02-55.pdf Google Scholar
C. J. Schuler et al.,
“Learning to deblur,”
IEEE Trans. Pattern Anal. Mach. Intell., 38
(7), 1439
–1451 https://doi.org/10.1109/TPAMI.2015.2481418 ITPIDJ 0162-8828
(2016).
Google Scholar
X. Xu et al.,
“Motion blur kernel estimation via deep learning,”
IEEE Trans. Image Process., 27
(1), 194
–205 https://doi.org/10.1109/TIP.2017.2753658 IIPRE4 1057-7149
(2018).
Google Scholar
R. Fergus et al.,
“Removing camera shake from a single photograph,”
ACM Trans. Graphics, 25
(3), 787
–794 https://doi.org/10.1145/1141911.1141956 ATGRDF 0730-0301
(2006).
Google Scholar
A. Levin et al.,
“Efficient marginal likelihood optimization in blind deconvolution,”
in CVPR,
(2011). Google Scholar
O. Ronneberger, P. Fischer, T. Brox,
“U-Net: convolutional networks for biomedical image segmentation,”
in Medical Image Computing and Computer-Assisted Intervention ? MICCAI 2015,
(2015). Google Scholar
A. Shocher, N. Cohen and M. Irani,
“Zero-shot super-resolution using deep internal learning,”
in 2018 IEEE/CVF Conf. Comput. Vision and Pattern Recognit. (CVPR),
3118
–3126
(2018). https://doi.org/10.1109/CVPR.2018.00329 Google Scholar
D. P. Kingma and J. L. Ba,
“Adam: a method for stochastic optimization,”
in ICLR,
(2015). Google Scholar
T. Elsken, J. H. Metzen and F. Hutter,
“Neural architecture search: a survey,”
J. Mach. Learn. Res., 20 1
–21
(2019).
Google Scholar
Z. Du et al.,
“Low dosage SEM image processing for metrology applications,”
Proc. SPIE, 12053 1205309 https://doi.org/10.1117/12.2614281 PSISDG 0277-786X
(2022).
Google Scholar
A. Hiraiwa and A. Nishida,
“Spectral analysis of line edge and line-width roughness with long-range correlation,”
J. Appl. Phys., 108 034908 https://doi.org/10.1063/1.3466777 JAPIAU 0021-8979
(2010).
Google Scholar
R. Bonam et al.,
“Comprehensive analysis of line-edge and line-width roughness for EUV lithography,”
Proc. SPIE, 10143 101431A https://doi.org/10.1117/12.2258194 PSISDG 0277-786X
(2017).
Google Scholar
L. Pu et al.,
“Analyze line roughness sources using power spectral density (PSD),”
Proc. SPIE, 10959 109592W https://doi.org/10.1117/12.2516570 PSISDG 0277-786X
(2019).
Google Scholar
C. A. Mack,
“Reducing roughness in extreme ultraviolet lithography,”
Proc. SPIE, 10450 104500P https://doi.org/10.1117/12.2281605 PSISDG 0277-786X
(2017).
Google Scholar
A. Hiraiwa and A. Nishida,
“Spectral analysis of line edge and line-width roughness with long-range correlation,”
J. Appl. Phys., 108 034908 https://doi.org/10.1063/1.3466777 JAPIAU 0021-8979
(2010).
Google Scholar
BiographyZijian Du obtained his PhD in electrical engineering from Arizona State University in 2019 and joined ASML Silicon Valley as senior software engineer. His research interests include applying machine learning and deep learning based techniques to SEM image quality enhancement as well as contour-based edge placement error (EPE) massive-metrology applications for high volume manufacturing (HVM). |
Image restoration
Metrology
Scanning electron microscopy
Image processing
Image quality
Neural networks
Education and training