This paper solves the Sparse Photometric stereo through Lighting Interpolation and Normal Estimation using a generative Network (SPLINE-Net). SPLINE-Net contains a lighting interpolation network to generate dense lighting observations given a sparse set of lights as inputs followed by a normal estimation network to estimate surface normals. Both networks are jointly constrained by the proposed symmetric and asymmetric loss functions to enforce isotropic constrain and perform outlier rejection of global illumination effects. SPLINE-Net is verified to outperform existing methods for photometric stereo of general BRDFs by using only ten images of different lights instead of using nearly one hundred images.
It is difficult to identify the historical period in which some ancient murals were created because of damage due to artificial and/or natural factors; similarities in content, style, and color among murals; low image resolution; and other reasons. This study proposed a transfer learning-fused Inception-v3 model for dynasty-based classification. First, the model adopted Inception-v3 with frozen fully connected and softmax layers for pretraining over ImageNet. Second, the model fused Inception-v3 with transfer learning for parameter readjustment over small datasets. Third, the corresponding bottleneck files of the mural images were generated, and the deep-level features of the images were extracted. Fourth, the cross-entropy loss function was employed to calculate the loss value at each step of the training, and an algorithm for the adaptive learning rate on the stochastic gradient descent was applied to unify the learning rate. Finally, the updated softmax classifier was utilized for the dynasty-based classification of the images. On the constructed small datasets, the accuracy rate, recall rate, and F1 value of the proposed model were 88.4%, 88.36%, and 88.32%, respectively, which exhibited noticeable increases compared with those of typical deep learning models and modified convolutional neural networks. Comparisons of the classification outcomes for the mural dataset with those for other painting datasets and natural image datasets showed that the proposed model achieved stable classification outcomes with a powerful generalization capacity. The training time of the proposed model was only 0.7 s, and overfitting seldom occurred.
A stable enhanced superresolution generative adversarial network (SESRGAN) algorithm was proposed in this study to address the low-resolution and blurred texture details in ancient murals. This algorithm makes improvements on the basis of GANs, which use dense residual blocks to extract image features. After two upsampling steps, the feature information of the image is input into the high-resolution (HR) image space to realize an improvement in resolution, and the reconstructed HR image is finally generated. The discriminator network uses VGG as its basic framework to judge the authenticity of the input image. This study further optimized the details of the network model. In addition, three loss optimization models, i.e., the perceptual loss, content loss, and adversarial loss models, were integrated into the proposed algorithm. The Wasserstein GAN-gradient penalty (WGAN-GP) theory was used to optimize the adversarial loss of the model when calculating the perceptual loss and when using the preactivation feature information for calculation purposes. In addition, public data sets were used to pretrain the generative network model to achieve a high-quality initialization. The simulation experiment results showed that the proposed algorithm outperforms other related superresolution algorithms in terms of both objective and subjective evaluation indicators. A subjective perception evaluation was also conducted, and the reconstructed images produced by our algorithm were more in line with the general public’s visual perception than those produced by the other compared algorithms.
Ancient murals are of high artistic value and boast rich content. The accurate classification of murals is a challenging task for researchers and can be arduous even for experienced researchers. The image classification algorithms currently available are not effective in the classification of mural images with strong background noise. A new multichannel separable network model (MCSN) is proposed in this study to solve this issue. Using the GoogLeNet network model as the basic framework, we adopt a small convolution kernel for the extraction of the shallow-layer background features of murals and then decompose larger, two-dimensional convolution kernels into smaller convolution kernels, for example, 7 × 7 and 3 × 3 kernels into 7 × 1 and 1 × 7 kernels and 3 × 1 and 1 × 3 kernels, respectively, to extract important deep-layer feature information. A soft thresholding activation scaling strategy is adopted to enhance the stability of the network during training, and finally, the murals are classified through the softmax layer. A minibatch SGD algorithm is employed to update the parameters. The accuracy, recall and F1-score reached 88.16%, 90.01%, and 90.38%, respectively. Compared with mainstream classification algorithms, the model demonstrates improvement in terms of classification accuracy, generalizability, and stability to a certain extent, supporting its suitability in efficiently classifying murals.
Traditional methods for ancient mural segmentation have drawbacks, including fuzzy target boundaries and low efficiency. Targeting these problems, this study proposes a pyramid scene parsing MobileNetV2 network (PSP-M) by fusing a deep separable convolution-based lightweight neural network with a multiscale image segmentation model. In this model, deep separable convolution-fused MobileNetV2, as the backbone network, is embedded in the image segmentation model, PSPNet. The pyramid scene parsing structure, particularly owned by the two models, is used to process the background features of images, which aims to reduce feature loss and to improve the efficiency of image feature extraction. In the meantime, atrous convolution is utilized to expand the perceptive field, aiming to ensure the integrity of image semantic information without changing the number of parameters. Compared with traditional image segmentation models, PSP-M increases the average training accuracy by 2%, increases the peak signal-to-noise ratio by 1–2 dB and improves the structural similarity index by 0.1–0.2.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
hi@scite.ai
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.