Deep Quaternion Networks

Gaudet, Chase J.; Maida, Anthony S.

doi:10.1109/ijcnn.2018.8489651

Cited by 147 publications

(157 citation statements)

References 16 publications

Supporting

Mentioning

140

Contrasting

Order By: Relevance

“…Similar to [65] and [67], the diagonal of γ is initialized to 1/ 8 , the off diagonal terms of γ and all components of β are initialized to 0.…”

Section: Octonion Batch Normalization Modulementioning

confidence: 99%

“…Since the images in datasets of CIFAR-10 and CIFAR-100 are real-valued, however, the input of the proposed deep octonion networks needs to be octonion matrix, which we should to be obtained first. The octonion has one real part and seven imaginary parts, we put the original N training real images into the real part, and similar to [65] and [67], To speed up the training, the following layer is an AveragePooling2D layer, which is then followed by a fully connected layer called Dense to classify the input. The deep octonion network model sets the number of residual blocks in the three stages to 10, 9, and 9, respectively, and the number of convolution filters is set to 32, 64, and 128.…”

Section: Octonion Input Constructionmentioning

confidence: 99%

See 1 more Smart Citation

Deep octonion networks

et al. 2020

Neurocomputing

View full text Add to dashboard Cite

Deep learning is a research hot topic in the field of machine learning. Real-value neural networks (Real NNs), especially deep real networks (DRNs), have been widely used in many research fields. In recent years, the deep complex networks (DCNs) and the deep quaternion networks (DQNs) have attracted more and more attentions. The octonion algebra, which is an extension of complex algebra and quaternion algebra, can provide more efficient and compact expression. This paper constructs a general framework of deep octonion networks (DONs) and provides the main building blocks of DONs such as octonion convolution, octonion batch normalization and octonion weight initialization; DONs are then used in image classification tasks for CIFAR-10 and CIFAR-100 data sets. Compared with the DRNs, the DCNs, and the DQNs, the proposed DONs have better convergence and higher classification accuracy. The success of DONs is also explained by multi-task learning. IntroductionReal-value neural networks (Real NNs) [1][2][3][4][5][6][7][8][9][10][11][12] attracted the attention of many researchers and recently made major breakthroughs in many areas such as signal processing, image processing, natural language processing, etc.Many models of Real NNs have been constructed in the literature. These models can generally be categorized into two kinds: non-deep models and deep models. The non-deep models are mainly constructed by multilayer perceptron module [13] and hard to train, if we only use the real-valued back propagation (BP) algorithm [14], when their layers are larger than 4. The deep models can be roughly constructed by the following two strategies: multilayer perceptron models assisted by the unsupervised pretrained methods (for example, deep belief nets [15], deep auto-encoder [16], etc.) and real-value convolutional neural networks (Real CNNs), including LeNet-5 [17], AlexNet [18], Inception [19-22], VGGNet [23], HighwayNet [24], ResNet [25], ResNeXt [26], DenseNet [27], FractalNet [28], PolyNet [29], SENet [30], CliqueNet [31], BinaryNet [32], SqueezeNet [33], MobileNet [34], etc.Although Real CNNs have achieved great success in various applications, the correlations between convolution kernels generally do not take into consideration, that is, there are no connections or no special relationships considering between convolution kernels. The opposite of Real CNNs is real-value recurrent neural networks (Real RNNs) [35][36][37][38], who obtain the correlations by adding the connections between convolution kernels and then learn the weights of these connections, which, however, increased significantly the training difficulty and was easier to encounter converge problems. The first question has been raised: Can we consider the correlations between convolution kernels by some special relationships, which do not need to learn, instead of adding the connections between convolution kernels?Many researchers find that the performance can be improved when the relationships between convolution kernels are modeled by complex algebra, quaterni...

show abstract

“…Similar to [65] and [67], the diagonal of γ is initialized to 1/ 8 , the off diagonal terms of γ and all components of β are initialized to 0.…”

Section: Octonion Batch Normalization Modulementioning

confidence: 99%

Section: Octonion Input Constructionmentioning

confidence: 99%

Deep octonion networks

et al. 2020

Neurocomputing

View full text Add to dashboard Cite

show abstract

“…The goal of learning effective representations lives at the heart of deep learning research. While most neural architectures for NLP have mainly explored the usage of real-valued representations (Vaswani et al, 2017;Bahdanau et al, 2014;Parikh et al, 2016), there have also been emerging interest in complex (Danihelka et al, 2016;Arjovsky et al, 2016;Gaudet and Maida, 2017) and hypercomplex representations (Parcollet et al, 2018b,a;Gaudet and Maida, 2017).…”

Section: Related Workmentioning

confidence: 99%

“…Notably, progress on Quaternion and hypercomplex representations for deep learning is still in its infancy and consequently, most works on this topic are very recent. Gaudet and Maida proposed deep Quaternion networks for image classification, introducing basic tools such as Quaternion batch normalization or Quaternion initialization (Gaudet and Maida, 2017). In a similar vein, Quaternion RNNs and CNNs were proposed for speech recognition (Parcollet et al, 2018a,b).…”

Section: Related Workmentioning

confidence: 99%

“…While Quaternion connectionist architectures have been considered in various deep learning application areas such as speech recognition (Parcollet et al, 2018b), kinematics/human motion (Pavllo et al, 2018) and computer vision (Gaudet and Maida, 2017), our work is the first hypercomplex inductive bias designed for a wide spread of NLP tasks. Other fields have motivated the usage of Quaternions primarily due to their natural 3 or 4 dimensional input features (e.g., RGB scenes or 3D human poses) (Parcollet et al, 2018b;Pavllo et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Tay¹,

Zhang²,

Luu³

et al. 2019

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics

View full text Add to dashboard Cite

Many state-of-the-art neural models for NLP are heavily parameterized and thus memory inefficient. This paper proposes a series of lightweight and memory efficient neural architectures for a potpourri of natural language processing (NLP) tasks. To this end, our models exploit computation using Quaternion algebra and hypercomplex spaces, enabling not only expressive inter-component interactions but also significantly (75%) reduced parameter size due to lesser degrees of freedom in the Hamilton product. We propose Quaternion variants of models, giving rise to new architectures such as the Quaternion attention Model and Quaternion Transformer. Extensive experiments on a battery of NLP tasks demonstrates the utility of proposed Quaternion-inspired models, enabling up to 75% reduction in parameter size without significant loss in performance.

show abstract

Analysis of deep complex‐valued convolutional neural networks for MRI reconstruction and phase‐focused applications

Cole

Cheng

Pauly

et al. 2021

Magnetic Resonance in Med

View full text Add to dashboard Cite

Deep learning has had success with MRI reconstruction, but previously published works use real-valued networks. The few works which have tried complexvalued networks have not fully assessed their impact on phase. Therefore, the purpose of this work is to fully investigate end-to-end complex-valued convolutional neural networks (CNNs) for accelerated MRI reconstruction and in several phasebased applications in comparison to 2-channel real-valued networks. Methods: Several complex-valued activation functions for MRI reconstruction were implemented, and their performance was compared. Complex-valued convolution was implemented and tested on an unrolled network architecture and a U-Net-based architecture over a wide range of network widths and depths with knee, body, and phase-contrast datasets. Results: Quantitative and qualitative results demonstrated that complex-valued CNNs with complex-valued convolutions provided superior reconstructions compared to real-valued convolutions with the same number of trainable parameters for both an unrolled network architecture and a U-Net-based architecture, and for 3 different datasets. Complex-valued CNNs consistently had superior normalized RMS error, structural similarity index, and peak SNR compared to real-valued CNNs. Conclusion: Complex-valued CNNs can enable superior accelerated MRI reconstruction and phase-based applications such as fat-water separation, and flow quantification compared to real-valued convolutional neural networks.

show abstract

Deep Quaternion Networks

Cited by 147 publications

References 16 publications

Deep octonion networks

Deep octonion networks

Lightweight and Efficient Neural Natural Language Processing with Quaternion Networks

Analysis of deep complex‐valued convolutional neural networks for MRI reconstruction and phase‐focused applications

Contact Info

Product

Resources

About