Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators

Chitty-Venkata, Krishna Teja; Somani, Arun K.

doi:10.1109/asap49362.2020.00016

Cited by 5 publications

(3 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Salah satu penerapan dari neural network adalah Deep Neural Network atau DNN. Deep Neural Network adalah neural network yang tersusun dari layer yang jumlahnya lebih dari satu [11]. Penelitian yang menggunakan DNN dilakukan oleh [12] untuk menganalisa sentimen pada Twitter berbahasa Indonesia mengenai institusi pemerintahan dan tokoh pemerintahan.…”

Section: Pendahuluanunclassified

See 1 more Smart Citation

Analisis Sentimen Twitter untuk Menilai Opini Terhadap Perusahaan Publik Menggunakan Algoritma Deep Neural Network

Hidayat

Hardiansyah

Affandy

2021

TEKNOSI

View full text Add to dashboard Cite

Dalam menaikkan kinerja serta mengevaluasi kualitas, perusahaan publik membutuhkan feedback dari masyarakat / konsumen yang bisa didapat melalui media sosial. Sebagai pengguna media sosial Twitter terbesar ketiga di dunia, tweet yang beredar di Indonesia memiliki potensi meningkatkan reputasi dan citra perusahaan. Dengan memanfaatkan algoritma Deep Neural Network (DNN), neural network yang tersusun dari layer yang jumlahnya lebih dari satu, didapati hasil analisa sentimen pada Twitter berbahasa Indonesia menjadi lebih baik dibanding dengan metode lainnya. Penelitian ini menganalisa sentimen melalui tweet dari masyarakat Indonesia terhadap sejumlah perusahaan publik dengan menggunakan DNN. Data Tweet sebanyak 5504 record didapat dengan melakukan crawling melalui Application Programming Interface (API) Twitter yang selanjutnya dilakukan preprocessing (cleansing, case folding, formalisasi, stemming, dan tokenisasi). Proses labeling dilakukan untuk 3902 record dengan memanfaatkan aplikasi Sentiment Strength Detection. Tahap pelatihan model dilakukan menggunakan algoritma DNN dengan variasi jumlah hidden layer, susunan node, dan nilai learning rate. Eksperimen dengan proporsi data training dan testing sebesar 90:10 memberikan hasil performa terbaik. Model tersusun dengan 3 hidden layer dengan susunan node tiap layer pada model tersebut yaitu 128, 256, 128 node dan menggunakan learning rate sebesar 0.005, model mampu menghasilkan nilai akurasi mencapai 88.72%.

show abstract

Section: Pendahuluanunclassified

“…Hasil perhitungan ini kemudian dihitung lagi dengan menggunakan fungsi aktivasi yang merupakan keluaran dari layer tersebut. Persamaan 1 merupakan perhitungan pada setiap layer[11].…”

unclassified

Analisis Sentimen Twitter untuk Menilai Opini Terhadap Perusahaan Publik Menggunakan Algoritma Deep Neural Network

Hidayat

Hardiansyah

Affandy

2021

TEKNOSI

View full text Add to dashboard Cite

show abstract

“…Neural Network model optimization techniques such as efficient network design, pruning, and quantization are essential for efficient inference on hardware in real-time. Pruning [4,5,11,12] is a method to reduce the size and computational complexity of a network as it removes redundant connections/neurons which do not significantly contribute to the model accuracy. Weight/Connection-wise pruning is irregular in nature and hence introduces non-uniform sparsity in the weight matrices.…”

Section: Introductionmentioning

confidence: 99%

Efficient Design Space Exploration for Sparse Mixed Precision Neural Architectures

Chitty-Venkata

Emani

Vishwanath

et al. 2022

Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing

Self Cite

View full text Add to dashboard Cite

Pruning and Quantization are two effective Deep Neural Network (DNN) compression methods for efficient inference on various hardware platforms. Pruning refers to removing unimportant weights or nodes, whereas Quantization converts the floating-point parameters to low-bit fixed integer representation. The pruned and low precision models result in smaller and faster inference models on hardware platforms with almost the same accuracy as the unoptimized network. Tensor Cores in Nvidia Ampere 100 (A100) GPU supports (1) 2:4 fine-grained sparse pruning where 2 out of every 4 elements are pruned, and (2) traditional dense multiplication to achieve a good accuracy and performance trade-off. The A100 Tensor Core also takes advantage of 1-bit, 4-bit, and 8-bit multiplication to speed up the inference of a model. Hence, finding the right matrix type (dense or 2:4 sparse) along with the precision for each layer becomes a combinatorial problem. Neural Architecture Search (NAS) can alleviate such problems by automating the architecture design process instead of a brute-force search. In this paper, we propose (i) Mixed Sparse and Precision Search (MSPS), a NAS framework to search for efficient sparse and mixed-precision quantized model within the predefined search space and fixed backbone neural network (Eg. ResNet50), and (ii) Architecture, Sparse and Precision Search (ASPS) to jointly search for kernel size and number of filters, and sparse-precision combination of each layer. We illustrate the effectiveness of our methods targeting A100 Tensor Core on Nvidia GPUs by searching efficient sparse-mixed precision networks on ResNet50 and achieving better accuracy-latency trade-off models compared to the manually designed Uniform Sparse Int8 networks. CCS CONCEPTS• Computing methodologies → Neural networks.

show abstract

FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shuffling

Wei,

Zheng,

Wang

et al. 2023

CCF Trans. HPC

View full text Add to dashboard Cite

Array Aware Training/Pruning: Methods for Efficient Forward Propagation on Array-based Neural Network Accelerators

Cited by 5 publications

References 16 publications

Analisis Sentimen Twitter untuk Menilai Opini Terhadap Perusahaan Publik Menggunakan Algoritma Deep Neural Network

Analisis Sentimen Twitter untuk Menilai Opini Terhadap Perusahaan Publik Menggunakan Algoritma Deep Neural Network

Efficient Design Space Exploration for Sparse Mixed Precision Neural Architectures

FASS-pruner: customizing a fine-grained CNN accelerator-aware pruning framework via intra-filter splitting and inter-filter shuffling

Contact Info

Product

Resources

About