2013
DOI: 10.12785/amis/072l28
|View full text |Cite
|
Sign up to set email alerts
|

Accelerating GOR Algorithm Using CUDA

Abstract: Protein secondary structure prediction is very important for its molecular structure. GOR algorithm is one of the most successful computational methods and has been widely used as an efficient analysis tool to predict secondary structure from protein sequence. However, the running time is unbearable with sharp growth in protein database. Fortunately, CUDA (Compute Unified Device Architecture) provides a promising approach to accelerate secondary structure prediction. Therefore, we propose a fine-grained parall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2015
2015
2021
2021

Publication Types

Select...
3
1

Relationship

2
2

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 10 publications
(15 reference statements)
0
3
0
Order By: Relevance
“…The new graphics processing unit (GPU) has fast shared memory and slow memory. Reusing the data in shared memory is a key point to improve the performance of GPU applications [105][106][107][108][109]. A very useful optimization is presented for fractional derivative [69,110].…”
Section: Memory Access Optimization (Fractional Precomputing Operator)mentioning
confidence: 99%
“…The new graphics processing unit (GPU) has fast shared memory and slow memory. Reusing the data in shared memory is a key point to improve the performance of GPU applications [105][106][107][108][109]. A very useful optimization is presented for fractional derivative [69,110].…”
Section: Memory Access Optimization (Fractional Precomputing Operator)mentioning
confidence: 99%
“…However, how to make best use of accelerator resources is still a challenge. Most research work is focused on maximizing utilization of accelerator cores by exploiting millions of concurrent threads, 7,[10][11][12][13] while few works are interested in data transfer into accelerators and fewer works have referred to non-contiguous data transfer with special respect to strided data transfer. However, non-contiguous chunks of data, especially for strided data, are widely applied to real-life scenarios, such as regions-of-interest (ROI) coding and critical component of dataset duplication for reliability.…”
Section: Introductionmentioning
confidence: 99%
“…However, it is more expensive to transfer data into CUDA memory. [10][11][12][13][14][15][16][17] Therefore, minimizing data transfer is proposed for GPU clusters, 18 and asynchronous transfer is also resorted to in porting FFT into CUDA. 11,[19][20][21] But, the above techniques are proposed for contiguous chunks of data transfer and difficult to extend into strided data in practice.…”
Section: Introductionmentioning
confidence: 99%