Rapid lossless compression of short text messages

Kalajdzic, Kenan; Ali, Samaher Hussein; Patel, Ahmed

doi:10.1016/j.csi.2014.05.005

Cited by 38 publications

(9 citation statements)

References 1 publication

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…Teknik yang disarankan untuk kompresi data teks adalah dengan kompresi reversible atau lossless. Dari studi literatur, beberapa teknik kompresi data teks [15] [16][17] mencoba untuk mengurangi kapasitas penyimpanan, pada saat data berada di dalam aplikasi tanpa menghilangkan informasi apa pun dari data yang ada (kompresi lossless). Dari ketiga penelitian teknik kompresi di atas, kompresi selalu mengubah simbol atau karakter menjadi bit yang lebih kecil daripada sebelumnya, dan untuk penulisan di dalam penyimpanan, setiap bit tersebut akan dikodekan ke dalam kode 8 bit per 1 byte yang disebut proses encoding.…”

Section: Kompresi Dataunclassified

Kompresi Multilevel Pada Metaheuristic Focused Web Crawler

Santoso¹,

Ginardi²

2019

JUTI

View full text Add to dashboard Cite

2) ABSTRAK Focused Web Crawler merupakan metode pencarian website yang sesuai dengan pencarian yang diinginkan oleh user. Untuk mendapatkan kecocokan yang baik, waktu yang dibutuhkan oleh metode Focused Web Crawler lebih lama dibandingkan dengan metode pencarian web crawler pada umumnya yang menggunakan algoritma DFS (Depth First Search) maupun BFS (Breadth First Search). Untuk mengatasi hal tersebut, dikembangkan teknik pencarian Focused Web Crawler dengan menggunakan metode metaheuristic pencarian cuckoo yang dipadukan dengan pencarian pada data history pencarian yang disimpan. Namun, dengan adanya penyimpanan data pada setiap kali pencarian link maka data akan semakin bertambah. Oleh karena itu diperlukan sebuah cara untuk mengurangi kebutuhan ruang penyimpanan. Cara yang dilakukan untuk mengurangi ruang penyimpanan dan tidak mengurangi nilai informasi dari data penyimpanan sebelumnya adalah dengan melakukan kompresi data. Penelitian ini mengusulkan metode kompresi data dengan melakukan kompresi multilevel menggunakan dua metode kompresi, yaitu pengurangan prefix dan postfix kata dan kompresi string berbasis kamus dengan melakukan pembuatan indeks kamus kata. Hasil kompresi string kamus kata berupa data encode. Pengujian hasil kompresi data dilakukan dengan perbandingan hasil pencarian link menggunakan metode KMP (Knutt Morris Pratt) pada data yang belum terkompresi dengan data yang telah terkompresi. Hasil pengujian menunjukkan maksimum presisi mencapai nilai 1, recall sebesar 0,73, serta rasio kompresi file rata-rata sebesar 36,4%.

show abstract

Section: Kompresi Dataunclassified

Kompresi Multilevel Pada Metaheuristic Focused Web Crawler

Santoso¹,

Ginardi²

2019

JUTI

View full text Add to dashboard Cite

show abstract

“…First, their method splits input text to word and nonword and then uses them as initial alphabet of LZW. Reference [17] proposed a technique to compress short text messages based on two phases. In the first phase, it converts the input text consisting of letters, numbers, spaces, and punctuation marks commonly used in English writing to a format which can be compressed in the second phase.…”

Section: Related Workmentioning

confidence: 99%

n-Gram-Based Text Compression

Nguyen

Duong

et al. 2016

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

We propose an efficient method for compressing Vietnamese text using n-gram dictionaries. It has a significant compression ratio in comparison with those of state-of-the-art methods on the same dataset. Given a text, first, the proposed method splits it into n-grams and then encodes them based on n-gram dictionaries. In the encoding phase, we use a sliding window with a size that ranges from bigram to five grams to obtain the best encoding stream. Each n-gram is encoded by two to four bytes accordingly based on its corresponding n-gram dictionary. We collected 2.5 GB text corpus from some Vietnamese news agencies to build n-gram dictionaries from unigram to five grams and achieve dictionaries with a size of 12 GB in total. In order to evaluate our method, we collected a testing set of 10 different text files with different sizes. The experimental results indicate that our method achieves compression ratio around 90% and outperforms state-of-the-art methods.

show abstract

“…Recently, some researches in lossless compression methods commonly aim to optimize existing compression method for specific data type [7][8][9][10][11][12][13][14][15][16][17][18] or to improve the existing compression method by transforming data to other form before compression process or by combining several compression method [19][20][21][22]. One of novel researches in compression method is Asymmetric Numerical System (ANS) [23][24].…”

Section: New Lossless Compression Methods Using Crlcm (Hendra Mesra)mentioning

confidence: 99%

New Lossless Compression Method using Cyclic Reversible Low Contrast Mapping (CRLCM)

Mesra¹,

Tjandrasa²,

Fatichah³

2016

IJECE

View full text Add to dashboard Cite

<p>In general, the compression method is developed to reduce the redundancy of data. This study uses a different approach to embed some bits of datum in image data into other datum using a Reversible Low Contrast Mapping (RLCM) transformation. Besides using the RLCM for embedding, this method also applies the properties of RLCM to compress the datum before it is embedded. In its algorithm, the proposed method engages Queue and Recursive Indexing. The algorithm encodes the data in a cyclic manner. In contrast to RLCM, the proposed method is a coding method as Huffman coding. This research uses publicly available image data to examine the proposed method. For all testing images, the proposed method has higher compression ratio than the Huffman coding.</p>

show abstract

Rapid lossless compression of short text messages

Cited by 38 publications

References 1 publication

Kompresi Multilevel Pada Metaheuristic Focused Web Crawler

Kompresi Multilevel Pada Metaheuristic Focused Web Crawler

n-Gram-Based Text Compression

New Lossless Compression Method using Cyclic Reversible Low Contrast Mapping (CRLCM)

Contact Info

Product

Resources

About