An Elementary View on Factorization Machines

Prillo, Sebastian

doi:10.1145/3109859.3109892

Cited by 9 publications

(2 citation statements)

References 5 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As shown in Figure 1 below, many scientists and researchers have made various improvements 2 , which can be roughly categorized into two kinds, namely feature width expansion as well as deep learning. Width expansion is mainly improved by introducing feature engineering at input, handling independent feature interactions [3][4][5] , handling correlated feature interactions 6,7 , and multi-model integration at output 8,9 . Deep learning, on the other hand, implements various deep feature mining for FM algorithms by initializing the embedding layer of the original features 10 , using a linear structure for the width part, a DNN model for the depth part [11][12][13][14] , and a Cross network for the width part to obtain low-order interaction information 15 , respectively.…”

Section: Introductionmentioning

confidence: 99%

Spark based parallelized FM algorithm research

deng,

wang,

2024

Preprint

View full text Add to dashboard Cite

For traditional Factorization Machines (FM) algorithms and various deep learning-based FM algorithms, although they perform well in tasks such as feature learning and recommender systems, they are often single-computer operations that require powerful computational and storage resources, and they have problems such as excessive irrelevant feature redundancy, slow convergence and low efficiency of parallelized training when running on large-scale datasets training efficiency, a Spark-based parallelized FM (SFM) algorithm is proposed. With the efficient distributed processing platform provided by Spark, the parallel computing of RDD datasets on multiple machines in the HDFS cluster can accelerate the processing speed, and at the same time ensure the scalability and fault tolerance when processing large-scale data. Part of the publicly available advertisement dataset is used for the experimental data. The experimental results show that the parallelized FM algorithm has better performance in recommendation accuracy, recall, F1 value, and scalability, and at the same time, it can significantly improve the operation efficiency in the case of large-scale data through multi-node parallelism.

show abstract

Section: Introductionmentioning

confidence: 99%

Spark based parallelized FM algorithm research

deng,

wang,

2024

Preprint

View full text Add to dashboard Cite

show abstract

“…As the feature interactions are modelled by the latent vectors of original features, the latent vectors take the responsibility of both representation learning and feature interaction modelling. These two tasks may conflict with each other so that the model performance is bounded, as observed in [13], [14].…”

Section: Introductionmentioning

confidence: 99%