High Impact Academic Paper Prediction Using Temporal and Topological Features

Davletov, Feruz; Aydın, Ali Selman; Çakmak, Ali

doi:10.1145/2661829.2662066

Cited by 27 publications

(16 citation statements)

References 26 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Predicting popularity with feature-driven models. Data-driven approaches [11,19] treat popularity as a nondecomposable process and take a bottom-up approach. They rely on machine learning algorithms to make the connection between item popularity and an extensive set of features.…”

Section: Related Workmentioning

confidence: 99%

“…Similar to the setup in the previous Sec 6.1, we observe cascades for 5 minutes, 10 minutes and 1 hour and fit the Hawkes and Seismic models for each cascade. The train-test split is different, in order to replicate closer the experimental setup in Martin et al [25]: the data from first half of July (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15) is used for training and the data from second half of July (16-31) for testing. We use the News historical data from April to June to construct the past user success feature for the feature-driven approach.…”

Section: Cascade Size: Generative Vs Feature-drivenmentioning

confidence: 99%

See 1 more Smart Citation

Feature Driven and Point Process Approaches for Popularity Prediction

Mishra

Rizoiu

Xie

2016

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

148

166

View full text Add to dashboard Cite

Predicting popularity, or the total volume of information outbreaks, is an important subproblem for understanding collective behavior in networks. Each of the two main types of recent approaches to the problem, feature-driven and generative models, have desired qualities and clear limitations. This paper bridges the gap between these solutions with a new hybrid approach and a new performance benchmark. We model each social cascade with a marked Hawkes self-exciting point process, and estimate the content virality, memory decay, and user influence. We then learn a predictive layer for popularity prediction using a collection of cascade history. To our surprise, Hawkes process with a predictive overlay outperform recent feature-driven and generative approaches on existing tweet data [43] and a new public benchmark on news tweets. We also found that a basic set of user features and event time summary statistics performs competitively in both classification and regression tasks, and that adding point process information to the feature set further improves predictions. From these observations, we argue that future work on popularity prediction should compare across feature-driven and generative modeling approaches in both classification and regression tasks

show abstract

Section: Related Workmentioning

confidence: 99%

Section: Cascade Size: Generative Vs Feature-drivenmentioning

confidence: 99%

Feature Driven and Point Process Approaches for Popularity Prediction

Mishra

Rizoiu

Xie

2016

Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

148

166

View full text Add to dashboard Cite

show abstract

“…There are no existing work related to prediction of company's future trend. Therefore, we compared our method to the most recent state of the art citationbased work related to high impact academic paper prediction proposed by Davletov et al (Davletov et al, 2014). More precisely, we applied Davletov's method (Citation) to open patents and compared it with the result obtained by our method.…”

Section: Methodsmentioning

confidence: 99%

“…McNamara et al proposed a method for predicting paper's future impact by using topological features extracted from citation network (McNamara et al, 2013). In addition to topological features, Davletov et al predicted high impact academic paper based on temporal features of citations (Davletov et al, 2014). There are a few academic paper prediction method used on textual features (Kogan et al, 2009;Joshi et al, 2010;Yagatama et al, 2011), while much of the previous work on paper prediction used mainly citation statistics (Shi et al, 2010;Yan et al, 2012).…”

Section: Introductionmentioning

confidence: 99%

Prediction of Company's Trend based on Publication Statistics and Sentiment Analysis

Fukumoto

Suzuki

Nonaka

et al. 2016

Proceedings of the 8th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management

View full text Add to dashboard Cite

This paper presents a method for predicting company's trend on research and technological innovation/development(R&D) in business area. We used three types of data collections, i.e, scientific papers, open patents, and newspaper articles to estimate temporal changes of trends on company's business area. We used frequency counts on scientific papers and open patents to be published in time series. For news articles, we applied sentiment analysis to extract positive news reports related to the company's business areas, and count their frequencies. For each company, we then created temporal changes based on these frequency statistics. For each business area, we clustered these temporal changes. Finally, we estimated prediction models for each cluster. The results show that the the model obtained by combining three data is effective to predict company's future trends, especially the results show that SP clustering contributes overall performance.

show abstract

“…Hitherto, existing approaches have focused on the prediction of future h-index values at author level [14], [15] or citation counts at publication level [16], [17]. Another categorization of existing approaches occurs with regards to their modeling methodology.…”

Section: Related Workmentioning

confidence: 99%

A Data-Driven Unified Framework for Predicting Citation Dynamics

Gogoglou

Manolopoulos

2020

IEEE Trans. Big Data

View full text Add to dashboard Cite

With the rising interest in predicting the scientific output, various efforts have been made to predict a scientist's h-index or the citation trajectory of a publication. In this work, we employ a dynamic categorization for scientists to ensure at each stage of their careers a comparison amongst their peers and combine this grouping with predictive models to estimate a scientist's future impact, as expressed by citation counts. Moreover, we investigate a wide range of factors identifying their importance in determining the future of science for different performance and academic levels with particular emphasis on features describing a scholar's position in multi-layered collaboration and citation networks. The robustness of the approach is examined on a longitudinal dataset centered around 700,302 data points representing Computer Scientists in various time periods with their complete networks of over 18 million collaboration links and 36 million citations. Our results indicate up to 30% improvement in prediction performance compared to baseline methods along with an average R 2 =0.96 for short term and R 2 =0.91 for long term predictions.

show abstract

High Impact Academic Paper Prediction Using Temporal and Topological Features

Cited by 27 publications

References 26 publications

Feature Driven and Point Process Approaches for Popularity Prediction

Feature Driven and Point Process Approaches for Popularity Prediction

Prediction of Company's Trend based on Publication Statistics and Sentiment Analysis

A Data-Driven Unified Framework for Predicting Citation Dynamics

Contact Info

Product

Resources

About