Bio-oil produced through pyrolysis of lignocellulosic biomass has recently received significant attention due to its possible uses as a second-generation biofuel. The yield and characteristics of produced bio-oil are affected by reaction conditions (reactor type, particle size, feed rate, operating temperature, heating rate, retention time, etc.) and the type of feedstock that is used (softwood, hardwood, agricultural plant residues, miscanthus, etc.). Recently, machine learning (ML) techniques have been widely employed to forecast the performance of the pyrolysis and the characteristics of bi-oil. In this study, a comprehensive review of ML research on bio-oil has been carried out. Regression methods were most frequently employed to build prediction models. The top five ML methods for bio-oil research were random forest, artificial neural network, gradient boosting, support vector regression, and linear regression. In addition, users frequently extract features using their own knowledge and restricted datasets were employed I previous studies. We highlighted the challenges and potential of cutting-edge ML techniques in bio-oil production.