Among the vast range of evolutionary and metaheuristic optimization techniques that have been recently used in different fields of water resources investigations, the firefly algorithm has not been reported yet, although it has been thoroughly considered and evaluated in the discussed paper.In the discussed paper, the authors evaluated the performance of the firefly algorithm (FA) for optimization of reservoir systems. First, FA was applied to five mathematical test functions. Thereafter, the authors used FA to solve two reservoir operation problems with different purposes, including irrigation supply and hydropower production. Their results demonstrated the superior performance of FA in terms of convergence rate to the global optima and lower variance of results about global optima when compared with a commonly used evolutionary algorithm, namely genetic algorithm (GA). Despite the great effort of the authors, the discusser found several issues to add to the discussed subjects in the original paper.According to the mathematical test functions, although the authors attempted to cover different type of test functions, the authors are questioned about not using any discrete problem to test FA. The claim of "superior performance" cannot be approved until demonstrating an algorithm's performance in discrete problems as well.Another issue is that the authors did not consider the carryover constraint for reservoir operation modeling. This constraint is generally defined in reservoir operation modeling and can be found in the papers by Garousi-Nejad and Bozorg-Haddad (2014), Garousi-Nejad et al. (2016), andBozorg-Haddad et al. (2015). used evolutionary and metaheuristic algorithms to solve reservoir operation problems because the classic methods are less complicated, and evolutionary or metaheuristic algorithms usually require more processing time to find a solution. What makes the evolutionary or metaheuristic algorithms appropriate for the purposes mentioned in the original article? The next issue is why the results of only five independent runs were reported. As the authors of the original paper mentioned, the number of runs taken in other articles were 15, 25, and even 100. Is there any specific reason that just five runs were conducted or was there a limitation forcing this number of runs?Moreover, the authors used a number of functional evaluations to compare the results of GA and FA without mentioning the reason for using this criterion instead of number of iterations. Another important issue is that how the number of functional evaluations was computed for each algorithm. When using the number of functional evaluations to compare the performance of different algorithms, it should be mentioned how they were computed.