Background: The central nervous system (CNS) is optimizing arm movements to reduce some kind of cost function. Simulating parts of the nervous system is one way of obtaining accurate information about the neurological and treatment of neuromuscular diseases. The primary purpose of this paper is to model and control the human arm in a reaching movement based on reinforcement learning (RL) theory. Methods: First, Zajac’s muscle model has improved by a fuzzy system. Second, the proposed muscle model applied to the 6 muscles, which are responsible for a two-link arm that moves in the horizontal plane. Third, the model parameters are approximated based on the genetic algorithm (GA). Experimental data recorded from healthy subjects for assessing the approach. At last, the RL algorithm has utilized to guide the arm for reaching tasks. Results: The results show that: (1) The proposed system is temporally similar to a real arm movement. (2) The RL algorithm can generate the motor commands obtained from electromyographies (EMGs). (3) The similarity of obtained activation function from the system has compared with the real data activation function, which may prove the possibility of RL in the CNS (basal ganglia). Finally, in order to have a graphical and effective representation of the arm model, the virtual reality environment of MATLAB has been used. Conclusion: Since the RL method is a representative of the brain’s control function, it has some features, such as better settling time, not having any peek overshoot, and robustness.