Deep reinforcement learning (DRL) algorithm is often used to find the best trading strategy in algorithmic trading. However, the classical DRL model is difficult to achieve rapid convergence, and the features extracted from the market data are relatively simple, resulting in incomplete DRL learning information. In this paper, we propose a supervised reinforcement learning method, a hybrid optimal investment strategy formation method consisting of long short-term memory neural network (LSTM) and deep deterministic policy gradient (DDPG). By participating in reinforcement learning in the early stage of supervised learning, agents can obtain guiding prior experience, thus reducing the cost of agent learning and accelerating convergence. In addition, multi-feature state input is added to the model to optimize the agent's learning of the environment. Compared with DDPG algorithm, LSTM-DDPG algorithm achieves convergence faster. Experiments on three regional stock markets in China, the United States and Europe show that LSTM-DDPG algorithm has higher profit and lower risk than B&H, MACD and LSTM trading strategies.INDEX TERMS Supervised reinforcement learning, finance and operations, reinforcement learning, deep deterministic policy gradient, long short-term memory.