2022
DOI: 10.3390/e24101394
|View full text |Cite
|
Sign up to set email alerts
|

A Fault Detection Method Based on an Oil Temperature Forecasting Model Using an Improved Deep Deterministic Policy Gradient Algorithm in the Helicopter Gearbox

Abstract: The main gearbox is very important for the operation safety of helicopters, and the oil temperature reflects the health degree of the gearbox; therefore establishing an accurate oil temperature forecasting model is an important step for reliable fault detection. Firstly, in order to achieve accurate gearbox oil temperature forecasting, an improved deep deterministic policy gradient algorithm with a CNN–LSTM basic learner is proposed, which can excavate the complex relationship between oil temperature and worki… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(2 citation statements)
references
References 36 publications
0
0
0
Order By: Relevance
“…The reward structure of game stipulates that if the agent accurately predicts the class of the sample, a favorable reward ensues; conversely, an adverse reward is imparted to the agent for wrong prediction. 38 The agent's primary objective is to maximize the cumulative rewards throughout this game by adopting an optimal behavior policy acquired through sustained environment interactions. 39 To facilitate the execution of this aforementioned deduction game via DRL, we delineate this game using the set S f , A f , P f ,R f È É where S f symbolize the state space, A f the action space, P f the transition probability, and reward function is expressed through R f .…”
Section: Fundamental Principle Of Fault Detection Agentmentioning
confidence: 99%
See 1 more Smart Citation
“…The reward structure of game stipulates that if the agent accurately predicts the class of the sample, a favorable reward ensues; conversely, an adverse reward is imparted to the agent for wrong prediction. 38 The agent's primary objective is to maximize the cumulative rewards throughout this game by adopting an optimal behavior policy acquired through sustained environment interactions. 39 To facilitate the execution of this aforementioned deduction game via DRL, we delineate this game using the set S f , A f , P f ,R f È É where S f symbolize the state space, A f the action space, P f the transition probability, and reward function is expressed through R f .…”
Section: Fundamental Principle Of Fault Detection Agentmentioning
confidence: 99%
“…Subsequent to each move, the environment furnishes the agent with immediate reward and the next query (i.e., the subsequent sample). The reward structure of game stipulates that if the agent accurately predicts the class of the sample, a favorable reward ensues; conversely, an adverse reward is imparted to the agent for wrong prediction 38 . The agent's primary objective is to maximize the cumulative rewards throughout this game by adopting an optimal behavior policy acquired through sustained environment interactions 39 .…”
Section: Fault‐tolerant Action Using Td3pg Agentsmentioning
confidence: 99%