Virtual Network Function Resource Adaptation (VNF-RA) aims at adequately adapting Network Function Virtualization Infrastructure (NFVI) resources according to the geographical fluctuation of user demand by maximizing the quality of service (QoS) of the offered services and the energy consumption of the NFVI while limiting the risks of Service Level Agreement (SLA) breaches, the CAPEX and OPEX of the cloud operators and their customers. Virtual Network Function (VNF) resource usage forecasting leads therefore a key role in enabling proactive resource adaptation in dynamic Network Function Virtualization (NFV) environments whose resource demand constantly changes. In parallel, Long Short-Term Memory (LSTM)-based prediction has garnered huge interest in the research community and several research teams have therefore proposed different VNF resource usage prediction algorithms based on this machine learning technique. However, current LSTM-based VNF resource usage forecasting techniques lack the flexibility to take several resource attributes of different scales, over many time steps, in order to predict several other resource attributes over many time steps from a Service Function Chain (SFC). In this paper, we push the state of the art forward by presenting A-NFVLearn, a flexible multivariate, LSTM-based model with an attention mechanism which uses different attributes of resource load history (CPU, memory, I/O bandwidth) from various VNFs of an SFC to forecast future load of multiple resources of a VNF. Next, we propose a multivariate outlier filtering scheme at pre-processing based on Adjusted Outlyingness (AO), which improves training time performance of LSTM-based models without impacting prediction accuracy.