Abstract-The application of machine learning algorithms in wireless communications has attracted increasing attention due to the promising performance gains recently achieved. Static classification algorithms have been successfully applied to training protocols that adapt transmission parameters according to context information. However, in reality, there are many timevarying reasons for fading channel quality including mobility of sender, receiver, and/or obstacles within the environment. Moreover, time-varying noise further exacerbates the dynamics of the channel. These problems pose new challenges for the application of static classification algorithms in context-aware algorithms and suggest that sequential classifiers which leverage the temporal dynamics and correlation of context information might be more appropriate. In this paper, we apply sequential training to rate adaptation (ASTRA), leveraging the temporal correlation of context information. In particular, linear and non-linear sequential coding schemes are used in the training process for selecting the modulation/coding rate that achieves the highest throughput for the given context. Experimental results on measurements from emulated and in-field channels demonstrate that ASTRA can significantly increase the accuracy of selecting these target rates by up to 175% and increase the resulting throughput by up to 66% over rate adaptation training which uses static classifier-based methods.I. INTRODUCTION Wireless channels are known to have time-varying quality, especially in mobile and vehicular networks. In such scenarios, algorithms attempt to adapt the transmission rate to by measuring either the packet losses [1], [2] or the channel quality [3], [4], [5]. As channel fluctuations increase, the ability to converge to optimality becomes more and more elusive [6]. Thus, recent works have proposed using the context information and machine learning to quickly converge to optimality [7], [8].Context-aware rate adaptation schemes attempt to leverage existing patterns in the collected context information to adjust the transmission parameters to improve performance. Examples of such schemes include neural networks and genetic algorithms for parameter adaptation in cognitive radio networks [9], [10], distributed classification with data from different sensors [11], and static classification-based rate adaptation [7]. Existing works in context-aware rate adaptation have mainly focus on operations on static sets of attributes. these works treat measurements from sensors as independent and identically distributed data points, using them to infer decision rules.However, the mapping of context information with network performance in the field is not static, because in-field channels and contexts change over time. Since the fluctuating channel state cannot sensibly be represented as a fixed set of measurements, it is not sufficient to simply determine when