Abstract-Designing efficient channel access schemes for wireless communications without any prior knowledge about the nature of environments has been a very challenging issue, in which the channel state distribution of all spectrum resources could be entirely or partially stochastic or adversarial at different time and locations. In this paper, we propose an online learning algorithm for adaptive channel access of wireless communications in unknown environments based on the theory of multi-armed bandits (MAB) problems. By automatically tuning two control parameters, i.e., learning rate and exploration probability, our algorithms could find the optimal channel access strategies and achieve the almost optimal learning performance over time in different scenarios. The quantitative performance studies indicate the superior throughput gain when compared with previous solutions and the flexibility of our algorithm in practice, which is resilient to both oblivious and adaptive jamming attacks with different intelligence and attacking strength that ranges from noattack to the full-attack of all spectrum resources. We conduct extensive simulations to validate our theoretical analysis.Index Terms-Online learning, jamming attack, stochastic and adversarial bandits, wireless communications, security.