Multi-agent reinforcement learning has made substantial empirical progresses in solving games with a large number of players. However, theoretically, the best known sample complexity for finding a Nash equilibrium in general-sum games scales exponentially in the number of players due to the size of the joint action space, and there is a matching exponential lower bound. This paper investigates what learning goals admit better sample complexities in the setting of m-player general-sum Markov games with H steps, S states, and A i actions per player. First, we design algorithms for learning an ε-Coarse Correlated Equilibrium (CCE) in O(H 5 S max i≤m A i /ε 2 ) episodes, and an ε-Correlated Equilibrium (CE) in O(H 6 S max i≤m A 2 i /ε 2 ) episodes. This is the first line of results for learning CCE and CE with sample complexities polynomial in max i≤m A i . Our algorithm for learning CE integrates an adversarial bandit subroutine which minimizes a weighted swap regret, along with several novel designs in the outer loop. Second, we consider the important special case of Markov Potential Games, and design an algorithm that learns an ε-approximate Nash equilibrium within O(S i≤m A i /ε 3 ) episodes (when only highlighting the dependence on S, A i , and ε), which only depends linearly in i≤m A i and significantly improves over existing efficient algorithms in the ε dependence. Overall, our results shed light on what equilibria or structural assumptions on the game may enable sample-efficient learning with many players.