3D point cloud classification tasks have been a hot topic in recent years. Most existing point cloud processing frameworks lack context-aware features due to the deficiency of sufficient local feature extraction information. Therefore, we design an augmented sampling and grouping (ASG) module to efficiently obtain fine-grained features from the original point cloud. In particular, this method strengthens the domain near each centroid and makes reasonable use of the local mean and global standard deviation to mine point cloud’s local and global features. In addition to this, inspired by the transformer structure UFO-ViT in 2D vision tasks, we first try to use a linearly-normalized attention mechanism in point cloud processing tasks, investigating a novel transformer-based point cloud classification architecture UFO-Net. An effective local feature learning module is adopted as a bridging technique to connect different feature extraction modules. Importantly, UFO-Net employs multiple stacked blocks to better capture feature representation of the point cloud. Extensive ablation experiments on public datasets show that our method outperforms other state-of-the-art methods. For instance, our network performed with 93.7% overall accuracy on the ModelNet40 dataset, which was 0.5% higher than PCT. Our network also archived 83.8% overall accuracy on the ScanObjectNN dataset, which is 3.8% better than PCT.