Embeddings derived from cell graphs hold significant potential for exploring spatial transcriptomics (ST) datasets. Nevertheless, existing methodologies rely on a graph structure defined by spatial proximity, which inadequately represents the diversity inherent in cell‐cell interactions (CCIs). This study introduces STAGUE, an innovative framework that concurrently learns a cell graph structure and a low‐dimensional embedding from ST data. STAGUE employs graph structure learning to parameterize and refine a cell graph adjacency matrix, enabling the generation of learnable graph views for effective contrastive learning. The derived embeddings and cell graph improve spatial clustering accuracy and facilitate the discovery of novel CCIs. Experimental benchmarks across 86 real and simulated ST datasets show that STAGUE outperforms 15 comparison methods in clustering performance. Additionally, STAGUE delineates the heterogeneity in human breast cancer tissues, revealing the activation of epithelial‐to‐mesenchymal transition and PI3K/AKT signaling in specific sub‐regions. Furthermore, STAGUE identifies CCIs with greater alignment to established biological knowledge than those ascertained by existing graph autoencoder‐based methods. STAGUE also reveals the regulatory genes that participate in these CCIs, including those enriched in neuropeptide signaling and receptor tyrosine kinase signaling pathways, thereby providing insights into the underlying biological processes.