Understanding the public’s diverse linguistic expressions about rainfall and flood provides a basis for flood disaster studies and enhances linguistic and cultural awareness. However, existing research tends to overlook linguistic complexity, potentially leading to bias. In this study, we introduce a novel algorithm capturing rainfall and flood-related expressions, considering the relationship between precipitation observations and linguistics expressions. Analyzing 210 million social media microblogs from 2017, we identified 594 keywords, 20 times more than usual manually created bag-of-words. Utilizing Large Language Model, we categorized these keywords into rainfall, flood, and other related terms. Semantic features of these keywords were analyzed from the viewpoint of popularity, credibility, time delay, and part-of-speech, finding rainfall-related terms most common-used, flood-related keywords often more time delayed than precipitation, and notable differences in part-of-speech across categories. We also assessed spatial characteristics from keyword and city-centric perspectives, revealing that 49.5% of the keywords have significant spatial correlation with differing median centers, reflecting regional variations. Large and disaster-impacted cities show the richest expression diversity for rainfall and flood-related terms.