Fine particulate matter (PM 2.5 ), tiny particles in the air, is air contamination that negatively impacts the environment and human health when levels in the air are high. The elevated level of PM 2.5 also reduces visibility and causes the air to appear hazy. Due to its impact on environment and health, almost every country around the world keeps track of PM 2.5 air quality level and records the data repeatedly over time in many sites. As the data are collected repeatedly, there is likely to be a natural dependency among the repeated measures of PM 2.5 level in a specific site. Modeling and analyzing these repeated data will help policymakers recommend new policies and/or update existing policies. Thus adequate modeling of such data is of enormous interest among the researchers and policymakers. It is noteworthy that as the data are collected repeatedly in immense volume, big data modeling techniques are required for modeling such data. This paper proposed a new modeling framework to analyze and trajectory risk prediction of categorical responses from big data collected repeatedly. We developed a divide and recombine approach to analyzing big data gathered continually. We used the Markov model for data division, and the Markov chain is used to recombine the marginal and conditional probabilities and estimated joint probabilities for trajectory. We illustrated the proposed model using PM 2.5 outdoor air pollution data from the United States between the years 2000 to 2020. The performance of the proposed methodology is also checked through bootstrap simulation studies. The