With the explosion of data on the internet led to the presence of the big data era, so it requires data processing to get the useful information. One of the challenges is the gesture recognition the video processing. Therefore, the study proposes Latent-Dynamic Conditional Neural Fields and compares with the other family members of Conditional Random Fields. For improving the accuracy, these methods are combined by using Fuzzy Clustering. From the results, it can be concluded that the performance of Fuzzy Latent-Dynamic Conditional Neural Fields are the highest. Also, the combination of the basic classifiers and Fuzzy C-Means Clustering has the higher than the original ones. The evaluation is tested on a temporal dataset of gesture phase segmentation.