Usually, Sound event detection systems that classify different events from sound data have two main blocks. In the first block, sound events are separated from sound background and in next block, different events are classified. In recent years, this research area has become increasingly popular in a wide range of applications, such as in surveillance and city patterns learning and recognition, mainly when combined with imaging sensors. However, it still poses challenging problems due to existent noise, complexity of the events, poor microphone(s) quality, bad microphone location(s), or events occurring simultaneously. This research aimed to compare accurate signal processing and classification methods to suggest a novel method for detecting sound events from sound background in urban scenes. Using wavelet and Mel-frequency cepstral coefficients, the analysis of the effect of classification methods and minimization of the number of train data are some of the advantages of the proposed method. The proposed methods' application to a standard sounds database led to an accuracy of about 99% in event detection.