In this paper, a two-stage scheme is proposed to deal with the difficult problem of acoustic echo cancellation (AEC) in single-channel scenario in the presence of noise. In order to overcome the major challenge of getting a separate reference signal in adaptive filter-based AEC problem, the delayed version of the echo and noise suppressed signal is proposed to use as reference. A modified objective function is thereby derived for a gradient-based adaptive filter algorithm, and proof of its convergence to the optimum Wiener-Hopf solution is established. The output of the AEC block is fed to an acoustic noise cancellation (ANC) block where a spectral subtraction-based algorithm with an adaptive spectral floor estimation is employed. In order to obtain fast but smooth convergence with maximum possible echo and noise suppression, a set of updating constraints is proposed based on various speech characteristics (e.g., energy and correlation) of reference and current frames considering whether they are voiced, unvoiced, or pause. Extensive experimentation is carried out on several echo and noise corrupted natural utterances taken from the TIMIT database, and it is found that the proposed scheme can significantly reduce the effect of both echo and noise in terms of objective and subjective quality measures.