With the introduction of edge analytics, IoT devices are becoming smart and ready for AI applications. A few modern ML frameworks are focusing on the generation of small-size ML models (often in kBs) that can directly be flashed and executed on tiny IoT devices, particularly the embedded systems. Edge analytics eliminates expensive device-to-cloud communications, thereby producing intelligent devices that can perform energyefficient real-time offline analytics. Any increase in the training data results in a linear increase in the size and space complexity of the trained ML models, making them unable to be deployed on IoT devices with limited memory. To alleviate the memory issue, a few studies have focused on optimizing and fine-tuning existing ML algorithms to reduce their complexity and size. However, such optimization is usually dependent on the nature of IoT data being trained. In this paper, we presented an approach that protects model quality without requiring any alteration to the existing ML algorithms. We propose SRAM-optimized implementation and efficient deployment of widely used standard/stable ML-frameworks classifier versions (e.g., from Python scikit-learn). Our initial evaluation results have demonstrated that ours is the most resource-friendly approach, having a very limited memory footprint while executing large and complex ML models on MCU-based IoT devices, and can perform ultra-fast classifications while consuming 0 bytes of SRAM. When we tested our approach by executing it on a variety of MCU-based devices, the majority of models ported and executed produced 1-4x times faster inference results in comparison with the models ported by the sklearn-porter, m2cgen, and emlearn libraries.