Since the start of modern computer algorithm practices, a lot of new techniques on computer vision has been developed. This results in utilizing machine learning algorithms and has provided attributes to the evolution of convolutional neural networks for building state-of-the-art object detection, segmentation and classification algorithms. These CNN can achieve human-like results in computer vision application, however with the expense of more computation. To meet machine learning application requirements on hardware deployments, various AI-Accelerated FPGA development kits have been developed along with specialized toolkits aimed at efficient optimization and deployment of the models. In theory, the FPGA solutions can have similar accuracy, better inference time and power consumption compared to the GPUs, however, it comes at the cost of limited CNN model support and additional FPGA hardware design complexity. In this thesis, an existing object detection algorithm has been studied and realtime simulation of the object detection algorithm which works under a darknet framework utilizing both CPU+GPU efficiently using CUDA by Nvidia. Implementation of GoogleNet and ResNet50 object detection algorithm on a cloudbased FPGA platform using Xilinx Vitis-AI Toolkit has been carried out. The tools utilize different strategies like model quantization and hardware architecture set up to achieve an accuracy similar to a GPU with at least 10% difference. A broad case study on hardware and software configurations made on Xilinx ALVEO U-200 FPGA for efficient deployment via the cloud has been carried out. Results of both the simulation platforms have been compared and discussed for further optimization and developments.