In this work, we look at how to effectively manage and utilize deep learning models at each edge location, to provide performance guarantees to inference requests. We identify challenges to use these deep learning models at resource-constrained edge locations, and propose to adapt existing cache algorithms to effectively manage these deep learning models. CCS CONCEPTS • Computing methodologies → Neural networks; • Computer systems organization → Cloud computing; • General and reference → Performance.