A large amount of memory usage in recent machine learning applications imposes a high degree of system burden in terms of power and processing speed. To cope with such a problem, Processing-In-Memory (PIM) techniques can be applied to and be an alternative solution. Especially, the recommendation system which is one of the major machine learning applications in data centers requires a huge memory capacity and can be a good candidate application helped by the PIM technique. In this paper, we introduce a machine learning framework designed for in-memory neural processing units and its evaluation environment, named PIMCaffe. PIMCaffe consists of two components; a Caffe2-based deep learning framework that supports PIM acceleration and a PIM-emulating hardware platform. We develop a suite of functions, libraries, application programming interfaces, and a device driver to support the framework. In addition, we implement a prototype Neural Processing Unit (NPU) in PIMCaffe to evaluate the performance of our platform with machine learning applications. Our prototype NPU design includes a vector processor for parallel vector processing and a systolic array unit for matrix multiplication. Using the proposed software framework, we perform a detailed analysis on the in-memory neural processing unit. PIMCaffe supports evaluations of recommendation systems and various convolutional neural network models on the in-memory neural processing unit. PIMCaffe with the NPU shows up to 2.26x, 5.99x, and 1.71x speedup for the recommendation system, AlexNet, and ResNet-50 respectively compared to the ARM Cortex-A53 CPU.