Abstract-Hardware support for floating-point (FP) arithmetic is an essential feature of modern microprocessor design. Although division and square root are relatively infrequent operations in traditional general-purpose applications, they are indispensable and becoming increasingly important in many modern applications. In this paper, a fused floating-point multiply/divide/square root unit based on Taylor-series expansion algorithm is presented. The implementation results of the proposed fused unit based on standard cell methodology in IBM 90nm technology exhibits that the incorporation of square root function to an existing multiply/divide unit requires only a modest 23% area increase and the same low latency for divide and square root operation can be achieved (12 cycles). The proposed arithmetic unit also exhibits a reasonably good area-performance balance.