Vision-based tracking of an object using perspective projection inherently results in non-linear measurement equations in the Cartesian coordinates. The underlying object kinematics can be modelled by a linear system. In this paper we introduce a measurement conversion technique that analytically transforms the non-linear measurement equations obtained from a stereo-vision system into a system of linear measurement equations. We then design a robust linear filter around the converted measurement system. The state estimation error of the proposed filter is bounded and we provide a rigorous theoretical analysis of this result. The performance of the robust filter developed in this paper is demonstrated via computer simulation and via practical experimentation using a robotic manipulator as a target. The proposed filter is shown to outperform the extended Kalman filter (EKF).