Efficient Policy Learning For Robust Robot Grasping