The arduino can not be communicated directly with the arduino. What I suggest is having some sort of computer, ie. rasberry pi, netbook, etc to run processing (http://processing.org/) which then could send commands to move the motors in a certain direction.
Check out here about skeleton tracking with the kinect and processing http://learning.codasign.com/index.php?title=Skeleton_Tracking_with_the_Kinect
I saw some hacks made with a working desktop/notebook computer, A quadcopter to avoid obstacles(google for it)... And I think it's may be done with a Raspberry pi(which is actualy a pc) Worth the research to do great things... It IS possible, but you may need a starting point ;)
I don't think Kinect does any computer vision processing, as far as I know it just outputs video pixels plus depth information. I think you may be able to implement some simple processing such as blob detection, etcetera with an Arduino. To process video to recognize human gestures you will probably need quite a bit more CPU power than an 8-bit Arduino, and you will probably want to leverage use of OpenCV.
Where are you now on the project? How far have you gotten?
It would help if you would add the links to:
This way we can all catch up to what you are looking at now. Once we know where you are (looking for tutorials/ etc), we can figure out how best to help.