kkk.mp4
mediapipe
the MediaPipe Hands module from the MediaPipe library to detect and track hand landmarks in real-time from webcam feed. The specific model used is a pre-trained model provided by MediaPipe for hand tracking.
model_complexity=0:
min_detection_confidence=0.5:
min_tracking_confidence=0.5:
This first parameter controls the complexity of the hand tracking model. A value of 0 indicates the simplest model.This 2nd parameter sets the minimum confidence threshold for a hand detection to be considered valid. The 3rd parameter sets the minimum confidence threshold for hand landmarks to be considered during tracking.
cv2.VideoCapture(2)
The number inside parentheses is the webcam's info from the user's device. If there are multiple webcams, try the following to check the webcam indeces
$ sudo apt-get install v4l-utils
$ v4l2-ctl --list-devices
It will show as follows:
HD720P Webcam: HD720P Webcam (usb-0000:06:00.3-2):
/dev/video2
/dev/video3
/dev/media1
USB2.0 HD UVC WebCam: USB2.0 HD (usb-0000:06:00.3-4):
/dev/video0
/dev/video1
/dev/media0
in this case, HD720P Webcam /dev/video2
is used, instead of built-in webcam USB2.0 HD /dev/video0
There are two functions such as .click()
and .press()
from pyautogui
. For first function, the distance between thumb
and index
fingers tips are measured and once the two tips meet each other, it clicks.
For .press()
, keyword f
is passed into that function .press('f')
to exit or enter full-screen mode. The closer the tip of thumb
and middle
finger gets, press
initiates. In the image above, ID
of finger joints can be found.
Tip
one must choose trade-off between FPS and Image Resolution
In this repo, resolution is set to minimal, since FPS is the priority while moving cursor around. As Futureworks, calibration techniques must be considered.
Check webcam's resolution and fps as follows:
$ v4l2-ctl --list-formats-ext
It will show minimum and maximum trade-offs between Resolution and FPS as follows:
ioctl: VIDIOC_ENUM_FMT
Type: Video Capture
[0]: 'MJPG' (Motion-JPEG, compressed)
Size: Discrete 1280x720
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 800x600
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 640x480
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 352x288
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 320x240
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 176x144
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 160x120
Interval: Discrete 0.033s (30.000 fps)
[1]: 'YUYV' (YUYV 4:2:2)
Size: Discrete 640x480
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 1280x720
Interval: Discrete 0.100s (10.000 fps)
Size: Discrete 800x600
Interval: Discrete 0.050s (20.000 fps)
Size: Discrete 352x288
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 320x240
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 176x144
Interval: Discrete 0.033s (30.000 fps)
Size: Discrete 160x120
Interval: Discrete 0.033s (30.000 fps)
$ git clone https://github.com/leo007-htun/Cursor_Control_by_Hand_Landmark_Detection.git
$ pip install -r requirements.txt
$ source best.sh