Main Article Content

Authors

Computer vision (CV) can be a process that facilitates some tasks in inventory management, through this process a permanent analysis of an inventory can be performed and thus keep record of all movements made, delivering an instant report when required. This means an improvement in security, since by keeping a strict control of the existing elements in the inventory it is possible to know if an element belongs or not to an inventory or when an element is removed or added, after this need for inventory control, the need arises to design an intelligent system that can facilitate inventory control. Through the combination of 2 frameworks, the creation of an algorithm capable of performing the identification and counting of objects, as well as the identification of the hand to determine when a human manipulation is performed to the inventory. To achieve this objective, two algorithms were used: MediaPipe and YOLOv5 combined with the COCO dataset, the first one was used for hand detection and the second one identifies and counts the objects. After testing the algorithm, it was determined that the hand recognition of MediaPipe had an accuracy of 96% and the detection and classification of objects using YOLO was 43.7%. Challenges for the algorithm were overlapping, occlusion/self-occlusion of objects, or loss of focus of items due to the sensor.

Francisco Bernal Baquero, Universidad Sergio Arboleda, Bogotá, Colombia

https://orcid.org/0009-0000-0399-8994

Darwin E. Martínez, Universidad Sergio Arboleda. Bogotá, Colombia

https://orcid.org/0000-0002-9486-2781

1.
Bernal Baquero F, Martínez DE. Human action detection for inventory control using computer vision. inycomp [Internet]. 2024 Feb. 26 [cited 2024 Dec. 21];26(1):e-21813230. Available from: https://revistaingenieria.univalle.edu.co/index.php/ingenieria_y_competitividad/article/view/13230

Athanasios V, Nikolas D, Anastasios D, Eftychios P. Deep Learning for Computer Vision: A Brief Review. Computational Intelligence and Neuroscience [Internet]. 2018 feb [quoted 7 aug 2023]. Available on: https://doi.org/10.1155/2018/7068349. DOI: https://doi.org/10.1155/2018/7068349

Christopher Bradley K. Computer Vision for Inventory Management [master’s thesis on the Internet]. Lousiana Tech University; 2020 [quoted 7 aug 2023]. Available on: https://digitalcommons.latech.edu/theses/40/.

Felipe R, Leaned Q, Frank S, David C, Enrique M. Software component for weapons recognition in X-ray images [Internet]. 2017 apr [quoted 7 aug 2023]. Available on: https://www.researchgate.net/publication/316628078_Software_component_for_weapons_recognition_in_X-ray_images.

MediaPipe team. MediaPipe Framework [Internet]. [Unknown location] [Quoted 8 aug 2023]. Available on: https://developers.google.com/mediapipe/framework.

Fan Z, Valentin B, Andrey V, Andrei T, George S, Chuo-Ling C, Matthias G. MediaPipe Hands: On-device Real-time Hand Tracking. 2020 jun [quoted 8 aug 2023]. Available on: https://arxiv.org/abs/2006.10214.

Joseph R, Santosh D, Ross G, Ali F. You Only Look Once: Unified, Real-Time Object Detection. 2015 jun [quoted 8 aug]. Available on: https://arxiv.org/abs/1506.02640.

Christian S, Wei L, Yangqing J, Pierre S, Scott R, Dragomir A, Dumitru E, Vincent V, Andrew R. Going Deeper with Convolutions. 2014 sep [quoted 8 aug 2023]. Available on: https://arxiv.org/abs/1409.4842.

Glenn J. why do I need to train from the pt model you have trained? · Issue #2990 · ultralytics/yolov5 · GitHub [Internet]. [Unknown location] [Quoted 8 aug 2023]. Available on: https://github.com/ultralytics/yolov5/issues/2990.

Bambach, S., Lee, S., Crandall, D., and Yu, C. EgoHands Object Detection Dataset [Internet]. [Unknown location] [Quoted 8 aug 2023]. Available on: https://public.roboflow.com/object-detection/hands.

Narendra A., Sinisa T. Learning the Taxonomy and Models of Categories Present in Arbitrary Images. 2007 dec [Citado 8 aug 2023]. Available on: https://ieeexplore.ieee.org/document/4409039.

Ming Jin C., Zaid O., Mohamed H. A review of hand gesture and sign language recognition technique. 2017 aug [Quoted 8 aug 2023]. Available on: https://link.springer.com/article/10.1007/s13042-017-0705-5.

Loïc C., Benoît M., Philippe T. Object Detection with Spiking Neural Networks on Automotive Event Data. 2022 may [Quoted 8 aug 2023]. Available on: https://arxiv.org/abs/2205.04339.

Pedro F., Ross B., David M., Deva R. Object Detection with Discriminatively Trained Part-Based Models. 2010 sep [Quoted 8 aug 2023]. Available on: https://ieeexplore.ieee.org/document/5255236.

Sanja F., Ales L. Towards Scalable Representations of Object Categories: Learning a Hierarchy of Parts. 2007 jul [Quoted 8 aug 2023]. Available on: https://ieeexplore.ieee.org/document/4270294.

Arpita H., Akshit T. Real-time Vernacular Sign Language Recognition using MediaPipe and Machine Learning. 2021 may [Quoted 8 aug 2023]. Available on: https://www.researchgate.net/publication/369945035_Real-time_Vernacular_Sign_Language_Recognition_using_MediaPipe_and_Machine_Learning.

Nathasia F., Michael V., Seto B., Abdul H. Hand Gesture Recognition as Signal for Help using Deep Neural Network. 2022 feb [Quoted 8 aug 2022]. Available on: https://www.researchgate.net/publication/358377113_Hand_Gesture_Recognition_as_Signal_for_Help_using_Deep_Neural_Network.

Received 2023-09-14
Accepted 2024-02-26
Published 2024-02-26