Faculty Dr M Naveen Kumar

Dr M Naveen Kumar

Assistant Professor

Department of Computer Science and Engineering

Contact Details

naveenkumar.m@srmap.edu.in

Office Location

CV Raman Block, Level 5, Cabin No: 3

Education

2020
National Institute of Technology, Tiruchirappalli
India
2011
MTech
JNTU Anantapur, Andhra Pradesh
India
2009
MSc (CS)
SV University, Andhra Pradesh
India
2007
BSc (CS)
SV University, Andhra Pradesh
India

Personal Website

Experience

  • Aug 2020 - Nov 2021 : Assistant Professor – Sri Ramachandra Institute of Higher Education and Research, Deemed to be University, Chennai.
  • May 2012 – Feb 2015 : Assistant Professor – SV Engineering College for Women, Tirupati.
  • Jul 2011 – Apr 2012: : Assistant Professor – CVS College of Engineering, Tirupati.

Research Interest

  • Computer Vision : Develop deep learning frameworks for 3D Action Recognition from Skeleton data.
  • Medical Imaging : Design and develop deep learning models for Pneumonia classification from Chest X-Ray images.
  • Assistive technology : Design and develop Artificial Intelligence Powered Assistive Wearable Device for Visually Impaired.

Awards

  • 2015 to 2020 – PhD Fellowship – MHRD, Govt. of India
  • 2018 – Best Paper Award for the paper entitled “Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition” - IIITDM Kancheepuram
  • 2013 – AP SET Qualified – Osmania University, AP
  • 2009 to 2011 – PG Fellowship (GATE) – MHRD, Govt. of India

Memberships

Publications

  • A Comparative Study of 2D and 3D Convolutional Neural Networks for Melanoma Classification

    Suryadevara T., Rafi M., Mahamkali N., Nalamothu R.

    Conference paper, Intelligent Computing and Emerging Communication Technologies, ICEC 2024, 2024, DOI Link

    View abstract ⏷

    Skin Melanoma is a lethal type of cancer. The early diagnosis of which is crucial to improve the survival rate of the patients. Convolution neural networks are at the heart of the deep learning algorithms. In the present work authors have experimentally compared 2D and 3D Convolution Neural Network (CNN) models to identify the melanoma. We have employed three different types of datasets namely PH2, ISIC archive, and ISIC skin cancer datasets. We applied the two models on each of the datasets to determine their accuracy, precision, recall, f1 score and ROC curves. The experimental results provide the insights about the advantages and limitations of using 2D and 3D CNN models for the identification of skin melanoma. The authors have observed that 2D CNN model shows enhanced capabilities to detect skin lesion structures compared to 3D CNN. Moreover, the classification accuracy of the 2D CNN is also found better than 3D CNN.
  • Comparative Study of ML Techniques for Classification of Crop Pests

    Pamidimukkala J.S., Tarun Teja P., Suman Paul K., Kosaraju D.S., Mahamkali N.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Crop pests pose a great threat to global food security; thus, the best pest prevention measures must be implemented. By using different machine learning (ML) techniques to perform crop pest classification, this research provides ways to improve the accuracy and speed of identifying pests in agricultural sectors. Conventional methods for identifying pests frequently depend on manual observation, which is tedious, error-prone, and labor-intensive. On the other hand, machine learning (ML) presents an effective way to automate this procedure by using sophisticated techniques to analyze massive data sets and produce precise predictions. The study applies a variety of machine learning approaches, such as Random Forests, K-Nearest Neighbor, and Naive Bayes, to classify agricultural pests according to features that have been extracted from images. For model training and validation, an extensive collection of high-resolution images of different agricultural pests taken in a range of environmental settings is used. Metrics like accuracy are used to determine how well the machine learning models perform. The potential of machine learning approaches to revolutionize pest management in agriculture is evident from the results, which indicate how accurately they can identify and classify agricultural pests. The suggested method improves the overall effectiveness of pest management procedures and drastically reduces the time and effort required to identify pests. Ultimately, this research promotes more resilient and productive farming systems by supporting efforts to develop sustainable and technologically advanced solutions for addressing agricultural difficulties. The results demonstrate the potential of machine learning (ML) as an invaluable tool for farmers, agronomists, and policymakers, encouraging a proactive and data-driven approach to pest management in contemporary agriculture.
  • Synergistic Integration of Skeletal Kinematic Features for Vision-Based Fall Detection

    Inturi A.R., Manikandan V.M., Kumar M.N., Wang S., Zhang Y.

    Article, Sensors, 2023, DOI Link

    View abstract ⏷

    According to the World Health Organisation, falling is a major health problem with potentially fatal implications. Each year, thousands of people die as a result of falls, with seniors making up 80% of these fatalities. The automatic detection of falls may reduce the severity of the consequences. Our study focuses on developing a vision-based fall detection system. Our work proposes a new feature descriptor that results in a new fall detection framework. The body geometry of the subject is analyzed and patterns that help to distinguish falls from non-fall activities are identified in our proposed method. An AlphaPose network is employed to identify 17 keypoints on the human skeleton. Thirteen keypoints are used in our study, and we compute two additional keypoints. These 15 keypoints are divided into five segments, each of which consists of a group of three non-collinear points. These five segments represent the left hand, right hand, left leg, right leg and craniocaudal section. A novel feature descriptor is generated by extracting the distances from the segmented parts, angles within the segmented parts and the angle of inclination for every segmented part. As a result, we may extract three features from each segment, giving us 15 features per frame that preserve spatial information. To capture temporal dynamics, the extracted spatial features are arranged in the temporal sequence. As a result, the feature descriptor in the proposed approach preserves the spatio-temporal dynamics. Thus, a feature descriptor of size (Formula presented.) is formed where m is the number of frames. To recognize fall patterns, machine learning approaches such as decision trees, random forests, and gradient boost are applied to the feature descriptor. Our system was evaluated on the UPfall dataset, which is a benchmark dataset. It has shown very good performance compared to the state-of-the-art approaches.
  • Spatio Temporal Joint Distance Maps for Skeleton-Based Action Recognition Using Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, International Journal of Image and Graphics, 2021, DOI Link

    View abstract ⏷

    Skeleton-based action recognition has become popular with the recent developments in sensor technology and fast pose estimation algorithms. The existing research works have attempted to address the action recognition problem by considering either spatial or temporal dynamics of the actions. But, both the features (spatial and temporal) would contribute to solve the problem. In this paper, we address the action recognition problem using 3D skeleton data by introducing eight Joint Distance Maps, referred to as Spatio Temporal Joint Distance Maps (ST-JDMs), to capture spatio temporal variations from skeleton data for action recognition. Among these, four maps are defined in spatial domain and remaining four are in temporal domain. After construction of ST-JDMs from an action sequence, they are encoded into color images. This representation enables us to fine-tune the Convolutional Neural Network (CNN) for action classification. The empirical results on the two datasets, UTD MHAD and NTU RGB+D, show that ST-JDMs outperforms the other state-of-the-art skeleton-based approaches by achieving recognition accuracies 91.63% and 80.16%, respectively.
  • Learning representations from quadrilateral based geometric features for skeleton-based action recognition using LSTM networks

    Naveenkumar M., Domnic S.

    Article, Intelligent Decision Technologies, 2020, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. Classical machine learning methods based on hand-crafted features fail on large scale datasets due to their limited representation power. Recently, recurrent neural networks (RNN) based methods focus on the temporal evolution of body joints and neglect the geometric relations between them. In this paper, we propose eleven quadrilaterals to capture the geometric relations among joints for action recognition. An end-to-end 3-layer Bi-LSTM network is designed as Base-Net to learn robust representations. We propose two subnets based on the Base-Net to extract discriminative spatio temporal features. Specifically, the first subnet (SQuadNet) uses four spatial features and the second one (TQuadNet) uses two temporal features. The empirical results on two benchmark datasets, NTU RGB+D and UTD MHAD, show how our method achieves state of the art performance when compared to recent methods in the literature.
  • Deep ensemble network using distance maps and body part features for skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Pattern Recognition, 2020, DOI Link

    View abstract ⏷

    Human action recognition is a hot research topic in the field of computer vision. The availability of low cost depth sensors in the market made the extraction of reliable skeleton maps of human objects easier. This paper proposes three subnets, referred to as SNet, TNet, and BodyNet to capture diverse spatio-temporal dynamics for action recognition task. Specifically, SNet is used to capture pose dynamics from the distance maps in the spatial domain. The second subnet (TNet) captures the temporal dynamics along the sequence. The third net (BodyNet) extracts distinct features from the fine-grained body parts in the temporal domain. With the motivation of ensemble learning, a hybrid network, referred to as HNet, is modeled using two subnets (TNet and BodyNet) to capture robust temporal dynamics. Finally, SNet and HNet are fused as one ensemble network for action classification task. Our method achieves competitive results on three widely used datasets: UTD MHAD, UT Kinect and NTU RGB+D.
  • Ensemble spatio-temporal distance net for Skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Scalable Computing, 2019, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. This paper proposes a deep learning framework for action recognition task using ensemble learning. We design two subnets to capture spatial and temporal dynamics of the entire video sequence, referred to as Spatial -distance Net (SdNet) and Temporal - distance Net (TdNet) respectively. More specifically, SdNet is a Convolutional Neural Network based subnet to capture spatial dynamics of joints within a frame and TdNet is a long short term memory based subnet to exploit temporal dynamics of joints between frames along the sequence. Finally, two subnets are fused as one ensemble network, referred to as Spatio-Temporal distance Net (STdNet) to explore both spatial and temporal information. The efficacy of the proposed method is evaluated on two widely used datasets, UTD MHAD and NTU RGB+D, and the proposed STdNet achieved 91.16% and 82.55% accuracies respectively.
  • Learning Representations from Spatio-Temporal Distance Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, Advances in Distributed Computing and Artificial Intelligence Journal, 2019, DOI Link

    View abstract ⏷

    This paper addresses the action recognition problem using skeleton data. In this work, a novel method is proposed, which employs five Distance Maps (DM), named as Spatio- Temporal Distance Maps (ST-DMs), to capture the spatio-temporal information from skeleton data for 3D action recognition. Among five DMs, four DMs capture the pose dynamics within a frame in the spatial domain and one DM captures the variations between consecutive frames along the action sequence in the temporal domain. All DMs are encoded into texture images, and Convolutional Neural Network is employed to learn informative features from these texture images for action classification task. Also, a statistical based normalization method is introduced in this proposed method to deal with variable heights of subjects. The efficacy of the proposed method is evaluated on two datasets: UTD MHAD and NTU RGB+D, by achieving recognition accuracies 91.63% and 80.36% respectively.
  • Skeleton Joint Difference Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Conference paper, Communications in Computer and Information Science, 2019, DOI Link

    View abstract ⏷

    Action recognition is a leading research topic in the field of computer vision. This paper proposes an effective method for action recognition task based on the skeleton data. Four features are proposed based on the joint differences from 3D skeleton data. From the differences of 3D coordinates of corresponding joints in successive frames, three maps are extracted related to x, y and z coordinates respectively and then these maps are encoded into 2D color images, named as Joint Difference Maps (JDMs). The fourth JDM is formed by mapping the individual x, y and z difference maps into red, green and blue values. Hence, the 3D action recognition problem is converted into 2D image classification problem. It enables us to fine tune CNNs to learn informative features for 3D action recognition problem. The proposed method achieved 79.30% recognition rate on UTD MHAD dataset.
  • Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition

    Naveenkumar M., Domnic S.

    Conference paper, Procedia Computer Science, 2018, DOI Link

    View abstract ⏷

    This paper presents an approach for 3D action recognition using vector quantization with pairwise joint distance maps. We name this approach as VQ-PJDM. The main problem for 3D action recognition using skeleton data is that dealing with the variable length of action sequences. We solve this problem by approximation of each action sequence as a codebook, which is the output of Vector Quantization (VQ) method. The codebook size is fixed for any length of the action sequence. After all actions in the data set are approximated by VQ method, the Pairwise Distance Joint Distance Maps(PJDM) are calculated from approximated actions. The voting classifier is employed for action classification. The empirical results on the UT Kinect dataset prove that the proposed method gives better results than that of state of the art.
  • 3-D Projected PCA Based DMM Feature Fusing with SMO-SVM for Human Action Recognition

    Naveenkumar M., Vadivel A.

    Conference paper, Procedia Computer Science, 2016, DOI Link

    View abstract ⏷

    Action recognition in video sequence is a very important and challenging problem yet. This paper presents an efficient feature extraction method for human action recognition for depth video sequence. For the video sequence acquired by depth sensor, all 3-D projections (xy, yz and zx) are calculated for each depth frame. For each projection view, the difference between each alternative frames have been considered to form the Depth Motion Map (DMM). Principle Component Analysis technique is applied to decrease the facet of DMM-feature. Sequential minimal optimization (SMO) is pre-owned to train the Support Vector Machine (SVM). The proposed approach is evaluated on MSR Action-3D data set and compared with the existing approaches. The empirical results convey that proposed approach achieves good results than the existing approaches.

Patents

  • A system and a method for real-time fall detection and monitoring for eldercare

    Dr M Mahesh Kumar, Dr M Naveen Kumar

    Patent Application No: 202541004750, Date Filed: 21/01/2025, Date Published: 31/01/2025, Status: Published

  • A system and a method for cancer classification

    Dr M Krishna Siva Prasad, Dr M Naveen Kumar

    Patent Application No: 202541004676, Date Filed: 21/01/2025, Date Published: 31/01/2025, Status: Published

Projects

Scholars

Interests

  • Artificial Intelligence
  • Computer Vision
  • Deep Learning
  • Image Processing
  • Machine Learning
  • Vision Computing

Thought Leaderships

There are no Thought Leaderships associated with this faculty.

Top Achievements

Research Area

No research areas found for this faculty.

Recent Updates

No recent updates found.

Education
2007
BSc (CS)
SV University, Andhra Pradesh
India
2009
MSc (CS)
SV University, Andhra Pradesh
India
2011
MTech
JNTU Anantapur, Andhra Pradesh
India
2020
National Institute of Technology, Tiruchirappalli
India
Experience
  • Aug 2020 - Nov 2021 : Assistant Professor – Sri Ramachandra Institute of Higher Education and Research, Deemed to be University, Chennai.
  • May 2012 – Feb 2015 : Assistant Professor – SV Engineering College for Women, Tirupati.
  • Jul 2011 – Apr 2012: : Assistant Professor – CVS College of Engineering, Tirupati.
Research Interests
  • Computer Vision : Develop deep learning frameworks for 3D Action Recognition from Skeleton data.
  • Medical Imaging : Design and develop deep learning models for Pneumonia classification from Chest X-Ray images.
  • Assistive technology : Design and develop Artificial Intelligence Powered Assistive Wearable Device for Visually Impaired.
Awards & Fellowships
  • 2015 to 2020 – PhD Fellowship – MHRD, Govt. of India
  • 2018 – Best Paper Award for the paper entitled “Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition” - IIITDM Kancheepuram
  • 2013 – AP SET Qualified – Osmania University, AP
  • 2009 to 2011 – PG Fellowship (GATE) – MHRD, Govt. of India
Memberships
Publications
  • A Comparative Study of 2D and 3D Convolutional Neural Networks for Melanoma Classification

    Suryadevara T., Rafi M., Mahamkali N., Nalamothu R.

    Conference paper, Intelligent Computing and Emerging Communication Technologies, ICEC 2024, 2024, DOI Link

    View abstract ⏷

    Skin Melanoma is a lethal type of cancer. The early diagnosis of which is crucial to improve the survival rate of the patients. Convolution neural networks are at the heart of the deep learning algorithms. In the present work authors have experimentally compared 2D and 3D Convolution Neural Network (CNN) models to identify the melanoma. We have employed three different types of datasets namely PH2, ISIC archive, and ISIC skin cancer datasets. We applied the two models on each of the datasets to determine their accuracy, precision, recall, f1 score and ROC curves. The experimental results provide the insights about the advantages and limitations of using 2D and 3D CNN models for the identification of skin melanoma. The authors have observed that 2D CNN model shows enhanced capabilities to detect skin lesion structures compared to 3D CNN. Moreover, the classification accuracy of the 2D CNN is also found better than 3D CNN.
  • Comparative Study of ML Techniques for Classification of Crop Pests

    Pamidimukkala J.S., Tarun Teja P., Suman Paul K., Kosaraju D.S., Mahamkali N.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Crop pests pose a great threat to global food security; thus, the best pest prevention measures must be implemented. By using different machine learning (ML) techniques to perform crop pest classification, this research provides ways to improve the accuracy and speed of identifying pests in agricultural sectors. Conventional methods for identifying pests frequently depend on manual observation, which is tedious, error-prone, and labor-intensive. On the other hand, machine learning (ML) presents an effective way to automate this procedure by using sophisticated techniques to analyze massive data sets and produce precise predictions. The study applies a variety of machine learning approaches, such as Random Forests, K-Nearest Neighbor, and Naive Bayes, to classify agricultural pests according to features that have been extracted from images. For model training and validation, an extensive collection of high-resolution images of different agricultural pests taken in a range of environmental settings is used. Metrics like accuracy are used to determine how well the machine learning models perform. The potential of machine learning approaches to revolutionize pest management in agriculture is evident from the results, which indicate how accurately they can identify and classify agricultural pests. The suggested method improves the overall effectiveness of pest management procedures and drastically reduces the time and effort required to identify pests. Ultimately, this research promotes more resilient and productive farming systems by supporting efforts to develop sustainable and technologically advanced solutions for addressing agricultural difficulties. The results demonstrate the potential of machine learning (ML) as an invaluable tool for farmers, agronomists, and policymakers, encouraging a proactive and data-driven approach to pest management in contemporary agriculture.
  • Synergistic Integration of Skeletal Kinematic Features for Vision-Based Fall Detection

    Inturi A.R., Manikandan V.M., Kumar M.N., Wang S., Zhang Y.

    Article, Sensors, 2023, DOI Link

    View abstract ⏷

    According to the World Health Organisation, falling is a major health problem with potentially fatal implications. Each year, thousands of people die as a result of falls, with seniors making up 80% of these fatalities. The automatic detection of falls may reduce the severity of the consequences. Our study focuses on developing a vision-based fall detection system. Our work proposes a new feature descriptor that results in a new fall detection framework. The body geometry of the subject is analyzed and patterns that help to distinguish falls from non-fall activities are identified in our proposed method. An AlphaPose network is employed to identify 17 keypoints on the human skeleton. Thirteen keypoints are used in our study, and we compute two additional keypoints. These 15 keypoints are divided into five segments, each of which consists of a group of three non-collinear points. These five segments represent the left hand, right hand, left leg, right leg and craniocaudal section. A novel feature descriptor is generated by extracting the distances from the segmented parts, angles within the segmented parts and the angle of inclination for every segmented part. As a result, we may extract three features from each segment, giving us 15 features per frame that preserve spatial information. To capture temporal dynamics, the extracted spatial features are arranged in the temporal sequence. As a result, the feature descriptor in the proposed approach preserves the spatio-temporal dynamics. Thus, a feature descriptor of size (Formula presented.) is formed where m is the number of frames. To recognize fall patterns, machine learning approaches such as decision trees, random forests, and gradient boost are applied to the feature descriptor. Our system was evaluated on the UPfall dataset, which is a benchmark dataset. It has shown very good performance compared to the state-of-the-art approaches.
  • Spatio Temporal Joint Distance Maps for Skeleton-Based Action Recognition Using Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, International Journal of Image and Graphics, 2021, DOI Link

    View abstract ⏷

    Skeleton-based action recognition has become popular with the recent developments in sensor technology and fast pose estimation algorithms. The existing research works have attempted to address the action recognition problem by considering either spatial or temporal dynamics of the actions. But, both the features (spatial and temporal) would contribute to solve the problem. In this paper, we address the action recognition problem using 3D skeleton data by introducing eight Joint Distance Maps, referred to as Spatio Temporal Joint Distance Maps (ST-JDMs), to capture spatio temporal variations from skeleton data for action recognition. Among these, four maps are defined in spatial domain and remaining four are in temporal domain. After construction of ST-JDMs from an action sequence, they are encoded into color images. This representation enables us to fine-tune the Convolutional Neural Network (CNN) for action classification. The empirical results on the two datasets, UTD MHAD and NTU RGB+D, show that ST-JDMs outperforms the other state-of-the-art skeleton-based approaches by achieving recognition accuracies 91.63% and 80.16%, respectively.
  • Learning representations from quadrilateral based geometric features for skeleton-based action recognition using LSTM networks

    Naveenkumar M., Domnic S.

    Article, Intelligent Decision Technologies, 2020, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. Classical machine learning methods based on hand-crafted features fail on large scale datasets due to their limited representation power. Recently, recurrent neural networks (RNN) based methods focus on the temporal evolution of body joints and neglect the geometric relations between them. In this paper, we propose eleven quadrilaterals to capture the geometric relations among joints for action recognition. An end-to-end 3-layer Bi-LSTM network is designed as Base-Net to learn robust representations. We propose two subnets based on the Base-Net to extract discriminative spatio temporal features. Specifically, the first subnet (SQuadNet) uses four spatial features and the second one (TQuadNet) uses two temporal features. The empirical results on two benchmark datasets, NTU RGB+D and UTD MHAD, show how our method achieves state of the art performance when compared to recent methods in the literature.
  • Deep ensemble network using distance maps and body part features for skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Pattern Recognition, 2020, DOI Link

    View abstract ⏷

    Human action recognition is a hot research topic in the field of computer vision. The availability of low cost depth sensors in the market made the extraction of reliable skeleton maps of human objects easier. This paper proposes three subnets, referred to as SNet, TNet, and BodyNet to capture diverse spatio-temporal dynamics for action recognition task. Specifically, SNet is used to capture pose dynamics from the distance maps in the spatial domain. The second subnet (TNet) captures the temporal dynamics along the sequence. The third net (BodyNet) extracts distinct features from the fine-grained body parts in the temporal domain. With the motivation of ensemble learning, a hybrid network, referred to as HNet, is modeled using two subnets (TNet and BodyNet) to capture robust temporal dynamics. Finally, SNet and HNet are fused as one ensemble network for action classification task. Our method achieves competitive results on three widely used datasets: UTD MHAD, UT Kinect and NTU RGB+D.
  • Ensemble spatio-temporal distance net for Skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Scalable Computing, 2019, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. This paper proposes a deep learning framework for action recognition task using ensemble learning. We design two subnets to capture spatial and temporal dynamics of the entire video sequence, referred to as Spatial -distance Net (SdNet) and Temporal - distance Net (TdNet) respectively. More specifically, SdNet is a Convolutional Neural Network based subnet to capture spatial dynamics of joints within a frame and TdNet is a long short term memory based subnet to exploit temporal dynamics of joints between frames along the sequence. Finally, two subnets are fused as one ensemble network, referred to as Spatio-Temporal distance Net (STdNet) to explore both spatial and temporal information. The efficacy of the proposed method is evaluated on two widely used datasets, UTD MHAD and NTU RGB+D, and the proposed STdNet achieved 91.16% and 82.55% accuracies respectively.
  • Learning Representations from Spatio-Temporal Distance Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, Advances in Distributed Computing and Artificial Intelligence Journal, 2019, DOI Link

    View abstract ⏷

    This paper addresses the action recognition problem using skeleton data. In this work, a novel method is proposed, which employs five Distance Maps (DM), named as Spatio- Temporal Distance Maps (ST-DMs), to capture the spatio-temporal information from skeleton data for 3D action recognition. Among five DMs, four DMs capture the pose dynamics within a frame in the spatial domain and one DM captures the variations between consecutive frames along the action sequence in the temporal domain. All DMs are encoded into texture images, and Convolutional Neural Network is employed to learn informative features from these texture images for action classification task. Also, a statistical based normalization method is introduced in this proposed method to deal with variable heights of subjects. The efficacy of the proposed method is evaluated on two datasets: UTD MHAD and NTU RGB+D, by achieving recognition accuracies 91.63% and 80.36% respectively.
  • Skeleton Joint Difference Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Conference paper, Communications in Computer and Information Science, 2019, DOI Link

    View abstract ⏷

    Action recognition is a leading research topic in the field of computer vision. This paper proposes an effective method for action recognition task based on the skeleton data. Four features are proposed based on the joint differences from 3D skeleton data. From the differences of 3D coordinates of corresponding joints in successive frames, three maps are extracted related to x, y and z coordinates respectively and then these maps are encoded into 2D color images, named as Joint Difference Maps (JDMs). The fourth JDM is formed by mapping the individual x, y and z difference maps into red, green and blue values. Hence, the 3D action recognition problem is converted into 2D image classification problem. It enables us to fine tune CNNs to learn informative features for 3D action recognition problem. The proposed method achieved 79.30% recognition rate on UTD MHAD dataset.
  • Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition

    Naveenkumar M., Domnic S.

    Conference paper, Procedia Computer Science, 2018, DOI Link

    View abstract ⏷

    This paper presents an approach for 3D action recognition using vector quantization with pairwise joint distance maps. We name this approach as VQ-PJDM. The main problem for 3D action recognition using skeleton data is that dealing with the variable length of action sequences. We solve this problem by approximation of each action sequence as a codebook, which is the output of Vector Quantization (VQ) method. The codebook size is fixed for any length of the action sequence. After all actions in the data set are approximated by VQ method, the Pairwise Distance Joint Distance Maps(PJDM) are calculated from approximated actions. The voting classifier is employed for action classification. The empirical results on the UT Kinect dataset prove that the proposed method gives better results than that of state of the art.
  • 3-D Projected PCA Based DMM Feature Fusing with SMO-SVM for Human Action Recognition

    Naveenkumar M., Vadivel A.

    Conference paper, Procedia Computer Science, 2016, DOI Link

    View abstract ⏷

    Action recognition in video sequence is a very important and challenging problem yet. This paper presents an efficient feature extraction method for human action recognition for depth video sequence. For the video sequence acquired by depth sensor, all 3-D projections (xy, yz and zx) are calculated for each depth frame. For each projection view, the difference between each alternative frames have been considered to form the Depth Motion Map (DMM). Principle Component Analysis technique is applied to decrease the facet of DMM-feature. Sequential minimal optimization (SMO) is pre-owned to train the Support Vector Machine (SVM). The proposed approach is evaluated on MSR Action-3D data set and compared with the existing approaches. The empirical results convey that proposed approach achieves good results than the existing approaches.
Contact Details

naveenkumar.m@srmap.edu.in

Scholars
Interests

  • Artificial Intelligence
  • Computer Vision
  • Deep Learning
  • Image Processing
  • Machine Learning
  • Vision Computing

Education
2007
BSc (CS)
SV University, Andhra Pradesh
India
2009
MSc (CS)
SV University, Andhra Pradesh
India
2011
MTech
JNTU Anantapur, Andhra Pradesh
India
2020
National Institute of Technology, Tiruchirappalli
India
Experience
  • Aug 2020 - Nov 2021 : Assistant Professor – Sri Ramachandra Institute of Higher Education and Research, Deemed to be University, Chennai.
  • May 2012 – Feb 2015 : Assistant Professor – SV Engineering College for Women, Tirupati.
  • Jul 2011 – Apr 2012: : Assistant Professor – CVS College of Engineering, Tirupati.
Research Interests
  • Computer Vision : Develop deep learning frameworks for 3D Action Recognition from Skeleton data.
  • Medical Imaging : Design and develop deep learning models for Pneumonia classification from Chest X-Ray images.
  • Assistive technology : Design and develop Artificial Intelligence Powered Assistive Wearable Device for Visually Impaired.
Awards & Fellowships
  • 2015 to 2020 – PhD Fellowship – MHRD, Govt. of India
  • 2018 – Best Paper Award for the paper entitled “Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition” - IIITDM Kancheepuram
  • 2013 – AP SET Qualified – Osmania University, AP
  • 2009 to 2011 – PG Fellowship (GATE) – MHRD, Govt. of India
Memberships
Publications
  • A Comparative Study of 2D and 3D Convolutional Neural Networks for Melanoma Classification

    Suryadevara T., Rafi M., Mahamkali N., Nalamothu R.

    Conference paper, Intelligent Computing and Emerging Communication Technologies, ICEC 2024, 2024, DOI Link

    View abstract ⏷

    Skin Melanoma is a lethal type of cancer. The early diagnosis of which is crucial to improve the survival rate of the patients. Convolution neural networks are at the heart of the deep learning algorithms. In the present work authors have experimentally compared 2D and 3D Convolution Neural Network (CNN) models to identify the melanoma. We have employed three different types of datasets namely PH2, ISIC archive, and ISIC skin cancer datasets. We applied the two models on each of the datasets to determine their accuracy, precision, recall, f1 score and ROC curves. The experimental results provide the insights about the advantages and limitations of using 2D and 3D CNN models for the identification of skin melanoma. The authors have observed that 2D CNN model shows enhanced capabilities to detect skin lesion structures compared to 3D CNN. Moreover, the classification accuracy of the 2D CNN is also found better than 3D CNN.
  • Comparative Study of ML Techniques for Classification of Crop Pests

    Pamidimukkala J.S., Tarun Teja P., Suman Paul K., Kosaraju D.S., Mahamkali N.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Crop pests pose a great threat to global food security; thus, the best pest prevention measures must be implemented. By using different machine learning (ML) techniques to perform crop pest classification, this research provides ways to improve the accuracy and speed of identifying pests in agricultural sectors. Conventional methods for identifying pests frequently depend on manual observation, which is tedious, error-prone, and labor-intensive. On the other hand, machine learning (ML) presents an effective way to automate this procedure by using sophisticated techniques to analyze massive data sets and produce precise predictions. The study applies a variety of machine learning approaches, such as Random Forests, K-Nearest Neighbor, and Naive Bayes, to classify agricultural pests according to features that have been extracted from images. For model training and validation, an extensive collection of high-resolution images of different agricultural pests taken in a range of environmental settings is used. Metrics like accuracy are used to determine how well the machine learning models perform. The potential of machine learning approaches to revolutionize pest management in agriculture is evident from the results, which indicate how accurately they can identify and classify agricultural pests. The suggested method improves the overall effectiveness of pest management procedures and drastically reduces the time and effort required to identify pests. Ultimately, this research promotes more resilient and productive farming systems by supporting efforts to develop sustainable and technologically advanced solutions for addressing agricultural difficulties. The results demonstrate the potential of machine learning (ML) as an invaluable tool for farmers, agronomists, and policymakers, encouraging a proactive and data-driven approach to pest management in contemporary agriculture.
  • Synergistic Integration of Skeletal Kinematic Features for Vision-Based Fall Detection

    Inturi A.R., Manikandan V.M., Kumar M.N., Wang S., Zhang Y.

    Article, Sensors, 2023, DOI Link

    View abstract ⏷

    According to the World Health Organisation, falling is a major health problem with potentially fatal implications. Each year, thousands of people die as a result of falls, with seniors making up 80% of these fatalities. The automatic detection of falls may reduce the severity of the consequences. Our study focuses on developing a vision-based fall detection system. Our work proposes a new feature descriptor that results in a new fall detection framework. The body geometry of the subject is analyzed and patterns that help to distinguish falls from non-fall activities are identified in our proposed method. An AlphaPose network is employed to identify 17 keypoints on the human skeleton. Thirteen keypoints are used in our study, and we compute two additional keypoints. These 15 keypoints are divided into five segments, each of which consists of a group of three non-collinear points. These five segments represent the left hand, right hand, left leg, right leg and craniocaudal section. A novel feature descriptor is generated by extracting the distances from the segmented parts, angles within the segmented parts and the angle of inclination for every segmented part. As a result, we may extract three features from each segment, giving us 15 features per frame that preserve spatial information. To capture temporal dynamics, the extracted spatial features are arranged in the temporal sequence. As a result, the feature descriptor in the proposed approach preserves the spatio-temporal dynamics. Thus, a feature descriptor of size (Formula presented.) is formed where m is the number of frames. To recognize fall patterns, machine learning approaches such as decision trees, random forests, and gradient boost are applied to the feature descriptor. Our system was evaluated on the UPfall dataset, which is a benchmark dataset. It has shown very good performance compared to the state-of-the-art approaches.
  • Spatio Temporal Joint Distance Maps for Skeleton-Based Action Recognition Using Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, International Journal of Image and Graphics, 2021, DOI Link

    View abstract ⏷

    Skeleton-based action recognition has become popular with the recent developments in sensor technology and fast pose estimation algorithms. The existing research works have attempted to address the action recognition problem by considering either spatial or temporal dynamics of the actions. But, both the features (spatial and temporal) would contribute to solve the problem. In this paper, we address the action recognition problem using 3D skeleton data by introducing eight Joint Distance Maps, referred to as Spatio Temporal Joint Distance Maps (ST-JDMs), to capture spatio temporal variations from skeleton data for action recognition. Among these, four maps are defined in spatial domain and remaining four are in temporal domain. After construction of ST-JDMs from an action sequence, they are encoded into color images. This representation enables us to fine-tune the Convolutional Neural Network (CNN) for action classification. The empirical results on the two datasets, UTD MHAD and NTU RGB+D, show that ST-JDMs outperforms the other state-of-the-art skeleton-based approaches by achieving recognition accuracies 91.63% and 80.16%, respectively.
  • Learning representations from quadrilateral based geometric features for skeleton-based action recognition using LSTM networks

    Naveenkumar M., Domnic S.

    Article, Intelligent Decision Technologies, 2020, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. Classical machine learning methods based on hand-crafted features fail on large scale datasets due to their limited representation power. Recently, recurrent neural networks (RNN) based methods focus on the temporal evolution of body joints and neglect the geometric relations between them. In this paper, we propose eleven quadrilaterals to capture the geometric relations among joints for action recognition. An end-to-end 3-layer Bi-LSTM network is designed as Base-Net to learn robust representations. We propose two subnets based on the Base-Net to extract discriminative spatio temporal features. Specifically, the first subnet (SQuadNet) uses four spatial features and the second one (TQuadNet) uses two temporal features. The empirical results on two benchmark datasets, NTU RGB+D and UTD MHAD, show how our method achieves state of the art performance when compared to recent methods in the literature.
  • Deep ensemble network using distance maps and body part features for skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Pattern Recognition, 2020, DOI Link

    View abstract ⏷

    Human action recognition is a hot research topic in the field of computer vision. The availability of low cost depth sensors in the market made the extraction of reliable skeleton maps of human objects easier. This paper proposes three subnets, referred to as SNet, TNet, and BodyNet to capture diverse spatio-temporal dynamics for action recognition task. Specifically, SNet is used to capture pose dynamics from the distance maps in the spatial domain. The second subnet (TNet) captures the temporal dynamics along the sequence. The third net (BodyNet) extracts distinct features from the fine-grained body parts in the temporal domain. With the motivation of ensemble learning, a hybrid network, referred to as HNet, is modeled using two subnets (TNet and BodyNet) to capture robust temporal dynamics. Finally, SNet and HNet are fused as one ensemble network for action classification task. Our method achieves competitive results on three widely used datasets: UTD MHAD, UT Kinect and NTU RGB+D.
  • Ensemble spatio-temporal distance net for Skeleton based action recognition

    Naveenkumar M., Domnic S.

    Article, Scalable Computing, 2019, DOI Link

    View abstract ⏷

    With the recent developments in sensor technology and pose estimation algorithms, skeleton based action recognition has become popular. This paper proposes a deep learning framework for action recognition task using ensemble learning. We design two subnets to capture spatial and temporal dynamics of the entire video sequence, referred to as Spatial -distance Net (SdNet) and Temporal - distance Net (TdNet) respectively. More specifically, SdNet is a Convolutional Neural Network based subnet to capture spatial dynamics of joints within a frame and TdNet is a long short term memory based subnet to exploit temporal dynamics of joints between frames along the sequence. Finally, two subnets are fused as one ensemble network, referred to as Spatio-Temporal distance Net (STdNet) to explore both spatial and temporal information. The efficacy of the proposed method is evaluated on two widely used datasets, UTD MHAD and NTU RGB+D, and the proposed STdNet achieved 91.16% and 82.55% accuracies respectively.
  • Learning Representations from Spatio-Temporal Distance Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Article, Advances in Distributed Computing and Artificial Intelligence Journal, 2019, DOI Link

    View abstract ⏷

    This paper addresses the action recognition problem using skeleton data. In this work, a novel method is proposed, which employs five Distance Maps (DM), named as Spatio- Temporal Distance Maps (ST-DMs), to capture the spatio-temporal information from skeleton data for 3D action recognition. Among five DMs, four DMs capture the pose dynamics within a frame in the spatial domain and one DM captures the variations between consecutive frames along the action sequence in the temporal domain. All DMs are encoded into texture images, and Convolutional Neural Network is employed to learn informative features from these texture images for action classification task. Also, a statistical based normalization method is introduced in this proposed method to deal with variable heights of subjects. The efficacy of the proposed method is evaluated on two datasets: UTD MHAD and NTU RGB+D, by achieving recognition accuracies 91.63% and 80.36% respectively.
  • Skeleton Joint Difference Maps for 3D Action Recognition with Convolutional Neural Networks

    Naveenkumar M., Domnic S.

    Conference paper, Communications in Computer and Information Science, 2019, DOI Link

    View abstract ⏷

    Action recognition is a leading research topic in the field of computer vision. This paper proposes an effective method for action recognition task based on the skeleton data. Four features are proposed based on the joint differences from 3D skeleton data. From the differences of 3D coordinates of corresponding joints in successive frames, three maps are extracted related to x, y and z coordinates respectively and then these maps are encoded into 2D color images, named as Joint Difference Maps (JDMs). The fourth JDM is formed by mapping the individual x, y and z difference maps into red, green and blue values. Hence, the 3D action recognition problem is converted into 2D image classification problem. It enables us to fine tune CNNs to learn informative features for 3D action recognition problem. The proposed method achieved 79.30% recognition rate on UTD MHAD dataset.
  • Vector Quantization based Pairwise Joint Distance Maps (VQ-PJDM) for 3D Action Recognition

    Naveenkumar M., Domnic S.

    Conference paper, Procedia Computer Science, 2018, DOI Link

    View abstract ⏷

    This paper presents an approach for 3D action recognition using vector quantization with pairwise joint distance maps. We name this approach as VQ-PJDM. The main problem for 3D action recognition using skeleton data is that dealing with the variable length of action sequences. We solve this problem by approximation of each action sequence as a codebook, which is the output of Vector Quantization (VQ) method. The codebook size is fixed for any length of the action sequence. After all actions in the data set are approximated by VQ method, the Pairwise Distance Joint Distance Maps(PJDM) are calculated from approximated actions. The voting classifier is employed for action classification. The empirical results on the UT Kinect dataset prove that the proposed method gives better results than that of state of the art.
  • 3-D Projected PCA Based DMM Feature Fusing with SMO-SVM for Human Action Recognition

    Naveenkumar M., Vadivel A.

    Conference paper, Procedia Computer Science, 2016, DOI Link

    View abstract ⏷

    Action recognition in video sequence is a very important and challenging problem yet. This paper presents an efficient feature extraction method for human action recognition for depth video sequence. For the video sequence acquired by depth sensor, all 3-D projections (xy, yz and zx) are calculated for each depth frame. For each projection view, the difference between each alternative frames have been considered to form the Depth Motion Map (DMM). Principle Component Analysis technique is applied to decrease the facet of DMM-feature. Sequential minimal optimization (SMO) is pre-owned to train the Support Vector Machine (SVM). The proposed approach is evaluated on MSR Action-3D data set and compared with the existing approaches. The empirical results convey that proposed approach achieves good results than the existing approaches.
Contact Details

naveenkumar.m@srmap.edu.in

Scholars