Faculty Mr Arun Kumar Sivapuram

Mr Arun Kumar Sivapuram

Assistant Professor

Department of Computer Science and Engineering

Contact Details

arunkumar.s@srmap.edu.in

Office Location

Homi J Bhabha Block, Level 4, Cubicle No: 23

Education

2025
IIT Tirupati
2020
M.Tech
IIT Tirupati
2016
B.Tech
JJNTUK

Personal Website

Experience

  • 2018 to 2025 – Teaching Assistant – IIT Tirupati, Courses: Image Processing Lab, Computer Vision Lab, Advance Signal Processing, Speech Signal Processing, Computer Vision, Deep Learning and Machine Learning.
  • 2019 to 2025 – Research Scholar – IIT Tirupati.
  • 2022 to 2025 - Reviewer for NCC, CVIP, ICVGIP, Geo Sciecnce and Remote Sensing Letters, Engineering Applications of Artificial Intelligence, CVPR, ECCV, BMVC and CVPRW
  • 2023 to 2025 – GPU Lab Administrator (SPCV Lab) – IIT Tirupati

Research Interest

  • Developing specialized loss functions to address class imbalance in object identification tasks, including classification, detection, and segmentation.
  • Designing lightweight models for object detection and enhancing detection modules within tracking frameworks to improve tracking performance.
  • Creating advanced frameworks for generating more challenging adversarial samples using adversarial attack techniques to enhance the robustness of object detection methods.
  • These methodologies are applied in Precision Agriculture, Transportation, Autonomous Systems, and Healthcare to enhance efficiency, safety, and automation in real-world scenarios.

Awards

  • 2023 - First Position in "Infrared imaging-based drone detection and tracking in distorted surveillance Videos" - challenge Competition in IEEE International Conference on Image Processing.
  • Ministry of Education Research Scholarship of USD 27000 for PhD at IIT Tirupati (Jan 2018 – Feb 2025).
  • MHRD Teaching Assistant Scholarship of USD 3000 in Masters at IIT Tirupati (Aug 2018 – Dec 2019).
  • IITT-NIF scholarship of USD 2,725 in PhD at IIT Tirupati (Aug 2023 - Jan 2025).
  • MHRD conference scholarship of USD 1,500 for attending Asian Conference on Machine Learning (Dec 4-8, 2024, Hanoi, Vietnam).
  • Visiting Research Scholar under INMOST Project – USD 3,650 (Oct 7 – Nov 28, 2024) at University of Agder, Norway.

Memberships

  • IAPR member

Publications

  • The Second Visual Object Tracking Segmentation VOTS2024 Challenge Results

    Kristan M., Matas J., Tokmakov P., Felsberg M., Zajc L.C., Lukezic A., Tran K.-T., Vu X.-S., Bjorklund J., Chang H.J., Fernandez G., Attari M., Chan A., Chen L., Chen X., Collins J., Cui Y., Devarapu G.S.M., Du Y., Fan H., Fan W.-C., Feng Z., Gao M., Gorthi R.K.S., Goyal R., Han J., Hatuwal B., He Z., Hu X., Huang X., Huang Y., Jiang D., Kang B., Kannappan P., Kittler J., Lai S., Li N., Li X., Li X., Liang C., Lin L., Ling H., Liu T., Liu Z., Lu H., Luo Y., Miao D., Mogollon J., Pang Z., Pochimireddy J.R., Prutyanov V., Rahmon G., Romanov A., Shi L., Siam M., Sigal L., Sivapuram A.K., Solovyev R., Kazemi E.S., Toubal I.E., Wan J., Wang L., Wang X., Wang Y., Wang Y.-X., Wang Z., Wu G., Wu Q., Wu X., Xia Z., Xie J., Xu C., Xu T., Xu Y., Xue C., Yang C., Yang J., Yang M.-H., Yu C., Yu K., Zhang C., Zhang J., Zhang Z., Zheng F., Zheng Y., Zhong B., Zhou J., Zhou J., Zhou Y., Zhou Z., Zhu G., Zhu J., Zhu X., Zunin V.

    Conference paper, Lecture Notes in Computer Science, 2025, DOI Link

    View abstract ⏷

    The Visual Object Tracking Segmentation VOTS2024 challenge is the twelfth annual tracker benchmarking activity of the VOT initiative. This challenge consolidates the new tracking setup proposed in VOTS2023, which merges short-term and long-term as well as single-target and multiple-target tracking with segmentation masks as the only target location specification. Two sub-challenges are considered. The VOTS2024 standard challenge, focusing on classical objects and the VOTSt2024, which considers objects undergoing a topological transformation. Both challenges use the same performance evaluation methodology. Results of 28 submissions are presented and analyzed. A leaderboard, with participating trackers details, the source code, the datasets, and the evaluation kit are publicly available on the website (https://www.votchallenge.net/vots2024/).
  • SA-LfV: self-annotated labeling from videos for object detection

    Sivapuram A.K., Komuravelli P., Gorthi R.K.S.

    Article, Machine Learning, 2025, DOI Link

    View abstract ⏷

    In the realm of object detection, the remarkable strides made by deep neural networks over the past decade have been hampered by challenges such as data labeling and the need to capture natural variations in training samples. Existing benchmark datasets are confined with limited set of classes, and natural variations. This paper presents "SA-LfV", a novel framework designed to streamline object detection from videos with minimal human input. By utilizing basic computer vision tasks, such as image classification and tracking single objects, our method generates pseudo-labels for object detection efficiently. To ensure a rich variety of training samples, we introduce two innovative sampling strategies. The first applies density-based clustering, choosing samples that represent a wide range of scenarios. The second analyzes object movements and their mutual information, capturing diverse behaviors and appearances. The proposed object detection data labeling procedure is demonstrated on object-tracking datasets and custom-downloaded videos. Through these methods, our framework has produced a dataset with 70,000 pseudo-labeled bounding boxes across 13 object classes, significantly diversifying the available data for object detection tasks. Our experiments show that the proposed framework can effectively adapt to unlabelled ImageNet classes, indicating its potential to broaden the capabilities of object detection models. Moreover, integrating our self-annotated dataset with standard benchmark datasets leads to a notable improvement in object detection performance. This new approach not only simplifies the traditionally labor-intensive process of manual labeling but also paves the way for expanding object detection to a wider range of classes and applications.
  • Towards Accurate Disease Segmentation in Plant Images: A Comprehensive Dataset Creation and Network Evaluation

    Prashanth K., Harsha J.S., Kumar S.A., Srilekha J.

    Conference paper, Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, DOI Link

    View abstract ⏷

    Automated disease segmentation in plant images plays a crucial role in identifying and mitigating the impact of plant diseases on agricultural productivity. In this study, we address the problem of Northern Leaf Blight (NLB) disease segmentation in maize plants. We present a comprehensive dataset of 1000 plant images annotated with NLB disease regions. We employ the Mask R-CNN and Cascaded Mask R-CNN models with various backbone architectures to perform NLB disease segmentation. The experimental results demonstrate the effectiveness of the models in accurately delineating NLB disease regions. Specifically, the ResNet Strikes Back-50 backbone architecture achieves the highest mean average precision (mAP) score, indicating its ability to capture intricate details of NLB disease spots. Additionally, the cascaded approach enhances segmentation accuracy compared to the single-stage Mask R-CNN models. Our findings provide valuable insights into the performance of different backbone architectures and contribute to the development of automated NLB disease segmentation methods in plant images. The generated dataset and experimental results serve as a resource for further research in plant disease segmentation and management.
  • VISAL—A novel learning strategy to address class imbalance

    S. S.R.V., Sivapuram A.K., Ravi V., Senthil G., Gorthi R.K.

    Article, Neural Networks, 2023, DOI Link

    View abstract ⏷

    In the imbalance data scenarios, Deep Neural Networks (DNNs) fail to generalize well on minority classes. In this letter, we propose a simple and effective learning function i.e, Visually Interpretable Space Adjustment Learning (VISAL) to handle the imbalanced data classification task. VISAL's objective is to create more room for the generalization of minority class samples by bringing in both the angular and euclidean margins into the cross-entropy learning strategy. When evaluated on the imbalanced versions of CIFAR, Tiny ImageNet, COVIDx and IMDB reviews datasets, our proposed method outperforms the state of the art works by a significant margin.
  • Depth camera based dataset of hand gestures

    Jeeru S., Sivapuram A.K., Leon D.G., Groli J., Yeduri S.R., Cenkeramaddi L.R.

    Data Paper, Data in Brief, 2022, DOI Link

    View abstract ⏷

    The dataset contains RGB and depth version video frames of various hand movements captured with the Intel RealSense Depth Camera D435. The camera has two channels for collecting both RGB and depth frames at the same time. A large dataset is created for accurate classification of hand gestures under complex backgrounds. The dataset is made up of 29718 frames from RGB and depth versions corresponding to various hand gestures from different people collected at different time instances with complex backgrounds. Hand movements corresponding to scroll-right, scroll-left, scroll-up, scroll-down, zoom-in, and zoom-out are included in the data. Each sequence has data of 40 frames, and there is a total of 662 sequences corresponding to each gesture in the dataset. To capture all the variations in the dataset, the hand is oriented in various ways while capturing.

Patents

Projects

Scholars

Interests

  • Artificial Intelligence
  • Computer Vision
  • Deep Learning
  • Image Processing
  • Machine Learning

Thought Leaderships

There are no Thought Leaderships associated with this faculty.

Top Achievements

Research Area

No research areas found for this faculty.

Recent Updates

No recent updates found.

Education
2016
B.Tech
JJNTUK
2020
M.Tech
IIT Tirupati
2025
IIT Tirupati
Experience
  • 2018 to 2025 – Teaching Assistant – IIT Tirupati, Courses: Image Processing Lab, Computer Vision Lab, Advance Signal Processing, Speech Signal Processing, Computer Vision, Deep Learning and Machine Learning.
  • 2019 to 2025 – Research Scholar – IIT Tirupati.
  • 2022 to 2025 - Reviewer for NCC, CVIP, ICVGIP, Geo Sciecnce and Remote Sensing Letters, Engineering Applications of Artificial Intelligence, CVPR, ECCV, BMVC and CVPRW
  • 2023 to 2025 – GPU Lab Administrator (SPCV Lab) – IIT Tirupati
Research Interests
  • Developing specialized loss functions to address class imbalance in object identification tasks, including classification, detection, and segmentation.
  • Designing lightweight models for object detection and enhancing detection modules within tracking frameworks to improve tracking performance.
  • Creating advanced frameworks for generating more challenging adversarial samples using adversarial attack techniques to enhance the robustness of object detection methods.
  • These methodologies are applied in Precision Agriculture, Transportation, Autonomous Systems, and Healthcare to enhance efficiency, safety, and automation in real-world scenarios.
Awards & Fellowships
  • 2023 - First Position in "Infrared imaging-based drone detection and tracking in distorted surveillance Videos" - challenge Competition in IEEE International Conference on Image Processing.
  • Ministry of Education Research Scholarship of USD 27000 for PhD at IIT Tirupati (Jan 2018 – Feb 2025).
  • MHRD Teaching Assistant Scholarship of USD 3000 in Masters at IIT Tirupati (Aug 2018 – Dec 2019).
  • IITT-NIF scholarship of USD 2,725 in PhD at IIT Tirupati (Aug 2023 - Jan 2025).
  • MHRD conference scholarship of USD 1,500 for attending Asian Conference on Machine Learning (Dec 4-8, 2024, Hanoi, Vietnam).
  • Visiting Research Scholar under INMOST Project – USD 3,650 (Oct 7 – Nov 28, 2024) at University of Agder, Norway.
Memberships
  • IAPR member
Publications
  • The Second Visual Object Tracking Segmentation VOTS2024 Challenge Results

    Kristan M., Matas J., Tokmakov P., Felsberg M., Zajc L.C., Lukezic A., Tran K.-T., Vu X.-S., Bjorklund J., Chang H.J., Fernandez G., Attari M., Chan A., Chen L., Chen X., Collins J., Cui Y., Devarapu G.S.M., Du Y., Fan H., Fan W.-C., Feng Z., Gao M., Gorthi R.K.S., Goyal R., Han J., Hatuwal B., He Z., Hu X., Huang X., Huang Y., Jiang D., Kang B., Kannappan P., Kittler J., Lai S., Li N., Li X., Li X., Liang C., Lin L., Ling H., Liu T., Liu Z., Lu H., Luo Y., Miao D., Mogollon J., Pang Z., Pochimireddy J.R., Prutyanov V., Rahmon G., Romanov A., Shi L., Siam M., Sigal L., Sivapuram A.K., Solovyev R., Kazemi E.S., Toubal I.E., Wan J., Wang L., Wang X., Wang Y., Wang Y.-X., Wang Z., Wu G., Wu Q., Wu X., Xia Z., Xie J., Xu C., Xu T., Xu Y., Xue C., Yang C., Yang J., Yang M.-H., Yu C., Yu K., Zhang C., Zhang J., Zhang Z., Zheng F., Zheng Y., Zhong B., Zhou J., Zhou J., Zhou Y., Zhou Z., Zhu G., Zhu J., Zhu X., Zunin V.

    Conference paper, Lecture Notes in Computer Science, 2025, DOI Link

    View abstract ⏷

    The Visual Object Tracking Segmentation VOTS2024 challenge is the twelfth annual tracker benchmarking activity of the VOT initiative. This challenge consolidates the new tracking setup proposed in VOTS2023, which merges short-term and long-term as well as single-target and multiple-target tracking with segmentation masks as the only target location specification. Two sub-challenges are considered. The VOTS2024 standard challenge, focusing on classical objects and the VOTSt2024, which considers objects undergoing a topological transformation. Both challenges use the same performance evaluation methodology. Results of 28 submissions are presented and analyzed. A leaderboard, with participating trackers details, the source code, the datasets, and the evaluation kit are publicly available on the website (https://www.votchallenge.net/vots2024/).
  • SA-LfV: self-annotated labeling from videos for object detection

    Sivapuram A.K., Komuravelli P., Gorthi R.K.S.

    Article, Machine Learning, 2025, DOI Link

    View abstract ⏷

    In the realm of object detection, the remarkable strides made by deep neural networks over the past decade have been hampered by challenges such as data labeling and the need to capture natural variations in training samples. Existing benchmark datasets are confined with limited set of classes, and natural variations. This paper presents "SA-LfV", a novel framework designed to streamline object detection from videos with minimal human input. By utilizing basic computer vision tasks, such as image classification and tracking single objects, our method generates pseudo-labels for object detection efficiently. To ensure a rich variety of training samples, we introduce two innovative sampling strategies. The first applies density-based clustering, choosing samples that represent a wide range of scenarios. The second analyzes object movements and their mutual information, capturing diverse behaviors and appearances. The proposed object detection data labeling procedure is demonstrated on object-tracking datasets and custom-downloaded videos. Through these methods, our framework has produced a dataset with 70,000 pseudo-labeled bounding boxes across 13 object classes, significantly diversifying the available data for object detection tasks. Our experiments show that the proposed framework can effectively adapt to unlabelled ImageNet classes, indicating its potential to broaden the capabilities of object detection models. Moreover, integrating our self-annotated dataset with standard benchmark datasets leads to a notable improvement in object detection performance. This new approach not only simplifies the traditionally labor-intensive process of manual labeling but also paves the way for expanding object detection to a wider range of classes and applications.
  • Towards Accurate Disease Segmentation in Plant Images: A Comprehensive Dataset Creation and Network Evaluation

    Prashanth K., Harsha J.S., Kumar S.A., Srilekha J.

    Conference paper, Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, DOI Link

    View abstract ⏷

    Automated disease segmentation in plant images plays a crucial role in identifying and mitigating the impact of plant diseases on agricultural productivity. In this study, we address the problem of Northern Leaf Blight (NLB) disease segmentation in maize plants. We present a comprehensive dataset of 1000 plant images annotated with NLB disease regions. We employ the Mask R-CNN and Cascaded Mask R-CNN models with various backbone architectures to perform NLB disease segmentation. The experimental results demonstrate the effectiveness of the models in accurately delineating NLB disease regions. Specifically, the ResNet Strikes Back-50 backbone architecture achieves the highest mean average precision (mAP) score, indicating its ability to capture intricate details of NLB disease spots. Additionally, the cascaded approach enhances segmentation accuracy compared to the single-stage Mask R-CNN models. Our findings provide valuable insights into the performance of different backbone architectures and contribute to the development of automated NLB disease segmentation methods in plant images. The generated dataset and experimental results serve as a resource for further research in plant disease segmentation and management.
  • VISAL—A novel learning strategy to address class imbalance

    S. S.R.V., Sivapuram A.K., Ravi V., Senthil G., Gorthi R.K.

    Article, Neural Networks, 2023, DOI Link

    View abstract ⏷

    In the imbalance data scenarios, Deep Neural Networks (DNNs) fail to generalize well on minority classes. In this letter, we propose a simple and effective learning function i.e, Visually Interpretable Space Adjustment Learning (VISAL) to handle the imbalanced data classification task. VISAL's objective is to create more room for the generalization of minority class samples by bringing in both the angular and euclidean margins into the cross-entropy learning strategy. When evaluated on the imbalanced versions of CIFAR, Tiny ImageNet, COVIDx and IMDB reviews datasets, our proposed method outperforms the state of the art works by a significant margin.
  • Depth camera based dataset of hand gestures

    Jeeru S., Sivapuram A.K., Leon D.G., Groli J., Yeduri S.R., Cenkeramaddi L.R.

    Data Paper, Data in Brief, 2022, DOI Link

    View abstract ⏷

    The dataset contains RGB and depth version video frames of various hand movements captured with the Intel RealSense Depth Camera D435. The camera has two channels for collecting both RGB and depth frames at the same time. A large dataset is created for accurate classification of hand gestures under complex backgrounds. The dataset is made up of 29718 frames from RGB and depth versions corresponding to various hand gestures from different people collected at different time instances with complex backgrounds. Hand movements corresponding to scroll-right, scroll-left, scroll-up, scroll-down, zoom-in, and zoom-out are included in the data. Each sequence has data of 40 frames, and there is a total of 662 sequences corresponding to each gesture in the dataset. To capture all the variations in the dataset, the hand is oriented in various ways while capturing.
Contact Details

arunkumar.s@srmap.edu.in

Scholars
Interests

  • Artificial Intelligence
  • Computer Vision
  • Deep Learning
  • Image Processing
  • Machine Learning

Education
2016
B.Tech
JJNTUK
2020
M.Tech
IIT Tirupati
2025
IIT Tirupati
Experience
  • 2018 to 2025 – Teaching Assistant – IIT Tirupati, Courses: Image Processing Lab, Computer Vision Lab, Advance Signal Processing, Speech Signal Processing, Computer Vision, Deep Learning and Machine Learning.
  • 2019 to 2025 – Research Scholar – IIT Tirupati.
  • 2022 to 2025 - Reviewer for NCC, CVIP, ICVGIP, Geo Sciecnce and Remote Sensing Letters, Engineering Applications of Artificial Intelligence, CVPR, ECCV, BMVC and CVPRW
  • 2023 to 2025 – GPU Lab Administrator (SPCV Lab) – IIT Tirupati
Research Interests
  • Developing specialized loss functions to address class imbalance in object identification tasks, including classification, detection, and segmentation.
  • Designing lightweight models for object detection and enhancing detection modules within tracking frameworks to improve tracking performance.
  • Creating advanced frameworks for generating more challenging adversarial samples using adversarial attack techniques to enhance the robustness of object detection methods.
  • These methodologies are applied in Precision Agriculture, Transportation, Autonomous Systems, and Healthcare to enhance efficiency, safety, and automation in real-world scenarios.
Awards & Fellowships
  • 2023 - First Position in "Infrared imaging-based drone detection and tracking in distorted surveillance Videos" - challenge Competition in IEEE International Conference on Image Processing.
  • Ministry of Education Research Scholarship of USD 27000 for PhD at IIT Tirupati (Jan 2018 – Feb 2025).
  • MHRD Teaching Assistant Scholarship of USD 3000 in Masters at IIT Tirupati (Aug 2018 – Dec 2019).
  • IITT-NIF scholarship of USD 2,725 in PhD at IIT Tirupati (Aug 2023 - Jan 2025).
  • MHRD conference scholarship of USD 1,500 for attending Asian Conference on Machine Learning (Dec 4-8, 2024, Hanoi, Vietnam).
  • Visiting Research Scholar under INMOST Project – USD 3,650 (Oct 7 – Nov 28, 2024) at University of Agder, Norway.
Memberships
  • IAPR member
Publications
  • The Second Visual Object Tracking Segmentation VOTS2024 Challenge Results

    Kristan M., Matas J., Tokmakov P., Felsberg M., Zajc L.C., Lukezic A., Tran K.-T., Vu X.-S., Bjorklund J., Chang H.J., Fernandez G., Attari M., Chan A., Chen L., Chen X., Collins J., Cui Y., Devarapu G.S.M., Du Y., Fan H., Fan W.-C., Feng Z., Gao M., Gorthi R.K.S., Goyal R., Han J., Hatuwal B., He Z., Hu X., Huang X., Huang Y., Jiang D., Kang B., Kannappan P., Kittler J., Lai S., Li N., Li X., Li X., Liang C., Lin L., Ling H., Liu T., Liu Z., Lu H., Luo Y., Miao D., Mogollon J., Pang Z., Pochimireddy J.R., Prutyanov V., Rahmon G., Romanov A., Shi L., Siam M., Sigal L., Sivapuram A.K., Solovyev R., Kazemi E.S., Toubal I.E., Wan J., Wang L., Wang X., Wang Y., Wang Y.-X., Wang Z., Wu G., Wu Q., Wu X., Xia Z., Xie J., Xu C., Xu T., Xu Y., Xue C., Yang C., Yang J., Yang M.-H., Yu C., Yu K., Zhang C., Zhang J., Zhang Z., Zheng F., Zheng Y., Zhong B., Zhou J., Zhou J., Zhou Y., Zhou Z., Zhu G., Zhu J., Zhu X., Zunin V.

    Conference paper, Lecture Notes in Computer Science, 2025, DOI Link

    View abstract ⏷

    The Visual Object Tracking Segmentation VOTS2024 challenge is the twelfth annual tracker benchmarking activity of the VOT initiative. This challenge consolidates the new tracking setup proposed in VOTS2023, which merges short-term and long-term as well as single-target and multiple-target tracking with segmentation masks as the only target location specification. Two sub-challenges are considered. The VOTS2024 standard challenge, focusing on classical objects and the VOTSt2024, which considers objects undergoing a topological transformation. Both challenges use the same performance evaluation methodology. Results of 28 submissions are presented and analyzed. A leaderboard, with participating trackers details, the source code, the datasets, and the evaluation kit are publicly available on the website (https://www.votchallenge.net/vots2024/).
  • SA-LfV: self-annotated labeling from videos for object detection

    Sivapuram A.K., Komuravelli P., Gorthi R.K.S.

    Article, Machine Learning, 2025, DOI Link

    View abstract ⏷

    In the realm of object detection, the remarkable strides made by deep neural networks over the past decade have been hampered by challenges such as data labeling and the need to capture natural variations in training samples. Existing benchmark datasets are confined with limited set of classes, and natural variations. This paper presents "SA-LfV", a novel framework designed to streamline object detection from videos with minimal human input. By utilizing basic computer vision tasks, such as image classification and tracking single objects, our method generates pseudo-labels for object detection efficiently. To ensure a rich variety of training samples, we introduce two innovative sampling strategies. The first applies density-based clustering, choosing samples that represent a wide range of scenarios. The second analyzes object movements and their mutual information, capturing diverse behaviors and appearances. The proposed object detection data labeling procedure is demonstrated on object-tracking datasets and custom-downloaded videos. Through these methods, our framework has produced a dataset with 70,000 pseudo-labeled bounding boxes across 13 object classes, significantly diversifying the available data for object detection tasks. Our experiments show that the proposed framework can effectively adapt to unlabelled ImageNet classes, indicating its potential to broaden the capabilities of object detection models. Moreover, integrating our self-annotated dataset with standard benchmark datasets leads to a notable improvement in object detection performance. This new approach not only simplifies the traditionally labor-intensive process of manual labeling but also paves the way for expanding object detection to a wider range of classes and applications.
  • Towards Accurate Disease Segmentation in Plant Images: A Comprehensive Dataset Creation and Network Evaluation

    Prashanth K., Harsha J.S., Kumar S.A., Srilekha J.

    Conference paper, Proceedings - 2024 IEEE Winter Conference on Applications of Computer Vision, WACV 2024, 2024, DOI Link

    View abstract ⏷

    Automated disease segmentation in plant images plays a crucial role in identifying and mitigating the impact of plant diseases on agricultural productivity. In this study, we address the problem of Northern Leaf Blight (NLB) disease segmentation in maize plants. We present a comprehensive dataset of 1000 plant images annotated with NLB disease regions. We employ the Mask R-CNN and Cascaded Mask R-CNN models with various backbone architectures to perform NLB disease segmentation. The experimental results demonstrate the effectiveness of the models in accurately delineating NLB disease regions. Specifically, the ResNet Strikes Back-50 backbone architecture achieves the highest mean average precision (mAP) score, indicating its ability to capture intricate details of NLB disease spots. Additionally, the cascaded approach enhances segmentation accuracy compared to the single-stage Mask R-CNN models. Our findings provide valuable insights into the performance of different backbone architectures and contribute to the development of automated NLB disease segmentation methods in plant images. The generated dataset and experimental results serve as a resource for further research in plant disease segmentation and management.
  • VISAL—A novel learning strategy to address class imbalance

    S. S.R.V., Sivapuram A.K., Ravi V., Senthil G., Gorthi R.K.

    Article, Neural Networks, 2023, DOI Link

    View abstract ⏷

    In the imbalance data scenarios, Deep Neural Networks (DNNs) fail to generalize well on minority classes. In this letter, we propose a simple and effective learning function i.e, Visually Interpretable Space Adjustment Learning (VISAL) to handle the imbalanced data classification task. VISAL's objective is to create more room for the generalization of minority class samples by bringing in both the angular and euclidean margins into the cross-entropy learning strategy. When evaluated on the imbalanced versions of CIFAR, Tiny ImageNet, COVIDx and IMDB reviews datasets, our proposed method outperforms the state of the art works by a significant margin.
  • Depth camera based dataset of hand gestures

    Jeeru S., Sivapuram A.K., Leon D.G., Groli J., Yeduri S.R., Cenkeramaddi L.R.

    Data Paper, Data in Brief, 2022, DOI Link

    View abstract ⏷

    The dataset contains RGB and depth version video frames of various hand movements captured with the Intel RealSense Depth Camera D435. The camera has two channels for collecting both RGB and depth frames at the same time. A large dataset is created for accurate classification of hand gestures under complex backgrounds. The dataset is made up of 29718 frames from RGB and depth versions corresponding to various hand gestures from different people collected at different time instances with complex backgrounds. Hand movements corresponding to scroll-right, scroll-left, scroll-up, scroll-down, zoom-in, and zoom-out are included in the data. Each sequence has data of 40 frames, and there is a total of 662 sequences corresponding to each gesture in the dataset. To capture all the variations in the dataset, the hand is oriented in various ways while capturing.
Contact Details

arunkumar.s@srmap.edu.in

Scholars