Dr Ravi Kant Kumar

Assistant Professor

Department of Computer Science and Engineering

Contact Details

ravikant.k@srmap.edu.in

Office Location

SR Block, Level 3, Cabin No: 20

Social Links

School of Engineering and Sciences Computer Science and Engineering Faculty Dr Ravi Kant Kumar

Faculty Information

Education

2019

Ph.D.

National Institute of Technology, Durgapur

India

2014

M.Tech.

Central University of Hyderabad

India

2008

MCA

VIT University, Vellore

India

Personal Website

Experience

4.5 Months – Assistant Professor – Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India.
1 Year- Worked as a Research Intern - IDRBT, Hyderabad, India.
1 Year – Software Engineer – Pellucid Healthcare Network Pvt. Ltd., Chennai, India.

Research Interest

Computational Modelling of Visual Attention using Low-Level and High Level Features.
Design and Development of Algorithms for Saliency based Intelligent Camera.
Mathematical Modelling of Computer Science Problems.

Awards

(2014 – 2019) – Institute Fellowship (During PhD) – MHRD, Govt. of India.
(2012 – 2014) – GATE Fellowship (During MTech) – MHRD, Govt. of India.
2013 – 1st Prize for Software Designing Competition (As a Team) – JP Morgan.
2006 – IBM Great Mind Challenge – IBM

Memberships

Memberships in professional associations to be listed

Publications

Enhanced Salient Object Detection from Single Haze Images
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Electrical Engineering, 2025, DOI Link
View abstract ⏷
Salient Object Detection experiences significant difficulties when trying to identify objects from single haze images due to the deterioration of visibility and low contrast. To subdue this challenge, this study introduces a computational model of visual saliency as a solution. Object detection in hazy environments presents a major challenge due to reduced visibility and contrast. The proposed methodology begins by determining whether an image is hazy, and if so, leverages the Dark Channel Prior (DCP) to extract essential haze-related information. The DCP calculation serves as the basis for subsequent dehazing, achieved through the Multiscale Retinex algorithm. In the dehazing phase, the Multiscale Retinex algorithm is applied to improve image clarity and obtain a dehazed version. This haze-free image is given as input to a trained U-Net architecture, which gives a saliency map that identifies notable and prominent regions within the image. Simultaneously, it undergoes region-based segmentation. The geodesic saliency map is calculated using geodesic distance, considering both spatial proximity and feature similarity. In the final step, the saliency maps generated from the U-Net and geodesic saliency computation are fused to generate the ultimate saliency map. The effectiveness of the suggested method in detecting salient objects in hazy images is supported by the experimental findings, which showcase state-of-the-art performance in dehazing. The integration of DCP, multiscale Retinex, and dual saliency maps enhances both dehazing and object detection, making this method valuable in a variety of computer vision applications, including autonomous driving, video surveillance, and image restoration. The experimental results of AUC and MAE provide confirmation for the effectiveness and accuracy of the saliency computational model that has been proposed.
Evaluation and Enhancement of Standard Classifier Performance by Resolving Class Imbalance Issue Using Smote-Variants Over Multiple Medical Datasets
Kumar V., Kumar R.K., Singh S.K.
Article, SN Computer Science, 2025, DOI Link
View abstract ⏷
In the era of machine learning we are solving the classification problems by training the labeled classes. But sometimes due to insufficient data in some of the training classes, the system training is inadequate for these minority classes. In this case the output for the classes obtained from the less amount of trained data are miserably inappropriate and biased towards the classes having more data. This problem is known as a class imbalance problem. In such cases, standard classifiers tend to be overpowered by the expansive classes and disregard the little ones. As a result, the performance of machine learning and the deep learning algorithms are also reducing and sometimes highly unacceptable too, mainly if it is related to crucial data like medical and health related. Though various researchers provided some methods to solve this problem but mostly they are problem specific and suitable with the specific classifier only. To find a generalized and effective solution to this problem, we have applied various smote variants for solving the imbalanced factors in dataset and finally improved the performance of the various machine learning and deep learning algorithms. We have experimented and analyzed the effects of SMOTE variants on various machine learning techniques over six standard medical datasets. We have found that SMOTE variants are very effective, and they improve the standard performance measures (Accuracy, Precision, Recall and F1-Score). Additionally, based on our research, it is feasible to determine which smote variation works best with machine learning methods and datasets.
A survey on visual saliency detection approaches and attention models
Dhara G., Kumar R.K.
Article, Multimedia Tools and Applications, 2025, DOI Link
View abstract ⏷
Visual saliency detection models are widely used in computer vision tasks to mimic the human visual system’s perception of scenes. The part of an image that stands out from its surroundings and captures attention at a glance is referred to as the salient region. This paper presents a comprehensive review of recent advancements in Salient Object Detection (SOD) and its subfields, namely Co-Salient Object Detection (CSD) and RGB-Depth (RGBD) saliency detection. Salient Object Detection refers to techniques that analyze image surroundings to extract prominent regions from the background. Co-saliency detection, on the other hand, focuses on identifying common and salient regions across a group of related images that share similar content. In contrast to traditional saliency detection, RGBD models incorporate both color and depth information to more accurately identify salient objects. All the aforementioned SOD approaches have numerous applications in pattern recognition and computer vision. The concept of saliency detection has garnered significant interest among researchers. However, there remains a need for extensive research to produce highly accurate saliency maps that bridge the gap between perceptual accuracy and computational performance. Although many novel methods have been developed to address these challenges, further efforts are required to enhance the overall accuracy of saliency detection. This review covers a wide range of techniques, from traditional approaches to deep learning-based models. In addition to analyzing various proposed algorithms, the paper provides a comprehensive overview of evaluation metrics used to assess the performance of SOD algorithms. It also explores benchmark datasets commonly used in SOD research and presents both qualitative and quantitative experimental results for SOD and its subfields. Finally, this study examines open research problems, current challenges, and future directions in salient object detection, offering valuable insights and guidance to support future research advancements in the field.
A Machine Learning-based Pneumonia Detection System
Cheekuri S.V., Veeramachaneni M., Manikandan V.M., Kumar R.K.
Conference paper, 2024 5th International Conference for Emerging Technology, INCET 2024, 2024, DOI Link
View abstract ⏷
Pneumonia ranks among the world's major causes of mortality and is the greatest cause of death for young children. It is an infectious condition that can be fatal, affects one or both lungs and is brought on by harmful bacteria. An accurate and timely diagnosis is essential for managing and treating patients effectively. Radiotherapists with specialized training are needed to assess chest X-rays to diagnose pneumonia. Therefore, creating an automated approach to identify pneumonia would be advantageous to treat the illness, especially in isolated locations quickly. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A VGG net and a convolutional neural network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia. Early disease identification is greatly aided by medical imaging, and chest X-rays are a frequent method of identifying lung disorders like pneumonia. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A Convolutional Neural Network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia.
DEM-UFR: Deep Ensemble Method for Enhanced Unconstraint Face Recognition System
Kumar D., Kumar R.K., Garain J., Kisku D.R., Sing J.K., Gupta P.
Conference paper, Proceedings - 2024 OITS International Conference on Information Technology, OCIT 2024, 2024, DOI Link
View abstract ⏷
The widespread usage of mobile devices and social media has led to a growing interest in face recognition technology. This study introduces a novel deep ensemble method designed to enhance facial recognition accuracy on a mobile selfie dataset by integrating three pre-trained models, viz. Inception-v3, ResNet-50, and EfficientNet B7 for automatic feature extraction and representation. The approach utilizes feature-level fusion through concatenation, followed by dimensionality reduction via principal component analysis (PCA). Feature optimization is carried out using the Firefly algorithm, and classification is achieved through a soft voting ensemble of classifiers, including Support Vector Machine (SVM), Random Forest, and a Deep Neural Network (DNN). When evaluated on the LFW, UTK face, and Wild Selfie datasets, the proposed method achieved recognition accuracies of 99.76%, 98.92%, and 98.73%, respectively, demonstrating competitive and significantly improved performance over existing models. The results indicate that the system performs effectively in real-world conditions, especially in environments with varying conditions.
Hybrid Deep Learning Architecture With K-means Clustering For Weapon Detection In CCTV Surveillance
Raj S., Anant A., Suryadevara H., Kumar R.K.
Conference paper, 2024 IEEE International Conference on Computer Vision and Machine Intelligence, CVMI 2024, 2024, DOI Link
View abstract ⏷
For quick criminal activity alert, it became quite obvious to use multiple capture points with cameras, and with multiple capture points, there is need for automated criminal activity alerting systems so the human observer can manage all the CCTV feed in real-time, as its humanly not possible go through that many video feed without a crime slipping out undetected. A deep learning based approach for this task on various network have been researched previously. The researchers have tuned with larger and largest network possible to perform weapon detection for surveillance, though they have achieved more than 90% of accuracy for the task but have to pull largest and complex networks possible. Large deep learning models are costly in both computation and memory for a CCTV device to perform AI workload.There was a gap in studying hybrid approach where deep learning along with machine learning based approach are evaluated for the task. To close this gap, our study employs a hybrid approach that combines machine learning and deep learning methods.Training on a customised dataset was attempted initially. But when implementation proved challenging, the study transitioned to implementing the use of the 'OD-weapon detection dataset' that had been collected from GitHub. Different levels of accuracy were achieved on first validation by using deep learning models such as VGG16, VGG19, InceptionNet, and MobileNet, which were maximised by applying this diversified collection of weapon images. Techniques for clustering, fine-tuning, and PCA dimension reduction were used to improve the classification performance.
Experimental Study of Free Vibration Fibre Reinforced Laminated Fractured Composite Beams
Kumar R.K., Harish G., Singh R.K., Khan K.
Conference paper, Lecture Notes in Mechanical Engineering, 2024, DOI Link
View abstract ⏷
In this paper, the free vibration analysis for the fundamental mode of vibration of fibre reinforced laminated fractured composite beams has been carried out. The beams are manufactured by hand lay-up method using glass as fibre material and epoxy resin as matrix material. The material properties are obtained from Universal Testing Machine. The analysis is carried out for different lamination schemes, boundary conditions, and number of cracks. The free vibration frequencies for the fundamental mode of vibration and Time-History are obtained from LabVIEW. The acceleration data for the dynamic response of the beam are obtained from an accelerometer fixed on the middle of the beam. The experimental stress and modal analysis of glass fibre reinforced laminated composite beam has been carried out for clamped-clamped and clamped-free end conditions. The experimental results obtained from LabVIEW are compared with the FE simulation results obtained from ANSYS using shell 8 node 281 elements. It is observed that in case of [0°]4 specimen, the free vibration of frequency of the beam is decreasing with increase in number of cracks, whereas in the case of [0°]8 the frequency of the beam has no significant change even if the number of cracks are varied. But the frequency is increased when the number of layers are increased irrespective of the crack. For [0°/90°]2 cross-ply laminate the free vibration frequencies decreases as number of crack increases. For [0°/90°]2 lamination scheme the maximum value of stress for zero crack beam is at the clamped end and for Single, Double, Triple crack beams the maximum stress value is found at the position of first crack and the value is higher than that of beam with zero crack, whereas for [0°/90°]4 the maximum stress values for Single, Double, Triple crack beams are lower when compared to the beam with Zero crack.
DeepFusion-Net: A U-Net and CGAN-Based Approach for Salient Object Detection
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Networks and Systems, 2024, DOI Link
View abstract ⏷
Saliency Detection is a crucial undertaking in the realm of vision computing, with a goal to identify the visual prominent regions within an input image. The method of automated saliency identification has caught the interest of various application fields during the last decade. An innovative method is suggested for saliency detection through Conditional Generative Adversarial Networks (CGANs) with a pre-trained U-Net model as the generator. The generated saliency maps are evaluated by the discriminator for authenticity and give feedback to enhance the generator’s ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. By combining the strengths of conditional generative adversarial networks and the U-Net architecture, our goal is to improve the accuracy and enhance the quality. Once the U-Net model is trained and its weights are saved, we then integrate it into the CGAN framework for salient object detection. The U-Net will serve as part of the generator for the CGAN, responsible for generating saliency maps for input images. The components of CGAN, are trained using adversarial learning to enhance the quality and realism of the resulting saliency maps. Precision, recall, MAE, and Fβ score measurements are used to evaluate performance. Thorough experiments have been conducted on three challenging saliency detection datasets, our model has demonstrated remarkable performance surpassing the latest models for saliency. Further, faster convergence is observed in our model due to the initialization of the CGAN’s generator using pre-trained U-Net model weights.
Spatial attention guided cGAN for improved salient object detection
Dhara G., Kumar R.K.
Article, Frontiers in Computer Science, 2024, DOI Link
View abstract ⏷
Recent research shows that Conditional Generative Adversarial Networks (cGANs) are effective for Salient Object Detection (SOD), a challenging computer vision task that mimics the way human vision focuses on important parts of an image. However, implementing cGANs for this task has presented several complexities, including instability during training with skip connections, weak generators, and difficulty in capturing context information for challenging images. These challenges are particularly evident when dealing with input images containing small salient objects against complex backgrounds, underscoring the need for careful design and tuning of cGANs to ensure accurate segmentation and detection of salient objects. To address these issues, we propose an innovative method for SOD using a cGAN framework. Our method utilizes encoder-decoder framework as the generator component for cGAN, enhancing the feature extraction process and facilitating accurate segmentation of the salient objects. We incorporate Wasserstein-1 distance within the cGAN training process to improve the accuracy of finding the salient objects and stabilize the training process. Additionally, our enhanced model efficiently captures intricate saliency cues by leveraging the spatial attention gate with global average pooling and regularization. The introduction of global average pooling layers in the encoder and decoder paths enhances the network's global perception and fine-grained detail capture, while the channel attention mechanism, facilitated by dense layers, dynamically modulates feature maps to amplify saliency cues. The generated saliency maps are evaluated by the discriminator for authenticity and gives feedback to enhance the generator's ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. We trained and validated our model using large-scale benchmark datasets commonly used for salient object detection, namely DUTS, ECSSD, and DUT-OMRON. Our approach was evaluated using standard performance metrics on these datasets. Precision, recall, MAE and Fβ score metrics are used to evaluate performance. Our method achieved the lowest MAE values: 0.0292 on the ECSSD dataset, 0.033 on the DUTS-TE dataset, and 0.0439 on the challenging and complex DUT-OMRON dataset, compared to other state-of-the-art methods. Our proposed method demonstrates significant improvements in salient object detection, highlighting its potential benefits for real-life applications.
Enhancing Salient Object Detection with Supervised Learning and Multi-prior Integration
Dhara G., Kumar R.K.
Article, Journal of Image and Graphics(United Kingdom), 2024, DOI Link
View abstract ⏷
Salient Object Detection (SOD) can mimic the human vision system by using algorithms that simulate the way how the eye detects and processes visual information. It focuses mainly on the visually distinctive parts of an image, similar to how the human brain processes visual information. The approach proposed in this study is an ensemble approach that incorporates classification algorithm, foreground connectivity and prior calculations. It involves a series of preprocessing, feature generation, selection, training, and prediction using random forest to identify and extract salient objects in an image as a first step. Next, an object proposals map is created for the foreground object. Subsequently, a fusion map is generated using boundary, global, and local contrast priors. In the feature generation step, different edge filters are implemented as the saliency score at edges will be high; additionally, with the use of Gabor’s filter the texture-based features are calculated. The Boruta feature selection algorithm is then used to identify the most appropriate and discriminative features, which helps to reduce the computational time required for feature selection. Ultimately, the initial map obtained from the random forest, along with the fusion saliency maps based on foreground connectivity and prior calculations, is merged to produce a saliency map. This map is then refined using post-processing techniques to acquire the final saliency map. The approach we propose surpasses the performance of 17 cutting-edge techniques across three benchmark datasets, showcasing superior results in terms of precision, recall, and f-measure. The proposed method performs well even on the DUT-OMRON dataset, known for its multiple salient objects and complex backgrounds, achieving a Mean Absolute Error (MAE) value of 0.113. The method also demonstrates high recall values (0.862, 0.923, 0.849 for ECSSD, MSRA-B and DUT-OMRON datasets, respectively) across all datasets, further establishing its suitability for salient object detection.
A novel multiscale cGAN approach for enhanced salient object detection in single haze images
Dhara G., Kumar R.K.
Article, Eurasip Journal on Image and Video Processing, 2024, DOI Link
View abstract ⏷
In computer vision, image dehazing is a low-level task that employs algorithms to analyze and remove haze from images, resulting in haze-free visuals. The aim of Salient Object Detection (SOD) is to locate the most visually prominent areas in images. However, most SOD techniques applied to visible images struggle in complex scenarios characterized by similarities between the foreground and background, cluttered backgrounds, adverse weather conditions, and low lighting. Identifying objects in hazy images is challenging due to the degradation of visibility caused by atmospheric conditions, leading to diminished visibility and reduced contrast. This paper introduces an innovative approach called Dehaze-SOD, a unique integrated model that addresses two vital tasks: dehazing and salient object detection. The key novelty of Dehaze-SOD lies in its dual functionality, seamlessly integrating dehazing and salient object identification into a unified framework. This is achieved using a conditional Generative Adversarial Network (cGAN) comprising two distinct subnetworks: one for image dehazing and another for salient object detection. The first module, designed with residual blocks, Dark Channel Prior (DCP), total variation, and the multiscale Retinex algorithm, processes the input hazy images. The second module employs an enhanced EfficientNet architecture with added attention mechanisms and pixel-wise refinement to further improve the dehazing process. The outputs from these subnetworks are combined to produce dehazed images, which are then fed into our proposed encoder–decoder framework for salient object detection. The cGAN is trained with two modules working together: the generator aims to produce haze-free images, whereas the discriminator distinguishes between the generated haze-free images and real haze-free images. Dehaze-SOD demonstrates superior performance compared to state-of-the-art dehazing methods in terms of color fidelity, visibility enhancement, and haze removal. The proposed method effectively produces high-quality, haze-free images from various hazy inputs and accurately detects salient objects within them. This makes Dehaze-SOD a promising tool for improving salient object detection in challenging hazy conditions. The effectiveness of our approach has been validated using benchmark evaluation metrics such as mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM).
Experimental Stress and Vibration Analysis of Hybrid Composite Laminated Cracked Beam
Kumar R.K., Khan K.
Article, NanoWorld Journal, 2023, DOI Link
View abstract ⏷
In this paper, an analysis of the fundamental mode of free vibration for hybrid laminated cracked composite beams was done. Hand layup is used to make the beams. Glass and carbon are used as fibers, and epoxy and resin are used as matrix materials. The properties of a material are found on the Universal Testing Machine. The experimental stress and modal analysis of carbon and glass fiber reinforced cracked hybrid laminated beam has been carried out for fixed-fixed and fixed-free beams. The [0°-90°-0°-90°] lamination scheme and [C-G-G-C], [G-C-C-G] composition have been used. The Lab VIEW software was used to perform the experiment. The strain gauges are used to perform stress analysis, and an accelerometer is used for modal analysis. The natural frequencies for all the cases have been determined. A finite element modal has been developed using ANSYS. The experimental data are compared to the ANSYS-obtained numerical results. For the compositions [C-G-G-C] and [G-C-C-G], the natural frequency drops as the number of cracks rises. Additionally, it has been found that both the laminated compositions [C-G-G-C] and [G-C-C-G] experience an increase in stress, strain, and deflection when the number of cracks increases. Maximum stress occurs at the fixed end if there is no crack; if a crack is present, it is located close to the fixed end, and as the number of cracks rises, the site of maximum stress shifts.
Study and analysis of visual saliency applications using graph neural networks
Dhara G., Kumar R.K.
Book chapter, Concepts and Techniques of Graph Neural Networks, 2023, DOI Link
View abstract ⏷
GNNs (graph neural networks) are deep learning algorithms that operate on graphs. A graph's unique ability to capture structural relationships among data gives insight into more information rather than by analyzing data in isolation. GNNs have numerous applications in different areas, including computer vision. In this chapter, the authors want to investigate the application of graph neural networks (GNNs) to common computer vision problems, specifically on visual saliency, salient object detection, and co-saliency. A thorough overview of numerous visual saliency problems that have been resolved using graph neural networks are studied in this chapter. The different research approaches that used GNN to find saliency and co-saliency between objects are also analyzed.
Parallel Big Bang-Big Crunch-LSTM Approach for Developing a Marathi Speech Recognition System
Sharma A., Bachate R.P., Singh P., Kumar V., Kumar R.K., Singh A., Kadariya M.
Article, Mobile Information Systems, 2022, DOI Link
View abstract ⏷
The Voice User Interface (VUI) for human-computer interaction has received wide acceptance, due to which the systems for speech recognition in regional languages are now being developed, taking into account all of the dialects. Because of the limited availability of the speech corpus (SC) of regional languages for doing research, designing a speech recognition system is challenging. This contribution provides a Parallel Big Bang-Big Crunch (PB3C)-based mechanism to automatically evolve the optimal architecture of LSTM (Long Short-Term Memory). To decide the optimal architecture, we evolved a number of neurons and hidden layers of LSTM model. We validated the proposed approach on Marathi speech recognition system. In this research work, the performance comparisons of the proposed method are done with BBBC based LSTM and manually configured LSTM. The results indicate that the proposed approach is better than two other approaches.
Improving performance of classifiers for diagnosis of critical diseases to prevent COVID risk
Kumar V., Lalotra G.S., Kumar R.K.
Article, Computers and Electrical Engineering, 2022, DOI Link
View abstract ⏷
The risk of developing COVID-19 and its variants may be higher in those with pre-existing health conditions such as thyroid disease, Hepatitis C Virus (HCV), breast tissue disease, chronic dermatitis, and other severe infections. Early and precise identification of these disorders is critical. A huge number of patients in nations like India require early and rapid testing as a preventative measure. The problem of imbalance arises from the skewed nature of data in which the instances from majority class are classified correct, while the minority class is unfortunately misclassified by many classifiers. When it comes to human life, this kind of misclassification is unacceptable. To solve the misclassification issue and improve accuracy in such datasets, we applied a variety of data balancing techniques to several machine learning algorithms. The outcomes are encouraging, with a considerable increase in accuracy. As an outcome of these proper diagnoses, we can make plans and take the required actions to stop patients from acquiring serious health issues or viral infections.
Water Body Identification from the Satellite Images using Color Component Analysis with Morphological Operations
Jagruth K., Manikandan V.M., Kumar R.K.
Conference paper, 2021 12th International Conference on Computing Communication and Networking Technologies, ICCCNT 2021, 2021, DOI Link
View abstract ⏷
Many countries including India are frequently affected by the natural disasters like floods. In general, predicting natural disasters accurately is very difficult, but advanced technologies can be utilized to come out of such difficulties or to reduce the impact of natural disasters. Satellite image processing is one of the efficient ways to detect water bodies in earth regions which may help the agriculture industry or to identify the flooded regions. In this paper, we propose a scheme to identify the water bodies from the satellite images which will be useful for various applications. During our research, we have created a set of water body images by cropping satellite images. The properties of the water body regions analyzed using an algorithm and computed a set of possible threshold values for the pixels representing the water bodies. The threshold values obtained from the analysis of 'water body images' are used in the proposed algorithm to identify water bodies in any given image. A sequence of morphological operations is introduced to refine the results that are obtained through pixel color component analysis. The result analysis is carried out on a set of satellite images and it achieved good results.
BAT algorithm based feature selection: Application in credit scoring
Tripathi D., Ramachandra Reddy B., Padmanabha Reddy Y.C.A., Shukla A.K., Kumar R.K., Sharma N.K.
Article, Journal of Intelligent and Fuzzy Systems, 2021, DOI Link
View abstract ⏷
Credit scoring plays a vital role for financial institutions to estimate the risk associated with a credit applicant applied for credit product. It is estimated based on applicants' credentials and directly affects to viability of issuing institutions. However, there may be a large number of irrelevant features in the credit scoring dataset. Due to irrelevant features, the credit scoring models may lead to poorer classification performances and higher complexity. So, by removing redundant and irrelevant features may overcome the problem with large number of features. In this work, we emphasized on the role of feature selection to enhance the predictive performance of credit scoring model. Towards to feature selection, Binary BAT optimization technique is utilized with a novel fitness function. Further, proposed approach aggregated with 'Radial Basis Function Neural Network (RBFN)', 'Support Vector Machine (SVM)' and 'Random Forest (RF)' for classification. Proposed approach is validated on four bench-marked credit scoring datasets obtained from UCI repository. Further, the comprehensive investigational results analysis are directed to show the comparative performance of the classification tasks with features selected by various approaches and other state-of-the-art approaches for credit scoring.
Constraint saliency based intelligent camera for enhancing viewers attention towards intended face
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Pattern Recognition Letters, 2020, DOI Link
View abstract ⏷
Visual Saliency decides the focus of attention towards a region in a scene. When we talk about attending faces in a crowd or set of multiple faces, our focus of attention does not go equal for all the faces. The biasness of human's visual system towards a particular face, occurs due to some dominant features of it, over rest of the faces. So, for the faces which are not intrinsically salient in a scene, there is a requirement of increment of attentiveness. The current study effort to improve the saliency of a face (in a set of multiple faces), which is not significantly salient. It can be achieved by enhancing the contrast of the intended face with its surrounding faces, in terms of visual low-level features like intensity, color, etc. Modification of such feature values will change its potential to attract observer's gaze. But excesses change will destroy the originality of the image. Therefore, the problem of enhancing saliency of a target face is framed as an optimization (maximization) problem under some constraints. This concept can be applied to develop a saliency based intelligent camera having the power of enhancing the attractiveness of a particular face in the crowd and in the taken photograph, the enhanced face may give more attention to the viewers. Experiment has been conducted on grayscale as well as the colour images. Moreover, effect of saliency on faces wearing jewellery, has also been measured.
Enhancing face recognition through overlaying training images
Sinha A.S., Rahman A.U., Kumar R.K., Sanyal G.
Conference paper, 2019 2nd International Conference on Advanced Computational and Communication Paradigms, ICACCP 2019, 2019, DOI Link
View abstract ⏷
In the usual face recognition approach, system is getting trained through a large number of training samples. It means in the process of training, features are extracted from all the training images individually. In this process many redundant features are required to be eliminate also. During feature elimination, some features also get suppressed due to inappropriate thresholds. So, this approach is typically time consuming and costly in the part of training. Hence, there is a requirement of feature extraction in such a way that it reduces the chance of data redundancy and system complexity. This paper presents a facial recognition technique by inclusion of superimposed version of all relevant images which improves the accuracy of the model by roughly 43 percent. The algorithm aims to establish the importance of superimposition strategy in the field of face recognition. The Haar feature based classifier is used, where a cascade function is trained from a set of images. We have used the open source database of faces from the archives of ATT Laboratories Cambridge to train and test our model.
Bezier Cohort Fusion in Doubling States for Human Identity Recognition with Multifaceted Constrained Faces
Garain J., Mishra S.R., Kumar R.K., Kisku D.R., Sanyal G.
Article, Arabian Journal for Science and Engineering, 2019, DOI Link
View abstract ⏷
Cohort selection benefits a biometric system by providing the information collected from non-match templates, whereas fusion benefits a system by combining information collected from different sources or from same source in different ways. The benefits of both approaches are availed here by proposing a cohort selection technique which is exploited prior to fusion and after fusion of matching scores for a face recognition system. Two robust facial features, viz. scale-invariant feature transform and speeded up robust features, are used here. This study presents a novel way of fusion based on cohort selection unlike the traditional levels of fusion (i.e., sensor, feature, match score, rank and decision level fusions). Cohort-based fusion is performed in two different fashions—pre-cohort fusion and post-cohort fusion. In case of early fusion, fusion rules like sum, max, min and average rules are applied before cohort selection to be performed. In contrast, the cohort selection is followed by the fusion in post (or late)-cohort fusion. Union operation is applied as late fusion rule. The matching scores are normalized by T-norm cohort score normalization technique prior to be compared with the threshold value to govern the decision of acceptance by the system. The experiments are carried out on FEI and the Look-alike (IIIT Delhi) face databases. The outcomes of the proposed method are looked to be encouraging and much convincing over non-cohort systems and state-of-the-art methods.
Guiding attention of faces through graph based visual saliency (GBVS)
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Cognitive Neurodynamics, 2019, DOI Link
View abstract ⏷
In a general scenario, while attending a scene containing multiple faces or looking towards a group photograph, our attention does not go equal towards all the faces. It means, we are naturally biased towards some faces. This biasness happens due to availability of dominant perceptual features in those faces. In visual saliency terminology it can be called as ‘salient face’. Human’s focus their gaze towards a face which carries the ‘dominating look’ in the crowd. This happens due to comparative saliency of the faces. Saliency of a face is determined by its feature dissimilarity with the surrounding faces. In this context there is a big role of human psychology and its cognitive science too. Therefore, enormous researches have been carried out towards modeling the computer vision system like human’s vision. This paper proposed a graphical based bottom up approach to point up the salient face in the crowd or in an image having multiple faces. In this novel method, visual saliencies of faces have been calculated based on the intensity values, facial areas and their relative spatial distances. Experiment has been conducted on gray scale images. In order to verify this experiment, three level of validation has been done. In the first level, our results have been verified with the prepared ground truth. In the second level, intensity scores of proposed saliency maps have been cross verified with the saliency score. In the third level, saliency map is validated with some standard parameters. The results are found to be interesting and in some aspects saliency predictions are like human vision system. The evaluation made with the proposed approach shows moderately boost up results and hence, this idea can be useful in the future modeling of intelligent vision (robot vision) system.
Addressing facial dynamics using k-medoids cohort selection algorithm for face recognition
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Article, Multimedia Tools and Applications, 2019, DOI Link
View abstract ⏷
Face recognition is itself a very challenging task and it becomes more challenging when the input images have intra class variations and inter class similarities in a large scale. Yet the recognition accuracy can be improved in some extent by supporting the system with non-matched templates. Therefore a set of cohort images is used in this regard. But all the cohort templates of the initial cohort pool may not be relevant for each and every enrolled subject. So the main focus of this work is to select a subject specific and meaningful cohort subset. This paper proposes a cohort selection method called K-medoids Cohort Selection (KMCS) to select a reference set of non-matched templates which are almost appropriate to the respective subjects. Basically, all cohort scores of a subject are clustered first using K-medoids clustering. Afterward the cluster having more scattered members/scores from its medoid is selected as a cohort subset because this cluster is constituted with the cohorts carrying more discriminative features compared to others. The SIFT points and SURF points are extracted as facial feature. The experiments are conducted on FEI, ORL and Look-alike databases of face images. The matching scores between probe and query images are normalized using T-norm, Max-Min and Aggarwal (Max rule) cohort score normalization techniques before taking the final decision of acceptance or rejection. The results obtained from the experiments show the domination of the proposed system over the non-cohort face recognition system as well as random and Top 10 cohort selection methods. There is another comparative study between k-means and K-medoids clustering for cohort selection.
Combined effect of cohort selection and decision level fusion in a face biometric system
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
There are different parameters which degrade the performance of a face biometric system due to their variations. The baseline biometric systems can get relief to some extent from this kind of negative effect by utilizing the information of the cohort images and fusion methods. But to achieve the set of suitable cohorts for each and every enrolled person is a task of great challenge. Determining the cohort subset using k-means clustering cohort selection based on the matching proximity is presented in this paper. SIFT and SURF are used as facial features to represent each face image and to calculate the similarity score between two face images. The clusters having highest and lowest centroid value are fused using union rule to form the target, user dependent cohort subset. The query-claimed matching scores are normalized with the help of T-norm cohort normalization technique. The scores after normalization are used in recognition separately for SIFT as well as SURF. Finally, the responses from the classifier for these two different features are fused at decision level to cover up the shortcomings of the cohort selection method if any. The experimental execution is done on FEI face database. This integrated face biometric system gains a significant hike in performance that evidences its effectiveness over baseline.
Image Specific Cross Cohort Normalization for Face Pair matching
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Procedia Computer Science, 2018, DOI Link
View abstract ⏷
An image matching or face pair matching is purely different aspect with respect to the other problems of computer vision and pattern recognition. This is a very active and challenging topic due to the unavailability of any prior information to the matching expert about the input images to be matched. Therefore an additional set of images can resolve this problem in some extent. In this context a cohort based face pair matching system is proposed. Initially the cohort set is common to all images but finally a subset of cohort images, specific to each of the paired images, are selected. Here Max-Min-Centroid-Cluster (MMCC) is applied which is capable enough to choose very relevant cohorts corresponding to target images. The raw similarity score between the input images is normalized with these set of cohort scores to obtain two normalized matching score. Afterwards the closeness between the images is measured by cross cohort normalization. The absolute difference of these two crossly normalized score is calculated and compared with a threshold value to decide the belonging of the input images to the same person or different person. The experiment has been conducted on ORL face database and the results found make evidence of the proposed system to be efficient.
A master map: An alternative approach to explore human’s eye fixation for generating ground truth based on various state-of-the-art techniques
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Electrical Engineering, 2018, DOI Link
View abstract ⏷
Saliency map is an efficient way to represent the salient objects in an image. In the area of object-based saliency, several researches have been accomplished. In these works, normal input images are taken in which salient region or arousal object are easily perceived by us. These experiments are validated based on ground truth obtained either from volunteers either through human eye fixation machine or based on viewer’s voting. But for complex images, salient locations are very confusing; therefore preparing ground truth is very difficult. In such images, results are varying with different state-of-the-art saliency model. To address this problem, this paper implements combine strategies to achieve composite saliency map which can incorporate the properties of every individual component maps. Fusion of saliency maps in this way can be utilized to generate good ground truth information which can be an alternate way of preparing ground truth unlike based on volunteers voting or through eye fixation machine. As this approach incorporates the concepts of reusability, therefore it may reduce the time and cost in the preparation of ground truth.
Attending prominent face in the set of multiple faces through relative visual saliency
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
Visual saliency determines the extent of attentiveness of a region in a scene. In the context of attending faces in the crowd, face components and its dominance features decide the focus on attention. Attention boosts up the recognition and identification process in a crowd and hence plays an excelling role in the area of visual surveillance and robotic vision. Using different computer vision-based techniques, enormous researches have been carried out on attention, recognition, and identification of the human face in context of different applications. This paper proposes a novel technique to analyze and explore the prominent face in the set of multiple faces (crowd). The proposed method stretched out the solution, using the concept of relative visual saliency, which has been evaluated on the various parameters of face as a whole and its componentwise too. These parameters are face area, spatial location, intensity, hue, RGB values, etc. The proposed work furnishes satisfactory results. The assessment made with this approach shows quite encouraging results which may lead to a future model for robotic vision and intelligent decision-making system.
A bezier curve cohort selection strategy for face pair matching
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2018, DOI Link
View abstract ⏷
The matching of two face images without any prior information is very much challenging task unlike a verification or identification system where already some knowledge about the images of each subjects are stored in the system's database. This paper proposes a methodology to enrich the performance of a face pair matching system by utilizing the complementary information collected from a set of cohort face images with the help of Bezier Curve cohort selection algorithm. A pair of face images is given as input to the system. Each image is compared with a predefined cohort pool to form two separate set of cohort scores. Further these set of cohort scores are passed through Bezier curve cohort selection method which provide two suitable cohort subsets. Afterwards a cross normalization is accomplished in conjunction with T-norm score normalization method then the absolute normalized difference between the paired face images is determined. On the basis of this normalized difference, it is finally decided whether the input face pair is from same person or not. The system is investigated with FEI face database and the results are quite impressive.
Estimating attention of faces due to its growing level of emotions
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2018, DOI Link
View abstract ⏷
In the task of attending faces in the disciplined assembly (Like in examination hall or Silent public places), our gaze automatically goes towards those persons who exhibits their expression other than the normal expression. It happens due to finding of dissimilar expression among the gathering of normal. In order to modeling this concept in the intelligent vision of computer system, hardly some effective researches have been succeeded. Therefore, in this proposal we have tried to come out with a solution for handling such challenging task of computer vision. Actually, this problem is related to cognitive aspect of visual attention. In the literature of visual saliency authors have dealt with expressionless objects but it has not been addressed with object like face which exploits expressions. Visual saliency is a term which differentiates 'appealing' visual substance from others, based on their feature differences. In this paper, in the set of multiple faces, 'Salient face' has been explored based on 'emotion deviation' from the normal. In the first phase of the experiment, face detection task has been accomplished using Viola Jones face detector. The concept of deep convolution neural network (CNN) has been applied for training and classification of different facial expression of emotions. Moreover, saliency score of every face of the input image have been computed by measuring their 'emotion score' which depends upon the deviation from the 'normal expression' scores. This proposed approach exhibits fairly good result which may give a new dimension to the researchers towards the modeling of an intelligent vision system which can be useful in the task of visual security and surveillance.
BCP-BCS: Best-fit cascaded matching paradigm with cohort selection using bezier curve for individual recognition
Garain J., Shah A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
The concept of cohort selection has been emerged as a very interesting and potential topic for ongoing research in biometrics. It has the capability to provide the traditional biometric systems to having a higher performance rate with lesser complexity and cost. This paper describes a novel matching technique incorporated with Bezier curve cohort selection. The Best-Fit matching with dynamic threshold has been proposed here to reduce the number of false match. This algorithm is applied for matching of Speeded Up Robust Feature (SURF) points detected on face images to find out the matching score between two faces. After that, Bezier curve is applied as a cohort selection technique. All the cohort scores are plotted in a 2D plane as if these are the control points of a Bezier curve and then a Bezier curve of degree n is plotted on the same plane using De Casteljau algorithm where number of control point is n + 1. A template contains more discriminative features more it is having distance from the curve. All the templates having score point far from the curve are included into the account of cohort subset. For each enrolled user a specific cohort subset is determined. As long as the subset is formed, T-norm cohort score normalization technique is applied to obtain the normalized scores which are further used for person identification and verification. Experiments are conducted on FEI face database and results are showing dominance over the non-cohort system.
A novel approach to attend faces in the crowd through relative visual saliency
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2017, DOI Link
View abstract ⏷
Visual Saliency plays an important role in attending faces in the crowd. While considering faces in a crowd, humans are inclined towards certain faces due to dominance of some features in those faces like color, texture, intensity, geometry etc. In this paper, we have proposed a novel method to analyze the saliency of faces in a crowd face image based on four parameters namely feature difference (intensity difference), spatial distance, area of each face and camera distance i.e. distance of each face in crowd from the viewer's point. To the best of our knowledge it is confirmed that this zone has not been explored by researchers to its full capacity till now. The experimental results have been found motivating. This method can find its future application in artificial intelligent systems and cognitive models can be established based on this theory which has the capacity to mimic the ability of human vision system.
Emotion recognition through facial gestures – a deep learning approach
Mishra S., Prasada G.R.B., Kumar R.K., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
As defined by some theorists, human emotions are discrete and consistent responses to internal or external events which have significance for an organism. They constitute a major part of our non-verbal communication. Among the human emotions, happy, sad, fear, anger, surprise, disgust and neutral are the seven basic emotions. Facial expressions are the best way to exhibit emotions. In this era of booming human-computer interaction, enabling the machines to recognize these emotions is a paramount task. There is an amalgamation of emotions in every facial expression. In this paper, we identified the different emotions and their intensity level in a human face by implementing deep learning approach through our proposed Convolution Neural Network (CNN). The architecture and the algorithm here yield appreciable results that can be used as a motivation for further research in computer based emotion recognition system.
Determine attention of faces through growing level of emotion using deep Convolution Neural Network
Kumar R.K., Kumar G.A.R., Garain J., Kisku D.R., Sanyal G.
Conference paper, 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies, ICICICT 2017, 2017, DOI Link
View abstract ⏷
Face emotions are internal feelings of a human which reflects on the human face in terms of specific expressions. It appears naturally rather than through forceful effort and are convoyed by physical variations in facial muscles that implies various expressions on the face. Several standard emotions are happy, anger, sad, fear, surprise, disgust, etc. Facial expressions play an essential role in non-verbal communication. In the area of modeling of intelligent computer vision which can recognize the human's emotion, various researches have been performed. But advance research which may analyze beyond of 'just emotion prediction', for example 'emotion attention' or 'salient emotion' has not been covered much in the literature. In this paper, we are finding the attention scores of same emotion in various levels. Obtained results justifies that, the higher expression level of emotions gives more attention. The experiment has been performed on different levels of same emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and analyzed, how saliency of face is changes with low level to high level of same emotions. FER-2013 and CK+ databases have been applied for training and testing respectively. The proposed approach delivers fairly good result it may give inspiration to the researchers for modeling an intelligent vision system which can predict not only emotion recognition but more than this.
Discriminating real from fake smile using convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, ICCIDS 2017 - International Conference on Computational Intelligence in Data Science, Proceedings, 2017, DOI Link
View abstract ⏷
In our society, sometime we hide our genuine feeling and emotion and purposely express different emotion in front of our surrounding folks. But as it's not actually a natural emotion, hence, it is more or less, predictable by others. Human vision system has enormous capability to recognizing genuine and fake smile of an individual. Discriminating genuine and fake smile is very thought-provoking task and even though very smaller amount of research has been carried out in this topic. In this paper, we are exploring a method to distinguish real from fake smile with high precision by using convolution neural networks (CNN). System has been train with FERC-2013 dataset having seven types of emotions namely happy, sad, disgust, angry, fearful, surprised and neutral. Emotions percentages of real and fake face are recorded by the emotion detection system. Based on recorded score, we investigate the effect of various percentages of emotions presented on both faces and then we are going to classify the smile on the face is real or fake.
Facial emotion analysis using deep convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, Proceedings of IEEE International Conference on Signal Processing and Communication, ICSPC 2017, 2017, DOI Link
View abstract ⏷
Human emotions are mental states of feelings that arise spontaneously rather than through conscious effort and are accompanied by physiological changes in facial muscles which implies expressions on face. Some of critical emotions are happy, sad, anger, disgust, fear, surprise etc. Facial expressions play a key role in non-verbal communication which appears due to internal feelings of a person that reflects on the faces. In order to computer modeling of human's emotion, a plenty of research has been accomplished. But still it is far behind from human vision system. In this paper, we are providing better approach to predict human emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and how emotion intensity changes on a face from low level to high level of emotion. In this algorithm, FERC-2013 database has been applied for training. The assessment through the proposed experiment confers quite good result and obtained accuracy may give encouragement to the researchers for future model of computer based emotion recognition system.
Selection of user-dependent cohorts using bezier curve for person identification
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, DOI Link
View abstract ⏷
The traditional biometric systems can be strengthened further with exploiting the concept of cohort selection to achieve the high demands of the organizations for a robust automated person identification system. To accomplish this task the researchers are being motivated towards developing robust biometric systems using cohort selection. This paper proposes a novel user-dependent cohort selection method using Bezier curve. It makes use of invariant SIFT descriptor to generate matching pair points between a pair of face images. Further for each subject, considering all the imposter scores as control points, a Bezier curve of degree n is plotted by applying De Casteljau algorithm. As long as the imposter scores represent the control points in the curve, a cohort subset is formed by considering the points determined to be far from the Bezier curve. In order to obtain the normalized cohort scores, T-norm cohort normalization technique is applied. The normalized scores are then used in recognition. The experiment is conducted on FEI face database. This novel cohort selection method achieves superior performance that validates its efficiency.
A novel approach to enlighten the effect of neighbor faces during attending a face in the crowd
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2016, DOI Link
View abstract ⏷
While attending the crowd, humans do not look all the faces with same gaze. Human beings have naturally, extraordinary capability to focus their gaze towards a face carrying the 'dominating look' in the crowd in terms of beauty or ugliness. This happens due to relative saliency of that face with respect to their surrounding faces. To model the computer's vision based on human psychology and its cognitive science, enormous researches have been carried out. In order to enhance these researches, this paper proposed a novel method to illustrate how the saliency of a face is varying with its neighboring faces in the crowd. This variation affects our perception and attention towards a face in the crowd. In this method, distribution of visual saliency has been calculated based on intensity values and respective spatial distance of the faces. Experiment has been conceded out on various facial images. The results and accuracy are found to be satisfactory. The evaluation made with the proposed approach which exhibits quite boosting results and it can lead to a future model of intelligent visions of computers.
Heterogeneous face detection
Das A., Kumar R.K., Kisku D.R.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
Face detection is the process of determining the location of human faces in an image. Like human visual system, a face detection system should also be capable of achieving the detection task irrespective of illumination, absence of texture, orientation and camera distance. Detecting faces in heterogeneous, infrared and thermal images is a challenging job due to variation in texture, orientation, lighting condition, intensity etc. Many researchers have worked and proposed various methods for visible faces in the domain of face detection. However, face detection with the existing algorithm in heterogeneous face images is found quite difficult one. This paper attempts to improve the accuracy of an existing face detection algorithm which was basically designed for detecting visible faces, can now be used to detect heterogeneous faces by applying various image enhancement techniques before face detection. The proposed improved algorithm is suitable for heterogeneous face images which include thermal images, infrared images in the crowd. Attempts for the same have been tested on test image dataset. The experimental results are found to be encouraging.
Constrained maximization of saliency of intended object for guiding attention
Kumar R.K., Pal R.
Conference paper, 12th IEEE International Conference Electronics, Energy, Environment, Communication, Computer, Control: (E3-C3), INDICON 2015, 2016, DOI Link
View abstract ⏷
Saliency of an object in an image determines the attentiveness of the object with respect to human visual system. Saliency, in the absence of any external stimuli, is determined by contrast of features (like intensity, color, etc.) of the object with its surroundings. This paper proposes a novel technique to enhance the saliency of an object, which is not intrinsically salient. In order to enhance the attentiveness, the feature values of a used-defined target object (whose saliency has to be enhanced) should have more differences with the feature values of its surrounding objects. But too much modification of these values of the target object will destroy the naturalness of the image. So this problem of enhancing saliency of a target object is treated as a maximization problem under some constraints.
Attention identification via relative saliencyof localized crowd faces
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
While viewing a crowd, the human vision automatically gets focused towards the most attentive face. This happens when a particular face in the crowd dominates the other faces in terms of beauty, expression, color, shape, size and structure, etc. Human attention towards such faces happens due to its higher visual saliency values. A computer vision system can be modeled based on this aspect of human psychology. Visual saliency of a face in a crowd may vary according to many parameters. In this paper, we propose a novel method to calculate the distribution of visual saliency of faces in the crowd based on their feature difference, spatial distance and size. This method has been tested on various crowd images and inspiring results have been found. Therefore, this can be used to mimic the cognitive behavior of the human vision system to create artificially intelligent computer vision systems.
Automated skull stripping in brain MR images
Aruchamy S., Kumar R.K., Bhattacharjee P., Sanyal G.
Conference paper, Proceedings of the 10th INDIACom; 2016 3rd International Conference on Computing for Sustainable Global Development, INDIACom 2016, 2016,
View abstract ⏷
Skull stripping is a significant as well as a preliminary step in diagnosing brain disorders. It removes extra-meningeal tissues from Magnetic Resonance Images of the brain. Magnetic Resonance Imaging (MRI) is a widely used technique for analysis of brain images. An efficient hardware-based algorithm for Skull segmentation would help in developing an automated brain image analysis system for real time applications in biomedical sciences. In this work, a Raspberry Pi single board computer based image analysis algorithm for an automatic skull stripping is reported. The experiment has been carried out with T1 weighted axis images. In order to reduce the noise and enhance the quality, initially the images were pre-processed. Further, edge detection and morphological operations were performed to extract the skull from the brain images. The proposed method has been validated by evaluating quantitative performance metrics like Jaccard similarity index and the Dice coefficient. This technique will serve as the major step for technological outbreaks for developing systems for automated skull stripping the images of the brain in the future.
Estimating normalized attention of viewers on account of relative visual saliency of faces (NRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Article, International Journal of Software Engineering and its Applications, 2015, DOI Link
View abstract ⏷
Humans psychological and behavioral understanding often lead to make natural decision which accurately identifies and remembers the faces which are highly appreciated or criticized by themselves in comparing to the normal viewed faces, in terms of beauty, ugliness or unique appearance. It happens due to human psychology of being biased towards the salient face in the process of face recognition and identification. This paper attempts a novel method to measure, how our attention is more restricted towards some particular faces in the crowd. This restricted attention is strongly guided by the relative visual saliency of these faces. In this paper, normalized relative visual saliency (NVRS) of the faces is evaluated using their intensity values modulated with respective spatial distance. Experiment has been carried out on test image dataset via bottom up approach. The experimental results are found to be encouraging and accuracy has also been measured exhibiting efficacy of the proposed approach.
Analysis of Attention Identification and Recognition of Faces through Segmentation and Relative Visual Saliency (SRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Procedia Computer Science, 2015, DOI Link
View abstract ⏷
Attentiveness, identification and recognition of a human face in a crowd play a starring role in the perspective of visual surveillance. Human vision system is not attending, identifying and recognizing all the faces with the same perception. It is biased towards some faces in the crowd due to their higher relative visual saliency and segment wise perception with respect to the surroundings faces. Using different computer vision-based techniques enormous researches have been carried out on attention, recognition and identification of the human face in context of different applications. This paper proposes a novel technique to explain and analyse how the attention, identification and recognition of a face in the crowd guided through segmentation and their relative visual saliency. The proposed method is stretched out the solution, using the concept of segmentation and relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces.
Cohort selection of specific user using max-min-centroid-cluster (MMCC) method to enhance the performance of a biometric system
Garain J., Kumar R.K., Sanyal G., Kisku D.R.
Article, International Journal of Security and its Applications, 2015, DOI Link
View abstract ⏷
Selection of cohort models plays a vital role to increase the accuracy of a biometric authentication system as well as to reduce the computational cost. This paper proposes a novel approach for cohort selection called Max-Min-Centroid-Cluster (MMCC) method. The clusters of cohorts are generated by K-means clustering technique. The union of the clusters having largest and smallest centroid value is taken as cohort subset. The cohort scores, after normalization using different cohort based score normalization techniques, are used in authentication process of the system. Evaluation has been carried out on FEI face datasets. The performance of this novel methodology is analyzed using T-norm and Aggarwal (max rule) normalization techniques. Experimental results exhibit the efficacy of the proposed method.
Novel methodology for guiding attention of faces through relative visual saliency (RVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Proceedings - 2015 International Conference on Control, Automation and Robotics, ICCAR 2015, 2015, DOI Link
View abstract ⏷
Identification of a human face in a crowded flux plays an important role in the context of surveillance. Considerable amount of research has been carried out on face identification in different applications. Accordingly, different researchers propose new algorithms. This paper attempts to showcase a novel methodology through which any face may be identified in a large crowd of human face. This proposed technique is based on relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces. In addition to visual saliency, top-down and bottom-up approaches to visual attention are also presented and explained in the context of face identification. Both of these two approaches are considered to be made a significant contribution while visual saliency is measured for attention-based face identification. Experiment has been carried out on test image dataset. The results are satisfactory and accuracy has also been measured. The evaluation made with the proposed approach exhibits quite encouraging results and accuracy leads to a future model of human face tracking and recognition system.
Enhancement of fuzzy implication operator
Kumar R.K.
Conference paper, Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014, 2014, DOI Link
View abstract ⏷
In this paper, fuzzy implication operator, ply operator and their properties have been studied. On the basis of existing properties of Implication operator, some more properties have been extracted in order to enhance the use of Implication in the approximate reasoning. In some form of monotonicity, five ply operators and its various properties suggested by Zadeh have been explained. In the last part of the paper, increasing and decreasing behavior of the five Ply operators (obtained by Zadeh using possibility distribution) under 'Special Cases' have also been proposed. © 2014 IEEE.
Low voltage DCI based low power VLSI circuit implementation on FPGA
Pandey B., Kumar R.
Conference paper, 2013 IEEE Conference on Information and Communication Technologies, ICT 2013, 2013, DOI Link
View abstract ⏷
In this paper, we study the effect of using digitally controlled impedance IO Standard in memory interface design in terms of power consumption. In this work, we achieved 50% dynamic power reduction at 1.5V output driver voltage, 35.2% dynamic power reduction at 1.8V output driver voltage in comparison to 2.5V output driver voltage in DCI based IO standard implementation on input or output port in target design. Target device XC6VLX75TFF484-1 is a Virtex-6 FPGA of -1 speed grade and 484 pins is used for implementation of this design. Target Design is RAM-UART memory interface. XPower 13.4 is used for power analysis of our low power memory interface design. ISim is simulator to generate waveform. Planahead is used for design, synthesis and implementation. © 2013 IEEE.
Clock gating aware low power global reset ALU and implementation on 28nm FPGA
Pandey B., Yadav J., Kumar J., Kumar R.
Conference paper, Proceedings - 5th International Conference on Computational Intelligence and Communication Networks, CICN 2013, 2013, DOI Link
View abstract ⏷
In this paper, we apply clock gating technique in Global Reset ALU design on 28nm Artix7 FPGA to save dynamic and clock power both. This technique is simulated in Xilinx14.3 tool and implemented on 28nm Artix7 XC7A200T FFG1156-1 FPGA. When clock gating technique is not applied clock power contributes 32.25%, 4.24%, 3.06%, 3.09%, and 3.09% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, 100GHz and1 THz device frequency respectively. When clock gating technique is applied clock power contributes 0%, 1.02%, 1.06%, 1.06%, and 1.06% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, and 100GHz and1 THz device frequency respectively. With clock gating, there is 100%, 76.92%, 66.30%, 66.55% and 66.58% reduction in clock power in compare to clock power consumption without clock gate on 100 MHz, 1 GHz, 10 GHz, 100 GHz and 1 THz respectively operating frequency. Clock gating is more effective on 28nm in compare to 40nm and 90nm technology file. © 2013 IEEE.

Patents

A wearable device for assisting visually impaired individuals in navigation and object interaction
Dr Ravi Kant Kumar
Patent Application No: 202441048386, Date Filed: 24/06/2024, Date Published: 05/07/2024,
A System and a method for face recognition in unconstrained events
Dr Ravi Kant Kumar
Patent Application No: 202541020827, Date Filed: 07/03/2025, Date Published: 21/03/2025, Status: Published
A system and method for detecting emotions through real-time facial gestures and method thereof
Dr Ravi Kant Kumar
Patent Application No: 202241042820, Date Filed: 26/07/2022, Date Published: 29/07/2022, Status: Granted

Projects

Scholars

Doctoral Scholars

Ms Shaik Reehana
Ms Keerthi Garisa
Ms Gayathri Dhara

Interests

Artificial Intelligence
Data Science
Image Processing
Machine Learning
Vision Computing

Thought Leaderships

There are no Thought Leaderships associated with this faculty.

Top Achievements

Research Area

No research areas found for this faculty.

Recent Updates

No recent updates found.

Education

2014

M.Tech.

Central University of Hyderabad

India

2008

MCA

VIT University, Vellore

India

2019

Ph.D.

National Institute of Technology, Durgapur

India

Experience

4.5 Months – Assistant Professor – Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India.
1 Year- Worked as a Research Intern - IDRBT, Hyderabad, India.
1 Year – Software Engineer – Pellucid Healthcare Network Pvt. Ltd., Chennai, India.

Research Interests

Computational Modelling of Visual Attention using Low-Level and High Level Features.
Design and Development of Algorithms for Saliency based Intelligent Camera.
Mathematical Modelling of Computer Science Problems.

Awards & Fellowships

(2014 – 2019) – Institute Fellowship (During PhD) – MHRD, Govt. of India.
(2012 – 2014) – GATE Fellowship (During MTech) – MHRD, Govt. of India.
2013 – 1st Prize for Software Designing Competition (As a Team) – JP Morgan.
2006 – IBM Great Mind Challenge – IBM

Memberships

Memberships in professional associations to be listed

Publications

Enhanced Salient Object Detection from Single Haze Images
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Electrical Engineering, 2025, DOI Link
View abstract ⏷
Salient Object Detection experiences significant difficulties when trying to identify objects from single haze images due to the deterioration of visibility and low contrast. To subdue this challenge, this study introduces a computational model of visual saliency as a solution. Object detection in hazy environments presents a major challenge due to reduced visibility and contrast. The proposed methodology begins by determining whether an image is hazy, and if so, leverages the Dark Channel Prior (DCP) to extract essential haze-related information. The DCP calculation serves as the basis for subsequent dehazing, achieved through the Multiscale Retinex algorithm. In the dehazing phase, the Multiscale Retinex algorithm is applied to improve image clarity and obtain a dehazed version. This haze-free image is given as input to a trained U-Net architecture, which gives a saliency map that identifies notable and prominent regions within the image. Simultaneously, it undergoes region-based segmentation. The geodesic saliency map is calculated using geodesic distance, considering both spatial proximity and feature similarity. In the final step, the saliency maps generated from the U-Net and geodesic saliency computation are fused to generate the ultimate saliency map. The effectiveness of the suggested method in detecting salient objects in hazy images is supported by the experimental findings, which showcase state-of-the-art performance in dehazing. The integration of DCP, multiscale Retinex, and dual saliency maps enhances both dehazing and object detection, making this method valuable in a variety of computer vision applications, including autonomous driving, video surveillance, and image restoration. The experimental results of AUC and MAE provide confirmation for the effectiveness and accuracy of the saliency computational model that has been proposed.
Evaluation and Enhancement of Standard Classifier Performance by Resolving Class Imbalance Issue Using Smote-Variants Over Multiple Medical Datasets
Kumar V., Kumar R.K., Singh S.K.
Article, SN Computer Science, 2025, DOI Link
View abstract ⏷
In the era of machine learning we are solving the classification problems by training the labeled classes. But sometimes due to insufficient data in some of the training classes, the system training is inadequate for these minority classes. In this case the output for the classes obtained from the less amount of trained data are miserably inappropriate and biased towards the classes having more data. This problem is known as a class imbalance problem. In such cases, standard classifiers tend to be overpowered by the expansive classes and disregard the little ones. As a result, the performance of machine learning and the deep learning algorithms are also reducing and sometimes highly unacceptable too, mainly if it is related to crucial data like medical and health related. Though various researchers provided some methods to solve this problem but mostly they are problem specific and suitable with the specific classifier only. To find a generalized and effective solution to this problem, we have applied various smote variants for solving the imbalanced factors in dataset and finally improved the performance of the various machine learning and deep learning algorithms. We have experimented and analyzed the effects of SMOTE variants on various machine learning techniques over six standard medical datasets. We have found that SMOTE variants are very effective, and they improve the standard performance measures (Accuracy, Precision, Recall and F1-Score). Additionally, based on our research, it is feasible to determine which smote variation works best with machine learning methods and datasets.
A survey on visual saliency detection approaches and attention models
Dhara G., Kumar R.K.
Article, Multimedia Tools and Applications, 2025, DOI Link
View abstract ⏷
Visual saliency detection models are widely used in computer vision tasks to mimic the human visual system’s perception of scenes. The part of an image that stands out from its surroundings and captures attention at a glance is referred to as the salient region. This paper presents a comprehensive review of recent advancements in Salient Object Detection (SOD) and its subfields, namely Co-Salient Object Detection (CSD) and RGB-Depth (RGBD) saliency detection. Salient Object Detection refers to techniques that analyze image surroundings to extract prominent regions from the background. Co-saliency detection, on the other hand, focuses on identifying common and salient regions across a group of related images that share similar content. In contrast to traditional saliency detection, RGBD models incorporate both color and depth information to more accurately identify salient objects. All the aforementioned SOD approaches have numerous applications in pattern recognition and computer vision. The concept of saliency detection has garnered significant interest among researchers. However, there remains a need for extensive research to produce highly accurate saliency maps that bridge the gap between perceptual accuracy and computational performance. Although many novel methods have been developed to address these challenges, further efforts are required to enhance the overall accuracy of saliency detection. This review covers a wide range of techniques, from traditional approaches to deep learning-based models. In addition to analyzing various proposed algorithms, the paper provides a comprehensive overview of evaluation metrics used to assess the performance of SOD algorithms. It also explores benchmark datasets commonly used in SOD research and presents both qualitative and quantitative experimental results for SOD and its subfields. Finally, this study examines open research problems, current challenges, and future directions in salient object detection, offering valuable insights and guidance to support future research advancements in the field.
A Machine Learning-based Pneumonia Detection System
Cheekuri S.V., Veeramachaneni M., Manikandan V.M., Kumar R.K.
Conference paper, 2024 5th International Conference for Emerging Technology, INCET 2024, 2024, DOI Link
View abstract ⏷
Pneumonia ranks among the world's major causes of mortality and is the greatest cause of death for young children. It is an infectious condition that can be fatal, affects one or both lungs and is brought on by harmful bacteria. An accurate and timely diagnosis is essential for managing and treating patients effectively. Radiotherapists with specialized training are needed to assess chest X-rays to diagnose pneumonia. Therefore, creating an automated approach to identify pneumonia would be advantageous to treat the illness, especially in isolated locations quickly. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A VGG net and a convolutional neural network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia. Early disease identification is greatly aided by medical imaging, and chest X-rays are a frequent method of identifying lung disorders like pneumonia. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A Convolutional Neural Network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia.
DEM-UFR: Deep Ensemble Method for Enhanced Unconstraint Face Recognition System
Kumar D., Kumar R.K., Garain J., Kisku D.R., Sing J.K., Gupta P.
Conference paper, Proceedings - 2024 OITS International Conference on Information Technology, OCIT 2024, 2024, DOI Link
View abstract ⏷
The widespread usage of mobile devices and social media has led to a growing interest in face recognition technology. This study introduces a novel deep ensemble method designed to enhance facial recognition accuracy on a mobile selfie dataset by integrating three pre-trained models, viz. Inception-v3, ResNet-50, and EfficientNet B7 for automatic feature extraction and representation. The approach utilizes feature-level fusion through concatenation, followed by dimensionality reduction via principal component analysis (PCA). Feature optimization is carried out using the Firefly algorithm, and classification is achieved through a soft voting ensemble of classifiers, including Support Vector Machine (SVM), Random Forest, and a Deep Neural Network (DNN). When evaluated on the LFW, UTK face, and Wild Selfie datasets, the proposed method achieved recognition accuracies of 99.76%, 98.92%, and 98.73%, respectively, demonstrating competitive and significantly improved performance over existing models. The results indicate that the system performs effectively in real-world conditions, especially in environments with varying conditions.
Hybrid Deep Learning Architecture With K-means Clustering For Weapon Detection In CCTV Surveillance
Raj S., Anant A., Suryadevara H., Kumar R.K.
Conference paper, 2024 IEEE International Conference on Computer Vision and Machine Intelligence, CVMI 2024, 2024, DOI Link
View abstract ⏷
For quick criminal activity alert, it became quite obvious to use multiple capture points with cameras, and with multiple capture points, there is need for automated criminal activity alerting systems so the human observer can manage all the CCTV feed in real-time, as its humanly not possible go through that many video feed without a crime slipping out undetected. A deep learning based approach for this task on various network have been researched previously. The researchers have tuned with larger and largest network possible to perform weapon detection for surveillance, though they have achieved more than 90% of accuracy for the task but have to pull largest and complex networks possible. Large deep learning models are costly in both computation and memory for a CCTV device to perform AI workload.There was a gap in studying hybrid approach where deep learning along with machine learning based approach are evaluated for the task. To close this gap, our study employs a hybrid approach that combines machine learning and deep learning methods.Training on a customised dataset was attempted initially. But when implementation proved challenging, the study transitioned to implementing the use of the 'OD-weapon detection dataset' that had been collected from GitHub. Different levels of accuracy were achieved on first validation by using deep learning models such as VGG16, VGG19, InceptionNet, and MobileNet, which were maximised by applying this diversified collection of weapon images. Techniques for clustering, fine-tuning, and PCA dimension reduction were used to improve the classification performance.
Experimental Study of Free Vibration Fibre Reinforced Laminated Fractured Composite Beams
Kumar R.K., Harish G., Singh R.K., Khan K.
Conference paper, Lecture Notes in Mechanical Engineering, 2024, DOI Link
View abstract ⏷
In this paper, the free vibration analysis for the fundamental mode of vibration of fibre reinforced laminated fractured composite beams has been carried out. The beams are manufactured by hand lay-up method using glass as fibre material and epoxy resin as matrix material. The material properties are obtained from Universal Testing Machine. The analysis is carried out for different lamination schemes, boundary conditions, and number of cracks. The free vibration frequencies for the fundamental mode of vibration and Time-History are obtained from LabVIEW. The acceleration data for the dynamic response of the beam are obtained from an accelerometer fixed on the middle of the beam. The experimental stress and modal analysis of glass fibre reinforced laminated composite beam has been carried out for clamped-clamped and clamped-free end conditions. The experimental results obtained from LabVIEW are compared with the FE simulation results obtained from ANSYS using shell 8 node 281 elements. It is observed that in case of [0°]4 specimen, the free vibration of frequency of the beam is decreasing with increase in number of cracks, whereas in the case of [0°]8 the frequency of the beam has no significant change even if the number of cracks are varied. But the frequency is increased when the number of layers are increased irrespective of the crack. For [0°/90°]2 cross-ply laminate the free vibration frequencies decreases as number of crack increases. For [0°/90°]2 lamination scheme the maximum value of stress for zero crack beam is at the clamped end and for Single, Double, Triple crack beams the maximum stress value is found at the position of first crack and the value is higher than that of beam with zero crack, whereas for [0°/90°]4 the maximum stress values for Single, Double, Triple crack beams are lower when compared to the beam with Zero crack.
DeepFusion-Net: A U-Net and CGAN-Based Approach for Salient Object Detection
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Networks and Systems, 2024, DOI Link
View abstract ⏷
Saliency Detection is a crucial undertaking in the realm of vision computing, with a goal to identify the visual prominent regions within an input image. The method of automated saliency identification has caught the interest of various application fields during the last decade. An innovative method is suggested for saliency detection through Conditional Generative Adversarial Networks (CGANs) with a pre-trained U-Net model as the generator. The generated saliency maps are evaluated by the discriminator for authenticity and give feedback to enhance the generator’s ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. By combining the strengths of conditional generative adversarial networks and the U-Net architecture, our goal is to improve the accuracy and enhance the quality. Once the U-Net model is trained and its weights are saved, we then integrate it into the CGAN framework for salient object detection. The U-Net will serve as part of the generator for the CGAN, responsible for generating saliency maps for input images. The components of CGAN, are trained using adversarial learning to enhance the quality and realism of the resulting saliency maps. Precision, recall, MAE, and Fβ score measurements are used to evaluate performance. Thorough experiments have been conducted on three challenging saliency detection datasets, our model has demonstrated remarkable performance surpassing the latest models for saliency. Further, faster convergence is observed in our model due to the initialization of the CGAN’s generator using pre-trained U-Net model weights.
Spatial attention guided cGAN for improved salient object detection
Dhara G., Kumar R.K.
Article, Frontiers in Computer Science, 2024, DOI Link
View abstract ⏷
Recent research shows that Conditional Generative Adversarial Networks (cGANs) are effective for Salient Object Detection (SOD), a challenging computer vision task that mimics the way human vision focuses on important parts of an image. However, implementing cGANs for this task has presented several complexities, including instability during training with skip connections, weak generators, and difficulty in capturing context information for challenging images. These challenges are particularly evident when dealing with input images containing small salient objects against complex backgrounds, underscoring the need for careful design and tuning of cGANs to ensure accurate segmentation and detection of salient objects. To address these issues, we propose an innovative method for SOD using a cGAN framework. Our method utilizes encoder-decoder framework as the generator component for cGAN, enhancing the feature extraction process and facilitating accurate segmentation of the salient objects. We incorporate Wasserstein-1 distance within the cGAN training process to improve the accuracy of finding the salient objects and stabilize the training process. Additionally, our enhanced model efficiently captures intricate saliency cues by leveraging the spatial attention gate with global average pooling and regularization. The introduction of global average pooling layers in the encoder and decoder paths enhances the network's global perception and fine-grained detail capture, while the channel attention mechanism, facilitated by dense layers, dynamically modulates feature maps to amplify saliency cues. The generated saliency maps are evaluated by the discriminator for authenticity and gives feedback to enhance the generator's ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. We trained and validated our model using large-scale benchmark datasets commonly used for salient object detection, namely DUTS, ECSSD, and DUT-OMRON. Our approach was evaluated using standard performance metrics on these datasets. Precision, recall, MAE and Fβ score metrics are used to evaluate performance. Our method achieved the lowest MAE values: 0.0292 on the ECSSD dataset, 0.033 on the DUTS-TE dataset, and 0.0439 on the challenging and complex DUT-OMRON dataset, compared to other state-of-the-art methods. Our proposed method demonstrates significant improvements in salient object detection, highlighting its potential benefits for real-life applications.
Enhancing Salient Object Detection with Supervised Learning and Multi-prior Integration
Dhara G., Kumar R.K.
Article, Journal of Image and Graphics(United Kingdom), 2024, DOI Link
View abstract ⏷
Salient Object Detection (SOD) can mimic the human vision system by using algorithms that simulate the way how the eye detects and processes visual information. It focuses mainly on the visually distinctive parts of an image, similar to how the human brain processes visual information. The approach proposed in this study is an ensemble approach that incorporates classification algorithm, foreground connectivity and prior calculations. It involves a series of preprocessing, feature generation, selection, training, and prediction using random forest to identify and extract salient objects in an image as a first step. Next, an object proposals map is created for the foreground object. Subsequently, a fusion map is generated using boundary, global, and local contrast priors. In the feature generation step, different edge filters are implemented as the saliency score at edges will be high; additionally, with the use of Gabor’s filter the texture-based features are calculated. The Boruta feature selection algorithm is then used to identify the most appropriate and discriminative features, which helps to reduce the computational time required for feature selection. Ultimately, the initial map obtained from the random forest, along with the fusion saliency maps based on foreground connectivity and prior calculations, is merged to produce a saliency map. This map is then refined using post-processing techniques to acquire the final saliency map. The approach we propose surpasses the performance of 17 cutting-edge techniques across three benchmark datasets, showcasing superior results in terms of precision, recall, and f-measure. The proposed method performs well even on the DUT-OMRON dataset, known for its multiple salient objects and complex backgrounds, achieving a Mean Absolute Error (MAE) value of 0.113. The method also demonstrates high recall values (0.862, 0.923, 0.849 for ECSSD, MSRA-B and DUT-OMRON datasets, respectively) across all datasets, further establishing its suitability for salient object detection.
A novel multiscale cGAN approach for enhanced salient object detection in single haze images
Dhara G., Kumar R.K.
Article, Eurasip Journal on Image and Video Processing, 2024, DOI Link
View abstract ⏷
In computer vision, image dehazing is a low-level task that employs algorithms to analyze and remove haze from images, resulting in haze-free visuals. The aim of Salient Object Detection (SOD) is to locate the most visually prominent areas in images. However, most SOD techniques applied to visible images struggle in complex scenarios characterized by similarities between the foreground and background, cluttered backgrounds, adverse weather conditions, and low lighting. Identifying objects in hazy images is challenging due to the degradation of visibility caused by atmospheric conditions, leading to diminished visibility and reduced contrast. This paper introduces an innovative approach called Dehaze-SOD, a unique integrated model that addresses two vital tasks: dehazing and salient object detection. The key novelty of Dehaze-SOD lies in its dual functionality, seamlessly integrating dehazing and salient object identification into a unified framework. This is achieved using a conditional Generative Adversarial Network (cGAN) comprising two distinct subnetworks: one for image dehazing and another for salient object detection. The first module, designed with residual blocks, Dark Channel Prior (DCP), total variation, and the multiscale Retinex algorithm, processes the input hazy images. The second module employs an enhanced EfficientNet architecture with added attention mechanisms and pixel-wise refinement to further improve the dehazing process. The outputs from these subnetworks are combined to produce dehazed images, which are then fed into our proposed encoder–decoder framework for salient object detection. The cGAN is trained with two modules working together: the generator aims to produce haze-free images, whereas the discriminator distinguishes between the generated haze-free images and real haze-free images. Dehaze-SOD demonstrates superior performance compared to state-of-the-art dehazing methods in terms of color fidelity, visibility enhancement, and haze removal. The proposed method effectively produces high-quality, haze-free images from various hazy inputs and accurately detects salient objects within them. This makes Dehaze-SOD a promising tool for improving salient object detection in challenging hazy conditions. The effectiveness of our approach has been validated using benchmark evaluation metrics such as mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM).
Experimental Stress and Vibration Analysis of Hybrid Composite Laminated Cracked Beam
Kumar R.K., Khan K.
Article, NanoWorld Journal, 2023, DOI Link
View abstract ⏷
In this paper, an analysis of the fundamental mode of free vibration for hybrid laminated cracked composite beams was done. Hand layup is used to make the beams. Glass and carbon are used as fibers, and epoxy and resin are used as matrix materials. The properties of a material are found on the Universal Testing Machine. The experimental stress and modal analysis of carbon and glass fiber reinforced cracked hybrid laminated beam has been carried out for fixed-fixed and fixed-free beams. The [0°-90°-0°-90°] lamination scheme and [C-G-G-C], [G-C-C-G] composition have been used. The Lab VIEW software was used to perform the experiment. The strain gauges are used to perform stress analysis, and an accelerometer is used for modal analysis. The natural frequencies for all the cases have been determined. A finite element modal has been developed using ANSYS. The experimental data are compared to the ANSYS-obtained numerical results. For the compositions [C-G-G-C] and [G-C-C-G], the natural frequency drops as the number of cracks rises. Additionally, it has been found that both the laminated compositions [C-G-G-C] and [G-C-C-G] experience an increase in stress, strain, and deflection when the number of cracks increases. Maximum stress occurs at the fixed end if there is no crack; if a crack is present, it is located close to the fixed end, and as the number of cracks rises, the site of maximum stress shifts.
Study and analysis of visual saliency applications using graph neural networks
Dhara G., Kumar R.K.
Book chapter, Concepts and Techniques of Graph Neural Networks, 2023, DOI Link
View abstract ⏷
GNNs (graph neural networks) are deep learning algorithms that operate on graphs. A graph's unique ability to capture structural relationships among data gives insight into more information rather than by analyzing data in isolation. GNNs have numerous applications in different areas, including computer vision. In this chapter, the authors want to investigate the application of graph neural networks (GNNs) to common computer vision problems, specifically on visual saliency, salient object detection, and co-saliency. A thorough overview of numerous visual saliency problems that have been resolved using graph neural networks are studied in this chapter. The different research approaches that used GNN to find saliency and co-saliency between objects are also analyzed.
Parallel Big Bang-Big Crunch-LSTM Approach for Developing a Marathi Speech Recognition System
Sharma A., Bachate R.P., Singh P., Kumar V., Kumar R.K., Singh A., Kadariya M.
Article, Mobile Information Systems, 2022, DOI Link
View abstract ⏷
The Voice User Interface (VUI) for human-computer interaction has received wide acceptance, due to which the systems for speech recognition in regional languages are now being developed, taking into account all of the dialects. Because of the limited availability of the speech corpus (SC) of regional languages for doing research, designing a speech recognition system is challenging. This contribution provides a Parallel Big Bang-Big Crunch (PB3C)-based mechanism to automatically evolve the optimal architecture of LSTM (Long Short-Term Memory). To decide the optimal architecture, we evolved a number of neurons and hidden layers of LSTM model. We validated the proposed approach on Marathi speech recognition system. In this research work, the performance comparisons of the proposed method are done with BBBC based LSTM and manually configured LSTM. The results indicate that the proposed approach is better than two other approaches.
Improving performance of classifiers for diagnosis of critical diseases to prevent COVID risk
Kumar V., Lalotra G.S., Kumar R.K.
Article, Computers and Electrical Engineering, 2022, DOI Link
View abstract ⏷
The risk of developing COVID-19 and its variants may be higher in those with pre-existing health conditions such as thyroid disease, Hepatitis C Virus (HCV), breast tissue disease, chronic dermatitis, and other severe infections. Early and precise identification of these disorders is critical. A huge number of patients in nations like India require early and rapid testing as a preventative measure. The problem of imbalance arises from the skewed nature of data in which the instances from majority class are classified correct, while the minority class is unfortunately misclassified by many classifiers. When it comes to human life, this kind of misclassification is unacceptable. To solve the misclassification issue and improve accuracy in such datasets, we applied a variety of data balancing techniques to several machine learning algorithms. The outcomes are encouraging, with a considerable increase in accuracy. As an outcome of these proper diagnoses, we can make plans and take the required actions to stop patients from acquiring serious health issues or viral infections.
Water Body Identification from the Satellite Images using Color Component Analysis with Morphological Operations
Jagruth K., Manikandan V.M., Kumar R.K.
Conference paper, 2021 12th International Conference on Computing Communication and Networking Technologies, ICCCNT 2021, 2021, DOI Link
View abstract ⏷
Many countries including India are frequently affected by the natural disasters like floods. In general, predicting natural disasters accurately is very difficult, but advanced technologies can be utilized to come out of such difficulties or to reduce the impact of natural disasters. Satellite image processing is one of the efficient ways to detect water bodies in earth regions which may help the agriculture industry or to identify the flooded regions. In this paper, we propose a scheme to identify the water bodies from the satellite images which will be useful for various applications. During our research, we have created a set of water body images by cropping satellite images. The properties of the water body regions analyzed using an algorithm and computed a set of possible threshold values for the pixels representing the water bodies. The threshold values obtained from the analysis of 'water body images' are used in the proposed algorithm to identify water bodies in any given image. A sequence of morphological operations is introduced to refine the results that are obtained through pixel color component analysis. The result analysis is carried out on a set of satellite images and it achieved good results.
BAT algorithm based feature selection: Application in credit scoring
Tripathi D., Ramachandra Reddy B., Padmanabha Reddy Y.C.A., Shukla A.K., Kumar R.K., Sharma N.K.
Article, Journal of Intelligent and Fuzzy Systems, 2021, DOI Link
View abstract ⏷
Credit scoring plays a vital role for financial institutions to estimate the risk associated with a credit applicant applied for credit product. It is estimated based on applicants' credentials and directly affects to viability of issuing institutions. However, there may be a large number of irrelevant features in the credit scoring dataset. Due to irrelevant features, the credit scoring models may lead to poorer classification performances and higher complexity. So, by removing redundant and irrelevant features may overcome the problem with large number of features. In this work, we emphasized on the role of feature selection to enhance the predictive performance of credit scoring model. Towards to feature selection, Binary BAT optimization technique is utilized with a novel fitness function. Further, proposed approach aggregated with 'Radial Basis Function Neural Network (RBFN)', 'Support Vector Machine (SVM)' and 'Random Forest (RF)' for classification. Proposed approach is validated on four bench-marked credit scoring datasets obtained from UCI repository. Further, the comprehensive investigational results analysis are directed to show the comparative performance of the classification tasks with features selected by various approaches and other state-of-the-art approaches for credit scoring.
Constraint saliency based intelligent camera for enhancing viewers attention towards intended face
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Pattern Recognition Letters, 2020, DOI Link
View abstract ⏷
Visual Saliency decides the focus of attention towards a region in a scene. When we talk about attending faces in a crowd or set of multiple faces, our focus of attention does not go equal for all the faces. The biasness of human's visual system towards a particular face, occurs due to some dominant features of it, over rest of the faces. So, for the faces which are not intrinsically salient in a scene, there is a requirement of increment of attentiveness. The current study effort to improve the saliency of a face (in a set of multiple faces), which is not significantly salient. It can be achieved by enhancing the contrast of the intended face with its surrounding faces, in terms of visual low-level features like intensity, color, etc. Modification of such feature values will change its potential to attract observer's gaze. But excesses change will destroy the originality of the image. Therefore, the problem of enhancing saliency of a target face is framed as an optimization (maximization) problem under some constraints. This concept can be applied to develop a saliency based intelligent camera having the power of enhancing the attractiveness of a particular face in the crowd and in the taken photograph, the enhanced face may give more attention to the viewers. Experiment has been conducted on grayscale as well as the colour images. Moreover, effect of saliency on faces wearing jewellery, has also been measured.
Enhancing face recognition through overlaying training images
Sinha A.S., Rahman A.U., Kumar R.K., Sanyal G.
Conference paper, 2019 2nd International Conference on Advanced Computational and Communication Paradigms, ICACCP 2019, 2019, DOI Link
View abstract ⏷
In the usual face recognition approach, system is getting trained through a large number of training samples. It means in the process of training, features are extracted from all the training images individually. In this process many redundant features are required to be eliminate also. During feature elimination, some features also get suppressed due to inappropriate thresholds. So, this approach is typically time consuming and costly in the part of training. Hence, there is a requirement of feature extraction in such a way that it reduces the chance of data redundancy and system complexity. This paper presents a facial recognition technique by inclusion of superimposed version of all relevant images which improves the accuracy of the model by roughly 43 percent. The algorithm aims to establish the importance of superimposition strategy in the field of face recognition. The Haar feature based classifier is used, where a cascade function is trained from a set of images. We have used the open source database of faces from the archives of ATT Laboratories Cambridge to train and test our model.
Bezier Cohort Fusion in Doubling States for Human Identity Recognition with Multifaceted Constrained Faces
Garain J., Mishra S.R., Kumar R.K., Kisku D.R., Sanyal G.
Article, Arabian Journal for Science and Engineering, 2019, DOI Link
View abstract ⏷
Cohort selection benefits a biometric system by providing the information collected from non-match templates, whereas fusion benefits a system by combining information collected from different sources or from same source in different ways. The benefits of both approaches are availed here by proposing a cohort selection technique which is exploited prior to fusion and after fusion of matching scores for a face recognition system. Two robust facial features, viz. scale-invariant feature transform and speeded up robust features, are used here. This study presents a novel way of fusion based on cohort selection unlike the traditional levels of fusion (i.e., sensor, feature, match score, rank and decision level fusions). Cohort-based fusion is performed in two different fashions—pre-cohort fusion and post-cohort fusion. In case of early fusion, fusion rules like sum, max, min and average rules are applied before cohort selection to be performed. In contrast, the cohort selection is followed by the fusion in post (or late)-cohort fusion. Union operation is applied as late fusion rule. The matching scores are normalized by T-norm cohort score normalization technique prior to be compared with the threshold value to govern the decision of acceptance by the system. The experiments are carried out on FEI and the Look-alike (IIIT Delhi) face databases. The outcomes of the proposed method are looked to be encouraging and much convincing over non-cohort systems and state-of-the-art methods.
Guiding attention of faces through graph based visual saliency (GBVS)
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Cognitive Neurodynamics, 2019, DOI Link
View abstract ⏷
In a general scenario, while attending a scene containing multiple faces or looking towards a group photograph, our attention does not go equal towards all the faces. It means, we are naturally biased towards some faces. This biasness happens due to availability of dominant perceptual features in those faces. In visual saliency terminology it can be called as ‘salient face’. Human’s focus their gaze towards a face which carries the ‘dominating look’ in the crowd. This happens due to comparative saliency of the faces. Saliency of a face is determined by its feature dissimilarity with the surrounding faces. In this context there is a big role of human psychology and its cognitive science too. Therefore, enormous researches have been carried out towards modeling the computer vision system like human’s vision. This paper proposed a graphical based bottom up approach to point up the salient face in the crowd or in an image having multiple faces. In this novel method, visual saliencies of faces have been calculated based on the intensity values, facial areas and their relative spatial distances. Experiment has been conducted on gray scale images. In order to verify this experiment, three level of validation has been done. In the first level, our results have been verified with the prepared ground truth. In the second level, intensity scores of proposed saliency maps have been cross verified with the saliency score. In the third level, saliency map is validated with some standard parameters. The results are found to be interesting and in some aspects saliency predictions are like human vision system. The evaluation made with the proposed approach shows moderately boost up results and hence, this idea can be useful in the future modeling of intelligent vision (robot vision) system.
Addressing facial dynamics using k-medoids cohort selection algorithm for face recognition
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Article, Multimedia Tools and Applications, 2019, DOI Link
View abstract ⏷
Face recognition is itself a very challenging task and it becomes more challenging when the input images have intra class variations and inter class similarities in a large scale. Yet the recognition accuracy can be improved in some extent by supporting the system with non-matched templates. Therefore a set of cohort images is used in this regard. But all the cohort templates of the initial cohort pool may not be relevant for each and every enrolled subject. So the main focus of this work is to select a subject specific and meaningful cohort subset. This paper proposes a cohort selection method called K-medoids Cohort Selection (KMCS) to select a reference set of non-matched templates which are almost appropriate to the respective subjects. Basically, all cohort scores of a subject are clustered first using K-medoids clustering. Afterward the cluster having more scattered members/scores from its medoid is selected as a cohort subset because this cluster is constituted with the cohorts carrying more discriminative features compared to others. The SIFT points and SURF points are extracted as facial feature. The experiments are conducted on FEI, ORL and Look-alike databases of face images. The matching scores between probe and query images are normalized using T-norm, Max-Min and Aggarwal (Max rule) cohort score normalization techniques before taking the final decision of acceptance or rejection. The results obtained from the experiments show the domination of the proposed system over the non-cohort face recognition system as well as random and Top 10 cohort selection methods. There is another comparative study between k-means and K-medoids clustering for cohort selection.
Combined effect of cohort selection and decision level fusion in a face biometric system
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
There are different parameters which degrade the performance of a face biometric system due to their variations. The baseline biometric systems can get relief to some extent from this kind of negative effect by utilizing the information of the cohort images and fusion methods. But to achieve the set of suitable cohorts for each and every enrolled person is a task of great challenge. Determining the cohort subset using k-means clustering cohort selection based on the matching proximity is presented in this paper. SIFT and SURF are used as facial features to represent each face image and to calculate the similarity score between two face images. The clusters having highest and lowest centroid value are fused using union rule to form the target, user dependent cohort subset. The query-claimed matching scores are normalized with the help of T-norm cohort normalization technique. The scores after normalization are used in recognition separately for SIFT as well as SURF. Finally, the responses from the classifier for these two different features are fused at decision level to cover up the shortcomings of the cohort selection method if any. The experimental execution is done on FEI face database. This integrated face biometric system gains a significant hike in performance that evidences its effectiveness over baseline.
Image Specific Cross Cohort Normalization for Face Pair matching
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Procedia Computer Science, 2018, DOI Link
View abstract ⏷
An image matching or face pair matching is purely different aspect with respect to the other problems of computer vision and pattern recognition. This is a very active and challenging topic due to the unavailability of any prior information to the matching expert about the input images to be matched. Therefore an additional set of images can resolve this problem in some extent. In this context a cohort based face pair matching system is proposed. Initially the cohort set is common to all images but finally a subset of cohort images, specific to each of the paired images, are selected. Here Max-Min-Centroid-Cluster (MMCC) is applied which is capable enough to choose very relevant cohorts corresponding to target images. The raw similarity score between the input images is normalized with these set of cohort scores to obtain two normalized matching score. Afterwards the closeness between the images is measured by cross cohort normalization. The absolute difference of these two crossly normalized score is calculated and compared with a threshold value to decide the belonging of the input images to the same person or different person. The experiment has been conducted on ORL face database and the results found make evidence of the proposed system to be efficient.
A master map: An alternative approach to explore human’s eye fixation for generating ground truth based on various state-of-the-art techniques
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Electrical Engineering, 2018, DOI Link
View abstract ⏷
Saliency map is an efficient way to represent the salient objects in an image. In the area of object-based saliency, several researches have been accomplished. In these works, normal input images are taken in which salient region or arousal object are easily perceived by us. These experiments are validated based on ground truth obtained either from volunteers either through human eye fixation machine or based on viewer’s voting. But for complex images, salient locations are very confusing; therefore preparing ground truth is very difficult. In such images, results are varying with different state-of-the-art saliency model. To address this problem, this paper implements combine strategies to achieve composite saliency map which can incorporate the properties of every individual component maps. Fusion of saliency maps in this way can be utilized to generate good ground truth information which can be an alternate way of preparing ground truth unlike based on volunteers voting or through eye fixation machine. As this approach incorporates the concepts of reusability, therefore it may reduce the time and cost in the preparation of ground truth.
Attending prominent face in the set of multiple faces through relative visual saliency
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
Visual saliency determines the extent of attentiveness of a region in a scene. In the context of attending faces in the crowd, face components and its dominance features decide the focus on attention. Attention boosts up the recognition and identification process in a crowd and hence plays an excelling role in the area of visual surveillance and robotic vision. Using different computer vision-based techniques, enormous researches have been carried out on attention, recognition, and identification of the human face in context of different applications. This paper proposes a novel technique to analyze and explore the prominent face in the set of multiple faces (crowd). The proposed method stretched out the solution, using the concept of relative visual saliency, which has been evaluated on the various parameters of face as a whole and its componentwise too. These parameters are face area, spatial location, intensity, hue, RGB values, etc. The proposed work furnishes satisfactory results. The assessment made with this approach shows quite encouraging results which may lead to a future model for robotic vision and intelligent decision-making system.
A bezier curve cohort selection strategy for face pair matching
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2018, DOI Link
View abstract ⏷
The matching of two face images without any prior information is very much challenging task unlike a verification or identification system where already some knowledge about the images of each subjects are stored in the system's database. This paper proposes a methodology to enrich the performance of a face pair matching system by utilizing the complementary information collected from a set of cohort face images with the help of Bezier Curve cohort selection algorithm. A pair of face images is given as input to the system. Each image is compared with a predefined cohort pool to form two separate set of cohort scores. Further these set of cohort scores are passed through Bezier curve cohort selection method which provide two suitable cohort subsets. Afterwards a cross normalization is accomplished in conjunction with T-norm score normalization method then the absolute normalized difference between the paired face images is determined. On the basis of this normalized difference, it is finally decided whether the input face pair is from same person or not. The system is investigated with FEI face database and the results are quite impressive.
Estimating attention of faces due to its growing level of emotions
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2018, DOI Link
View abstract ⏷
In the task of attending faces in the disciplined assembly (Like in examination hall or Silent public places), our gaze automatically goes towards those persons who exhibits their expression other than the normal expression. It happens due to finding of dissimilar expression among the gathering of normal. In order to modeling this concept in the intelligent vision of computer system, hardly some effective researches have been succeeded. Therefore, in this proposal we have tried to come out with a solution for handling such challenging task of computer vision. Actually, this problem is related to cognitive aspect of visual attention. In the literature of visual saliency authors have dealt with expressionless objects but it has not been addressed with object like face which exploits expressions. Visual saliency is a term which differentiates 'appealing' visual substance from others, based on their feature differences. In this paper, in the set of multiple faces, 'Salient face' has been explored based on 'emotion deviation' from the normal. In the first phase of the experiment, face detection task has been accomplished using Viola Jones face detector. The concept of deep convolution neural network (CNN) has been applied for training and classification of different facial expression of emotions. Moreover, saliency score of every face of the input image have been computed by measuring their 'emotion score' which depends upon the deviation from the 'normal expression' scores. This proposed approach exhibits fairly good result which may give a new dimension to the researchers towards the modeling of an intelligent vision system which can be useful in the task of visual security and surveillance.
BCP-BCS: Best-fit cascaded matching paradigm with cohort selection using bezier curve for individual recognition
Garain J., Shah A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
The concept of cohort selection has been emerged as a very interesting and potential topic for ongoing research in biometrics. It has the capability to provide the traditional biometric systems to having a higher performance rate with lesser complexity and cost. This paper describes a novel matching technique incorporated with Bezier curve cohort selection. The Best-Fit matching with dynamic threshold has been proposed here to reduce the number of false match. This algorithm is applied for matching of Speeded Up Robust Feature (SURF) points detected on face images to find out the matching score between two faces. After that, Bezier curve is applied as a cohort selection technique. All the cohort scores are plotted in a 2D plane as if these are the control points of a Bezier curve and then a Bezier curve of degree n is plotted on the same plane using De Casteljau algorithm where number of control point is n + 1. A template contains more discriminative features more it is having distance from the curve. All the templates having score point far from the curve are included into the account of cohort subset. For each enrolled user a specific cohort subset is determined. As long as the subset is formed, T-norm cohort score normalization technique is applied to obtain the normalized scores which are further used for person identification and verification. Experiments are conducted on FEI face database and results are showing dominance over the non-cohort system.
A novel approach to attend faces in the crowd through relative visual saliency
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2017, DOI Link
View abstract ⏷
Visual Saliency plays an important role in attending faces in the crowd. While considering faces in a crowd, humans are inclined towards certain faces due to dominance of some features in those faces like color, texture, intensity, geometry etc. In this paper, we have proposed a novel method to analyze the saliency of faces in a crowd face image based on four parameters namely feature difference (intensity difference), spatial distance, area of each face and camera distance i.e. distance of each face in crowd from the viewer's point. To the best of our knowledge it is confirmed that this zone has not been explored by researchers to its full capacity till now. The experimental results have been found motivating. This method can find its future application in artificial intelligent systems and cognitive models can be established based on this theory which has the capacity to mimic the ability of human vision system.
Emotion recognition through facial gestures – a deep learning approach
Mishra S., Prasada G.R.B., Kumar R.K., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
As defined by some theorists, human emotions are discrete and consistent responses to internal or external events which have significance for an organism. They constitute a major part of our non-verbal communication. Among the human emotions, happy, sad, fear, anger, surprise, disgust and neutral are the seven basic emotions. Facial expressions are the best way to exhibit emotions. In this era of booming human-computer interaction, enabling the machines to recognize these emotions is a paramount task. There is an amalgamation of emotions in every facial expression. In this paper, we identified the different emotions and their intensity level in a human face by implementing deep learning approach through our proposed Convolution Neural Network (CNN). The architecture and the algorithm here yield appreciable results that can be used as a motivation for further research in computer based emotion recognition system.
Determine attention of faces through growing level of emotion using deep Convolution Neural Network
Kumar R.K., Kumar G.A.R., Garain J., Kisku D.R., Sanyal G.
Conference paper, 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies, ICICICT 2017, 2017, DOI Link
View abstract ⏷
Face emotions are internal feelings of a human which reflects on the human face in terms of specific expressions. It appears naturally rather than through forceful effort and are convoyed by physical variations in facial muscles that implies various expressions on the face. Several standard emotions are happy, anger, sad, fear, surprise, disgust, etc. Facial expressions play an essential role in non-verbal communication. In the area of modeling of intelligent computer vision which can recognize the human's emotion, various researches have been performed. But advance research which may analyze beyond of 'just emotion prediction', for example 'emotion attention' or 'salient emotion' has not been covered much in the literature. In this paper, we are finding the attention scores of same emotion in various levels. Obtained results justifies that, the higher expression level of emotions gives more attention. The experiment has been performed on different levels of same emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and analyzed, how saliency of face is changes with low level to high level of same emotions. FER-2013 and CK+ databases have been applied for training and testing respectively. The proposed approach delivers fairly good result it may give inspiration to the researchers for modeling an intelligent vision system which can predict not only emotion recognition but more than this.
Discriminating real from fake smile using convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, ICCIDS 2017 - International Conference on Computational Intelligence in Data Science, Proceedings, 2017, DOI Link
View abstract ⏷
In our society, sometime we hide our genuine feeling and emotion and purposely express different emotion in front of our surrounding folks. But as it's not actually a natural emotion, hence, it is more or less, predictable by others. Human vision system has enormous capability to recognizing genuine and fake smile of an individual. Discriminating genuine and fake smile is very thought-provoking task and even though very smaller amount of research has been carried out in this topic. In this paper, we are exploring a method to distinguish real from fake smile with high precision by using convolution neural networks (CNN). System has been train with FERC-2013 dataset having seven types of emotions namely happy, sad, disgust, angry, fearful, surprised and neutral. Emotions percentages of real and fake face are recorded by the emotion detection system. Based on recorded score, we investigate the effect of various percentages of emotions presented on both faces and then we are going to classify the smile on the face is real or fake.
Facial emotion analysis using deep convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, Proceedings of IEEE International Conference on Signal Processing and Communication, ICSPC 2017, 2017, DOI Link
View abstract ⏷
Human emotions are mental states of feelings that arise spontaneously rather than through conscious effort and are accompanied by physiological changes in facial muscles which implies expressions on face. Some of critical emotions are happy, sad, anger, disgust, fear, surprise etc. Facial expressions play a key role in non-verbal communication which appears due to internal feelings of a person that reflects on the faces. In order to computer modeling of human's emotion, a plenty of research has been accomplished. But still it is far behind from human vision system. In this paper, we are providing better approach to predict human emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and how emotion intensity changes on a face from low level to high level of emotion. In this algorithm, FERC-2013 database has been applied for training. The assessment through the proposed experiment confers quite good result and obtained accuracy may give encouragement to the researchers for future model of computer based emotion recognition system.
Selection of user-dependent cohorts using bezier curve for person identification
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, DOI Link
View abstract ⏷
The traditional biometric systems can be strengthened further with exploiting the concept of cohort selection to achieve the high demands of the organizations for a robust automated person identification system. To accomplish this task the researchers are being motivated towards developing robust biometric systems using cohort selection. This paper proposes a novel user-dependent cohort selection method using Bezier curve. It makes use of invariant SIFT descriptor to generate matching pair points between a pair of face images. Further for each subject, considering all the imposter scores as control points, a Bezier curve of degree n is plotted by applying De Casteljau algorithm. As long as the imposter scores represent the control points in the curve, a cohort subset is formed by considering the points determined to be far from the Bezier curve. In order to obtain the normalized cohort scores, T-norm cohort normalization technique is applied. The normalized scores are then used in recognition. The experiment is conducted on FEI face database. This novel cohort selection method achieves superior performance that validates its efficiency.
A novel approach to enlighten the effect of neighbor faces during attending a face in the crowd
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2016, DOI Link
View abstract ⏷
While attending the crowd, humans do not look all the faces with same gaze. Human beings have naturally, extraordinary capability to focus their gaze towards a face carrying the 'dominating look' in the crowd in terms of beauty or ugliness. This happens due to relative saliency of that face with respect to their surrounding faces. To model the computer's vision based on human psychology and its cognitive science, enormous researches have been carried out. In order to enhance these researches, this paper proposed a novel method to illustrate how the saliency of a face is varying with its neighboring faces in the crowd. This variation affects our perception and attention towards a face in the crowd. In this method, distribution of visual saliency has been calculated based on intensity values and respective spatial distance of the faces. Experiment has been conceded out on various facial images. The results and accuracy are found to be satisfactory. The evaluation made with the proposed approach which exhibits quite boosting results and it can lead to a future model of intelligent visions of computers.
Heterogeneous face detection
Das A., Kumar R.K., Kisku D.R.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
Face detection is the process of determining the location of human faces in an image. Like human visual system, a face detection system should also be capable of achieving the detection task irrespective of illumination, absence of texture, orientation and camera distance. Detecting faces in heterogeneous, infrared and thermal images is a challenging job due to variation in texture, orientation, lighting condition, intensity etc. Many researchers have worked and proposed various methods for visible faces in the domain of face detection. However, face detection with the existing algorithm in heterogeneous face images is found quite difficult one. This paper attempts to improve the accuracy of an existing face detection algorithm which was basically designed for detecting visible faces, can now be used to detect heterogeneous faces by applying various image enhancement techniques before face detection. The proposed improved algorithm is suitable for heterogeneous face images which include thermal images, infrared images in the crowd. Attempts for the same have been tested on test image dataset. The experimental results are found to be encouraging.
Constrained maximization of saliency of intended object for guiding attention
Kumar R.K., Pal R.
Conference paper, 12th IEEE International Conference Electronics, Energy, Environment, Communication, Computer, Control: (E3-C3), INDICON 2015, 2016, DOI Link
View abstract ⏷
Saliency of an object in an image determines the attentiveness of the object with respect to human visual system. Saliency, in the absence of any external stimuli, is determined by contrast of features (like intensity, color, etc.) of the object with its surroundings. This paper proposes a novel technique to enhance the saliency of an object, which is not intrinsically salient. In order to enhance the attentiveness, the feature values of a used-defined target object (whose saliency has to be enhanced) should have more differences with the feature values of its surrounding objects. But too much modification of these values of the target object will destroy the naturalness of the image. So this problem of enhancing saliency of a target object is treated as a maximization problem under some constraints.
Attention identification via relative saliencyof localized crowd faces
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
While viewing a crowd, the human vision automatically gets focused towards the most attentive face. This happens when a particular face in the crowd dominates the other faces in terms of beauty, expression, color, shape, size and structure, etc. Human attention towards such faces happens due to its higher visual saliency values. A computer vision system can be modeled based on this aspect of human psychology. Visual saliency of a face in a crowd may vary according to many parameters. In this paper, we propose a novel method to calculate the distribution of visual saliency of faces in the crowd based on their feature difference, spatial distance and size. This method has been tested on various crowd images and inspiring results have been found. Therefore, this can be used to mimic the cognitive behavior of the human vision system to create artificially intelligent computer vision systems.
Automated skull stripping in brain MR images
Aruchamy S., Kumar R.K., Bhattacharjee P., Sanyal G.
Conference paper, Proceedings of the 10th INDIACom; 2016 3rd International Conference on Computing for Sustainable Global Development, INDIACom 2016, 2016,
View abstract ⏷
Skull stripping is a significant as well as a preliminary step in diagnosing brain disorders. It removes extra-meningeal tissues from Magnetic Resonance Images of the brain. Magnetic Resonance Imaging (MRI) is a widely used technique for analysis of brain images. An efficient hardware-based algorithm for Skull segmentation would help in developing an automated brain image analysis system for real time applications in biomedical sciences. In this work, a Raspberry Pi single board computer based image analysis algorithm for an automatic skull stripping is reported. The experiment has been carried out with T1 weighted axis images. In order to reduce the noise and enhance the quality, initially the images were pre-processed. Further, edge detection and morphological operations were performed to extract the skull from the brain images. The proposed method has been validated by evaluating quantitative performance metrics like Jaccard similarity index and the Dice coefficient. This technique will serve as the major step for technological outbreaks for developing systems for automated skull stripping the images of the brain in the future.
Estimating normalized attention of viewers on account of relative visual saliency of faces (NRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Article, International Journal of Software Engineering and its Applications, 2015, DOI Link
View abstract ⏷
Humans psychological and behavioral understanding often lead to make natural decision which accurately identifies and remembers the faces which are highly appreciated or criticized by themselves in comparing to the normal viewed faces, in terms of beauty, ugliness or unique appearance. It happens due to human psychology of being biased towards the salient face in the process of face recognition and identification. This paper attempts a novel method to measure, how our attention is more restricted towards some particular faces in the crowd. This restricted attention is strongly guided by the relative visual saliency of these faces. In this paper, normalized relative visual saliency (NVRS) of the faces is evaluated using their intensity values modulated with respective spatial distance. Experiment has been carried out on test image dataset via bottom up approach. The experimental results are found to be encouraging and accuracy has also been measured exhibiting efficacy of the proposed approach.
Analysis of Attention Identification and Recognition of Faces through Segmentation and Relative Visual Saliency (SRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Procedia Computer Science, 2015, DOI Link
View abstract ⏷
Attentiveness, identification and recognition of a human face in a crowd play a starring role in the perspective of visual surveillance. Human vision system is not attending, identifying and recognizing all the faces with the same perception. It is biased towards some faces in the crowd due to their higher relative visual saliency and segment wise perception with respect to the surroundings faces. Using different computer vision-based techniques enormous researches have been carried out on attention, recognition and identification of the human face in context of different applications. This paper proposes a novel technique to explain and analyse how the attention, identification and recognition of a face in the crowd guided through segmentation and their relative visual saliency. The proposed method is stretched out the solution, using the concept of segmentation and relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces.
Cohort selection of specific user using max-min-centroid-cluster (MMCC) method to enhance the performance of a biometric system
Garain J., Kumar R.K., Sanyal G., Kisku D.R.
Article, International Journal of Security and its Applications, 2015, DOI Link
View abstract ⏷
Selection of cohort models plays a vital role to increase the accuracy of a biometric authentication system as well as to reduce the computational cost. This paper proposes a novel approach for cohort selection called Max-Min-Centroid-Cluster (MMCC) method. The clusters of cohorts are generated by K-means clustering technique. The union of the clusters having largest and smallest centroid value is taken as cohort subset. The cohort scores, after normalization using different cohort based score normalization techniques, are used in authentication process of the system. Evaluation has been carried out on FEI face datasets. The performance of this novel methodology is analyzed using T-norm and Aggarwal (max rule) normalization techniques. Experimental results exhibit the efficacy of the proposed method.
Novel methodology for guiding attention of faces through relative visual saliency (RVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Proceedings - 2015 International Conference on Control, Automation and Robotics, ICCAR 2015, 2015, DOI Link
View abstract ⏷
Identification of a human face in a crowded flux plays an important role in the context of surveillance. Considerable amount of research has been carried out on face identification in different applications. Accordingly, different researchers propose new algorithms. This paper attempts to showcase a novel methodology through which any face may be identified in a large crowd of human face. This proposed technique is based on relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces. In addition to visual saliency, top-down and bottom-up approaches to visual attention are also presented and explained in the context of face identification. Both of these two approaches are considered to be made a significant contribution while visual saliency is measured for attention-based face identification. Experiment has been carried out on test image dataset. The results are satisfactory and accuracy has also been measured. The evaluation made with the proposed approach exhibits quite encouraging results and accuracy leads to a future model of human face tracking and recognition system.
Enhancement of fuzzy implication operator
Kumar R.K.
Conference paper, Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014, 2014, DOI Link
View abstract ⏷
In this paper, fuzzy implication operator, ply operator and their properties have been studied. On the basis of existing properties of Implication operator, some more properties have been extracted in order to enhance the use of Implication in the approximate reasoning. In some form of monotonicity, five ply operators and its various properties suggested by Zadeh have been explained. In the last part of the paper, increasing and decreasing behavior of the five Ply operators (obtained by Zadeh using possibility distribution) under 'Special Cases' have also been proposed. © 2014 IEEE.
Low voltage DCI based low power VLSI circuit implementation on FPGA
Pandey B., Kumar R.
Conference paper, 2013 IEEE Conference on Information and Communication Technologies, ICT 2013, 2013, DOI Link
View abstract ⏷
In this paper, we study the effect of using digitally controlled impedance IO Standard in memory interface design in terms of power consumption. In this work, we achieved 50% dynamic power reduction at 1.5V output driver voltage, 35.2% dynamic power reduction at 1.8V output driver voltage in comparison to 2.5V output driver voltage in DCI based IO standard implementation on input or output port in target design. Target device XC6VLX75TFF484-1 is a Virtex-6 FPGA of -1 speed grade and 484 pins is used for implementation of this design. Target Design is RAM-UART memory interface. XPower 13.4 is used for power analysis of our low power memory interface design. ISim is simulator to generate waveform. Planahead is used for design, synthesis and implementation. © 2013 IEEE.
Clock gating aware low power global reset ALU and implementation on 28nm FPGA
Pandey B., Yadav J., Kumar J., Kumar R.
Conference paper, Proceedings - 5th International Conference on Computational Intelligence and Communication Networks, CICN 2013, 2013, DOI Link
View abstract ⏷
In this paper, we apply clock gating technique in Global Reset ALU design on 28nm Artix7 FPGA to save dynamic and clock power both. This technique is simulated in Xilinx14.3 tool and implemented on 28nm Artix7 XC7A200T FFG1156-1 FPGA. When clock gating technique is not applied clock power contributes 32.25%, 4.24%, 3.06%, 3.09%, and 3.09% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, 100GHz and1 THz device frequency respectively. When clock gating technique is applied clock power contributes 0%, 1.02%, 1.06%, 1.06%, and 1.06% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, and 100GHz and1 THz device frequency respectively. With clock gating, there is 100%, 76.92%, 66.30%, 66.55% and 66.58% reduction in clock power in compare to clock power consumption without clock gate on 100 MHz, 1 GHz, 10 GHz, 100 GHz and 1 THz respectively operating frequency. Clock gating is more effective on 28nm in compare to 40nm and 90nm technology file. © 2013 IEEE.

Contact Details

ravikant.k@srmap.edu.in

Scholars

Doctoral Scholars

Ms Shaik Reehana
Ms Keerthi Garisa
Ms Gayathri Dhara

Interests

Artificial Intelligence
Data Science
Image Processing
Machine Learning
Vision Computing

Education

2014

M.Tech.

Central University of Hyderabad

India

2008

MCA

VIT University, Vellore

India

2019

Ph.D.

National Institute of Technology, Durgapur

India

Experience

4.5 Months – Assistant Professor – Madanapalle Institute of Technology and Science, Madanapalle, Andhra Pradesh, India.
1 Year- Worked as a Research Intern - IDRBT, Hyderabad, India.
1 Year – Software Engineer – Pellucid Healthcare Network Pvt. Ltd., Chennai, India.

Research Interests

Computational Modelling of Visual Attention using Low-Level and High Level Features.
Design and Development of Algorithms for Saliency based Intelligent Camera.
Mathematical Modelling of Computer Science Problems.

Awards & Fellowships

(2014 – 2019) – Institute Fellowship (During PhD) – MHRD, Govt. of India.
(2012 – 2014) – GATE Fellowship (During MTech) – MHRD, Govt. of India.
2013 – 1st Prize for Software Designing Competition (As a Team) – JP Morgan.
2006 – IBM Great Mind Challenge – IBM

Memberships

Memberships in professional associations to be listed

Publications

Enhanced Salient Object Detection from Single Haze Images
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Electrical Engineering, 2025, DOI Link
View abstract ⏷
Salient Object Detection experiences significant difficulties when trying to identify objects from single haze images due to the deterioration of visibility and low contrast. To subdue this challenge, this study introduces a computational model of visual saliency as a solution. Object detection in hazy environments presents a major challenge due to reduced visibility and contrast. The proposed methodology begins by determining whether an image is hazy, and if so, leverages the Dark Channel Prior (DCP) to extract essential haze-related information. The DCP calculation serves as the basis for subsequent dehazing, achieved through the Multiscale Retinex algorithm. In the dehazing phase, the Multiscale Retinex algorithm is applied to improve image clarity and obtain a dehazed version. This haze-free image is given as input to a trained U-Net architecture, which gives a saliency map that identifies notable and prominent regions within the image. Simultaneously, it undergoes region-based segmentation. The geodesic saliency map is calculated using geodesic distance, considering both spatial proximity and feature similarity. In the final step, the saliency maps generated from the U-Net and geodesic saliency computation are fused to generate the ultimate saliency map. The effectiveness of the suggested method in detecting salient objects in hazy images is supported by the experimental findings, which showcase state-of-the-art performance in dehazing. The integration of DCP, multiscale Retinex, and dual saliency maps enhances both dehazing and object detection, making this method valuable in a variety of computer vision applications, including autonomous driving, video surveillance, and image restoration. The experimental results of AUC and MAE provide confirmation for the effectiveness and accuracy of the saliency computational model that has been proposed.
Evaluation and Enhancement of Standard Classifier Performance by Resolving Class Imbalance Issue Using Smote-Variants Over Multiple Medical Datasets
Kumar V., Kumar R.K., Singh S.K.
Article, SN Computer Science, 2025, DOI Link
View abstract ⏷
In the era of machine learning we are solving the classification problems by training the labeled classes. But sometimes due to insufficient data in some of the training classes, the system training is inadequate for these minority classes. In this case the output for the classes obtained from the less amount of trained data are miserably inappropriate and biased towards the classes having more data. This problem is known as a class imbalance problem. In such cases, standard classifiers tend to be overpowered by the expansive classes and disregard the little ones. As a result, the performance of machine learning and the deep learning algorithms are also reducing and sometimes highly unacceptable too, mainly if it is related to crucial data like medical and health related. Though various researchers provided some methods to solve this problem but mostly they are problem specific and suitable with the specific classifier only. To find a generalized and effective solution to this problem, we have applied various smote variants for solving the imbalanced factors in dataset and finally improved the performance of the various machine learning and deep learning algorithms. We have experimented and analyzed the effects of SMOTE variants on various machine learning techniques over six standard medical datasets. We have found that SMOTE variants are very effective, and they improve the standard performance measures (Accuracy, Precision, Recall and F1-Score). Additionally, based on our research, it is feasible to determine which smote variation works best with machine learning methods and datasets.
A survey on visual saliency detection approaches and attention models
Dhara G., Kumar R.K.
Article, Multimedia Tools and Applications, 2025, DOI Link
View abstract ⏷
Visual saliency detection models are widely used in computer vision tasks to mimic the human visual system’s perception of scenes. The part of an image that stands out from its surroundings and captures attention at a glance is referred to as the salient region. This paper presents a comprehensive review of recent advancements in Salient Object Detection (SOD) and its subfields, namely Co-Salient Object Detection (CSD) and RGB-Depth (RGBD) saliency detection. Salient Object Detection refers to techniques that analyze image surroundings to extract prominent regions from the background. Co-saliency detection, on the other hand, focuses on identifying common and salient regions across a group of related images that share similar content. In contrast to traditional saliency detection, RGBD models incorporate both color and depth information to more accurately identify salient objects. All the aforementioned SOD approaches have numerous applications in pattern recognition and computer vision. The concept of saliency detection has garnered significant interest among researchers. However, there remains a need for extensive research to produce highly accurate saliency maps that bridge the gap between perceptual accuracy and computational performance. Although many novel methods have been developed to address these challenges, further efforts are required to enhance the overall accuracy of saliency detection. This review covers a wide range of techniques, from traditional approaches to deep learning-based models. In addition to analyzing various proposed algorithms, the paper provides a comprehensive overview of evaluation metrics used to assess the performance of SOD algorithms. It also explores benchmark datasets commonly used in SOD research and presents both qualitative and quantitative experimental results for SOD and its subfields. Finally, this study examines open research problems, current challenges, and future directions in salient object detection, offering valuable insights and guidance to support future research advancements in the field.
A Machine Learning-based Pneumonia Detection System
Cheekuri S.V., Veeramachaneni M., Manikandan V.M., Kumar R.K.
Conference paper, 2024 5th International Conference for Emerging Technology, INCET 2024, 2024, DOI Link
View abstract ⏷
Pneumonia ranks among the world's major causes of mortality and is the greatest cause of death for young children. It is an infectious condition that can be fatal, affects one or both lungs and is brought on by harmful bacteria. An accurate and timely diagnosis is essential for managing and treating patients effectively. Radiotherapists with specialized training are needed to assess chest X-rays to diagnose pneumonia. Therefore, creating an automated approach to identify pneumonia would be advantageous to treat the illness, especially in isolated locations quickly. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A VGG net and a convolutional neural network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia. Early disease identification is greatly aided by medical imaging, and chest X-rays are a frequent method of identifying lung disorders like pneumonia. This project offers a novel method for improving chest X-ray image quality, which is then used in conjunction with machine learning approaches to increase the detection accuracy of pneumonia. Subtle details in X-rays can be seen much better using picture-enhancing techniques including sharpening, contrast stretching, and histogram equalization. A Convolutional Neural Network (CNN) model that can accurately diagnose pneumonia is trained using this augmented image dataset. By bridging the gap between conventional X-ray imaging and sophisticated machine learning, the initiative offers a viable approach to the early and accurate detection of pneumonia.
DEM-UFR: Deep Ensemble Method for Enhanced Unconstraint Face Recognition System
Kumar D., Kumar R.K., Garain J., Kisku D.R., Sing J.K., Gupta P.
Conference paper, Proceedings - 2024 OITS International Conference on Information Technology, OCIT 2024, 2024, DOI Link
View abstract ⏷
The widespread usage of mobile devices and social media has led to a growing interest in face recognition technology. This study introduces a novel deep ensemble method designed to enhance facial recognition accuracy on a mobile selfie dataset by integrating three pre-trained models, viz. Inception-v3, ResNet-50, and EfficientNet B7 for automatic feature extraction and representation. The approach utilizes feature-level fusion through concatenation, followed by dimensionality reduction via principal component analysis (PCA). Feature optimization is carried out using the Firefly algorithm, and classification is achieved through a soft voting ensemble of classifiers, including Support Vector Machine (SVM), Random Forest, and a Deep Neural Network (DNN). When evaluated on the LFW, UTK face, and Wild Selfie datasets, the proposed method achieved recognition accuracies of 99.76%, 98.92%, and 98.73%, respectively, demonstrating competitive and significantly improved performance over existing models. The results indicate that the system performs effectively in real-world conditions, especially in environments with varying conditions.
Hybrid Deep Learning Architecture With K-means Clustering For Weapon Detection In CCTV Surveillance
Raj S., Anant A., Suryadevara H., Kumar R.K.
Conference paper, 2024 IEEE International Conference on Computer Vision and Machine Intelligence, CVMI 2024, 2024, DOI Link
View abstract ⏷
For quick criminal activity alert, it became quite obvious to use multiple capture points with cameras, and with multiple capture points, there is need for automated criminal activity alerting systems so the human observer can manage all the CCTV feed in real-time, as its humanly not possible go through that many video feed without a crime slipping out undetected. A deep learning based approach for this task on various network have been researched previously. The researchers have tuned with larger and largest network possible to perform weapon detection for surveillance, though they have achieved more than 90% of accuracy for the task but have to pull largest and complex networks possible. Large deep learning models are costly in both computation and memory for a CCTV device to perform AI workload.There was a gap in studying hybrid approach where deep learning along with machine learning based approach are evaluated for the task. To close this gap, our study employs a hybrid approach that combines machine learning and deep learning methods.Training on a customised dataset was attempted initially. But when implementation proved challenging, the study transitioned to implementing the use of the 'OD-weapon detection dataset' that had been collected from GitHub. Different levels of accuracy were achieved on first validation by using deep learning models such as VGG16, VGG19, InceptionNet, and MobileNet, which were maximised by applying this diversified collection of weapon images. Techniques for clustering, fine-tuning, and PCA dimension reduction were used to improve the classification performance.
Experimental Study of Free Vibration Fibre Reinforced Laminated Fractured Composite Beams
Kumar R.K., Harish G., Singh R.K., Khan K.
Conference paper, Lecture Notes in Mechanical Engineering, 2024, DOI Link
View abstract ⏷
In this paper, the free vibration analysis for the fundamental mode of vibration of fibre reinforced laminated fractured composite beams has been carried out. The beams are manufactured by hand lay-up method using glass as fibre material and epoxy resin as matrix material. The material properties are obtained from Universal Testing Machine. The analysis is carried out for different lamination schemes, boundary conditions, and number of cracks. The free vibration frequencies for the fundamental mode of vibration and Time-History are obtained from LabVIEW. The acceleration data for the dynamic response of the beam are obtained from an accelerometer fixed on the middle of the beam. The experimental stress and modal analysis of glass fibre reinforced laminated composite beam has been carried out for clamped-clamped and clamped-free end conditions. The experimental results obtained from LabVIEW are compared with the FE simulation results obtained from ANSYS using shell 8 node 281 elements. It is observed that in case of [0°]4 specimen, the free vibration of frequency of the beam is decreasing with increase in number of cracks, whereas in the case of [0°]8 the frequency of the beam has no significant change even if the number of cracks are varied. But the frequency is increased when the number of layers are increased irrespective of the crack. For [0°/90°]2 cross-ply laminate the free vibration frequencies decreases as number of crack increases. For [0°/90°]2 lamination scheme the maximum value of stress for zero crack beam is at the clamped end and for Single, Double, Triple crack beams the maximum stress value is found at the position of first crack and the value is higher than that of beam with zero crack, whereas for [0°/90°]4 the maximum stress values for Single, Double, Triple crack beams are lower when compared to the beam with Zero crack.
DeepFusion-Net: A U-Net and CGAN-Based Approach for Salient Object Detection
Dhara G., Kumar R.K.
Conference paper, Lecture Notes in Networks and Systems, 2024, DOI Link
View abstract ⏷
Saliency Detection is a crucial undertaking in the realm of vision computing, with a goal to identify the visual prominent regions within an input image. The method of automated saliency identification has caught the interest of various application fields during the last decade. An innovative method is suggested for saliency detection through Conditional Generative Adversarial Networks (CGANs) with a pre-trained U-Net model as the generator. The generated saliency maps are evaluated by the discriminator for authenticity and give feedback to enhance the generator’s ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. By combining the strengths of conditional generative adversarial networks and the U-Net architecture, our goal is to improve the accuracy and enhance the quality. Once the U-Net model is trained and its weights are saved, we then integrate it into the CGAN framework for salient object detection. The U-Net will serve as part of the generator for the CGAN, responsible for generating saliency maps for input images. The components of CGAN, are trained using adversarial learning to enhance the quality and realism of the resulting saliency maps. Precision, recall, MAE, and Fβ score measurements are used to evaluate performance. Thorough experiments have been conducted on three challenging saliency detection datasets, our model has demonstrated remarkable performance surpassing the latest models for saliency. Further, faster convergence is observed in our model due to the initialization of the CGAN’s generator using pre-trained U-Net model weights.
Spatial attention guided cGAN for improved salient object detection
Dhara G., Kumar R.K.
Article, Frontiers in Computer Science, 2024, DOI Link
View abstract ⏷
Recent research shows that Conditional Generative Adversarial Networks (cGANs) are effective for Salient Object Detection (SOD), a challenging computer vision task that mimics the way human vision focuses on important parts of an image. However, implementing cGANs for this task has presented several complexities, including instability during training with skip connections, weak generators, and difficulty in capturing context information for challenging images. These challenges are particularly evident when dealing with input images containing small salient objects against complex backgrounds, underscoring the need for careful design and tuning of cGANs to ensure accurate segmentation and detection of salient objects. To address these issues, we propose an innovative method for SOD using a cGAN framework. Our method utilizes encoder-decoder framework as the generator component for cGAN, enhancing the feature extraction process and facilitating accurate segmentation of the salient objects. We incorporate Wasserstein-1 distance within the cGAN training process to improve the accuracy of finding the salient objects and stabilize the training process. Additionally, our enhanced model efficiently captures intricate saliency cues by leveraging the spatial attention gate with global average pooling and regularization. The introduction of global average pooling layers in the encoder and decoder paths enhances the network's global perception and fine-grained detail capture, while the channel attention mechanism, facilitated by dense layers, dynamically modulates feature maps to amplify saliency cues. The generated saliency maps are evaluated by the discriminator for authenticity and gives feedback to enhance the generator's ability to generate high-resolution saliency maps. By iteratively training the discriminator and generator networks, the model achieves improved results in finding the salient object. We trained and validated our model using large-scale benchmark datasets commonly used for salient object detection, namely DUTS, ECSSD, and DUT-OMRON. Our approach was evaluated using standard performance metrics on these datasets. Precision, recall, MAE and Fβ score metrics are used to evaluate performance. Our method achieved the lowest MAE values: 0.0292 on the ECSSD dataset, 0.033 on the DUTS-TE dataset, and 0.0439 on the challenging and complex DUT-OMRON dataset, compared to other state-of-the-art methods. Our proposed method demonstrates significant improvements in salient object detection, highlighting its potential benefits for real-life applications.
Enhancing Salient Object Detection with Supervised Learning and Multi-prior Integration
Dhara G., Kumar R.K.
Article, Journal of Image and Graphics(United Kingdom), 2024, DOI Link
View abstract ⏷
Salient Object Detection (SOD) can mimic the human vision system by using algorithms that simulate the way how the eye detects and processes visual information. It focuses mainly on the visually distinctive parts of an image, similar to how the human brain processes visual information. The approach proposed in this study is an ensemble approach that incorporates classification algorithm, foreground connectivity and prior calculations. It involves a series of preprocessing, feature generation, selection, training, and prediction using random forest to identify and extract salient objects in an image as a first step. Next, an object proposals map is created for the foreground object. Subsequently, a fusion map is generated using boundary, global, and local contrast priors. In the feature generation step, different edge filters are implemented as the saliency score at edges will be high; additionally, with the use of Gabor’s filter the texture-based features are calculated. The Boruta feature selection algorithm is then used to identify the most appropriate and discriminative features, which helps to reduce the computational time required for feature selection. Ultimately, the initial map obtained from the random forest, along with the fusion saliency maps based on foreground connectivity and prior calculations, is merged to produce a saliency map. This map is then refined using post-processing techniques to acquire the final saliency map. The approach we propose surpasses the performance of 17 cutting-edge techniques across three benchmark datasets, showcasing superior results in terms of precision, recall, and f-measure. The proposed method performs well even on the DUT-OMRON dataset, known for its multiple salient objects and complex backgrounds, achieving a Mean Absolute Error (MAE) value of 0.113. The method also demonstrates high recall values (0.862, 0.923, 0.849 for ECSSD, MSRA-B and DUT-OMRON datasets, respectively) across all datasets, further establishing its suitability for salient object detection.
A novel multiscale cGAN approach for enhanced salient object detection in single haze images
Dhara G., Kumar R.K.
Article, Eurasip Journal on Image and Video Processing, 2024, DOI Link
View abstract ⏷
In computer vision, image dehazing is a low-level task that employs algorithms to analyze and remove haze from images, resulting in haze-free visuals. The aim of Salient Object Detection (SOD) is to locate the most visually prominent areas in images. However, most SOD techniques applied to visible images struggle in complex scenarios characterized by similarities between the foreground and background, cluttered backgrounds, adverse weather conditions, and low lighting. Identifying objects in hazy images is challenging due to the degradation of visibility caused by atmospheric conditions, leading to diminished visibility and reduced contrast. This paper introduces an innovative approach called Dehaze-SOD, a unique integrated model that addresses two vital tasks: dehazing and salient object detection. The key novelty of Dehaze-SOD lies in its dual functionality, seamlessly integrating dehazing and salient object identification into a unified framework. This is achieved using a conditional Generative Adversarial Network (cGAN) comprising two distinct subnetworks: one for image dehazing and another for salient object detection. The first module, designed with residual blocks, Dark Channel Prior (DCP), total variation, and the multiscale Retinex algorithm, processes the input hazy images. The second module employs an enhanced EfficientNet architecture with added attention mechanisms and pixel-wise refinement to further improve the dehazing process. The outputs from these subnetworks are combined to produce dehazed images, which are then fed into our proposed encoder–decoder framework for salient object detection. The cGAN is trained with two modules working together: the generator aims to produce haze-free images, whereas the discriminator distinguishes between the generated haze-free images and real haze-free images. Dehaze-SOD demonstrates superior performance compared to state-of-the-art dehazing methods in terms of color fidelity, visibility enhancement, and haze removal. The proposed method effectively produces high-quality, haze-free images from various hazy inputs and accurately detects salient objects within them. This makes Dehaze-SOD a promising tool for improving salient object detection in challenging hazy conditions. The effectiveness of our approach has been validated using benchmark evaluation metrics such as mean absolute error (MAE), peak signal-to-noise ratio (PSNR), and structural similarity index measure (SSIM).
Experimental Stress and Vibration Analysis of Hybrid Composite Laminated Cracked Beam
Kumar R.K., Khan K.
Article, NanoWorld Journal, 2023, DOI Link
View abstract ⏷
In this paper, an analysis of the fundamental mode of free vibration for hybrid laminated cracked composite beams was done. Hand layup is used to make the beams. Glass and carbon are used as fibers, and epoxy and resin are used as matrix materials. The properties of a material are found on the Universal Testing Machine. The experimental stress and modal analysis of carbon and glass fiber reinforced cracked hybrid laminated beam has been carried out for fixed-fixed and fixed-free beams. The [0°-90°-0°-90°] lamination scheme and [C-G-G-C], [G-C-C-G] composition have been used. The Lab VIEW software was used to perform the experiment. The strain gauges are used to perform stress analysis, and an accelerometer is used for modal analysis. The natural frequencies for all the cases have been determined. A finite element modal has been developed using ANSYS. The experimental data are compared to the ANSYS-obtained numerical results. For the compositions [C-G-G-C] and [G-C-C-G], the natural frequency drops as the number of cracks rises. Additionally, it has been found that both the laminated compositions [C-G-G-C] and [G-C-C-G] experience an increase in stress, strain, and deflection when the number of cracks increases. Maximum stress occurs at the fixed end if there is no crack; if a crack is present, it is located close to the fixed end, and as the number of cracks rises, the site of maximum stress shifts.
Study and analysis of visual saliency applications using graph neural networks
Dhara G., Kumar R.K.
Book chapter, Concepts and Techniques of Graph Neural Networks, 2023, DOI Link
View abstract ⏷
GNNs (graph neural networks) are deep learning algorithms that operate on graphs. A graph's unique ability to capture structural relationships among data gives insight into more information rather than by analyzing data in isolation. GNNs have numerous applications in different areas, including computer vision. In this chapter, the authors want to investigate the application of graph neural networks (GNNs) to common computer vision problems, specifically on visual saliency, salient object detection, and co-saliency. A thorough overview of numerous visual saliency problems that have been resolved using graph neural networks are studied in this chapter. The different research approaches that used GNN to find saliency and co-saliency between objects are also analyzed.
Parallel Big Bang-Big Crunch-LSTM Approach for Developing a Marathi Speech Recognition System
Sharma A., Bachate R.P., Singh P., Kumar V., Kumar R.K., Singh A., Kadariya M.
Article, Mobile Information Systems, 2022, DOI Link
View abstract ⏷
The Voice User Interface (VUI) for human-computer interaction has received wide acceptance, due to which the systems for speech recognition in regional languages are now being developed, taking into account all of the dialects. Because of the limited availability of the speech corpus (SC) of regional languages for doing research, designing a speech recognition system is challenging. This contribution provides a Parallel Big Bang-Big Crunch (PB3C)-based mechanism to automatically evolve the optimal architecture of LSTM (Long Short-Term Memory). To decide the optimal architecture, we evolved a number of neurons and hidden layers of LSTM model. We validated the proposed approach on Marathi speech recognition system. In this research work, the performance comparisons of the proposed method are done with BBBC based LSTM and manually configured LSTM. The results indicate that the proposed approach is better than two other approaches.
Improving performance of classifiers for diagnosis of critical diseases to prevent COVID risk
Kumar V., Lalotra G.S., Kumar R.K.
Article, Computers and Electrical Engineering, 2022, DOI Link
View abstract ⏷
The risk of developing COVID-19 and its variants may be higher in those with pre-existing health conditions such as thyroid disease, Hepatitis C Virus (HCV), breast tissue disease, chronic dermatitis, and other severe infections. Early and precise identification of these disorders is critical. A huge number of patients in nations like India require early and rapid testing as a preventative measure. The problem of imbalance arises from the skewed nature of data in which the instances from majority class are classified correct, while the minority class is unfortunately misclassified by many classifiers. When it comes to human life, this kind of misclassification is unacceptable. To solve the misclassification issue and improve accuracy in such datasets, we applied a variety of data balancing techniques to several machine learning algorithms. The outcomes are encouraging, with a considerable increase in accuracy. As an outcome of these proper diagnoses, we can make plans and take the required actions to stop patients from acquiring serious health issues or viral infections.
Water Body Identification from the Satellite Images using Color Component Analysis with Morphological Operations
Jagruth K., Manikandan V.M., Kumar R.K.
Conference paper, 2021 12th International Conference on Computing Communication and Networking Technologies, ICCCNT 2021, 2021, DOI Link
View abstract ⏷
Many countries including India are frequently affected by the natural disasters like floods. In general, predicting natural disasters accurately is very difficult, but advanced technologies can be utilized to come out of such difficulties or to reduce the impact of natural disasters. Satellite image processing is one of the efficient ways to detect water bodies in earth regions which may help the agriculture industry or to identify the flooded regions. In this paper, we propose a scheme to identify the water bodies from the satellite images which will be useful for various applications. During our research, we have created a set of water body images by cropping satellite images. The properties of the water body regions analyzed using an algorithm and computed a set of possible threshold values for the pixels representing the water bodies. The threshold values obtained from the analysis of 'water body images' are used in the proposed algorithm to identify water bodies in any given image. A sequence of morphological operations is introduced to refine the results that are obtained through pixel color component analysis. The result analysis is carried out on a set of satellite images and it achieved good results.
BAT algorithm based feature selection: Application in credit scoring
Tripathi D., Ramachandra Reddy B., Padmanabha Reddy Y.C.A., Shukla A.K., Kumar R.K., Sharma N.K.
Article, Journal of Intelligent and Fuzzy Systems, 2021, DOI Link
View abstract ⏷
Credit scoring plays a vital role for financial institutions to estimate the risk associated with a credit applicant applied for credit product. It is estimated based on applicants' credentials and directly affects to viability of issuing institutions. However, there may be a large number of irrelevant features in the credit scoring dataset. Due to irrelevant features, the credit scoring models may lead to poorer classification performances and higher complexity. So, by removing redundant and irrelevant features may overcome the problem with large number of features. In this work, we emphasized on the role of feature selection to enhance the predictive performance of credit scoring model. Towards to feature selection, Binary BAT optimization technique is utilized with a novel fitness function. Further, proposed approach aggregated with 'Radial Basis Function Neural Network (RBFN)', 'Support Vector Machine (SVM)' and 'Random Forest (RF)' for classification. Proposed approach is validated on four bench-marked credit scoring datasets obtained from UCI repository. Further, the comprehensive investigational results analysis are directed to show the comparative performance of the classification tasks with features selected by various approaches and other state-of-the-art approaches for credit scoring.
Constraint saliency based intelligent camera for enhancing viewers attention towards intended face
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Pattern Recognition Letters, 2020, DOI Link
View abstract ⏷
Visual Saliency decides the focus of attention towards a region in a scene. When we talk about attending faces in a crowd or set of multiple faces, our focus of attention does not go equal for all the faces. The biasness of human's visual system towards a particular face, occurs due to some dominant features of it, over rest of the faces. So, for the faces which are not intrinsically salient in a scene, there is a requirement of increment of attentiveness. The current study effort to improve the saliency of a face (in a set of multiple faces), which is not significantly salient. It can be achieved by enhancing the contrast of the intended face with its surrounding faces, in terms of visual low-level features like intensity, color, etc. Modification of such feature values will change its potential to attract observer's gaze. But excesses change will destroy the originality of the image. Therefore, the problem of enhancing saliency of a target face is framed as an optimization (maximization) problem under some constraints. This concept can be applied to develop a saliency based intelligent camera having the power of enhancing the attractiveness of a particular face in the crowd and in the taken photograph, the enhanced face may give more attention to the viewers. Experiment has been conducted on grayscale as well as the colour images. Moreover, effect of saliency on faces wearing jewellery, has also been measured.
Enhancing face recognition through overlaying training images
Sinha A.S., Rahman A.U., Kumar R.K., Sanyal G.
Conference paper, 2019 2nd International Conference on Advanced Computational and Communication Paradigms, ICACCP 2019, 2019, DOI Link
View abstract ⏷
In the usual face recognition approach, system is getting trained through a large number of training samples. It means in the process of training, features are extracted from all the training images individually. In this process many redundant features are required to be eliminate also. During feature elimination, some features also get suppressed due to inappropriate thresholds. So, this approach is typically time consuming and costly in the part of training. Hence, there is a requirement of feature extraction in such a way that it reduces the chance of data redundancy and system complexity. This paper presents a facial recognition technique by inclusion of superimposed version of all relevant images which improves the accuracy of the model by roughly 43 percent. The algorithm aims to establish the importance of superimposition strategy in the field of face recognition. The Haar feature based classifier is used, where a cascade function is trained from a set of images. We have used the open source database of faces from the archives of ATT Laboratories Cambridge to train and test our model.
Bezier Cohort Fusion in Doubling States for Human Identity Recognition with Multifaceted Constrained Faces
Garain J., Mishra S.R., Kumar R.K., Kisku D.R., Sanyal G.
Article, Arabian Journal for Science and Engineering, 2019, DOI Link
View abstract ⏷
Cohort selection benefits a biometric system by providing the information collected from non-match templates, whereas fusion benefits a system by combining information collected from different sources or from same source in different ways. The benefits of both approaches are availed here by proposing a cohort selection technique which is exploited prior to fusion and after fusion of matching scores for a face recognition system. Two robust facial features, viz. scale-invariant feature transform and speeded up robust features, are used here. This study presents a novel way of fusion based on cohort selection unlike the traditional levels of fusion (i.e., sensor, feature, match score, rank and decision level fusions). Cohort-based fusion is performed in two different fashions—pre-cohort fusion and post-cohort fusion. In case of early fusion, fusion rules like sum, max, min and average rules are applied before cohort selection to be performed. In contrast, the cohort selection is followed by the fusion in post (or late)-cohort fusion. Union operation is applied as late fusion rule. The matching scores are normalized by T-norm cohort score normalization technique prior to be compared with the threshold value to govern the decision of acceptance by the system. The experiments are carried out on FEI and the Look-alike (IIIT Delhi) face databases. The outcomes of the proposed method are looked to be encouraging and much convincing over non-cohort systems and state-of-the-art methods.
Guiding attention of faces through graph based visual saliency (GBVS)
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Article, Cognitive Neurodynamics, 2019, DOI Link
View abstract ⏷
In a general scenario, while attending a scene containing multiple faces or looking towards a group photograph, our attention does not go equal towards all the faces. It means, we are naturally biased towards some faces. This biasness happens due to availability of dominant perceptual features in those faces. In visual saliency terminology it can be called as ‘salient face’. Human’s focus their gaze towards a face which carries the ‘dominating look’ in the crowd. This happens due to comparative saliency of the faces. Saliency of a face is determined by its feature dissimilarity with the surrounding faces. In this context there is a big role of human psychology and its cognitive science too. Therefore, enormous researches have been carried out towards modeling the computer vision system like human’s vision. This paper proposed a graphical based bottom up approach to point up the salient face in the crowd or in an image having multiple faces. In this novel method, visual saliencies of faces have been calculated based on the intensity values, facial areas and their relative spatial distances. Experiment has been conducted on gray scale images. In order to verify this experiment, three level of validation has been done. In the first level, our results have been verified with the prepared ground truth. In the second level, intensity scores of proposed saliency maps have been cross verified with the saliency score. In the third level, saliency map is validated with some standard parameters. The results are found to be interesting and in some aspects saliency predictions are like human vision system. The evaluation made with the proposed approach shows moderately boost up results and hence, this idea can be useful in the future modeling of intelligent vision (robot vision) system.
Addressing facial dynamics using k-medoids cohort selection algorithm for face recognition
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Article, Multimedia Tools and Applications, 2019, DOI Link
View abstract ⏷
Face recognition is itself a very challenging task and it becomes more challenging when the input images have intra class variations and inter class similarities in a large scale. Yet the recognition accuracy can be improved in some extent by supporting the system with non-matched templates. Therefore a set of cohort images is used in this regard. But all the cohort templates of the initial cohort pool may not be relevant for each and every enrolled subject. So the main focus of this work is to select a subject specific and meaningful cohort subset. This paper proposes a cohort selection method called K-medoids Cohort Selection (KMCS) to select a reference set of non-matched templates which are almost appropriate to the respective subjects. Basically, all cohort scores of a subject are clustered first using K-medoids clustering. Afterward the cluster having more scattered members/scores from its medoid is selected as a cohort subset because this cluster is constituted with the cohorts carrying more discriminative features compared to others. The SIFT points and SURF points are extracted as facial feature. The experiments are conducted on FEI, ORL and Look-alike databases of face images. The matching scores between probe and query images are normalized using T-norm, Max-Min and Aggarwal (Max rule) cohort score normalization techniques before taking the final decision of acceptance or rejection. The results obtained from the experiments show the domination of the proposed system over the non-cohort face recognition system as well as random and Top 10 cohort selection methods. There is another comparative study between k-means and K-medoids clustering for cohort selection.
Combined effect of cohort selection and decision level fusion in a face biometric system
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
There are different parameters which degrade the performance of a face biometric system due to their variations. The baseline biometric systems can get relief to some extent from this kind of negative effect by utilizing the information of the cohort images and fusion methods. But to achieve the set of suitable cohorts for each and every enrolled person is a task of great challenge. Determining the cohort subset using k-means clustering cohort selection based on the matching proximity is presented in this paper. SIFT and SURF are used as facial features to represent each face image and to calculate the similarity score between two face images. The clusters having highest and lowest centroid value are fused using union rule to form the target, user dependent cohort subset. The query-claimed matching scores are normalized with the help of T-norm cohort normalization technique. The scores after normalization are used in recognition separately for SIFT as well as SURF. Finally, the responses from the classifier for these two different features are fused at decision level to cover up the shortcomings of the cohort selection method if any. The experimental execution is done on FEI face database. This integrated face biometric system gains a significant hike in performance that evidences its effectiveness over baseline.
Image Specific Cross Cohort Normalization for Face Pair matching
Garain J., Kumar R.K., Kumar D., Kisku D.R., Sanyal G.
Conference paper, Procedia Computer Science, 2018, DOI Link
View abstract ⏷
An image matching or face pair matching is purely different aspect with respect to the other problems of computer vision and pattern recognition. This is a very active and challenging topic due to the unavailability of any prior information to the matching expert about the input images to be matched. Therefore an additional set of images can resolve this problem in some extent. In this context a cohort based face pair matching system is proposed. Initially the cohort set is common to all images but finally a subset of cohort images, specific to each of the paired images, are selected. Here Max-Min-Centroid-Cluster (MMCC) is applied which is capable enough to choose very relevant cohorts corresponding to target images. The raw similarity score between the input images is normalized with these set of cohort scores to obtain two normalized matching score. Afterwards the closeness between the images is measured by cross cohort normalization. The absolute difference of these two crossly normalized score is calculated and compared with a threshold value to decide the belonging of the input images to the same person or different person. The experiment has been conducted on ORL face database and the results found make evidence of the proposed system to be efficient.
A master map: An alternative approach to explore human’s eye fixation for generating ground truth based on various state-of-the-art techniques
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Electrical Engineering, 2018, DOI Link
View abstract ⏷
Saliency map is an efficient way to represent the salient objects in an image. In the area of object-based saliency, several researches have been accomplished. In these works, normal input images are taken in which salient region or arousal object are easily perceived by us. These experiments are validated based on ground truth obtained either from volunteers either through human eye fixation machine or based on viewer’s voting. But for complex images, salient locations are very confusing; therefore preparing ground truth is very difficult. In such images, results are varying with different state-of-the-art saliency model. To address this problem, this paper implements combine strategies to achieve composite saliency map which can incorporate the properties of every individual component maps. Fusion of saliency maps in this way can be utilized to generate good ground truth information which can be an alternate way of preparing ground truth unlike based on volunteers voting or through eye fixation machine. As this approach incorporates the concepts of reusability, therefore it may reduce the time and cost in the preparation of ground truth.
Attending prominent face in the set of multiple faces through relative visual saliency
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, Advances in Intelligent Systems and Computing, 2018, DOI Link
View abstract ⏷
Visual saliency determines the extent of attentiveness of a region in a scene. In the context of attending faces in the crowd, face components and its dominance features decide the focus on attention. Attention boosts up the recognition and identification process in a crowd and hence plays an excelling role in the area of visual surveillance and robotic vision. Using different computer vision-based techniques, enormous researches have been carried out on attention, recognition, and identification of the human face in context of different applications. This paper proposes a novel technique to analyze and explore the prominent face in the set of multiple faces (crowd). The proposed method stretched out the solution, using the concept of relative visual saliency, which has been evaluated on the various parameters of face as a whole and its componentwise too. These parameters are face area, spatial location, intensity, hue, RGB values, etc. The proposed work furnishes satisfactory results. The assessment made with this approach shows quite encouraging results which may lead to a future model for robotic vision and intelligent decision-making system.
A bezier curve cohort selection strategy for face pair matching
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2018, DOI Link
View abstract ⏷
The matching of two face images without any prior information is very much challenging task unlike a verification or identification system where already some knowledge about the images of each subjects are stored in the system's database. This paper proposes a methodology to enrich the performance of a face pair matching system by utilizing the complementary information collected from a set of cohort face images with the help of Bezier Curve cohort selection algorithm. A pair of face images is given as input to the system. Each image is compared with a predefined cohort pool to form two separate set of cohort scores. Further these set of cohort scores are passed through Bezier curve cohort selection method which provide two suitable cohort subsets. Afterwards a cross normalization is accomplished in conjunction with T-norm score normalization method then the absolute normalized difference between the paired face images is determined. On the basis of this normalized difference, it is finally decided whether the input face pair is from same person or not. The system is investigated with FEI face database and the results are quite impressive.
Estimating attention of faces due to its growing level of emotions
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, 2018, DOI Link
View abstract ⏷
In the task of attending faces in the disciplined assembly (Like in examination hall or Silent public places), our gaze automatically goes towards those persons who exhibits their expression other than the normal expression. It happens due to finding of dissimilar expression among the gathering of normal. In order to modeling this concept in the intelligent vision of computer system, hardly some effective researches have been succeeded. Therefore, in this proposal we have tried to come out with a solution for handling such challenging task of computer vision. Actually, this problem is related to cognitive aspect of visual attention. In the literature of visual saliency authors have dealt with expressionless objects but it has not been addressed with object like face which exploits expressions. Visual saliency is a term which differentiates 'appealing' visual substance from others, based on their feature differences. In this paper, in the set of multiple faces, 'Salient face' has been explored based on 'emotion deviation' from the normal. In the first phase of the experiment, face detection task has been accomplished using Viola Jones face detector. The concept of deep convolution neural network (CNN) has been applied for training and classification of different facial expression of emotions. Moreover, saliency score of every face of the input image have been computed by measuring their 'emotion score' which depends upon the deviation from the 'normal expression' scores. This proposed approach exhibits fairly good result which may give a new dimension to the researchers towards the modeling of an intelligent vision system which can be useful in the task of visual security and surveillance.
BCP-BCS: Best-fit cascaded matching paradigm with cohort selection using bezier curve for individual recognition
Garain J., Shah A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
The concept of cohort selection has been emerged as a very interesting and potential topic for ongoing research in biometrics. It has the capability to provide the traditional biometric systems to having a higher performance rate with lesser complexity and cost. This paper describes a novel matching technique incorporated with Bezier curve cohort selection. The Best-Fit matching with dynamic threshold has been proposed here to reduce the number of false match. This algorithm is applied for matching of Speeded Up Robust Feature (SURF) points detected on face images to find out the matching score between two faces. After that, Bezier curve is applied as a cohort selection technique. All the cohort scores are plotted in a 2D plane as if these are the control points of a Bezier curve and then a Bezier curve of degree n is plotted on the same plane using De Casteljau algorithm where number of control point is n + 1. A template contains more discriminative features more it is having distance from the curve. All the templates having score point far from the curve are included into the account of cohort subset. For each enrolled user a specific cohort subset is determined. As long as the subset is formed, T-norm cohort score normalization technique is applied to obtain the normalized scores which are further used for person identification and verification. Experiments are conducted on FEI face database and results are showing dominance over the non-cohort system.
A novel approach to attend faces in the crowd through relative visual saliency
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2017, DOI Link
View abstract ⏷
Visual Saliency plays an important role in attending faces in the crowd. While considering faces in a crowd, humans are inclined towards certain faces due to dominance of some features in those faces like color, texture, intensity, geometry etc. In this paper, we have proposed a novel method to analyze the saliency of faces in a crowd face image based on four parameters namely feature difference (intensity difference), spatial distance, area of each face and camera distance i.e. distance of each face in crowd from the viewer's point. To the best of our knowledge it is confirmed that this zone has not been explored by researchers to its full capacity till now. The experimental results have been found motivating. This method can find its future application in artificial intelligent systems and cognitive models can be established based on this theory which has the capacity to mimic the ability of human vision system.
Emotion recognition through facial gestures – a deep learning approach
Mishra S., Prasada G.R.B., Kumar R.K., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2017, DOI Link
View abstract ⏷
As defined by some theorists, human emotions are discrete and consistent responses to internal or external events which have significance for an organism. They constitute a major part of our non-verbal communication. Among the human emotions, happy, sad, fear, anger, surprise, disgust and neutral are the seven basic emotions. Facial expressions are the best way to exhibit emotions. In this era of booming human-computer interaction, enabling the machines to recognize these emotions is a paramount task. There is an amalgamation of emotions in every facial expression. In this paper, we identified the different emotions and their intensity level in a human face by implementing deep learning approach through our proposed Convolution Neural Network (CNN). The architecture and the algorithm here yield appreciable results that can be used as a motivation for further research in computer based emotion recognition system.
Determine attention of faces through growing level of emotion using deep Convolution Neural Network
Kumar R.K., Kumar G.A.R., Garain J., Kisku D.R., Sanyal G.
Conference paper, 2017 International Conference on Intelligent Computing, Instrumentation and Control Technologies, ICICICT 2017, 2017, DOI Link
View abstract ⏷
Face emotions are internal feelings of a human which reflects on the human face in terms of specific expressions. It appears naturally rather than through forceful effort and are convoyed by physical variations in facial muscles that implies various expressions on the face. Several standard emotions are happy, anger, sad, fear, surprise, disgust, etc. Facial expressions play an essential role in non-verbal communication. In the area of modeling of intelligent computer vision which can recognize the human's emotion, various researches have been performed. But advance research which may analyze beyond of 'just emotion prediction', for example 'emotion attention' or 'salient emotion' has not been covered much in the literature. In this paper, we are finding the attention scores of same emotion in various levels. Obtained results justifies that, the higher expression level of emotions gives more attention. The experiment has been performed on different levels of same emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and analyzed, how saliency of face is changes with low level to high level of same emotions. FER-2013 and CK+ databases have been applied for training and testing respectively. The proposed approach delivers fairly good result it may give inspiration to the researchers for modeling an intelligent vision system which can predict not only emotion recognition but more than this.
Discriminating real from fake smile using convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, ICCIDS 2017 - International Conference on Computational Intelligence in Data Science, Proceedings, 2017, DOI Link
View abstract ⏷
In our society, sometime we hide our genuine feeling and emotion and purposely express different emotion in front of our surrounding folks. But as it's not actually a natural emotion, hence, it is more or less, predictable by others. Human vision system has enormous capability to recognizing genuine and fake smile of an individual. Discriminating genuine and fake smile is very thought-provoking task and even though very smaller amount of research has been carried out in this topic. In this paper, we are exploring a method to distinguish real from fake smile with high precision by using convolution neural networks (CNN). System has been train with FERC-2013 dataset having seven types of emotions namely happy, sad, disgust, angry, fearful, surprised and neutral. Emotions percentages of real and fake face are recorded by the emotion detection system. Based on recorded score, we investigate the effect of various percentages of emotions presented on both faces and then we are going to classify the smile on the face is real or fake.
Facial emotion analysis using deep convolution neural network
Kumar G.A.R., Kumar R.K., Sanyal G.
Conference paper, Proceedings of IEEE International Conference on Signal Processing and Communication, ICSPC 2017, 2017, DOI Link
View abstract ⏷
Human emotions are mental states of feelings that arise spontaneously rather than through conscious effort and are accompanied by physiological changes in facial muscles which implies expressions on face. Some of critical emotions are happy, sad, anger, disgust, fear, surprise etc. Facial expressions play a key role in non-verbal communication which appears due to internal feelings of a person that reflects on the faces. In order to computer modeling of human's emotion, a plenty of research has been accomplished. But still it is far behind from human vision system. In this paper, we are providing better approach to predict human emotions (Frames by Frames) using deep Convolution Neural Network (CNN) and how emotion intensity changes on a face from low level to high level of emotion. In this algorithm, FERC-2013 database has been applied for training. The assessment through the proposed experiment confers quite good result and obtained accuracy may give encouragement to the researchers for future model of computer based emotion recognition system.
Selection of user-dependent cohorts using bezier curve for person identification
Garain J., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, DOI Link
View abstract ⏷
The traditional biometric systems can be strengthened further with exploiting the concept of cohort selection to achieve the high demands of the organizations for a robust automated person identification system. To accomplish this task the researchers are being motivated towards developing robust biometric systems using cohort selection. This paper proposes a novel user-dependent cohort selection method using Bezier curve. It makes use of invariant SIFT descriptor to generate matching pair points between a pair of face images. Further for each subject, considering all the imposter scores as control points, a Bezier curve of degree n is plotted by applying De Casteljau algorithm. As long as the imposter scores represent the control points in the curve, a cohort subset is formed by considering the points determined to be far from the Bezier curve. In order to obtain the normalized cohort scores, T-norm cohort normalization technique is applied. The normalized scores are then used in recognition. The experiment is conducted on FEI face database. This novel cohort selection method achieves superior performance that validates its efficiency.
A novel approach to enlighten the effect of neighbor faces during attending a face in the crowd
Kumar R.K., Garain J., Kisku D.R., Sanyal G.
Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2016, DOI Link
View abstract ⏷
While attending the crowd, humans do not look all the faces with same gaze. Human beings have naturally, extraordinary capability to focus their gaze towards a face carrying the 'dominating look' in the crowd in terms of beauty or ugliness. This happens due to relative saliency of that face with respect to their surrounding faces. To model the computer's vision based on human psychology and its cognitive science, enormous researches have been carried out. In order to enhance these researches, this paper proposed a novel method to illustrate how the saliency of a face is varying with its neighboring faces in the crowd. This variation affects our perception and attention towards a face in the crowd. In this method, distribution of visual saliency has been calculated based on intensity values and respective spatial distance of the faces. Experiment has been conceded out on various facial images. The results and accuracy are found to be satisfactory. The evaluation made with the proposed approach which exhibits quite boosting results and it can lead to a future model of intelligent visions of computers.
Heterogeneous face detection
Das A., Kumar R.K., Kisku D.R.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
Face detection is the process of determining the location of human faces in an image. Like human visual system, a face detection system should also be capable of achieving the detection task irrespective of illumination, absence of texture, orientation and camera distance. Detecting faces in heterogeneous, infrared and thermal images is a challenging job due to variation in texture, orientation, lighting condition, intensity etc. Many researchers have worked and proposed various methods for visible faces in the domain of face detection. However, face detection with the existing algorithm in heterogeneous face images is found quite difficult one. This paper attempts to improve the accuracy of an existing face detection algorithm which was basically designed for detecting visible faces, can now be used to detect heterogeneous faces by applying various image enhancement techniques before face detection. The proposed improved algorithm is suitable for heterogeneous face images which include thermal images, infrared images in the crowd. Attempts for the same have been tested on test image dataset. The experimental results are found to be encouraging.
Constrained maximization of saliency of intended object for guiding attention
Kumar R.K., Pal R.
Conference paper, 12th IEEE International Conference Electronics, Energy, Environment, Communication, Computer, Control: (E3-C3), INDICON 2015, 2016, DOI Link
View abstract ⏷
Saliency of an object in an image determines the attentiveness of the object with respect to human visual system. Saliency, in the absence of any external stimuli, is determined by contrast of features (like intensity, color, etc.) of the object with its surroundings. This paper proposes a novel technique to enhance the saliency of an object, which is not intrinsically salient. In order to enhance the attentiveness, the feature values of a used-defined target object (whose saliency has to be enhanced) should have more differences with the feature values of its surrounding objects. But too much modification of these values of the target object will destroy the naturalness of the image. So this problem of enhancing saliency of a target object is treated as a maximization problem under some constraints.
Attention identification via relative saliencyof localized crowd faces
Das A., Kumar R.K., Kisku D.R., Sanyal G.
Conference paper, ACM International Conference Proceeding Series, 2016, DOI Link
View abstract ⏷
While viewing a crowd, the human vision automatically gets focused towards the most attentive face. This happens when a particular face in the crowd dominates the other faces in terms of beauty, expression, color, shape, size and structure, etc. Human attention towards such faces happens due to its higher visual saliency values. A computer vision system can be modeled based on this aspect of human psychology. Visual saliency of a face in a crowd may vary according to many parameters. In this paper, we propose a novel method to calculate the distribution of visual saliency of faces in the crowd based on their feature difference, spatial distance and size. This method has been tested on various crowd images and inspiring results have been found. Therefore, this can be used to mimic the cognitive behavior of the human vision system to create artificially intelligent computer vision systems.
Automated skull stripping in brain MR images
Aruchamy S., Kumar R.K., Bhattacharjee P., Sanyal G.
Conference paper, Proceedings of the 10th INDIACom; 2016 3rd International Conference on Computing for Sustainable Global Development, INDIACom 2016, 2016,
View abstract ⏷
Skull stripping is a significant as well as a preliminary step in diagnosing brain disorders. It removes extra-meningeal tissues from Magnetic Resonance Images of the brain. Magnetic Resonance Imaging (MRI) is a widely used technique for analysis of brain images. An efficient hardware-based algorithm for Skull segmentation would help in developing an automated brain image analysis system for real time applications in biomedical sciences. In this work, a Raspberry Pi single board computer based image analysis algorithm for an automatic skull stripping is reported. The experiment has been carried out with T1 weighted axis images. In order to reduce the noise and enhance the quality, initially the images were pre-processed. Further, edge detection and morphological operations were performed to extract the skull from the brain images. The proposed method has been validated by evaluating quantitative performance metrics like Jaccard similarity index and the Dice coefficient. This technique will serve as the major step for technological outbreaks for developing systems for automated skull stripping the images of the brain in the future.
Estimating normalized attention of viewers on account of relative visual saliency of faces (NRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Article, International Journal of Software Engineering and its Applications, 2015, DOI Link
View abstract ⏷
Humans psychological and behavioral understanding often lead to make natural decision which accurately identifies and remembers the faces which are highly appreciated or criticized by themselves in comparing to the normal viewed faces, in terms of beauty, ugliness or unique appearance. It happens due to human psychology of being biased towards the salient face in the process of face recognition and identification. This paper attempts a novel method to measure, how our attention is more restricted towards some particular faces in the crowd. This restricted attention is strongly guided by the relative visual saliency of these faces. In this paper, normalized relative visual saliency (NVRS) of the faces is evaluated using their intensity values modulated with respective spatial distance. Experiment has been carried out on test image dataset via bottom up approach. The experimental results are found to be encouraging and accuracy has also been measured exhibiting efficacy of the proposed approach.
Analysis of Attention Identification and Recognition of Faces through Segmentation and Relative Visual Saliency (SRVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Procedia Computer Science, 2015, DOI Link
View abstract ⏷
Attentiveness, identification and recognition of a human face in a crowd play a starring role in the perspective of visual surveillance. Human vision system is not attending, identifying and recognizing all the faces with the same perception. It is biased towards some faces in the crowd due to their higher relative visual saliency and segment wise perception with respect to the surroundings faces. Using different computer vision-based techniques enormous researches have been carried out on attention, recognition and identification of the human face in context of different applications. This paper proposes a novel technique to explain and analyse how the attention, identification and recognition of a face in the crowd guided through segmentation and their relative visual saliency. The proposed method is stretched out the solution, using the concept of segmentation and relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces.
Cohort selection of specific user using max-min-centroid-cluster (MMCC) method to enhance the performance of a biometric system
Garain J., Kumar R.K., Sanyal G., Kisku D.R.
Article, International Journal of Security and its Applications, 2015, DOI Link
View abstract ⏷
Selection of cohort models plays a vital role to increase the accuracy of a biometric authentication system as well as to reduce the computational cost. This paper proposes a novel approach for cohort selection called Max-Min-Centroid-Cluster (MMCC) method. The clusters of cohorts are generated by K-means clustering technique. The union of the clusters having largest and smallest centroid value is taken as cohort subset. The cohort scores, after normalization using different cohort based score normalization techniques, are used in authentication process of the system. Evaluation has been carried out on FEI face datasets. The performance of this novel methodology is analyzed using T-norm and Aggarwal (max rule) normalization techniques. Experimental results exhibit the efficacy of the proposed method.
Novel methodology for guiding attention of faces through relative visual saliency (RVS)
Kumar R.K., Garain J., Sanyal G., Kisku D.R.
Conference paper, Proceedings - 2015 International Conference on Control, Automation and Robotics, ICCAR 2015, 2015, DOI Link
View abstract ⏷
Identification of a human face in a crowded flux plays an important role in the context of surveillance. Considerable amount of research has been carried out on face identification in different applications. Accordingly, different researchers propose new algorithms. This paper attempts to showcase a novel methodology through which any face may be identified in a large crowd of human face. This proposed technique is based on relative visual saliency which is evaluated on the intensity values and respective spatial distance of the faces. In addition to visual saliency, top-down and bottom-up approaches to visual attention are also presented and explained in the context of face identification. Both of these two approaches are considered to be made a significant contribution while visual saliency is measured for attention-based face identification. Experiment has been carried out on test image dataset. The results are satisfactory and accuracy has also been measured. The evaluation made with the proposed approach exhibits quite encouraging results and accuracy leads to a future model of human face tracking and recognition system.
Enhancement of fuzzy implication operator
Kumar R.K.
Conference paper, Proceedings - 2014 4th International Conference on Communication Systems and Network Technologies, CSNT 2014, 2014, DOI Link
View abstract ⏷
In this paper, fuzzy implication operator, ply operator and their properties have been studied. On the basis of existing properties of Implication operator, some more properties have been extracted in order to enhance the use of Implication in the approximate reasoning. In some form of monotonicity, five ply operators and its various properties suggested by Zadeh have been explained. In the last part of the paper, increasing and decreasing behavior of the five Ply operators (obtained by Zadeh using possibility distribution) under 'Special Cases' have also been proposed. © 2014 IEEE.
Low voltage DCI based low power VLSI circuit implementation on FPGA
Pandey B., Kumar R.
Conference paper, 2013 IEEE Conference on Information and Communication Technologies, ICT 2013, 2013, DOI Link
View abstract ⏷
In this paper, we study the effect of using digitally controlled impedance IO Standard in memory interface design in terms of power consumption. In this work, we achieved 50% dynamic power reduction at 1.5V output driver voltage, 35.2% dynamic power reduction at 1.8V output driver voltage in comparison to 2.5V output driver voltage in DCI based IO standard implementation on input or output port in target design. Target device XC6VLX75TFF484-1 is a Virtex-6 FPGA of -1 speed grade and 484 pins is used for implementation of this design. Target Design is RAM-UART memory interface. XPower 13.4 is used for power analysis of our low power memory interface design. ISim is simulator to generate waveform. Planahead is used for design, synthesis and implementation. © 2013 IEEE.
Clock gating aware low power global reset ALU and implementation on 28nm FPGA
Pandey B., Yadav J., Kumar J., Kumar R.
Conference paper, Proceedings - 5th International Conference on Computational Intelligence and Communication Networks, CICN 2013, 2013, DOI Link
View abstract ⏷
In this paper, we apply clock gating technique in Global Reset ALU design on 28nm Artix7 FPGA to save dynamic and clock power both. This technique is simulated in Xilinx14.3 tool and implemented on 28nm Artix7 XC7A200T FFG1156-1 FPGA. When clock gating technique is not applied clock power contributes 32.25%, 4.24%, 3.06%, 3.09%, and 3.09% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, 100GHz and1 THz device frequency respectively. When clock gating technique is applied clock power contributes 0%, 1.02%, 1.06%, 1.06%, and 1.06% of overall dynamic power on 100 MHz, 1 GHz, 10 GHz, and 100GHz and1 THz device frequency respectively. With clock gating, there is 100%, 76.92%, 66.30%, 66.55% and 66.58% reduction in clock power in compare to clock power consumption without clock gate on 100 MHz, 1 GHz, 10 GHz, 100 GHz and 1 THz respectively operating frequency. Clock gating is more effective on 28nm in compare to 40nm and 90nm technology file. © 2013 IEEE.

Contact Details

ravikant.k@srmap.edu.in

Scholars

Doctoral Scholars

Ms Shaik Reehana
Ms Keerthi Garisa
Ms Gayathri Dhara

About

Admissions

B.Tech

B.Sc. (Hons)/ B.Sc.

M.Tech

M.Sc

Integrated M.Tech

B.A. (Hons)/ B.A.

B.Sc. (Hons)/ B.Sc.

B.Com. (Hons)/ B.Com.

BBA (Hons)/ BBA

MBA

Research

Placements

International Relations

Campus Life