Faculty Dr Shaik Rafi

Dr Shaik Rafi

Assistant Professor

Department of Computer Science and Engineering

Contact Details

rafi.s@srmap.edu.in

Office Location

Homi J Bhabha Block, Level 3, Cubical No: 23

Education

2025
Ph.D
National Institute of Technology Mizoram
India
2020
AMIE
Institute of Engineers
India
2014
M.Tech (CSE)
JNTU Kakinada
India
2012
MCA
JNTU Kakinada
India

Personal Website

Experience

  • 2013-2018 Assistant Professor – Tirumala Engineering College, Narasaraopet
  • 2018-2019 Assistant Professor – Eswar College of Engineering, Narasaraopet
  • 2019-2025 Assistant Professor – Narasaraopeta Engineering College (Autonomous), Narasaraopet

Research Interest

  • Considering two types of modalities such as text and image modalities and extracting features from multimodalities to generate Multimodal Abstractive Text Summarization.
  • My interest also includes to retrieve relevant images to the generated Multimodal Abstractive Summaries using different techniques from deep learning.

Awards

  • Gold Medal in M.Tech – Narasaraopeta Engineering College (2013)
  • Best Paper Award – DoSCI-2025

Memberships

  • ACM
  • IAENG

Publications

  • Feature Augmentation and Convolutional Neural Networks for Accurate Prediction of Heart Disease

    Ramnadh Babu T.G., Bhavana K., Chaitanya G., Sravani M., Shaik R., Moturi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2026, DOI Link

    View abstract ⏷

    Heart diseases are considered the foremost cause of death within developing nations; therefore, prediction of heart disease is crucial for evaluating the risk of patients. This paper will introduce a new method to enhance prediction accuracy by combining CNNs with SAE for feature enhancement. Our method uses a dataset of 918 patient records with 11 clinical variables, and it removes the drawback of traditional classifiers by using feature augmentation to build more informative features. Experimental results show that our model’s accuracy is 93.478%, an improvement over traditional classifiers like MLP and RF by 4.98%. Latent space size is also optimized, and 100 best features are obtained. The results suggest that the deep learning methods, especially the combination of SAE with CNN, bring in notable benefits for heart disease prediction, which might further be used for the clinical purpose of earlier interventions.
  • Smartphone Price Patterns Prediction Using Machine Learning Algorithms

    Rafi S., Rambabu Y., Rajasekhar P., Suhas B., Reddy M.S., Moturi S., Doda V.R.

    Conference paper, 6th IEEE International Conference on Recent Advances in Information Technology, RAIT 2025, 2025, DOI Link

    View abstract ⏷

    Selecting the best smartphone can be challenging due to the wide range of models available on the market. This study shows how the machine learning models can predict mobile phone prices based on their features We evaluated several machine learning techniques, including Logistic Regression, Decision Trees, Random Forest, SVC, K-Neighbors Classifier, Gaussian Naive Bayes (GaussianNB), AdaBoost, Gradient Boosting, Extra Trees, Bagging Classifiers, and XGBoost. The primary objective was to identify the most effective model for price forecasting and to investigate the factors influencing phone prices. Our research offers insights to both consumers and manufacturers, helping them make more informed decisions about phone features and pricing. We emphasize the importance of using diverse datasets that accurately represent various smartphone models and pricing points. Key factors affecting phone costs were identified, and model performance was assessed using metrics such as accuracy, F1-score, and classification reports. Model performance was further enhanced through hyperparameter tuning with GridSearchCV, achieving 97% accuracy with the Decision Tree, K-Neighbors Classifier, SVC, AdaBoost, and Random Forest models. Among these, the Decision Tree and SVC was selected as the optimal models, offering a good tradeoff between accuracy, flexibility, and time complexity. This study aims to provide valuable data to guide consumers in making informed choices about mobile phone features and price ranges.
  • Chronic Kidney Disease Prediction Using Machine Learning and Deep Learning Models

    Rafi S., Revanth N., Reddy K.V.R., Babu K.M., Kumar Y.L.P., Kumar N.V.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    Chronic kidney disease is a noticeable health condition that can persist throughout an individual's life, resulting from either kidney malignancy or diminished kidney function. In this work, we investigate how several machine learning techniques might provide an early CKD diagnosis. While previous research has extensively explored this area, our aim is to refine our approach by employing predictive modeling techniques. Initially, we considered 25 variables alongside the class property. The data set used in this study underwent extensive processing, including changing the names of colours for clarity, converting identified colours to numbers, treating unique values with letters handling of partitioned values, fixing incorrect values, filling null values with mean, and encoding categorical values into mathematical notation. In addition, Principal component analysis (PCA) was also employed to lower dimensionality. Our findings demonstrated that the XG Boost classifier surpassed e very other algorithm, with an accuracy of 0.991.
  • Detecting Sarcasm Across Headlines and Text

    Rafi S., Niharika A.L., Neelima S., Nikhitha K., Reddy M.S., Sireesha M.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    In this era with the rapid growth in social media usage among the current generation, a huge amount of content and comments, most of them sarcastic, is seen. Sarcasm has turned out to be an important part of daily life, especially in news and social media, where sarcastic comments are often used for better attention. However, detecting sarcasm is always challenging because it deals with understanding the difference between what has been said and what is meant. The current paper focuses on the detection of sarcasm in news headlines with the help of deep learning. Previous works were based on a wide range of datasets; however, these had limitations regarding either size or quality. In this respect, the authors propose creating a new dataset of headlines from sarcastic news sites and real news sites that is large and of high quality, hence appropriate for machine learning model training. The authors have also used the CNN-BILSTM architecture for text analysis, identifying sarcasm expression and deciding whether it is sarcastic or not-sarcastic which gained an accuracy of 97%. This dataset is made publicly available to enable further research in this direction.
  • Multimodal Multi-objective Grey Wolf Optimisation with SVM and Random Forest as Classifier in Feature Selection

    Das R., Rafi S., Purwar H., Laskar R.H., Rajshekhar A., Chandrawanshi N.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    A technique called feature selection, often referred as attribute subset selection, selects the optimal subset of features for a given set of data by reducing the dimensionality and eliminating unnecessary characteristics. There can be 2n feasible solutions for a dataset with “n” features that is challenging to address using the conventional attribute selection method. Meta heuristic-based approaches perform better than traditional procedures in such situations. Numerous evolutionary computing techniques have been effectively used in Feature Selection challenges. On the distribution of options in the choice space, some research has been done. Achieving one, however, necessitates advancing the other since many optimisation problems contain two or more competing goals. The multi-objective optimisation technique discussed in this research finds the best effective trade-off between numerous objectives. Multi-objective Programs require multiple non-dominated solutions that could be found as opposite to just one. In the initial stage, we applied the Grey Wolf Optimisation (GWO) to acquire the optimised features. On the basis of the features selected, we trained the classifiers—Support Vector Machine (SVM) and Random Forest (RF) in the second phase. Experiment has been carried out on three benchmark datasets namely Glass, Wine and Breast Cancer Datasets retrieved from the UCI repository to show the supremacy of the proposed technique, the effectiveness of the recommended feature selection approach has been evaluated. The testing results show that the suggested GWO with Random Forest performs better than GWO with SVM.
  • Optimizing English Learning with AI: Machine Learning Techniques and Tools

    Jayaranjan M., Shekhar G.R., Rafi S., Ramesh J.V.N., Kiran A., Vimochana M.

    Conference paper, International Conference on Intelligent Systems and Computational Networks, ICISCN 2025, 2025, DOI Link

    View abstract ⏷

    This research presents a new AI-driven approach to optimizing English language learning with a TF-IDF-based Gated Recurrent Unit model. It provides an effective framework for assessing and improving student writing competency by using advanced text classification approaches. Here, the six criteria evaluated on the basis of the ELLIPSE corpus of essays by grade 8-12 learners include coherence, syntax, vocabulary, phraseology, grammar, and conventions. Extensive preprocessing in terms of tokenization, stop word removal, stemming, and lemmatization, TF-IDF extracts key text features into numerical representations and feed into a GRU model that captures long-term dependencies as well as contextual meaning followed by classification with a dense layer and softmax activation. It has been implemented in Python: The GRU model yields an accuracy of 99.7%, precision at 99.04%, recall at 99.56%, and F1-score of 99.54. The work provides a rigorous methodology for text classification which improves the students' writing skills and propels AI tools in education.
  • AI vs Human Text Detector: Identifying AI-Generated Content Using NLP

    Reddy C.R.G.R., Muthukumar D.S., Thogai Rani B., Ganesh N., Rafi S.M., Vamsi P.

    Conference paper, Proceedings of the 2025 3rd International Conference on Inventive Computing and Informatics, ICICI 2025, 2025, DOI Link

    View abstract ⏷

    The evolving capabilities of AI models such as GPT -3 and GPT -4, and Deepseek have rendered it ever more difficult to distinguish between human-written text and AI-generated text. The issue raises serious concerns regarding plagiarism, disinformation, and authorship of electronic content in academic, journalistic, and educational spheres. For this aim, an innovative tool - AI vs Human Text Detector is introduced, it is a web application created to ascertain whether information was authored by a human or by an artificial intelligence model. The system uses Natural Language Processing (NLP) methods to examine prominent linguistic characteristics like perplexity and burstiness that distinguish the homogeneous pattern of AI text from the naturally heterogeneous patterns of human writing. The pre-trained GPT -2 is utilized to quantify textual predictability and variability, thus improving classification accuracy. Visualization of results is facilitated through Matplotlib and Plotly. The software tool is optimized to run on low-cost, commonly available resources, ensuring both accessibility and scalability. The proposed model achieves a high accuracy of 99.2% in detecting AI-generated content, outperforming existing models like ERT, CNN, and BERT-CNN. Future enhancements involve multilingual detection, hybrid detection models. The contribution of this work lies in supporting content integrity and ethical AI use in digital communication.
  • Multiple Disease Detection using CNN based Model from Chest X-rays

    Seviappan A., Chethan D., Sri P.K., Rafi S., Shahi S.G.

    Conference paper, Proceedings of 3rd International Conference on Sustainable Computing and Data Communication Systems, ICSCDS 2025, 2025, DOI Link

    View abstract ⏷

    Chest X-rays are one of the most widely used diagnostic tools for identifying lung-related diseases such as pneumonia, tuberculosis, and lung cancer. However, manual interpretation of these images can be time-consuming and subject to inconsistencies, especially in high-pressure clinical settings or areas with limited access to radiologists. To address these challenges, this study introduces a deep learning-based system that leverages Convolutional Neural Networks (CNNs) for automatic disease detection from chest X-ray images. The model is trained on a large, diverse dataset and incorporates essential preprocessing steps like histogram equalization to improve image quality and enhance diagnostic accuracy.The CNN-based framework is designed to automatically extract features and classify X-ray images into multiple disease categories. It aims to reduce diagnostic delays and support radiologists by offering fast and consistent evaluations. The system's performance is assessed using standard metrics such as accuracy, sensitivity, specificity, and F1-score, demonstrating its effectiveness as a decision-support tool in clinical workflows. By reducing dependence on manual review, this approach enhances both scalability and reliability in medical imaging diagnostics.Beyond automated image analysis, this paper also explores the integration of a real-time health monitoring solution for elderly individuals using a smartwatch-based system. This wearable device tracks vital signs such as heart rate and physical activity and provides instant alerts to caregivers in the event of abnormal readings. The combined application of AI-powered diagnostics and wearable health tracking presents a dual solution that supports early disease detection and improves continuous patient monitoring - paving the way for more responsive, accessible, and patient-centered healthcare.
  • Detecting multimodal cyber-bullying behaviour in social-media using deep learning techniques

    MohammedJany S., Killi C.B.R., Rafi S., Rizwana S.

    Article, Journal of Supercomputing, 2025, DOI Link

    View abstract ⏷

    Cyberbullying detection refers to the process of classifying and identifying of cyberbullying behavior—which involves the use of technology to harass, or bullying individuals, typically through online platforms. A growing concern is the spread of bullying memes on social media, which can perpetuate harmful behavior. While much of the existing research focuses on detecting cyberbullying in text-based data, image-based cyberbullying has not received as much attention. This is a significant issue because many social media posts combine images with text, and the visual content can be a key component of cyberbullying. To address this, our research aims to develop a multimodal cyberbullying detection modal (MCB) that is capable of detecting bullying in both images and text. For this, we used VGG16 pretrained model to detect bullying in images and XLM-RoBERTa with BiGRU model to detect bullying in text. Together we integrated these models (VGG16 + XLM-RoBERTa and BiGRU) using attention mechanisms, CLIP, feedback mechanisms, CentralNet, etc., and proposed a model used for detecting cyberbullying in image-text-based memes. Our accomplished model produced a reasonable accuracy of 74%, pointing that the system is effective in recognizing most cyberbullying activity.
  • Multimodal: A Text-Image based Cyber-Bullying Detecting with Deep Learning

    Ankarao A., Rafi S., Eluri R.K., Reddy K.V.N.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    The practice of categorizing and identifying cyber-bullying behavior, which include using technology to harass or intimidate people - usually through online platforms - is known as cyberbullying detection. To tackle this, we took a look at a dataset that was made public and labeled as bully or non-bully based on text, image, and image-text. Then, we proposed a deep learning model that could identify cyberbullying in multimodal data. Bullying in text is detected using the XLM-RoBERTa with BiGRU model, while bullying in images is identified by the VGG16 pre-trained model. Using attention processes, CLIP, feedback mechanisms, CentralNet, and other tools, we combined these models (VGG16 + XLM-RoBERTa and BiGRU) and developed a model for identifying cyberbullying in image-text based memes.With a respectable accuracy of 72%, our final model demonstrated that the system is capable of identifying the majority of cyberbullying incidents.
  • Recognizing Image Manipulations Utilizing CNN and ELA

    Nallamothu K., Rafi S., Kokkiligadda S., Jany S.M.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    Tampering of digital photos or images is known as image forgery. The ability to create phony images or information has become easier due to the rapid advancement of technology. In order to detect image forgeries, this paper proposes a model that employs Error Level Analysis (ELA) with Convolutional Neural Networks (CNN). ELA is used as a preprocessing step to highlight regions of an image that may have been tampered with. CNN is then trained on this enhanced data to classify images based on their authenticity and detect digital modifications. This initiative’s main goals include image classification, attribute extraction, image authenticity verification, and digital image modification detection. Our suggested solution makes use of CNNs’ deep learning capabilities and the refinement found by ELA.
  • Topic-guided abstractive multimodal summarization with multimodal output

    Rafi S., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Summarization is a technique that produces condensed text from large text documents by using different deep-learning techniques. Over the past few years, abstractive summarization has drawn much attention because of the capability of generating human-like sentences with the help of machines. However, it must improve repetition, redundancy and lexical problems while generating sentences. Previous studies show that incorporating images with text modality in the abstractive summary may reduce redundancy, but the concentration still needs to lay on the semantics of the sentences. This paper considers adding a topic to a multimodal summary to address semantics and linguistics problems. This stress the need to develop a multimodal summarization system with the topic. Multimodal summarization uses two or more modalities to extract the essential features to increase user satisfaction in generating an abstractive summary. However, the paper’s primary aim is to explore the generation of user preference summaries of a particular topic by proposing a Hybrid Image Text Topic (HITT) to guide the extracted essential information from text and image modalities with the help of topic that addresses semantics and linguistic problems to generate a topic-guided abstractive multimodal summary. Furthermore, a caption-summary order space technique has been introduced in this proposed work to retrieve the relevant image for the generated summary. Finally, the MSMO dataset compares and validates the results with rouge and image precision scores. Besides, we also calculated the model’s loss using sparse categorical cross entropy and showed significant improvement over other state-of-the-art techniques.
  • Reducing extrinsic hallucination in multimodal abstractive summaries with post-processing technique

    Rafi S., Laitonjam L., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Multimodal abstractive summarization integrates information from diverse modalities, such as text and images, to generate concise, coherent summaries. Despite its advancements, extrinsic hallucination, a phenomenon where generated summaries include content not present in the source, remains a critical challenge, leading to inaccuracies and reduced reliability. Existing techniques largely focus on intrinsic hallucinations and often require substantial architectural changes, leaving extrinsic hallucination inadequately addressed. This paper proposes a novel post-processing approach to mitigate extrinsic hallucination by leveraging external knowledge vocabularies as a corrective mechanism. The framework identifies hallucinated tokens in generated summaries using a cosine similarity metric and replaces them with factually consistent tokens sourced from external knowledge, ensuring improved coherence and faithfulness. By operating at the token level, the approach preserves linguistic, semantic, and syntactic structures, effectively reducing unfaithful content. The proposed method incorporates domain knowledge through a Word2Vec-based vocabulary, offering a scalable solution to enhance factual consistency without modifying model architectures. The contributions of this study include a detailed methodology for identifying and correcting hallucinated tokens, a robust post-processing pipeline for refining summaries, and a demonstration of the method’s effectiveness in reducing extrinsic hallucinations. Experimental results on MSMO dataset highlight the approach’s potential to improve the accuracy, coherence, and reliability of multimodal abstractive summarization systems, addressing a significant gap in the field and shows the state-of-the-art results with RMS-Prop Optimizer.
  • Unveiling Student Success: A Multifaceted Approach with Learning Coefficients and Beyond

    VijayaKumar N., Rafi S., Mahith T., Reddy D.V., Naik K.R., Raju K.

    Conference paper, 2nd IEEE International Conference on Integrated Intelligence and Communication Systems, ICIICS 2024, 2024, DOI Link

    View abstract ⏷

    The student performance is examined in this study using a number of methods of Educational Data Mining (EDM), Clustering and classification techniques are employed to classify the course as well as the performance in the entrance examination. The results obtained show that the Random Forest and XG Boost which are machine learning models outperform traditional methods for predicting student success. Moreover, CNN and LSTM Networks, which are deep learning models, improve prediction accuracy even further. Conducted through metrics like accuracy, precision, recall and F1-score, this study shows that any form of recognition of the pattern, in this case, the early one, helps to reduce failure rates to considerable extents. The results of this study suggest that there is a potential scope for further improving prediction algorithms and management of educational resources, which are of great relevance to the institutions to further the student success.
  • Rainfall Prediction Using Machine Learning

    Sreenivasu S.V.N., Rafi S., Lakshmi V.V.A.S., Sivanageswara Rao S., Rajani Ch.

    Conference paper, Proceedings of 2024 2nd International Conference on Recent Trends in Microelectronics, Automation, Computing, and Communications Systems: Exploration and Blend of Emerging Technologies for Future Innovation, ICMACC 2024, 2024, DOI Link

    View abstract ⏷

    The title is 'Rainfall Prediction Using Machine Learning'. The initiative's dataset is written in Python and stored in Microsoft Excel. A wide range of machine learning algorithms are used to discover which strategy generates the best accurate predictions. In many sections of the country, rainfall forecasting is critical for avoiding major natural disasters. This forecast was created using a variety of machine learning approaches, including catboost, xgboost, decision tree, random forest, logistic regression, neural network, and light gbm. It incorporates several components. The Weather Dataset was utilized. The primary goal of the research is to evaluate a variety of algorithms and determine which one performs best. Farmers may greatly profit from growing the appropriate crops based on the amount of water they require.
  • Automatic Attendance Management System Using CNN

    Sreenivasu S.V.N., Rajani C., Rani B.U., Dasaradha A., Rafi S.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Facial recognition technology plays a crucial role in various applications, from enhancing security at banks and organizations to streamlining attendance tracking in public gatherings and educational institutions. Traditional methods of attendance marking, such as signatures, names, and biometrics, can be time-consuming and error-prone. To address these challenges, a smart attendance system is proposed, leveraging Deep Learning, Convolutional Neural Networks (CNN), and the OpenCV library in Python for efficient face detection and recognition. The system utilizes advanced algorithms, including Eigen faces and fisher faces, to recognize faces accurately. While deep learning models excel with large datasets, they may not perform optimally with few samples. By comparing input faces with images in the dataset, the system automatically updates recognized names and timestamps into a CSV file, which is then sent to the respective organization's head. Additionally, the system allows users to upload a single photo or a group photo, and it returns matched photos as output using a CNN. This feature enhances the system's flexibility and usability, providing users with a convenient way.
  • Classification of Music Genre Using Deep Learning Approaches

    Srinivas U.M., Rafi S., Manohar T.V., Rao M.V.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Music genre classification is a pivotal area of research within audio technology, holding immense importance for content organization and recommendation. Audio feature extraction and Music genre classification constitute a complete recognition system. Audio feature analysis and Music genre classification together form an integrated recognition system for comprehensive music genre identification and organization. This technology is frequently utilized to accurately detect and classify various types of music genres or characteristics present in audio signals, contributing significantly to the effective organization and recommendation of music content. Our experiment was conducted with the dataset from GTZAN that is taken from Kaggle repository. Convolutional neural networks (CNN) are employed to train our model, which is subsequently utilized for the classification of music genres in audio signals.
  • ChronicNet: Randome Forest Classifier-based Chronic Heart Failure Detection with CNN Feature Analysis

    Malleswari D., Vazram B.J., Shaik R.

    Conference paper, Proceedings - 2024 13th IEEE International Conference on Communication Systems and Network Technologies, CSNT 2024, 2024, DOI Link

    View abstract ⏷

    Chronic heart failure (CHF) is a common illness that affects the heart. The conventional machine learning and deep learning algorithms failed to identify CHF in the early stages. So, this work implemented the ChronicNet by combining the properties of both deep learning and machine learning. This study is all about creating a network for finding out CHF from sound recordings of the heart. These are called phonocardiogram data, or PCG. The first step, which is noise removal, signal boosting, and dividing parts, makes sure that heartbeat sounds are of good quality. We use Mel and pitchbased coefficients (MFCC) to analyze the frequency changes in a heartbeat signal. This helps pick out features that show what makes CHF heart disease more special. The convolutional neural network (CNN) feature extraction, using deep learning, helps the MFCC automatically find out special features on its own. The Random Forest classifier (RFC) is used to build a model that can predict CHF faster. In groups with problems to classify, the RFC uses a set of decision trees. This gives benefits such as holding up, fitting to big sizes, and being highly accurate. The proposed system uses CNN features to find CHF from PCG data and correctly identify it using RFC. The simulation results show that the proposed ChronicNet outperformed traditional approaches with 98.84% accuracy.
  • SCT: Summary Caption Technique for Retrieving Relevant Images in Alignment with Multimodal Abstractive Summary

    Rafi S., Das R.

    Article, ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, DOI Link

    View abstract ⏷

    This work proposes an efficient Summary Caption Technique that considers the multimodal summary and image captions as input to retrieve the correspondence images from the captions that are highly influential to the multimodal summary. Matching a multimodal summary with an appropriate image is a challenging task in computer vision and natural language processing. Merging in these fields is tedious, though the research community has steadily focused on cross-modal retrieval. These issues include the visual question-answering, matching queries with the images, and semantic relationship matching between two modalities for retrieving the corresponding image. Relevant works consider questions to match the relationship of visual information and object detection and to match the text with visual information and employing structural-level representation to align the images with the text. However, these techniques are primarily focused on retrieving the images to text or for image captioning. But less effort has been spent on retrieving relevant images for the multimodal summary. Hence, our proposed technique extracts and merge features in the Hybrid Image Text layer and captions in the semantic embeddings with word2vec where the contextual features and semantic relationships are compared and matched with each vector between the modalities, with cosine semantic similarity. In cross-modal retrieval, we achieve top five related images and align the relevant images to the multimodal summary that achieves the highest cosine score among the retrieved images. The model has been trained with seq-to-seq modal with 100 epochs, besides reducing the information loss by the sparse categorical cross entropy. Further, experimenting with the multimodal summarization with multimodal output dataset, in cross-modal retrieval, helps to evaluate the quality of image alignment with an image-precision metric that demonstrate the best results.
  • Machine Learning Framework for Women Safety Prediction using Decision Tree

    Sowmika P.S., Rao S.S.N., Rafi S.

    Conference paper, Proceedings - 5th International Conference on Smart Systems and Inventive Technology, ICSSIT 2023, 2023, DOI Link

    View abstract ⏷

    In every city, harassment and violence becomes one of the major problems for women. Further, women's personal life is suffered by the bullying and abusive content presented in Online Social Networking (OSN). Therefore, it is necessary to identify the women safety in OSN environment. When it came to predicting the maximum safety analysis, however, traditional methodologies came up short. This study, then, employs a decision tree (WSP-DT) classifier to make predictions about women's safety. After considering the Twitter dataset for system implementation, it is pre-processed to get rid of the blanks and the unknowns. The tweets were then processed by a natural language toolkit (NLTK) that handled tasks including tokenization, case-conversion, stop-word detection, stemming, and lemmatization. Next, we create a text blob protocol to determine the positive, negative, and neutral polarity of pre-processed tweets. To further extract the data characteristics based on word and character frequency, term frequency-inverse document frequency (TF-IDF) is used. At last, a decision tree classifier was used, based on several rounds of training, to determine if a tweet was phoney or real. Testing on the Twitter dataset demonstrates that the proposed WSP-DT classifier outperforms the competition in simulations.
  • Abstractive Text Summarization Using Multimodal Information

    Rafi S., Das R.

    Conference paper, 2023 10th International Conference on Soft Computing and Machine Intelligence, ISCMI 2023, 2023, DOI Link

    View abstract ⏷

    Much text generates over the internet through news articles, story writing and blogs. Reading and understanding such an enormous amount of data to the user is problematic, including time and effort. Automatic abstractive text summarization has gained more importance to increase the user's understanding and reduce time. It shortens the given input by preserving the meaning and identifying the context of the whole document to generate meaningful sentences. The research community has proposed different methods for text reduction and generating abstractive summaries. However, problems like semantics and contextual relationship in the summary generation process must be still need to improve. The multimodal abstractive text summarization is a technique that combines text and image information which helps in addressing the semantics and contextual relationship by proposing Multimodality Image Text (MIT) layer that fuses the text-extracted global features by glove embedding and preserves the semantics of the vocabulary and text-related images are used to identify the contextual relationship features from inception v3, which cope in the MIT layer to generate efficient multimodal abstractive text summaries by training and testing with seq-to-seq model. Experiments with the MSMO dataset achieve superior performance on other state-of-art results.
  • Sentiment-Based Abstractive Text Summarization Using Attention Oriented LSTM Model

    Debnath D., Das R., Rafi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2022, DOI Link

    View abstract ⏷

    Product reviews are essential as they help customers to make purchase decisions. Often these product reviews are used to be too abundant, lengthy and descriptive. However, the abstractive text summarization (ATS) system can help to build an internal semantic representation of the text with the use of natural language processing to create sentiment-based summaries of the entire reviews. An ATS system is proposed in this work, which is called a many-to-many sequence problem where attention-based long short-term memory (LSTM) is incorporated to generate the summary of the product reviews. The proposed ATS system works in two stages. In the first stage, a data processing module is designed to create the structured representation of the text, where noise and other irrelevant data are also removed. In the second stage, the attention-based long short-term memory (LSTM) model is designed to train, validate and test the system. It is found that the proposed approach can better deal with the rare words, also can generate readable, short and informative summaries. An experimental analysis has been carried out over the Amazon product review dataset, and to validate the result, the ROUGE measure is used. The obtained result reflects the efficacy of the proposed method over other state-of-the-art methods.
  • Detection of Active Attacks Using Ensemble Machine Learning Approach

    Revathy S., Sathya Priya S., Rafi S., Pranideep P., Rajesh D.

    Conference paper, Lecture Notes in Networks and Systems, 2022, DOI Link

    View abstract ⏷

    Cyber attackers entrap online users and companies by stealing their sensitive information without their knowledge. Sensitive information such as login credentials, bank card details, and centralized servers of organizations are hacked by the attackers. Some of the cyber attacks are phishing attack where attackers make the online users to trust their websites as a legitimate website and retrieve their personal information by making them as a prey for cyber attacks. Malware attack is a one where attacker will inject a malicious software into the company server or any online user’s device, without their knowledge and steal all the data in their devices and servers. Intrusion is an invasion, where attacker will attack the network and theft all the network resources. But, there are many types of cyber attacks solutions such as visual similarity-based approaches, intrusion detection system, signature-based, heuristic-based, specification-based, anomaly-based methods that are proposed, but they have some disadvantages. Because of few unsecured HTTP websites and lack of cyber knowledge, cyber attacks are increasing day by day. In our proposed system, a unified ensemble approach (UEA) is proposed by combining different machine learning algorithms using ensemble approach that gives better accuracy and detection rate. This model aims to detect the intrusion, phishing attack and prevent the malwares thereby mitigating the cyber attacks encountered by individual and organization.
  • RNN Encoder and Decoder with Teacher Forcing Attention Mechanism for Abstractive Summarization

    Rafi S., Das R.

    Conference paper, Proceedings of the 2021 IEEE 18th India Council International Conference, INDICON 2021, 2021, DOI Link

    View abstract ⏷

    Automatic text summarization is utilized to address the sustained increasing proportion of text data available in online, to address such numerous data to short data summarization is preferred. In this work, abstractive text summarization is reviewed with Recurrent Neural Network based Long Short Term Memory with encoder and attention decoder. In recent years, this type of summarization has carried out with Recurrent Neural Network which learns from the previous time steps. In this paper, we propose Teacher Forcing Technique to improve the slow convergence and poor performance of Recurrent Neural Network. Our technique improves the performance of text summarization by using Attention Decoder which learns from ground truth instead of learning from previous time steps and minimizes the error rate with Stochastic Gradient Descent optimizer with more accurate results on Wikihow dataset when measured with metric of ROUGE and performed well when compared our result with the state of art results.
  • A Linear Sub-Structure with Co-Variance Shift for Image Captioning

    Rafi S., Das R.

    Conference paper, 2021 8th International Conference on Soft Computing and Machine Intelligence, ISCMI 2021, 2021, DOI Link

    View abstract ⏷

    Automatic description of image has attracted many researchers in the field of computer vision for captioning the image in artificial intelligence which connects with Natural Language Processing. Exact generation of captions to image is necessary but it lacks due to Gradient Diminishing problem, LSTM can overcome this problem by fusing local and global characteristics of image and text that generates sequenced word prediction for accurate image captioning. We consider Flickr 8k data-set which consists of text as descriptions of images. The use of GLoVe embedding helps for the word representation to consider the global and local features of images which finds distance with Euclidean to understand the relationship between words in vector space. Inception V3 architecture which is pretrained on ImageNet used to extract image features of different objects in scenes. We propose Linear Sub-Structure that helps to generate sequenced order of words for captioning by understanding relationship between words. For extracting image features considers co-variance shift which mainly concentrates on moving parts of the image to generate accurate description of the image to maintain a semantic visual grammar relationship between the predicted text for image as the caption, the proposed model evaluated with the help of BLEU score which achieves state of art model in our work while compared with others that has greater than 81% of accuracy.
  • Enhanced biomedical data modeling using unsupervised probabilistic machine learning technique

    Rizwana S., Challa K., Rafi S., Imambi S.S.

    Article, International Journal of Recent Technology and Engineering, 2019,

    View abstract ⏷

    Text mining approaches uses feature similarity techniques or distributed keyword searching techniques. But machine learning techniques develop a statistical model to categorize documents by learning from vast amount of medical documents available at pubmed. It is unsupervised techniques. The proposed algorithm enhances the traditional document clustering techniques.and generate accurate and reliable model. We experimented the algorithm with 1000 document data set It showed the significant improvement over other traditional algorithms.

Patents

Projects

Scholars

Interests

  • Computer Vision
  • Content-Based Image Retrieval
  • Natural Language Processing

Thought Leaderships

There are no Thought Leaderships associated with this faculty.

Top Achievements

Research Area

No research areas found for this faculty.

Recent Updates

No recent updates found.

Education
2012
MCA
JNTU Kakinada
India
2014
M.Tech (CSE)
JNTU Kakinada
India
2020
AMIE
Institute of Engineers
India
2025
Ph.D
National Institute of Technology Mizoram
India
Experience
  • 2013-2018 Assistant Professor – Tirumala Engineering College, Narasaraopet
  • 2018-2019 Assistant Professor – Eswar College of Engineering, Narasaraopet
  • 2019-2025 Assistant Professor – Narasaraopeta Engineering College (Autonomous), Narasaraopet
Research Interests
  • Considering two types of modalities such as text and image modalities and extracting features from multimodalities to generate Multimodal Abstractive Text Summarization.
  • My interest also includes to retrieve relevant images to the generated Multimodal Abstractive Summaries using different techniques from deep learning.
Awards & Fellowships
  • Gold Medal in M.Tech – Narasaraopeta Engineering College (2013)
  • Best Paper Award – DoSCI-2025
Memberships
  • ACM
  • IAENG
Publications
  • Feature Augmentation and Convolutional Neural Networks for Accurate Prediction of Heart Disease

    Ramnadh Babu T.G., Bhavana K., Chaitanya G., Sravani M., Shaik R., Moturi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2026, DOI Link

    View abstract ⏷

    Heart diseases are considered the foremost cause of death within developing nations; therefore, prediction of heart disease is crucial for evaluating the risk of patients. This paper will introduce a new method to enhance prediction accuracy by combining CNNs with SAE for feature enhancement. Our method uses a dataset of 918 patient records with 11 clinical variables, and it removes the drawback of traditional classifiers by using feature augmentation to build more informative features. Experimental results show that our model’s accuracy is 93.478%, an improvement over traditional classifiers like MLP and RF by 4.98%. Latent space size is also optimized, and 100 best features are obtained. The results suggest that the deep learning methods, especially the combination of SAE with CNN, bring in notable benefits for heart disease prediction, which might further be used for the clinical purpose of earlier interventions.
  • Smartphone Price Patterns Prediction Using Machine Learning Algorithms

    Rafi S., Rambabu Y., Rajasekhar P., Suhas B., Reddy M.S., Moturi S., Doda V.R.

    Conference paper, 6th IEEE International Conference on Recent Advances in Information Technology, RAIT 2025, 2025, DOI Link

    View abstract ⏷

    Selecting the best smartphone can be challenging due to the wide range of models available on the market. This study shows how the machine learning models can predict mobile phone prices based on their features We evaluated several machine learning techniques, including Logistic Regression, Decision Trees, Random Forest, SVC, K-Neighbors Classifier, Gaussian Naive Bayes (GaussianNB), AdaBoost, Gradient Boosting, Extra Trees, Bagging Classifiers, and XGBoost. The primary objective was to identify the most effective model for price forecasting and to investigate the factors influencing phone prices. Our research offers insights to both consumers and manufacturers, helping them make more informed decisions about phone features and pricing. We emphasize the importance of using diverse datasets that accurately represent various smartphone models and pricing points. Key factors affecting phone costs were identified, and model performance was assessed using metrics such as accuracy, F1-score, and classification reports. Model performance was further enhanced through hyperparameter tuning with GridSearchCV, achieving 97% accuracy with the Decision Tree, K-Neighbors Classifier, SVC, AdaBoost, and Random Forest models. Among these, the Decision Tree and SVC was selected as the optimal models, offering a good tradeoff between accuracy, flexibility, and time complexity. This study aims to provide valuable data to guide consumers in making informed choices about mobile phone features and price ranges.
  • Chronic Kidney Disease Prediction Using Machine Learning and Deep Learning Models

    Rafi S., Revanth N., Reddy K.V.R., Babu K.M., Kumar Y.L.P., Kumar N.V.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    Chronic kidney disease is a noticeable health condition that can persist throughout an individual's life, resulting from either kidney malignancy or diminished kidney function. In this work, we investigate how several machine learning techniques might provide an early CKD diagnosis. While previous research has extensively explored this area, our aim is to refine our approach by employing predictive modeling techniques. Initially, we considered 25 variables alongside the class property. The data set used in this study underwent extensive processing, including changing the names of colours for clarity, converting identified colours to numbers, treating unique values with letters handling of partitioned values, fixing incorrect values, filling null values with mean, and encoding categorical values into mathematical notation. In addition, Principal component analysis (PCA) was also employed to lower dimensionality. Our findings demonstrated that the XG Boost classifier surpassed e very other algorithm, with an accuracy of 0.991.
  • Detecting Sarcasm Across Headlines and Text

    Rafi S., Niharika A.L., Neelima S., Nikhitha K., Reddy M.S., Sireesha M.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    In this era with the rapid growth in social media usage among the current generation, a huge amount of content and comments, most of them sarcastic, is seen. Sarcasm has turned out to be an important part of daily life, especially in news and social media, where sarcastic comments are often used for better attention. However, detecting sarcasm is always challenging because it deals with understanding the difference between what has been said and what is meant. The current paper focuses on the detection of sarcasm in news headlines with the help of deep learning. Previous works were based on a wide range of datasets; however, these had limitations regarding either size or quality. In this respect, the authors propose creating a new dataset of headlines from sarcastic news sites and real news sites that is large and of high quality, hence appropriate for machine learning model training. The authors have also used the CNN-BILSTM architecture for text analysis, identifying sarcasm expression and deciding whether it is sarcastic or not-sarcastic which gained an accuracy of 97%. This dataset is made publicly available to enable further research in this direction.
  • Multimodal Multi-objective Grey Wolf Optimisation with SVM and Random Forest as Classifier in Feature Selection

    Das R., Rafi S., Purwar H., Laskar R.H., Rajshekhar A., Chandrawanshi N.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    A technique called feature selection, often referred as attribute subset selection, selects the optimal subset of features for a given set of data by reducing the dimensionality and eliminating unnecessary characteristics. There can be 2n feasible solutions for a dataset with “n” features that is challenging to address using the conventional attribute selection method. Meta heuristic-based approaches perform better than traditional procedures in such situations. Numerous evolutionary computing techniques have been effectively used in Feature Selection challenges. On the distribution of options in the choice space, some research has been done. Achieving one, however, necessitates advancing the other since many optimisation problems contain two or more competing goals. The multi-objective optimisation technique discussed in this research finds the best effective trade-off between numerous objectives. Multi-objective Programs require multiple non-dominated solutions that could be found as opposite to just one. In the initial stage, we applied the Grey Wolf Optimisation (GWO) to acquire the optimised features. On the basis of the features selected, we trained the classifiers—Support Vector Machine (SVM) and Random Forest (RF) in the second phase. Experiment has been carried out on three benchmark datasets namely Glass, Wine and Breast Cancer Datasets retrieved from the UCI repository to show the supremacy of the proposed technique, the effectiveness of the recommended feature selection approach has been evaluated. The testing results show that the suggested GWO with Random Forest performs better than GWO with SVM.
  • Optimizing English Learning with AI: Machine Learning Techniques and Tools

    Jayaranjan M., Shekhar G.R., Rafi S., Ramesh J.V.N., Kiran A., Vimochana M.

    Conference paper, International Conference on Intelligent Systems and Computational Networks, ICISCN 2025, 2025, DOI Link

    View abstract ⏷

    This research presents a new AI-driven approach to optimizing English language learning with a TF-IDF-based Gated Recurrent Unit model. It provides an effective framework for assessing and improving student writing competency by using advanced text classification approaches. Here, the six criteria evaluated on the basis of the ELLIPSE corpus of essays by grade 8-12 learners include coherence, syntax, vocabulary, phraseology, grammar, and conventions. Extensive preprocessing in terms of tokenization, stop word removal, stemming, and lemmatization, TF-IDF extracts key text features into numerical representations and feed into a GRU model that captures long-term dependencies as well as contextual meaning followed by classification with a dense layer and softmax activation. It has been implemented in Python: The GRU model yields an accuracy of 99.7%, precision at 99.04%, recall at 99.56%, and F1-score of 99.54. The work provides a rigorous methodology for text classification which improves the students' writing skills and propels AI tools in education.
  • AI vs Human Text Detector: Identifying AI-Generated Content Using NLP

    Reddy C.R.G.R., Muthukumar D.S., Thogai Rani B., Ganesh N., Rafi S.M., Vamsi P.

    Conference paper, Proceedings of the 2025 3rd International Conference on Inventive Computing and Informatics, ICICI 2025, 2025, DOI Link

    View abstract ⏷

    The evolving capabilities of AI models such as GPT -3 and GPT -4, and Deepseek have rendered it ever more difficult to distinguish between human-written text and AI-generated text. The issue raises serious concerns regarding plagiarism, disinformation, and authorship of electronic content in academic, journalistic, and educational spheres. For this aim, an innovative tool - AI vs Human Text Detector is introduced, it is a web application created to ascertain whether information was authored by a human or by an artificial intelligence model. The system uses Natural Language Processing (NLP) methods to examine prominent linguistic characteristics like perplexity and burstiness that distinguish the homogeneous pattern of AI text from the naturally heterogeneous patterns of human writing. The pre-trained GPT -2 is utilized to quantify textual predictability and variability, thus improving classification accuracy. Visualization of results is facilitated through Matplotlib and Plotly. The software tool is optimized to run on low-cost, commonly available resources, ensuring both accessibility and scalability. The proposed model achieves a high accuracy of 99.2% in detecting AI-generated content, outperforming existing models like ERT, CNN, and BERT-CNN. Future enhancements involve multilingual detection, hybrid detection models. The contribution of this work lies in supporting content integrity and ethical AI use in digital communication.
  • Multiple Disease Detection using CNN based Model from Chest X-rays

    Seviappan A., Chethan D., Sri P.K., Rafi S., Shahi S.G.

    Conference paper, Proceedings of 3rd International Conference on Sustainable Computing and Data Communication Systems, ICSCDS 2025, 2025, DOI Link

    View abstract ⏷

    Chest X-rays are one of the most widely used diagnostic tools for identifying lung-related diseases such as pneumonia, tuberculosis, and lung cancer. However, manual interpretation of these images can be time-consuming and subject to inconsistencies, especially in high-pressure clinical settings or areas with limited access to radiologists. To address these challenges, this study introduces a deep learning-based system that leverages Convolutional Neural Networks (CNNs) for automatic disease detection from chest X-ray images. The model is trained on a large, diverse dataset and incorporates essential preprocessing steps like histogram equalization to improve image quality and enhance diagnostic accuracy.The CNN-based framework is designed to automatically extract features and classify X-ray images into multiple disease categories. It aims to reduce diagnostic delays and support radiologists by offering fast and consistent evaluations. The system's performance is assessed using standard metrics such as accuracy, sensitivity, specificity, and F1-score, demonstrating its effectiveness as a decision-support tool in clinical workflows. By reducing dependence on manual review, this approach enhances both scalability and reliability in medical imaging diagnostics.Beyond automated image analysis, this paper also explores the integration of a real-time health monitoring solution for elderly individuals using a smartwatch-based system. This wearable device tracks vital signs such as heart rate and physical activity and provides instant alerts to caregivers in the event of abnormal readings. The combined application of AI-powered diagnostics and wearable health tracking presents a dual solution that supports early disease detection and improves continuous patient monitoring - paving the way for more responsive, accessible, and patient-centered healthcare.
  • Detecting multimodal cyber-bullying behaviour in social-media using deep learning techniques

    MohammedJany S., Killi C.B.R., Rafi S., Rizwana S.

    Article, Journal of Supercomputing, 2025, DOI Link

    View abstract ⏷

    Cyberbullying detection refers to the process of classifying and identifying of cyberbullying behavior—which involves the use of technology to harass, or bullying individuals, typically through online platforms. A growing concern is the spread of bullying memes on social media, which can perpetuate harmful behavior. While much of the existing research focuses on detecting cyberbullying in text-based data, image-based cyberbullying has not received as much attention. This is a significant issue because many social media posts combine images with text, and the visual content can be a key component of cyberbullying. To address this, our research aims to develop a multimodal cyberbullying detection modal (MCB) that is capable of detecting bullying in both images and text. For this, we used VGG16 pretrained model to detect bullying in images and XLM-RoBERTa with BiGRU model to detect bullying in text. Together we integrated these models (VGG16 + XLM-RoBERTa and BiGRU) using attention mechanisms, CLIP, feedback mechanisms, CentralNet, etc., and proposed a model used for detecting cyberbullying in image-text-based memes. Our accomplished model produced a reasonable accuracy of 74%, pointing that the system is effective in recognizing most cyberbullying activity.
  • Multimodal: A Text-Image based Cyber-Bullying Detecting with Deep Learning

    Ankarao A., Rafi S., Eluri R.K., Reddy K.V.N.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    The practice of categorizing and identifying cyber-bullying behavior, which include using technology to harass or intimidate people - usually through online platforms - is known as cyberbullying detection. To tackle this, we took a look at a dataset that was made public and labeled as bully or non-bully based on text, image, and image-text. Then, we proposed a deep learning model that could identify cyberbullying in multimodal data. Bullying in text is detected using the XLM-RoBERTa with BiGRU model, while bullying in images is identified by the VGG16 pre-trained model. Using attention processes, CLIP, feedback mechanisms, CentralNet, and other tools, we combined these models (VGG16 + XLM-RoBERTa and BiGRU) and developed a model for identifying cyberbullying in image-text based memes.With a respectable accuracy of 72%, our final model demonstrated that the system is capable of identifying the majority of cyberbullying incidents.
  • Recognizing Image Manipulations Utilizing CNN and ELA

    Nallamothu K., Rafi S., Kokkiligadda S., Jany S.M.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    Tampering of digital photos or images is known as image forgery. The ability to create phony images or information has become easier due to the rapid advancement of technology. In order to detect image forgeries, this paper proposes a model that employs Error Level Analysis (ELA) with Convolutional Neural Networks (CNN). ELA is used as a preprocessing step to highlight regions of an image that may have been tampered with. CNN is then trained on this enhanced data to classify images based on their authenticity and detect digital modifications. This initiative’s main goals include image classification, attribute extraction, image authenticity verification, and digital image modification detection. Our suggested solution makes use of CNNs’ deep learning capabilities and the refinement found by ELA.
  • Topic-guided abstractive multimodal summarization with multimodal output

    Rafi S., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Summarization is a technique that produces condensed text from large text documents by using different deep-learning techniques. Over the past few years, abstractive summarization has drawn much attention because of the capability of generating human-like sentences with the help of machines. However, it must improve repetition, redundancy and lexical problems while generating sentences. Previous studies show that incorporating images with text modality in the abstractive summary may reduce redundancy, but the concentration still needs to lay on the semantics of the sentences. This paper considers adding a topic to a multimodal summary to address semantics and linguistics problems. This stress the need to develop a multimodal summarization system with the topic. Multimodal summarization uses two or more modalities to extract the essential features to increase user satisfaction in generating an abstractive summary. However, the paper’s primary aim is to explore the generation of user preference summaries of a particular topic by proposing a Hybrid Image Text Topic (HITT) to guide the extracted essential information from text and image modalities with the help of topic that addresses semantics and linguistic problems to generate a topic-guided abstractive multimodal summary. Furthermore, a caption-summary order space technique has been introduced in this proposed work to retrieve the relevant image for the generated summary. Finally, the MSMO dataset compares and validates the results with rouge and image precision scores. Besides, we also calculated the model’s loss using sparse categorical cross entropy and showed significant improvement over other state-of-the-art techniques.
  • Reducing extrinsic hallucination in multimodal abstractive summaries with post-processing technique

    Rafi S., Laitonjam L., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Multimodal abstractive summarization integrates information from diverse modalities, such as text and images, to generate concise, coherent summaries. Despite its advancements, extrinsic hallucination, a phenomenon where generated summaries include content not present in the source, remains a critical challenge, leading to inaccuracies and reduced reliability. Existing techniques largely focus on intrinsic hallucinations and often require substantial architectural changes, leaving extrinsic hallucination inadequately addressed. This paper proposes a novel post-processing approach to mitigate extrinsic hallucination by leveraging external knowledge vocabularies as a corrective mechanism. The framework identifies hallucinated tokens in generated summaries using a cosine similarity metric and replaces them with factually consistent tokens sourced from external knowledge, ensuring improved coherence and faithfulness. By operating at the token level, the approach preserves linguistic, semantic, and syntactic structures, effectively reducing unfaithful content. The proposed method incorporates domain knowledge through a Word2Vec-based vocabulary, offering a scalable solution to enhance factual consistency without modifying model architectures. The contributions of this study include a detailed methodology for identifying and correcting hallucinated tokens, a robust post-processing pipeline for refining summaries, and a demonstration of the method’s effectiveness in reducing extrinsic hallucinations. Experimental results on MSMO dataset highlight the approach’s potential to improve the accuracy, coherence, and reliability of multimodal abstractive summarization systems, addressing a significant gap in the field and shows the state-of-the-art results with RMS-Prop Optimizer.
  • Unveiling Student Success: A Multifaceted Approach with Learning Coefficients and Beyond

    VijayaKumar N., Rafi S., Mahith T., Reddy D.V., Naik K.R., Raju K.

    Conference paper, 2nd IEEE International Conference on Integrated Intelligence and Communication Systems, ICIICS 2024, 2024, DOI Link

    View abstract ⏷

    The student performance is examined in this study using a number of methods of Educational Data Mining (EDM), Clustering and classification techniques are employed to classify the course as well as the performance in the entrance examination. The results obtained show that the Random Forest and XG Boost which are machine learning models outperform traditional methods for predicting student success. Moreover, CNN and LSTM Networks, which are deep learning models, improve prediction accuracy even further. Conducted through metrics like accuracy, precision, recall and F1-score, this study shows that any form of recognition of the pattern, in this case, the early one, helps to reduce failure rates to considerable extents. The results of this study suggest that there is a potential scope for further improving prediction algorithms and management of educational resources, which are of great relevance to the institutions to further the student success.
  • Rainfall Prediction Using Machine Learning

    Sreenivasu S.V.N., Rafi S., Lakshmi V.V.A.S., Sivanageswara Rao S., Rajani Ch.

    Conference paper, Proceedings of 2024 2nd International Conference on Recent Trends in Microelectronics, Automation, Computing, and Communications Systems: Exploration and Blend of Emerging Technologies for Future Innovation, ICMACC 2024, 2024, DOI Link

    View abstract ⏷

    The title is 'Rainfall Prediction Using Machine Learning'. The initiative's dataset is written in Python and stored in Microsoft Excel. A wide range of machine learning algorithms are used to discover which strategy generates the best accurate predictions. In many sections of the country, rainfall forecasting is critical for avoiding major natural disasters. This forecast was created using a variety of machine learning approaches, including catboost, xgboost, decision tree, random forest, logistic regression, neural network, and light gbm. It incorporates several components. The Weather Dataset was utilized. The primary goal of the research is to evaluate a variety of algorithms and determine which one performs best. Farmers may greatly profit from growing the appropriate crops based on the amount of water they require.
  • Automatic Attendance Management System Using CNN

    Sreenivasu S.V.N., Rajani C., Rani B.U., Dasaradha A., Rafi S.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Facial recognition technology plays a crucial role in various applications, from enhancing security at banks and organizations to streamlining attendance tracking in public gatherings and educational institutions. Traditional methods of attendance marking, such as signatures, names, and biometrics, can be time-consuming and error-prone. To address these challenges, a smart attendance system is proposed, leveraging Deep Learning, Convolutional Neural Networks (CNN), and the OpenCV library in Python for efficient face detection and recognition. The system utilizes advanced algorithms, including Eigen faces and fisher faces, to recognize faces accurately. While deep learning models excel with large datasets, they may not perform optimally with few samples. By comparing input faces with images in the dataset, the system automatically updates recognized names and timestamps into a CSV file, which is then sent to the respective organization's head. Additionally, the system allows users to upload a single photo or a group photo, and it returns matched photos as output using a CNN. This feature enhances the system's flexibility and usability, providing users with a convenient way.
  • Classification of Music Genre Using Deep Learning Approaches

    Srinivas U.M., Rafi S., Manohar T.V., Rao M.V.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Music genre classification is a pivotal area of research within audio technology, holding immense importance for content organization and recommendation. Audio feature extraction and Music genre classification constitute a complete recognition system. Audio feature analysis and Music genre classification together form an integrated recognition system for comprehensive music genre identification and organization. This technology is frequently utilized to accurately detect and classify various types of music genres or characteristics present in audio signals, contributing significantly to the effective organization and recommendation of music content. Our experiment was conducted with the dataset from GTZAN that is taken from Kaggle repository. Convolutional neural networks (CNN) are employed to train our model, which is subsequently utilized for the classification of music genres in audio signals.
  • ChronicNet: Randome Forest Classifier-based Chronic Heart Failure Detection with CNN Feature Analysis

    Malleswari D., Vazram B.J., Shaik R.

    Conference paper, Proceedings - 2024 13th IEEE International Conference on Communication Systems and Network Technologies, CSNT 2024, 2024, DOI Link

    View abstract ⏷

    Chronic heart failure (CHF) is a common illness that affects the heart. The conventional machine learning and deep learning algorithms failed to identify CHF in the early stages. So, this work implemented the ChronicNet by combining the properties of both deep learning and machine learning. This study is all about creating a network for finding out CHF from sound recordings of the heart. These are called phonocardiogram data, or PCG. The first step, which is noise removal, signal boosting, and dividing parts, makes sure that heartbeat sounds are of good quality. We use Mel and pitchbased coefficients (MFCC) to analyze the frequency changes in a heartbeat signal. This helps pick out features that show what makes CHF heart disease more special. The convolutional neural network (CNN) feature extraction, using deep learning, helps the MFCC automatically find out special features on its own. The Random Forest classifier (RFC) is used to build a model that can predict CHF faster. In groups with problems to classify, the RFC uses a set of decision trees. This gives benefits such as holding up, fitting to big sizes, and being highly accurate. The proposed system uses CNN features to find CHF from PCG data and correctly identify it using RFC. The simulation results show that the proposed ChronicNet outperformed traditional approaches with 98.84% accuracy.
  • SCT: Summary Caption Technique for Retrieving Relevant Images in Alignment with Multimodal Abstractive Summary

    Rafi S., Das R.

    Article, ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, DOI Link

    View abstract ⏷

    This work proposes an efficient Summary Caption Technique that considers the multimodal summary and image captions as input to retrieve the correspondence images from the captions that are highly influential to the multimodal summary. Matching a multimodal summary with an appropriate image is a challenging task in computer vision and natural language processing. Merging in these fields is tedious, though the research community has steadily focused on cross-modal retrieval. These issues include the visual question-answering, matching queries with the images, and semantic relationship matching between two modalities for retrieving the corresponding image. Relevant works consider questions to match the relationship of visual information and object detection and to match the text with visual information and employing structural-level representation to align the images with the text. However, these techniques are primarily focused on retrieving the images to text or for image captioning. But less effort has been spent on retrieving relevant images for the multimodal summary. Hence, our proposed technique extracts and merge features in the Hybrid Image Text layer and captions in the semantic embeddings with word2vec where the contextual features and semantic relationships are compared and matched with each vector between the modalities, with cosine semantic similarity. In cross-modal retrieval, we achieve top five related images and align the relevant images to the multimodal summary that achieves the highest cosine score among the retrieved images. The model has been trained with seq-to-seq modal with 100 epochs, besides reducing the information loss by the sparse categorical cross entropy. Further, experimenting with the multimodal summarization with multimodal output dataset, in cross-modal retrieval, helps to evaluate the quality of image alignment with an image-precision metric that demonstrate the best results.
  • Machine Learning Framework for Women Safety Prediction using Decision Tree

    Sowmika P.S., Rao S.S.N., Rafi S.

    Conference paper, Proceedings - 5th International Conference on Smart Systems and Inventive Technology, ICSSIT 2023, 2023, DOI Link

    View abstract ⏷

    In every city, harassment and violence becomes one of the major problems for women. Further, women's personal life is suffered by the bullying and abusive content presented in Online Social Networking (OSN). Therefore, it is necessary to identify the women safety in OSN environment. When it came to predicting the maximum safety analysis, however, traditional methodologies came up short. This study, then, employs a decision tree (WSP-DT) classifier to make predictions about women's safety. After considering the Twitter dataset for system implementation, it is pre-processed to get rid of the blanks and the unknowns. The tweets were then processed by a natural language toolkit (NLTK) that handled tasks including tokenization, case-conversion, stop-word detection, stemming, and lemmatization. Next, we create a text blob protocol to determine the positive, negative, and neutral polarity of pre-processed tweets. To further extract the data characteristics based on word and character frequency, term frequency-inverse document frequency (TF-IDF) is used. At last, a decision tree classifier was used, based on several rounds of training, to determine if a tweet was phoney or real. Testing on the Twitter dataset demonstrates that the proposed WSP-DT classifier outperforms the competition in simulations.
  • Abstractive Text Summarization Using Multimodal Information

    Rafi S., Das R.

    Conference paper, 2023 10th International Conference on Soft Computing and Machine Intelligence, ISCMI 2023, 2023, DOI Link

    View abstract ⏷

    Much text generates over the internet through news articles, story writing and blogs. Reading and understanding such an enormous amount of data to the user is problematic, including time and effort. Automatic abstractive text summarization has gained more importance to increase the user's understanding and reduce time. It shortens the given input by preserving the meaning and identifying the context of the whole document to generate meaningful sentences. The research community has proposed different methods for text reduction and generating abstractive summaries. However, problems like semantics and contextual relationship in the summary generation process must be still need to improve. The multimodal abstractive text summarization is a technique that combines text and image information which helps in addressing the semantics and contextual relationship by proposing Multimodality Image Text (MIT) layer that fuses the text-extracted global features by glove embedding and preserves the semantics of the vocabulary and text-related images are used to identify the contextual relationship features from inception v3, which cope in the MIT layer to generate efficient multimodal abstractive text summaries by training and testing with seq-to-seq model. Experiments with the MSMO dataset achieve superior performance on other state-of-art results.
  • Sentiment-Based Abstractive Text Summarization Using Attention Oriented LSTM Model

    Debnath D., Das R., Rafi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2022, DOI Link

    View abstract ⏷

    Product reviews are essential as they help customers to make purchase decisions. Often these product reviews are used to be too abundant, lengthy and descriptive. However, the abstractive text summarization (ATS) system can help to build an internal semantic representation of the text with the use of natural language processing to create sentiment-based summaries of the entire reviews. An ATS system is proposed in this work, which is called a many-to-many sequence problem where attention-based long short-term memory (LSTM) is incorporated to generate the summary of the product reviews. The proposed ATS system works in two stages. In the first stage, a data processing module is designed to create the structured representation of the text, where noise and other irrelevant data are also removed. In the second stage, the attention-based long short-term memory (LSTM) model is designed to train, validate and test the system. It is found that the proposed approach can better deal with the rare words, also can generate readable, short and informative summaries. An experimental analysis has been carried out over the Amazon product review dataset, and to validate the result, the ROUGE measure is used. The obtained result reflects the efficacy of the proposed method over other state-of-the-art methods.
  • Detection of Active Attacks Using Ensemble Machine Learning Approach

    Revathy S., Sathya Priya S., Rafi S., Pranideep P., Rajesh D.

    Conference paper, Lecture Notes in Networks and Systems, 2022, DOI Link

    View abstract ⏷

    Cyber attackers entrap online users and companies by stealing their sensitive information without their knowledge. Sensitive information such as login credentials, bank card details, and centralized servers of organizations are hacked by the attackers. Some of the cyber attacks are phishing attack where attackers make the online users to trust their websites as a legitimate website and retrieve their personal information by making them as a prey for cyber attacks. Malware attack is a one where attacker will inject a malicious software into the company server or any online user’s device, without their knowledge and steal all the data in their devices and servers. Intrusion is an invasion, where attacker will attack the network and theft all the network resources. But, there are many types of cyber attacks solutions such as visual similarity-based approaches, intrusion detection system, signature-based, heuristic-based, specification-based, anomaly-based methods that are proposed, but they have some disadvantages. Because of few unsecured HTTP websites and lack of cyber knowledge, cyber attacks are increasing day by day. In our proposed system, a unified ensemble approach (UEA) is proposed by combining different machine learning algorithms using ensemble approach that gives better accuracy and detection rate. This model aims to detect the intrusion, phishing attack and prevent the malwares thereby mitigating the cyber attacks encountered by individual and organization.
  • RNN Encoder and Decoder with Teacher Forcing Attention Mechanism for Abstractive Summarization

    Rafi S., Das R.

    Conference paper, Proceedings of the 2021 IEEE 18th India Council International Conference, INDICON 2021, 2021, DOI Link

    View abstract ⏷

    Automatic text summarization is utilized to address the sustained increasing proportion of text data available in online, to address such numerous data to short data summarization is preferred. In this work, abstractive text summarization is reviewed with Recurrent Neural Network based Long Short Term Memory with encoder and attention decoder. In recent years, this type of summarization has carried out with Recurrent Neural Network which learns from the previous time steps. In this paper, we propose Teacher Forcing Technique to improve the slow convergence and poor performance of Recurrent Neural Network. Our technique improves the performance of text summarization by using Attention Decoder which learns from ground truth instead of learning from previous time steps and minimizes the error rate with Stochastic Gradient Descent optimizer with more accurate results on Wikihow dataset when measured with metric of ROUGE and performed well when compared our result with the state of art results.
  • A Linear Sub-Structure with Co-Variance Shift for Image Captioning

    Rafi S., Das R.

    Conference paper, 2021 8th International Conference on Soft Computing and Machine Intelligence, ISCMI 2021, 2021, DOI Link

    View abstract ⏷

    Automatic description of image has attracted many researchers in the field of computer vision for captioning the image in artificial intelligence which connects with Natural Language Processing. Exact generation of captions to image is necessary but it lacks due to Gradient Diminishing problem, LSTM can overcome this problem by fusing local and global characteristics of image and text that generates sequenced word prediction for accurate image captioning. We consider Flickr 8k data-set which consists of text as descriptions of images. The use of GLoVe embedding helps for the word representation to consider the global and local features of images which finds distance with Euclidean to understand the relationship between words in vector space. Inception V3 architecture which is pretrained on ImageNet used to extract image features of different objects in scenes. We propose Linear Sub-Structure that helps to generate sequenced order of words for captioning by understanding relationship between words. For extracting image features considers co-variance shift which mainly concentrates on moving parts of the image to generate accurate description of the image to maintain a semantic visual grammar relationship between the predicted text for image as the caption, the proposed model evaluated with the help of BLEU score which achieves state of art model in our work while compared with others that has greater than 81% of accuracy.
  • Enhanced biomedical data modeling using unsupervised probabilistic machine learning technique

    Rizwana S., Challa K., Rafi S., Imambi S.S.

    Article, International Journal of Recent Technology and Engineering, 2019,

    View abstract ⏷

    Text mining approaches uses feature similarity techniques or distributed keyword searching techniques. But machine learning techniques develop a statistical model to categorize documents by learning from vast amount of medical documents available at pubmed. It is unsupervised techniques. The proposed algorithm enhances the traditional document clustering techniques.and generate accurate and reliable model. We experimented the algorithm with 1000 document data set It showed the significant improvement over other traditional algorithms.
Contact Details

rafi.s@srmap.edu.in

Scholars
Interests

  • Computer Vision
  • Content-Based Image Retrieval
  • Natural Language Processing

Education
2012
MCA
JNTU Kakinada
India
2014
M.Tech (CSE)
JNTU Kakinada
India
2020
AMIE
Institute of Engineers
India
2025
Ph.D
National Institute of Technology Mizoram
India
Experience
  • 2013-2018 Assistant Professor – Tirumala Engineering College, Narasaraopet
  • 2018-2019 Assistant Professor – Eswar College of Engineering, Narasaraopet
  • 2019-2025 Assistant Professor – Narasaraopeta Engineering College (Autonomous), Narasaraopet
Research Interests
  • Considering two types of modalities such as text and image modalities and extracting features from multimodalities to generate Multimodal Abstractive Text Summarization.
  • My interest also includes to retrieve relevant images to the generated Multimodal Abstractive Summaries using different techniques from deep learning.
Awards & Fellowships
  • Gold Medal in M.Tech – Narasaraopeta Engineering College (2013)
  • Best Paper Award – DoSCI-2025
Memberships
  • ACM
  • IAENG
Publications
  • Feature Augmentation and Convolutional Neural Networks for Accurate Prediction of Heart Disease

    Ramnadh Babu T.G., Bhavana K., Chaitanya G., Sravani M., Shaik R., Moturi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2026, DOI Link

    View abstract ⏷

    Heart diseases are considered the foremost cause of death within developing nations; therefore, prediction of heart disease is crucial for evaluating the risk of patients. This paper will introduce a new method to enhance prediction accuracy by combining CNNs with SAE for feature enhancement. Our method uses a dataset of 918 patient records with 11 clinical variables, and it removes the drawback of traditional classifiers by using feature augmentation to build more informative features. Experimental results show that our model’s accuracy is 93.478%, an improvement over traditional classifiers like MLP and RF by 4.98%. Latent space size is also optimized, and 100 best features are obtained. The results suggest that the deep learning methods, especially the combination of SAE with CNN, bring in notable benefits for heart disease prediction, which might further be used for the clinical purpose of earlier interventions.
  • Smartphone Price Patterns Prediction Using Machine Learning Algorithms

    Rafi S., Rambabu Y., Rajasekhar P., Suhas B., Reddy M.S., Moturi S., Doda V.R.

    Conference paper, 6th IEEE International Conference on Recent Advances in Information Technology, RAIT 2025, 2025, DOI Link

    View abstract ⏷

    Selecting the best smartphone can be challenging due to the wide range of models available on the market. This study shows how the machine learning models can predict mobile phone prices based on their features We evaluated several machine learning techniques, including Logistic Regression, Decision Trees, Random Forest, SVC, K-Neighbors Classifier, Gaussian Naive Bayes (GaussianNB), AdaBoost, Gradient Boosting, Extra Trees, Bagging Classifiers, and XGBoost. The primary objective was to identify the most effective model for price forecasting and to investigate the factors influencing phone prices. Our research offers insights to both consumers and manufacturers, helping them make more informed decisions about phone features and pricing. We emphasize the importance of using diverse datasets that accurately represent various smartphone models and pricing points. Key factors affecting phone costs were identified, and model performance was assessed using metrics such as accuracy, F1-score, and classification reports. Model performance was further enhanced through hyperparameter tuning with GridSearchCV, achieving 97% accuracy with the Decision Tree, K-Neighbors Classifier, SVC, AdaBoost, and Random Forest models. Among these, the Decision Tree and SVC was selected as the optimal models, offering a good tradeoff between accuracy, flexibility, and time complexity. This study aims to provide valuable data to guide consumers in making informed choices about mobile phone features and price ranges.
  • Chronic Kidney Disease Prediction Using Machine Learning and Deep Learning Models

    Rafi S., Revanth N., Reddy K.V.R., Babu K.M., Kumar Y.L.P., Kumar N.V.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    Chronic kidney disease is a noticeable health condition that can persist throughout an individual's life, resulting from either kidney malignancy or diminished kidney function. In this work, we investigate how several machine learning techniques might provide an early CKD diagnosis. While previous research has extensively explored this area, our aim is to refine our approach by employing predictive modeling techniques. Initially, we considered 25 variables alongside the class property. The data set used in this study underwent extensive processing, including changing the names of colours for clarity, converting identified colours to numbers, treating unique values with letters handling of partitioned values, fixing incorrect values, filling null values with mean, and encoding categorical values into mathematical notation. In addition, Principal component analysis (PCA) was also employed to lower dimensionality. Our findings demonstrated that the XG Boost classifier surpassed e very other algorithm, with an accuracy of 0.991.
  • Detecting Sarcasm Across Headlines and Text

    Rafi S., Niharika A.L., Neelima S., Nikhitha K., Reddy M.S., Sireesha M.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    In this era with the rapid growth in social media usage among the current generation, a huge amount of content and comments, most of them sarcastic, is seen. Sarcasm has turned out to be an important part of daily life, especially in news and social media, where sarcastic comments are often used for better attention. However, detecting sarcasm is always challenging because it deals with understanding the difference between what has been said and what is meant. The current paper focuses on the detection of sarcasm in news headlines with the help of deep learning. Previous works were based on a wide range of datasets; however, these had limitations regarding either size or quality. In this respect, the authors propose creating a new dataset of headlines from sarcastic news sites and real news sites that is large and of high quality, hence appropriate for machine learning model training. The authors have also used the CNN-BILSTM architecture for text analysis, identifying sarcasm expression and deciding whether it is sarcastic or not-sarcastic which gained an accuracy of 97%. This dataset is made publicly available to enable further research in this direction.
  • Multimodal Multi-objective Grey Wolf Optimisation with SVM and Random Forest as Classifier in Feature Selection

    Das R., Rafi S., Purwar H., Laskar R.H., Rajshekhar A., Chandrawanshi N.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    A technique called feature selection, often referred as attribute subset selection, selects the optimal subset of features for a given set of data by reducing the dimensionality and eliminating unnecessary characteristics. There can be 2n feasible solutions for a dataset with “n” features that is challenging to address using the conventional attribute selection method. Meta heuristic-based approaches perform better than traditional procedures in such situations. Numerous evolutionary computing techniques have been effectively used in Feature Selection challenges. On the distribution of options in the choice space, some research has been done. Achieving one, however, necessitates advancing the other since many optimisation problems contain two or more competing goals. The multi-objective optimisation technique discussed in this research finds the best effective trade-off between numerous objectives. Multi-objective Programs require multiple non-dominated solutions that could be found as opposite to just one. In the initial stage, we applied the Grey Wolf Optimisation (GWO) to acquire the optimised features. On the basis of the features selected, we trained the classifiers—Support Vector Machine (SVM) and Random Forest (RF) in the second phase. Experiment has been carried out on three benchmark datasets namely Glass, Wine and Breast Cancer Datasets retrieved from the UCI repository to show the supremacy of the proposed technique, the effectiveness of the recommended feature selection approach has been evaluated. The testing results show that the suggested GWO with Random Forest performs better than GWO with SVM.
  • Optimizing English Learning with AI: Machine Learning Techniques and Tools

    Jayaranjan M., Shekhar G.R., Rafi S., Ramesh J.V.N., Kiran A., Vimochana M.

    Conference paper, International Conference on Intelligent Systems and Computational Networks, ICISCN 2025, 2025, DOI Link

    View abstract ⏷

    This research presents a new AI-driven approach to optimizing English language learning with a TF-IDF-based Gated Recurrent Unit model. It provides an effective framework for assessing and improving student writing competency by using advanced text classification approaches. Here, the six criteria evaluated on the basis of the ELLIPSE corpus of essays by grade 8-12 learners include coherence, syntax, vocabulary, phraseology, grammar, and conventions. Extensive preprocessing in terms of tokenization, stop word removal, stemming, and lemmatization, TF-IDF extracts key text features into numerical representations and feed into a GRU model that captures long-term dependencies as well as contextual meaning followed by classification with a dense layer and softmax activation. It has been implemented in Python: The GRU model yields an accuracy of 99.7%, precision at 99.04%, recall at 99.56%, and F1-score of 99.54. The work provides a rigorous methodology for text classification which improves the students' writing skills and propels AI tools in education.
  • AI vs Human Text Detector: Identifying AI-Generated Content Using NLP

    Reddy C.R.G.R., Muthukumar D.S., Thogai Rani B., Ganesh N., Rafi S.M., Vamsi P.

    Conference paper, Proceedings of the 2025 3rd International Conference on Inventive Computing and Informatics, ICICI 2025, 2025, DOI Link

    View abstract ⏷

    The evolving capabilities of AI models such as GPT -3 and GPT -4, and Deepseek have rendered it ever more difficult to distinguish between human-written text and AI-generated text. The issue raises serious concerns regarding plagiarism, disinformation, and authorship of electronic content in academic, journalistic, and educational spheres. For this aim, an innovative tool - AI vs Human Text Detector is introduced, it is a web application created to ascertain whether information was authored by a human or by an artificial intelligence model. The system uses Natural Language Processing (NLP) methods to examine prominent linguistic characteristics like perplexity and burstiness that distinguish the homogeneous pattern of AI text from the naturally heterogeneous patterns of human writing. The pre-trained GPT -2 is utilized to quantify textual predictability and variability, thus improving classification accuracy. Visualization of results is facilitated through Matplotlib and Plotly. The software tool is optimized to run on low-cost, commonly available resources, ensuring both accessibility and scalability. The proposed model achieves a high accuracy of 99.2% in detecting AI-generated content, outperforming existing models like ERT, CNN, and BERT-CNN. Future enhancements involve multilingual detection, hybrid detection models. The contribution of this work lies in supporting content integrity and ethical AI use in digital communication.
  • Multiple Disease Detection using CNN based Model from Chest X-rays

    Seviappan A., Chethan D., Sri P.K., Rafi S., Shahi S.G.

    Conference paper, Proceedings of 3rd International Conference on Sustainable Computing and Data Communication Systems, ICSCDS 2025, 2025, DOI Link

    View abstract ⏷

    Chest X-rays are one of the most widely used diagnostic tools for identifying lung-related diseases such as pneumonia, tuberculosis, and lung cancer. However, manual interpretation of these images can be time-consuming and subject to inconsistencies, especially in high-pressure clinical settings or areas with limited access to radiologists. To address these challenges, this study introduces a deep learning-based system that leverages Convolutional Neural Networks (CNNs) for automatic disease detection from chest X-ray images. The model is trained on a large, diverse dataset and incorporates essential preprocessing steps like histogram equalization to improve image quality and enhance diagnostic accuracy.The CNN-based framework is designed to automatically extract features and classify X-ray images into multiple disease categories. It aims to reduce diagnostic delays and support radiologists by offering fast and consistent evaluations. The system's performance is assessed using standard metrics such as accuracy, sensitivity, specificity, and F1-score, demonstrating its effectiveness as a decision-support tool in clinical workflows. By reducing dependence on manual review, this approach enhances both scalability and reliability in medical imaging diagnostics.Beyond automated image analysis, this paper also explores the integration of a real-time health monitoring solution for elderly individuals using a smartwatch-based system. This wearable device tracks vital signs such as heart rate and physical activity and provides instant alerts to caregivers in the event of abnormal readings. The combined application of AI-powered diagnostics and wearable health tracking presents a dual solution that supports early disease detection and improves continuous patient monitoring - paving the way for more responsive, accessible, and patient-centered healthcare.
  • Detecting multimodal cyber-bullying behaviour in social-media using deep learning techniques

    MohammedJany S., Killi C.B.R., Rafi S., Rizwana S.

    Article, Journal of Supercomputing, 2025, DOI Link

    View abstract ⏷

    Cyberbullying detection refers to the process of classifying and identifying of cyberbullying behavior—which involves the use of technology to harass, or bullying individuals, typically through online platforms. A growing concern is the spread of bullying memes on social media, which can perpetuate harmful behavior. While much of the existing research focuses on detecting cyberbullying in text-based data, image-based cyberbullying has not received as much attention. This is a significant issue because many social media posts combine images with text, and the visual content can be a key component of cyberbullying. To address this, our research aims to develop a multimodal cyberbullying detection modal (MCB) that is capable of detecting bullying in both images and text. For this, we used VGG16 pretrained model to detect bullying in images and XLM-RoBERTa with BiGRU model to detect bullying in text. Together we integrated these models (VGG16 + XLM-RoBERTa and BiGRU) using attention mechanisms, CLIP, feedback mechanisms, CentralNet, etc., and proposed a model used for detecting cyberbullying in image-text-based memes. Our accomplished model produced a reasonable accuracy of 74%, pointing that the system is effective in recognizing most cyberbullying activity.
  • Multimodal: A Text-Image based Cyber-Bullying Detecting with Deep Learning

    Ankarao A., Rafi S., Eluri R.K., Reddy K.V.N.

    Conference paper, 2025 IEEE International Conference on Interdisciplinary Approaches in Technology and Management for Social Innovation, IATMSI 2025, 2025, DOI Link

    View abstract ⏷

    The practice of categorizing and identifying cyber-bullying behavior, which include using technology to harass or intimidate people - usually through online platforms - is known as cyberbullying detection. To tackle this, we took a look at a dataset that was made public and labeled as bully or non-bully based on text, image, and image-text. Then, we proposed a deep learning model that could identify cyberbullying in multimodal data. Bullying in text is detected using the XLM-RoBERTa with BiGRU model, while bullying in images is identified by the VGG16 pre-trained model. Using attention processes, CLIP, feedback mechanisms, CentralNet, and other tools, we combined these models (VGG16 + XLM-RoBERTa and BiGRU) and developed a model for identifying cyberbullying in image-text based memes.With a respectable accuracy of 72%, our final model demonstrated that the system is capable of identifying the majority of cyberbullying incidents.
  • Recognizing Image Manipulations Utilizing CNN and ELA

    Nallamothu K., Rafi S., Kokkiligadda S., Jany S.M.

    Conference paper, Lecture Notes in Networks and Systems, 2025, DOI Link

    View abstract ⏷

    Tampering of digital photos or images is known as image forgery. The ability to create phony images or information has become easier due to the rapid advancement of technology. In order to detect image forgeries, this paper proposes a model that employs Error Level Analysis (ELA) with Convolutional Neural Networks (CNN). ELA is used as a preprocessing step to highlight regions of an image that may have been tampered with. CNN is then trained on this enhanced data to classify images based on their authenticity and detect digital modifications. This initiative’s main goals include image classification, attribute extraction, image authenticity verification, and digital image modification detection. Our suggested solution makes use of CNNs’ deep learning capabilities and the refinement found by ELA.
  • Topic-guided abstractive multimodal summarization with multimodal output

    Rafi S., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Summarization is a technique that produces condensed text from large text documents by using different deep-learning techniques. Over the past few years, abstractive summarization has drawn much attention because of the capability of generating human-like sentences with the help of machines. However, it must improve repetition, redundancy and lexical problems while generating sentences. Previous studies show that incorporating images with text modality in the abstractive summary may reduce redundancy, but the concentration still needs to lay on the semantics of the sentences. This paper considers adding a topic to a multimodal summary to address semantics and linguistics problems. This stress the need to develop a multimodal summarization system with the topic. Multimodal summarization uses two or more modalities to extract the essential features to increase user satisfaction in generating an abstractive summary. However, the paper’s primary aim is to explore the generation of user preference summaries of a particular topic by proposing a Hybrid Image Text Topic (HITT) to guide the extracted essential information from text and image modalities with the help of topic that addresses semantics and linguistic problems to generate a topic-guided abstractive multimodal summary. Furthermore, a caption-summary order space technique has been introduced in this proposed work to retrieve the relevant image for the generated summary. Finally, the MSMO dataset compares and validates the results with rouge and image precision scores. Besides, we also calculated the model’s loss using sparse categorical cross entropy and showed significant improvement over other state-of-the-art techniques.
  • Reducing extrinsic hallucination in multimodal abstractive summaries with post-processing technique

    Rafi S., Laitonjam L., Das R.

    Article, Neural Computing and Applications, 2025, DOI Link

    View abstract ⏷

    Multimodal abstractive summarization integrates information from diverse modalities, such as text and images, to generate concise, coherent summaries. Despite its advancements, extrinsic hallucination, a phenomenon where generated summaries include content not present in the source, remains a critical challenge, leading to inaccuracies and reduced reliability. Existing techniques largely focus on intrinsic hallucinations and often require substantial architectural changes, leaving extrinsic hallucination inadequately addressed. This paper proposes a novel post-processing approach to mitigate extrinsic hallucination by leveraging external knowledge vocabularies as a corrective mechanism. The framework identifies hallucinated tokens in generated summaries using a cosine similarity metric and replaces them with factually consistent tokens sourced from external knowledge, ensuring improved coherence and faithfulness. By operating at the token level, the approach preserves linguistic, semantic, and syntactic structures, effectively reducing unfaithful content. The proposed method incorporates domain knowledge through a Word2Vec-based vocabulary, offering a scalable solution to enhance factual consistency without modifying model architectures. The contributions of this study include a detailed methodology for identifying and correcting hallucinated tokens, a robust post-processing pipeline for refining summaries, and a demonstration of the method’s effectiveness in reducing extrinsic hallucinations. Experimental results on MSMO dataset highlight the approach’s potential to improve the accuracy, coherence, and reliability of multimodal abstractive summarization systems, addressing a significant gap in the field and shows the state-of-the-art results with RMS-Prop Optimizer.
  • Unveiling Student Success: A Multifaceted Approach with Learning Coefficients and Beyond

    VijayaKumar N., Rafi S., Mahith T., Reddy D.V., Naik K.R., Raju K.

    Conference paper, 2nd IEEE International Conference on Integrated Intelligence and Communication Systems, ICIICS 2024, 2024, DOI Link

    View abstract ⏷

    The student performance is examined in this study using a number of methods of Educational Data Mining (EDM), Clustering and classification techniques are employed to classify the course as well as the performance in the entrance examination. The results obtained show that the Random Forest and XG Boost which are machine learning models outperform traditional methods for predicting student success. Moreover, CNN and LSTM Networks, which are deep learning models, improve prediction accuracy even further. Conducted through metrics like accuracy, precision, recall and F1-score, this study shows that any form of recognition of the pattern, in this case, the early one, helps to reduce failure rates to considerable extents. The results of this study suggest that there is a potential scope for further improving prediction algorithms and management of educational resources, which are of great relevance to the institutions to further the student success.
  • Rainfall Prediction Using Machine Learning

    Sreenivasu S.V.N., Rafi S., Lakshmi V.V.A.S., Sivanageswara Rao S., Rajani Ch.

    Conference paper, Proceedings of 2024 2nd International Conference on Recent Trends in Microelectronics, Automation, Computing, and Communications Systems: Exploration and Blend of Emerging Technologies for Future Innovation, ICMACC 2024, 2024, DOI Link

    View abstract ⏷

    The title is 'Rainfall Prediction Using Machine Learning'. The initiative's dataset is written in Python and stored in Microsoft Excel. A wide range of machine learning algorithms are used to discover which strategy generates the best accurate predictions. In many sections of the country, rainfall forecasting is critical for avoiding major natural disasters. This forecast was created using a variety of machine learning approaches, including catboost, xgboost, decision tree, random forest, logistic regression, neural network, and light gbm. It incorporates several components. The Weather Dataset was utilized. The primary goal of the research is to evaluate a variety of algorithms and determine which one performs best. Farmers may greatly profit from growing the appropriate crops based on the amount of water they require.
  • Automatic Attendance Management System Using CNN

    Sreenivasu S.V.N., Rajani C., Rani B.U., Dasaradha A., Rafi S.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Facial recognition technology plays a crucial role in various applications, from enhancing security at banks and organizations to streamlining attendance tracking in public gatherings and educational institutions. Traditional methods of attendance marking, such as signatures, names, and biometrics, can be time-consuming and error-prone. To address these challenges, a smart attendance system is proposed, leveraging Deep Learning, Convolutional Neural Networks (CNN), and the OpenCV library in Python for efficient face detection and recognition. The system utilizes advanced algorithms, including Eigen faces and fisher faces, to recognize faces accurately. While deep learning models excel with large datasets, they may not perform optimally with few samples. By comparing input faces with images in the dataset, the system automatically updates recognized names and timestamps into a CSV file, which is then sent to the respective organization's head. Additionally, the system allows users to upload a single photo or a group photo, and it returns matched photos as output using a CNN. This feature enhances the system's flexibility and usability, providing users with a convenient way.
  • Classification of Music Genre Using Deep Learning Approaches

    Srinivas U.M., Rafi S., Manohar T.V., Rao M.V.

    Conference paper, 2024 4th International Conference on Artificial Intelligence and Signal Processing, AISP 2024, 2024, DOI Link

    View abstract ⏷

    Music genre classification is a pivotal area of research within audio technology, holding immense importance for content organization and recommendation. Audio feature extraction and Music genre classification constitute a complete recognition system. Audio feature analysis and Music genre classification together form an integrated recognition system for comprehensive music genre identification and organization. This technology is frequently utilized to accurately detect and classify various types of music genres or characteristics present in audio signals, contributing significantly to the effective organization and recommendation of music content. Our experiment was conducted with the dataset from GTZAN that is taken from Kaggle repository. Convolutional neural networks (CNN) are employed to train our model, which is subsequently utilized for the classification of music genres in audio signals.
  • ChronicNet: Randome Forest Classifier-based Chronic Heart Failure Detection with CNN Feature Analysis

    Malleswari D., Vazram B.J., Shaik R.

    Conference paper, Proceedings - 2024 13th IEEE International Conference on Communication Systems and Network Technologies, CSNT 2024, 2024, DOI Link

    View abstract ⏷

    Chronic heart failure (CHF) is a common illness that affects the heart. The conventional machine learning and deep learning algorithms failed to identify CHF in the early stages. So, this work implemented the ChronicNet by combining the properties of both deep learning and machine learning. This study is all about creating a network for finding out CHF from sound recordings of the heart. These are called phonocardiogram data, or PCG. The first step, which is noise removal, signal boosting, and dividing parts, makes sure that heartbeat sounds are of good quality. We use Mel and pitchbased coefficients (MFCC) to analyze the frequency changes in a heartbeat signal. This helps pick out features that show what makes CHF heart disease more special. The convolutional neural network (CNN) feature extraction, using deep learning, helps the MFCC automatically find out special features on its own. The Random Forest classifier (RFC) is used to build a model that can predict CHF faster. In groups with problems to classify, the RFC uses a set of decision trees. This gives benefits such as holding up, fitting to big sizes, and being highly accurate. The proposed system uses CNN features to find CHF from PCG data and correctly identify it using RFC. The simulation results show that the proposed ChronicNet outperformed traditional approaches with 98.84% accuracy.
  • SCT: Summary Caption Technique for Retrieving Relevant Images in Alignment with Multimodal Abstractive Summary

    Rafi S., Das R.

    Article, ACM Transactions on Asian and Low-Resource Language Information Processing, 2024, DOI Link

    View abstract ⏷

    This work proposes an efficient Summary Caption Technique that considers the multimodal summary and image captions as input to retrieve the correspondence images from the captions that are highly influential to the multimodal summary. Matching a multimodal summary with an appropriate image is a challenging task in computer vision and natural language processing. Merging in these fields is tedious, though the research community has steadily focused on cross-modal retrieval. These issues include the visual question-answering, matching queries with the images, and semantic relationship matching between two modalities for retrieving the corresponding image. Relevant works consider questions to match the relationship of visual information and object detection and to match the text with visual information and employing structural-level representation to align the images with the text. However, these techniques are primarily focused on retrieving the images to text or for image captioning. But less effort has been spent on retrieving relevant images for the multimodal summary. Hence, our proposed technique extracts and merge features in the Hybrid Image Text layer and captions in the semantic embeddings with word2vec where the contextual features and semantic relationships are compared and matched with each vector between the modalities, with cosine semantic similarity. In cross-modal retrieval, we achieve top five related images and align the relevant images to the multimodal summary that achieves the highest cosine score among the retrieved images. The model has been trained with seq-to-seq modal with 100 epochs, besides reducing the information loss by the sparse categorical cross entropy. Further, experimenting with the multimodal summarization with multimodal output dataset, in cross-modal retrieval, helps to evaluate the quality of image alignment with an image-precision metric that demonstrate the best results.
  • Machine Learning Framework for Women Safety Prediction using Decision Tree

    Sowmika P.S., Rao S.S.N., Rafi S.

    Conference paper, Proceedings - 5th International Conference on Smart Systems and Inventive Technology, ICSSIT 2023, 2023, DOI Link

    View abstract ⏷

    In every city, harassment and violence becomes one of the major problems for women. Further, women's personal life is suffered by the bullying and abusive content presented in Online Social Networking (OSN). Therefore, it is necessary to identify the women safety in OSN environment. When it came to predicting the maximum safety analysis, however, traditional methodologies came up short. This study, then, employs a decision tree (WSP-DT) classifier to make predictions about women's safety. After considering the Twitter dataset for system implementation, it is pre-processed to get rid of the blanks and the unknowns. The tweets were then processed by a natural language toolkit (NLTK) that handled tasks including tokenization, case-conversion, stop-word detection, stemming, and lemmatization. Next, we create a text blob protocol to determine the positive, negative, and neutral polarity of pre-processed tweets. To further extract the data characteristics based on word and character frequency, term frequency-inverse document frequency (TF-IDF) is used. At last, a decision tree classifier was used, based on several rounds of training, to determine if a tweet was phoney or real. Testing on the Twitter dataset demonstrates that the proposed WSP-DT classifier outperforms the competition in simulations.
  • Abstractive Text Summarization Using Multimodal Information

    Rafi S., Das R.

    Conference paper, 2023 10th International Conference on Soft Computing and Machine Intelligence, ISCMI 2023, 2023, DOI Link

    View abstract ⏷

    Much text generates over the internet through news articles, story writing and blogs. Reading and understanding such an enormous amount of data to the user is problematic, including time and effort. Automatic abstractive text summarization has gained more importance to increase the user's understanding and reduce time. It shortens the given input by preserving the meaning and identifying the context of the whole document to generate meaningful sentences. The research community has proposed different methods for text reduction and generating abstractive summaries. However, problems like semantics and contextual relationship in the summary generation process must be still need to improve. The multimodal abstractive text summarization is a technique that combines text and image information which helps in addressing the semantics and contextual relationship by proposing Multimodality Image Text (MIT) layer that fuses the text-extracted global features by glove embedding and preserves the semantics of the vocabulary and text-related images are used to identify the contextual relationship features from inception v3, which cope in the MIT layer to generate efficient multimodal abstractive text summaries by training and testing with seq-to-seq model. Experiments with the MSMO dataset achieve superior performance on other state-of-art results.
  • Sentiment-Based Abstractive Text Summarization Using Attention Oriented LSTM Model

    Debnath D., Das R., Rafi S.

    Conference paper, Smart Innovation, Systems and Technologies, 2022, DOI Link

    View abstract ⏷

    Product reviews are essential as they help customers to make purchase decisions. Often these product reviews are used to be too abundant, lengthy and descriptive. However, the abstractive text summarization (ATS) system can help to build an internal semantic representation of the text with the use of natural language processing to create sentiment-based summaries of the entire reviews. An ATS system is proposed in this work, which is called a many-to-many sequence problem where attention-based long short-term memory (LSTM) is incorporated to generate the summary of the product reviews. The proposed ATS system works in two stages. In the first stage, a data processing module is designed to create the structured representation of the text, where noise and other irrelevant data are also removed. In the second stage, the attention-based long short-term memory (LSTM) model is designed to train, validate and test the system. It is found that the proposed approach can better deal with the rare words, also can generate readable, short and informative summaries. An experimental analysis has been carried out over the Amazon product review dataset, and to validate the result, the ROUGE measure is used. The obtained result reflects the efficacy of the proposed method over other state-of-the-art methods.
  • Detection of Active Attacks Using Ensemble Machine Learning Approach

    Revathy S., Sathya Priya S., Rafi S., Pranideep P., Rajesh D.

    Conference paper, Lecture Notes in Networks and Systems, 2022, DOI Link

    View abstract ⏷

    Cyber attackers entrap online users and companies by stealing their sensitive information without their knowledge. Sensitive information such as login credentials, bank card details, and centralized servers of organizations are hacked by the attackers. Some of the cyber attacks are phishing attack where attackers make the online users to trust their websites as a legitimate website and retrieve their personal information by making them as a prey for cyber attacks. Malware attack is a one where attacker will inject a malicious software into the company server or any online user’s device, without their knowledge and steal all the data in their devices and servers. Intrusion is an invasion, where attacker will attack the network and theft all the network resources. But, there are many types of cyber attacks solutions such as visual similarity-based approaches, intrusion detection system, signature-based, heuristic-based, specification-based, anomaly-based methods that are proposed, but they have some disadvantages. Because of few unsecured HTTP websites and lack of cyber knowledge, cyber attacks are increasing day by day. In our proposed system, a unified ensemble approach (UEA) is proposed by combining different machine learning algorithms using ensemble approach that gives better accuracy and detection rate. This model aims to detect the intrusion, phishing attack and prevent the malwares thereby mitigating the cyber attacks encountered by individual and organization.
  • RNN Encoder and Decoder with Teacher Forcing Attention Mechanism for Abstractive Summarization

    Rafi S., Das R.

    Conference paper, Proceedings of the 2021 IEEE 18th India Council International Conference, INDICON 2021, 2021, DOI Link

    View abstract ⏷

    Automatic text summarization is utilized to address the sustained increasing proportion of text data available in online, to address such numerous data to short data summarization is preferred. In this work, abstractive text summarization is reviewed with Recurrent Neural Network based Long Short Term Memory with encoder and attention decoder. In recent years, this type of summarization has carried out with Recurrent Neural Network which learns from the previous time steps. In this paper, we propose Teacher Forcing Technique to improve the slow convergence and poor performance of Recurrent Neural Network. Our technique improves the performance of text summarization by using Attention Decoder which learns from ground truth instead of learning from previous time steps and minimizes the error rate with Stochastic Gradient Descent optimizer with more accurate results on Wikihow dataset when measured with metric of ROUGE and performed well when compared our result with the state of art results.
  • A Linear Sub-Structure with Co-Variance Shift for Image Captioning

    Rafi S., Das R.

    Conference paper, 2021 8th International Conference on Soft Computing and Machine Intelligence, ISCMI 2021, 2021, DOI Link

    View abstract ⏷

    Automatic description of image has attracted many researchers in the field of computer vision for captioning the image in artificial intelligence which connects with Natural Language Processing. Exact generation of captions to image is necessary but it lacks due to Gradient Diminishing problem, LSTM can overcome this problem by fusing local and global characteristics of image and text that generates sequenced word prediction for accurate image captioning. We consider Flickr 8k data-set which consists of text as descriptions of images. The use of GLoVe embedding helps for the word representation to consider the global and local features of images which finds distance with Euclidean to understand the relationship between words in vector space. Inception V3 architecture which is pretrained on ImageNet used to extract image features of different objects in scenes. We propose Linear Sub-Structure that helps to generate sequenced order of words for captioning by understanding relationship between words. For extracting image features considers co-variance shift which mainly concentrates on moving parts of the image to generate accurate description of the image to maintain a semantic visual grammar relationship between the predicted text for image as the caption, the proposed model evaluated with the help of BLEU score which achieves state of art model in our work while compared with others that has greater than 81% of accuracy.
  • Enhanced biomedical data modeling using unsupervised probabilistic machine learning technique

    Rizwana S., Challa K., Rafi S., Imambi S.S.

    Article, International Journal of Recent Technology and Engineering, 2019,

    View abstract ⏷

    Text mining approaches uses feature similarity techniques or distributed keyword searching techniques. But machine learning techniques develop a statistical model to categorize documents by learning from vast amount of medical documents available at pubmed. It is unsupervised techniques. The proposed algorithm enhances the traditional document clustering techniques.and generate accurate and reliable model. We experimented the algorithm with 1000 document data set It showed the significant improvement over other traditional algorithms.
Contact Details

rafi.s@srmap.edu.in

Scholars