Faculty Dr Sowkuntla Pandu

Dr Sowkuntla Pandu

Assistant Professor

Department of Computer Science and Engineering

Contact Details

pandu.s@srmap.edu.in

Office Location

CV Raman Block, Level 5, Cabin No: 1

Education

2021
University of Hyderbad
India
2010
M.Tech
JNTU Hyderabad
India
2006
B.Tech
JNTU Hyderabad
India

Personal Website

Experience

  • Feb 2016 – Oct 2021 | Research Scholar | School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Telangana, India.
  • June 2011 – Nov 2015 | Assistant Professor | Department of Computer Science and Engineering, School of Engineering, NNRG group of institutions (affiliated to JNTU Hyderabad), Hyderabad, Telangana, India.
  • June 2006 – July 2007 |IT Associate | Institute for Electronic Governance, Government of Andhra Pradesh, Hyderabad, India.

Research Interest

  • The current research is focused in the area of MapReduce based parallel/distributed attribute reduction using Rough Sets and Fuzzy-Rough Sets.
  • Investigating appropriate MapReduce-based strategies for scalable attribute reduction that can simultaneously scale in both huge object space and huge attribute space (high dimensional) of the big data sets.
  • Proposing MapReduce-based incremental attribute reduction approaches for streaming data.

Awards

  • December 2014 - National Eligibility Test (NET) – UGC
  • June 2015 - State Eligibility Test (SET) -UGC (AP/TS)
  • Feb 2016-Jan 2021 - Research Fellowship from Visvesvaraya PhD scheme - Ministry of Electronics and Information Technology (MeitY), Govt. of India.

Memberships

  • IEEE Membership (96133891)

Publications

  • Parallel attribute reduction in high-dimensional data: An efficient MapReduce strategy with fuzzy discernibility matrix

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Applied Soft Computing, 2025, DOI Link

    View abstract ⏷

    The hybrid paradigm of fuzzy-rough set theory, which combines fuzzy and rough sets, has proven effective in attribute reduction for hybrid decision systems encompassing both numerical and categorical attributes. However, current parallel/distributed approaches are limited to handling datasets with either categorical or numerical attributes and often rely on fuzzy dependency measures. There exists little research on parallel/distributed attribute reduction for large-scale hybrid decision systems. The challenge of handling high-dimensional data in hybrid decision systems necessitates efficient distributed computing techniques to ensure scalability and performance. MapReduce, a widely used framework for distributed processing, provides an organized approach to handling large-scale data. Despite its potential, there is a noticeable lack of attribute reduction techniques that leverage MapReduce's capabilities with a fuzzy discernibility matrix, which can significantly improve the efficiency of processing high-dimensional hybrid datasets. This paper introduces a vertically partitioned fuzzy discernibility matrix within the MapReduce computation model to address the high dimensionality of hybrid datasets. The proposed MapReduce strategy for attribute reduction minimizes data movement during the shuffle and sort phase, overcoming limitations present in existing approaches. Furthermore, the method's efficiency is enhanced by integrating a feature known as SAT-region removal, which removes matrix entries that satisfy the maximum satisfiability conditions during the attribute reduction process. Extensive experimental analysis validates the proposed method, demonstrating its superior performance compared to recent parallel/distributed methods in attribute reduction.
  • MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix

    Sowkuntla P., Prasad P.S.V.S.S.

    Article, Applied Intelligence, 2022, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory is an efficient method for attribute reduction. It can effectively handle the imprecision and uncertainty of the data in the attribute reduction. Despite its efficacy, current approaches to fuzzy-rough attribute reduction are not efficient for the processing of large data sets due to the requirement of higher space complexities. A limited number of accelerators and parallel/distributed approaches have been proposed for fuzzy-rough attribute reduction in large data sets. However, all of these approaches are dependency measure based methods in which fuzzy similarity matrices are used for performing attribute reduction. Alternative discernibility matrix based attribute reduction methods are found to have less space requirements and more amicable to parallelization in building parallel/distributed algorithms. This paper therefore introduces a fuzzy discernibility matrix-based attribute reduction accelerator (DARA) to accelerate the attribute reduction. DARA is used to build a sequential approach and the corresponding parallel/distributed approach for attribute reduction in large data sets. The proposed approaches are compared to the existing state-of-the-art approaches with a systematic experimental analysis to assess computational efficiency. The experimental study, along with theoretical validation, shows that the proposed approaches are effective and perform better than the current approaches.
  • MapReduce based parallel attribute reduction in Incomplete Decision Systems

    Sowkuntla P., Dunna S., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2021, DOI Link

    View abstract ⏷

    The scale of the data collected today from applications in the real-world is massive. Sometimes this data can also include missing (incomplete) values that give rise to large-scale incomplete decision systems (IDS). Parallel attribute reduction in big data is an essential preprocessing step for scalable machine learning model construction. Rough set theory has been used as a powerful tool for attribute reduction in complete decision systems (CDS). Furthermore extensions to classical rough set theory have been proposed to deal with IDS. A lot of research works have been done on efficient attribute reduction in IDS using these extensions, but no parallel/distributed approaches have been proposed for attribute reduction in large-scale IDS. Since, owing to its two challenges, large-scale and incompleteness, the processing of large-scale IDS is difficult. To address these challenges, we propose MapReduce based parallel/distributed approaches for attribute reduction in massive IDS. The proposed approaches resolve the challenge of incompleteness with the existing Novel Granular Framework (NGF). And each proposed approach follows a different data partitioning strategy to handle the data sets that are large-scale in terms of number of objects and attributes. One of the proposed approaches adopts an alternative representation of the NGF and uses a horizontal partitioning (division in object space) of the data to the nodes of cluster. Another approach embraces the existing NGF and uses a vertical partitioning (division in attribute space) of the data. Extensive experimental analysis carried out on various data sets with different percentages of incompleteness in the data. The experimental results show that the horizontal partitioning based approach performs well for the massive object space data sets. And the vertical partitioning based approach is relevant and scales well for extremely high dimensional data sets.
  • MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2020, DOI Link

    View abstract ⏷

    In the last few decades, rough sets have evolved to become an essential technology for feature subset selection by way of reduct computation in categorical decision systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation algorithms have been developed in this field for large-scale decision systems using MapReduce. The existing MapReduce based reduct computation approaches use horizontal partitioning (division in object space) of the dataset into the nodes of the cluster, requiring a complicated shuffle and sort phase. In this work, we propose an algorithm MR_IQRA_VP which is designed using vertical partitioning (division in attribute space) of the dataset with a simplified shuffle and sort phase of the MapReduce framework. MR_IQRA_VP is a distributed/parallel implementation of the Improved Quick Reduct Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark. We have done an extensive comparative study through experimentation on benchmark decision systems using existing horizontal partitioning based reduct computation algorithms. Through experimental analysis, along with theoretical validation, we have established that MR_IQRA_VP is suitable and scalable to datasets of larger size attribute space and moderate object space prevalent in the areas of Bioinformatics and Web mining.
  • MR_IMQRA: An Efficient MapReduce Based Approach for Fuzzy Decision Reduct Computation

    Bandagar K., Sowkuntla P., Moiz S.A., Sai Prasad P.S.V.S.

    Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory, an extension to classical rough set theory, is effectively used for attribute reduction in hybrid decision systems. However, it’s applicability is restricted to smaller size datasets because of higher space and time complexities. In this work, an algorithm MR_IMQRA is developed as a MapReduce based distributed/parallel approach for standalone fuzzy-rough attribute reduction algorithm IMQRA. This algorithm uses a vertical partitioning technique to distribute the input data in the cluster environment of the MapReduce framework. Owing to the vertical partitioning, the proposed algorithm is scalable in attribute space and is relevant for scalable attribute reduction in the areas of Bioinformatics and document classification. This technique reduces the complexity of movement of data in shuffle and sort phase of MapReduce framework. A comparative and performance analysis is conducted on larger attribute space (high dimensional) hybrid decision systems. The comparative experimental results demonstrated that the proposed MR_IMQRA algorithm obtained good sizeup/speedup measures and induced classifiers achieving better classification accuracy.
  • Fuzzy Rough Discernibility Matrix Based Feature Subset Selection with MapReduce

    Pavani N.L., Sowkuntla P., Rani K.S., Prasad P.S.V.S.S.

    Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory (FRST) is a hybridization of fuzzy sets with rough sets with applications to attribute reduction in hybrid decision systems. The existing reduct computation approaches in fuzzy-rough sets are not scalable to large scale decision systems owing to higher space complexity requirements. Iterative MapReduce framework of Apache Spark facilitates the development of scalable distributed algorithms with fault tolerance. This work introduces algorithm MR-FRDM-SBE as one of the first attempts towards scalable fuzzy-rough set based attribute reduction. MR-FRDM-SBE algorithm is a combination of a novel incremental approach for the construction of distributed fuzzy-rough discernibility matrix and Sequential Backward Elimination control strategy based distributed fuzzy-rough attribute reduction using a discernibility matrix. A comparative experimental study conducted using large scale benchmark hybrid decision systems demonstrated the relevance of the proposed approach in scalable attribute reduction and better classification model construction.

Patents

  • A system and method for disease detection in agricultural practices

    Dr Sowkuntla Pandu

    Patent Application No: 202541016653, Date Filed: 25/02/2025, Date Published: 07/03/2025, Status: Published

  • A system and a method for eggshell waste management and calcium extraction

    Dr Sanjay Kumar, Dr Sowkuntla Pandu

    Patent Application No: 202441063206, Date Filed: 21/08/2024, Date Published: 30/08/2024, Status: Published

  • System and method for enhancing operational efficiency and facilitating dynamic collaboration in professional practices

    Mr Gavaskar S, Dr Sowkuntla Pandu

    Patent Application No: 202541002139, Date Filed: 09/01/2025, Date Published: 24/01/2025, Status: Published

  • System and method for interpreting cognitive and emotional states of a  user

    Dr Sanjay Kumar, Dr Sowkuntla Pandu

    Patent Application No: 202441086466, Date Filed: 09/11/2024, Date Published: 15/11/2024, Status: Published

Projects

Scholars

Doctoral Scholars

  • Mr Arla Gopala Krishna

Interests

  • Artificial Intelligence
  • Data Science
  • Machine Learning
  • Vision Computing

Thought Leaderships

There are no Thought Leaderships associated with this faculty.

Top Achievements

Research Area

No research areas found for this faculty.

Recent Updates

No recent updates found.

Education
2006
B.Tech
JNTU Hyderabad
India
2010
M.Tech
JNTU Hyderabad
India
2021
University of Hyderbad
India
Experience
  • Feb 2016 – Oct 2021 | Research Scholar | School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Telangana, India.
  • June 2011 – Nov 2015 | Assistant Professor | Department of Computer Science and Engineering, School of Engineering, NNRG group of institutions (affiliated to JNTU Hyderabad), Hyderabad, Telangana, India.
  • June 2006 – July 2007 |IT Associate | Institute for Electronic Governance, Government of Andhra Pradesh, Hyderabad, India.
Research Interests
  • The current research is focused in the area of MapReduce based parallel/distributed attribute reduction using Rough Sets and Fuzzy-Rough Sets.
  • Investigating appropriate MapReduce-based strategies for scalable attribute reduction that can simultaneously scale in both huge object space and huge attribute space (high dimensional) of the big data sets.
  • Proposing MapReduce-based incremental attribute reduction approaches for streaming data.
Awards & Fellowships
  • December 2014 - National Eligibility Test (NET) – UGC
  • June 2015 - State Eligibility Test (SET) -UGC (AP/TS)
  • Feb 2016-Jan 2021 - Research Fellowship from Visvesvaraya PhD scheme - Ministry of Electronics and Information Technology (MeitY), Govt. of India.
Memberships
  • IEEE Membership (96133891)
Publications
  • Parallel attribute reduction in high-dimensional data: An efficient MapReduce strategy with fuzzy discernibility matrix

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Applied Soft Computing, 2025, DOI Link

    View abstract ⏷

    The hybrid paradigm of fuzzy-rough set theory, which combines fuzzy and rough sets, has proven effective in attribute reduction for hybrid decision systems encompassing both numerical and categorical attributes. However, current parallel/distributed approaches are limited to handling datasets with either categorical or numerical attributes and often rely on fuzzy dependency measures. There exists little research on parallel/distributed attribute reduction for large-scale hybrid decision systems. The challenge of handling high-dimensional data in hybrid decision systems necessitates efficient distributed computing techniques to ensure scalability and performance. MapReduce, a widely used framework for distributed processing, provides an organized approach to handling large-scale data. Despite its potential, there is a noticeable lack of attribute reduction techniques that leverage MapReduce's capabilities with a fuzzy discernibility matrix, which can significantly improve the efficiency of processing high-dimensional hybrid datasets. This paper introduces a vertically partitioned fuzzy discernibility matrix within the MapReduce computation model to address the high dimensionality of hybrid datasets. The proposed MapReduce strategy for attribute reduction minimizes data movement during the shuffle and sort phase, overcoming limitations present in existing approaches. Furthermore, the method's efficiency is enhanced by integrating a feature known as SAT-region removal, which removes matrix entries that satisfy the maximum satisfiability conditions during the attribute reduction process. Extensive experimental analysis validates the proposed method, demonstrating its superior performance compared to recent parallel/distributed methods in attribute reduction.
  • MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix

    Sowkuntla P., Prasad P.S.V.S.S.

    Article, Applied Intelligence, 2022, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory is an efficient method for attribute reduction. It can effectively handle the imprecision and uncertainty of the data in the attribute reduction. Despite its efficacy, current approaches to fuzzy-rough attribute reduction are not efficient for the processing of large data sets due to the requirement of higher space complexities. A limited number of accelerators and parallel/distributed approaches have been proposed for fuzzy-rough attribute reduction in large data sets. However, all of these approaches are dependency measure based methods in which fuzzy similarity matrices are used for performing attribute reduction. Alternative discernibility matrix based attribute reduction methods are found to have less space requirements and more amicable to parallelization in building parallel/distributed algorithms. This paper therefore introduces a fuzzy discernibility matrix-based attribute reduction accelerator (DARA) to accelerate the attribute reduction. DARA is used to build a sequential approach and the corresponding parallel/distributed approach for attribute reduction in large data sets. The proposed approaches are compared to the existing state-of-the-art approaches with a systematic experimental analysis to assess computational efficiency. The experimental study, along with theoretical validation, shows that the proposed approaches are effective and perform better than the current approaches.
  • MapReduce based parallel attribute reduction in Incomplete Decision Systems

    Sowkuntla P., Dunna S., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2021, DOI Link

    View abstract ⏷

    The scale of the data collected today from applications in the real-world is massive. Sometimes this data can also include missing (incomplete) values that give rise to large-scale incomplete decision systems (IDS). Parallel attribute reduction in big data is an essential preprocessing step for scalable machine learning model construction. Rough set theory has been used as a powerful tool for attribute reduction in complete decision systems (CDS). Furthermore extensions to classical rough set theory have been proposed to deal with IDS. A lot of research works have been done on efficient attribute reduction in IDS using these extensions, but no parallel/distributed approaches have been proposed for attribute reduction in large-scale IDS. Since, owing to its two challenges, large-scale and incompleteness, the processing of large-scale IDS is difficult. To address these challenges, we propose MapReduce based parallel/distributed approaches for attribute reduction in massive IDS. The proposed approaches resolve the challenge of incompleteness with the existing Novel Granular Framework (NGF). And each proposed approach follows a different data partitioning strategy to handle the data sets that are large-scale in terms of number of objects and attributes. One of the proposed approaches adopts an alternative representation of the NGF and uses a horizontal partitioning (division in object space) of the data to the nodes of cluster. Another approach embraces the existing NGF and uses a vertical partitioning (division in attribute space) of the data. Extensive experimental analysis carried out on various data sets with different percentages of incompleteness in the data. The experimental results show that the horizontal partitioning based approach performs well for the massive object space data sets. And the vertical partitioning based approach is relevant and scales well for extremely high dimensional data sets.
  • MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2020, DOI Link

    View abstract ⏷

    In the last few decades, rough sets have evolved to become an essential technology for feature subset selection by way of reduct computation in categorical decision systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation algorithms have been developed in this field for large-scale decision systems using MapReduce. The existing MapReduce based reduct computation approaches use horizontal partitioning (division in object space) of the dataset into the nodes of the cluster, requiring a complicated shuffle and sort phase. In this work, we propose an algorithm MR_IQRA_VP which is designed using vertical partitioning (division in attribute space) of the dataset with a simplified shuffle and sort phase of the MapReduce framework. MR_IQRA_VP is a distributed/parallel implementation of the Improved Quick Reduct Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark. We have done an extensive comparative study through experimentation on benchmark decision systems using existing horizontal partitioning based reduct computation algorithms. Through experimental analysis, along with theoretical validation, we have established that MR_IQRA_VP is suitable and scalable to datasets of larger size attribute space and moderate object space prevalent in the areas of Bioinformatics and Web mining.
  • MR_IMQRA: An Efficient MapReduce Based Approach for Fuzzy Decision Reduct Computation

    Bandagar K., Sowkuntla P., Moiz S.A., Sai Prasad P.S.V.S.

    Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory, an extension to classical rough set theory, is effectively used for attribute reduction in hybrid decision systems. However, it’s applicability is restricted to smaller size datasets because of higher space and time complexities. In this work, an algorithm MR_IMQRA is developed as a MapReduce based distributed/parallel approach for standalone fuzzy-rough attribute reduction algorithm IMQRA. This algorithm uses a vertical partitioning technique to distribute the input data in the cluster environment of the MapReduce framework. Owing to the vertical partitioning, the proposed algorithm is scalable in attribute space and is relevant for scalable attribute reduction in the areas of Bioinformatics and document classification. This technique reduces the complexity of movement of data in shuffle and sort phase of MapReduce framework. A comparative and performance analysis is conducted on larger attribute space (high dimensional) hybrid decision systems. The comparative experimental results demonstrated that the proposed MR_IMQRA algorithm obtained good sizeup/speedup measures and induced classifiers achieving better classification accuracy.
  • Fuzzy Rough Discernibility Matrix Based Feature Subset Selection with MapReduce

    Pavani N.L., Sowkuntla P., Rani K.S., Prasad P.S.V.S.S.

    Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory (FRST) is a hybridization of fuzzy sets with rough sets with applications to attribute reduction in hybrid decision systems. The existing reduct computation approaches in fuzzy-rough sets are not scalable to large scale decision systems owing to higher space complexity requirements. Iterative MapReduce framework of Apache Spark facilitates the development of scalable distributed algorithms with fault tolerance. This work introduces algorithm MR-FRDM-SBE as one of the first attempts towards scalable fuzzy-rough set based attribute reduction. MR-FRDM-SBE algorithm is a combination of a novel incremental approach for the construction of distributed fuzzy-rough discernibility matrix and Sequential Backward Elimination control strategy based distributed fuzzy-rough attribute reduction using a discernibility matrix. A comparative experimental study conducted using large scale benchmark hybrid decision systems demonstrated the relevance of the proposed approach in scalable attribute reduction and better classification model construction.
Contact Details

pandu.s@srmap.edu.in

Scholars

Doctoral Scholars

  • Mr Arla Gopala Krishna

Interests

  • Artificial Intelligence
  • Data Science
  • Machine Learning
  • Vision Computing

Education
2006
B.Tech
JNTU Hyderabad
India
2010
M.Tech
JNTU Hyderabad
India
2021
University of Hyderbad
India
Experience
  • Feb 2016 – Oct 2021 | Research Scholar | School of Computer and Information Sciences, University of Hyderabad, Hyderabad, Telangana, India.
  • June 2011 – Nov 2015 | Assistant Professor | Department of Computer Science and Engineering, School of Engineering, NNRG group of institutions (affiliated to JNTU Hyderabad), Hyderabad, Telangana, India.
  • June 2006 – July 2007 |IT Associate | Institute for Electronic Governance, Government of Andhra Pradesh, Hyderabad, India.
Research Interests
  • The current research is focused in the area of MapReduce based parallel/distributed attribute reduction using Rough Sets and Fuzzy-Rough Sets.
  • Investigating appropriate MapReduce-based strategies for scalable attribute reduction that can simultaneously scale in both huge object space and huge attribute space (high dimensional) of the big data sets.
  • Proposing MapReduce-based incremental attribute reduction approaches for streaming data.
Awards & Fellowships
  • December 2014 - National Eligibility Test (NET) – UGC
  • June 2015 - State Eligibility Test (SET) -UGC (AP/TS)
  • Feb 2016-Jan 2021 - Research Fellowship from Visvesvaraya PhD scheme - Ministry of Electronics and Information Technology (MeitY), Govt. of India.
Memberships
  • IEEE Membership (96133891)
Publications
  • Parallel attribute reduction in high-dimensional data: An efficient MapReduce strategy with fuzzy discernibility matrix

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Applied Soft Computing, 2025, DOI Link

    View abstract ⏷

    The hybrid paradigm of fuzzy-rough set theory, which combines fuzzy and rough sets, has proven effective in attribute reduction for hybrid decision systems encompassing both numerical and categorical attributes. However, current parallel/distributed approaches are limited to handling datasets with either categorical or numerical attributes and often rely on fuzzy dependency measures. There exists little research on parallel/distributed attribute reduction for large-scale hybrid decision systems. The challenge of handling high-dimensional data in hybrid decision systems necessitates efficient distributed computing techniques to ensure scalability and performance. MapReduce, a widely used framework for distributed processing, provides an organized approach to handling large-scale data. Despite its potential, there is a noticeable lack of attribute reduction techniques that leverage MapReduce's capabilities with a fuzzy discernibility matrix, which can significantly improve the efficiency of processing high-dimensional hybrid datasets. This paper introduces a vertically partitioned fuzzy discernibility matrix within the MapReduce computation model to address the high dimensionality of hybrid datasets. The proposed MapReduce strategy for attribute reduction minimizes data movement during the shuffle and sort phase, overcoming limitations present in existing approaches. Furthermore, the method's efficiency is enhanced by integrating a feature known as SAT-region removal, which removes matrix entries that satisfy the maximum satisfiability conditions during the attribute reduction process. Extensive experimental analysis validates the proposed method, demonstrating its superior performance compared to recent parallel/distributed methods in attribute reduction.
  • MapReduce based parallel fuzzy-rough attribute reduction using discernibility matrix

    Sowkuntla P., Prasad P.S.V.S.S.

    Article, Applied Intelligence, 2022, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory is an efficient method for attribute reduction. It can effectively handle the imprecision and uncertainty of the data in the attribute reduction. Despite its efficacy, current approaches to fuzzy-rough attribute reduction are not efficient for the processing of large data sets due to the requirement of higher space complexities. A limited number of accelerators and parallel/distributed approaches have been proposed for fuzzy-rough attribute reduction in large data sets. However, all of these approaches are dependency measure based methods in which fuzzy similarity matrices are used for performing attribute reduction. Alternative discernibility matrix based attribute reduction methods are found to have less space requirements and more amicable to parallelization in building parallel/distributed algorithms. This paper therefore introduces a fuzzy discernibility matrix-based attribute reduction accelerator (DARA) to accelerate the attribute reduction. DARA is used to build a sequential approach and the corresponding parallel/distributed approach for attribute reduction in large data sets. The proposed approaches are compared to the existing state-of-the-art approaches with a systematic experimental analysis to assess computational efficiency. The experimental study, along with theoretical validation, shows that the proposed approaches are effective and perform better than the current approaches.
  • MapReduce based parallel attribute reduction in Incomplete Decision Systems

    Sowkuntla P., Dunna S., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2021, DOI Link

    View abstract ⏷

    The scale of the data collected today from applications in the real-world is massive. Sometimes this data can also include missing (incomplete) values that give rise to large-scale incomplete decision systems (IDS). Parallel attribute reduction in big data is an essential preprocessing step for scalable machine learning model construction. Rough set theory has been used as a powerful tool for attribute reduction in complete decision systems (CDS). Furthermore extensions to classical rough set theory have been proposed to deal with IDS. A lot of research works have been done on efficient attribute reduction in IDS using these extensions, but no parallel/distributed approaches have been proposed for attribute reduction in large-scale IDS. Since, owing to its two challenges, large-scale and incompleteness, the processing of large-scale IDS is difficult. To address these challenges, we propose MapReduce based parallel/distributed approaches for attribute reduction in massive IDS. The proposed approaches resolve the challenge of incompleteness with the existing Novel Granular Framework (NGF). And each proposed approach follows a different data partitioning strategy to handle the data sets that are large-scale in terms of number of objects and attributes. One of the proposed approaches adopts an alternative representation of the NGF and uses a horizontal partitioning (division in object space) of the data to the nodes of cluster. Another approach embraces the existing NGF and uses a vertical partitioning (division in attribute space) of the data. Extensive experimental analysis carried out on various data sets with different percentages of incompleteness in the data. The experimental results show that the horizontal partitioning based approach performs well for the massive object space data sets. And the vertical partitioning based approach is relevant and scales well for extremely high dimensional data sets.
  • MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme

    Sowkuntla P., Sai Prasad P.S.V.S.

    Article, Knowledge-Based Systems, 2020, DOI Link

    View abstract ⏷

    In the last few decades, rough sets have evolved to become an essential technology for feature subset selection by way of reduct computation in categorical decision systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation algorithms have been developed in this field for large-scale decision systems using MapReduce. The existing MapReduce based reduct computation approaches use horizontal partitioning (division in object space) of the dataset into the nodes of the cluster, requiring a complicated shuffle and sort phase. In this work, we propose an algorithm MR_IQRA_VP which is designed using vertical partitioning (division in attribute space) of the dataset with a simplified shuffle and sort phase of the MapReduce framework. MR_IQRA_VP is a distributed/parallel implementation of the Improved Quick Reduct Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark. We have done an extensive comparative study through experimentation on benchmark decision systems using existing horizontal partitioning based reduct computation algorithms. Through experimental analysis, along with theoretical validation, we have established that MR_IQRA_VP is suitable and scalable to datasets of larger size attribute space and moderate object space prevalent in the areas of Bioinformatics and Web mining.
  • MR_IMQRA: An Efficient MapReduce Based Approach for Fuzzy Decision Reduct Computation

    Bandagar K., Sowkuntla P., Moiz S.A., Sai Prasad P.S.V.S.

    Conference paper, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory, an extension to classical rough set theory, is effectively used for attribute reduction in hybrid decision systems. However, it’s applicability is restricted to smaller size datasets because of higher space and time complexities. In this work, an algorithm MR_IMQRA is developed as a MapReduce based distributed/parallel approach for standalone fuzzy-rough attribute reduction algorithm IMQRA. This algorithm uses a vertical partitioning technique to distribute the input data in the cluster environment of the MapReduce framework. Owing to the vertical partitioning, the proposed algorithm is scalable in attribute space and is relevant for scalable attribute reduction in the areas of Bioinformatics and document classification. This technique reduces the complexity of movement of data in shuffle and sort phase of MapReduce framework. A comparative and performance analysis is conducted on larger attribute space (high dimensional) hybrid decision systems. The comparative experimental results demonstrated that the proposed MR_IMQRA algorithm obtained good sizeup/speedup measures and induced classifiers achieving better classification accuracy.
  • Fuzzy Rough Discernibility Matrix Based Feature Subset Selection with MapReduce

    Pavani N.L., Sowkuntla P., Rani K.S., Prasad P.S.V.S.S.

    Conference paper, IEEE Region 10 Annual International Conference, Proceedings/TENCON, 2019, DOI Link

    View abstract ⏷

    Fuzzy-rough set theory (FRST) is a hybridization of fuzzy sets with rough sets with applications to attribute reduction in hybrid decision systems. The existing reduct computation approaches in fuzzy-rough sets are not scalable to large scale decision systems owing to higher space complexity requirements. Iterative MapReduce framework of Apache Spark facilitates the development of scalable distributed algorithms with fault tolerance. This work introduces algorithm MR-FRDM-SBE as one of the first attempts towards scalable fuzzy-rough set based attribute reduction. MR-FRDM-SBE algorithm is a combination of a novel incremental approach for the construction of distributed fuzzy-rough discernibility matrix and Sequential Backward Elimination control strategy based distributed fuzzy-rough attribute reduction using a discernibility matrix. A comparative experimental study conducted using large scale benchmark hybrid decision systems demonstrated the relevance of the proposed approach in scalable attribute reduction and better classification model construction.
Contact Details

pandu.s@srmap.edu.in

Scholars

Doctoral Scholars

  • Mr Arla Gopala Krishna