To Reduce Gross NPA and Classify Defaulters Using Shannon Entropy

Mansimransinghanand
3 min readMar 19, 2022

COVID — 19 hit India during March 2020. I was in the middle of my internship at that time. Since there was a nation wide lock down, we were forced to change our daily regime. Work from home meant that there was an opportunity where some of the travel time to and fro the office could be utilized to do something fruitful. Hence I teamed up with some of my colleagues and batch mates to do some research on topics related to banking and health care department.

In this article I would discuss one of the research papers we were able to publish — “To Reduce Gross NPA and Classify Defaulters Using Shannon Entropy”, in internationally recognized Springer Journal.

Non-Performing Asset (NPA) has been in a serious attention by banks over the past few years. NPA cause a huge loss to the banks, hence it becomes an extremely critical step in deciding which loans have the capabilities to become an NPA, and thereby deciding which loans to grant and which ones to reject

Amount Tycoon Vijay Malaya owed to various banks, all of which have become a NPA

Factors leading to growth of NPA can range from mismanagement at the staff level, inappropriate lending rules of the bank, deviation of funds, growing number of defaults and economic conditions of the country. Despite the measure taken the issue and drawback faced by the bank in regards to NPA are on the rise. We believe that the growing percentages of NPA can be brought down if we can precisely predict the factors contributing to the occurrence of NPA

In our paper a unique modelling of Entropy based classifier model is built using the logic of Shannon Entropy. The classifier model categorizes the data points in two categories, ‘accepted’ or ‘rejected’. The use of local entropy and global entropy is also exerted to help determine the output. The entropy classifier model is then compared with existing classifiers used to predict NPAs, thereby giving an idea about the performance.

Steps followed in deciding that whether a particular loan could be classified as an NPA or not.

  1. We used a data set that contained total of 15,32,428 cases, out of which 4,06,601 were defaulter cases and 1,25,827 cases were non-defaulter cases. The data set consisted of total 45 columns. A precise and careful pre-processing of the data set was performed.
  2. After pre-processing of the data, we moved onto calculating the entropy of a single column.
  3. The type of entropy used in our paper is Shannon entropy which was formulated by Claude E. Shannon. Shannon Entropy tells the uncertainty associated with a variable, hence allowing us to denote the average number of bits needed to encode a string of symbols, depending on their frequency. Since our dataset was imbalanced towards one class, the advantage we got by doing this is that we were able to eliminate the bias and balance out the occurrence probability of each data point.
  4. We moved on from local entropy calculation (µ) to global entropy calculation (Ω) and finally calculation of the difference between a reference metric of these entropies (Difference of Entropy Metrics or DEM) which decides whether an entity or a data point belongs to a particular class.
  5. This approach was then compared to other approaches and seen to be more effective.

Hence we could conclude that Shannon entropy can be transformed from a mere tool for analyzing the data representations, and distributions into a classifying tool. By classifying a particular application into a granted sanction or a potential debtor will reduce the gross debt caused by non-performing assets.

--

--