UTILIZING SCIENCE DATA TO INCREASING THE NUMBER MSME DEBTORS AT PT.BANK CENTRAL ASIA.TBK (CASE STUDY OF PT. BANK CENTRAL ASIA.TBK KCU TEBING TINGGI)

This study aims to increase the number of MSME debtors at the BCA Tebing Tinggi Branch. Since the enactment of Bank Indonesia Regulation Number 23/13/PBI/2021 concerning the Macroprudential Inclusive Financing Ratio (RPIM) for Conventional Commercial Banks, Sharia Commercial Banks, and Sharia Business Units. So Commercial Banks began to adjust the percentage of the use of funds that would be used to finance MSMEs and PBR. BCA Tebing Tinggi Branch is committed to meeting the increase in the percentage of RPIM. One way that can be used to explore Potential Funding is by Utilizing Data Science. Data science studies data, especially quantitative data, with the aim of finding hidden patterns in the data. Researchers will study profile information and transaction patterns in accounts to find MSME customers who are given the right financing. This study processes data using the Machine Learning method with the Random Forest algorithm.


INTRODUCTION
Since the enactment of Bank Indonesia Regulation Number 23/13/PBI/2021 concerning the Macroprudential Inclusive Financing Ratio (RPIM) for Conventional Commercial Banks, Sharia Commercial Banks, and Sharia Business Units. So Commercial Banks began to adjust the percentage of the use of funds that would be used to finance Micro, Small and Medium Enterprises (MSMEs) and Low Income Individuals (PBR). This PBI was issued as one of Bank Indonesia's efforts to increase economic inclusion and open access to finance and strengthen the role of MSMEs in national economic recovery. BCA as the largest private commercial bank in Indonesia is committed to complying with these Bank Indonesia Regulations. As of June 2022, BCA has disbursed loans to MSMEs of IDR 94.2 trillion, or growing up to 19.6% on an annual basis or year on year (yoy). The portion only reached 13.94% of the company's total loans. This means that it is still far from the regulator's target. BCA's target in 2024 for lending to MSMEs is 30% in accordance with regulatory provisions. In line with this, BCA branches throughout Indonesia are expected to be able to more quickly and actively seek potential MSME financing.
Micro, Small and Medium Enterprises (MSMEs) thrive in a number of areas. This can be seen from the data reported by the Ministry of Cooperatives and Small and Medium Enterprises (Kemenkop UKM), the total number of MSMEs in Indonesia will exceed 8.71 million business units in 2022. Java dominates this sector. It was recorded that West Java became the MSME champion with a total of 1.49 million business units. Thin in second place is Central Java, which reached 1.45 million units. Third, there is East Java with 1.15 million units. Outside of the top three, the gap is quite far. DKI Jakarta, which won the fourth position, can carve almost 660 thousand units. Fifth, there is North Sumatra with 596 thousand units. While the least number of businesses are in three regions, namely West Papua with 4.6 thousand business units, North Maluku with 4.1 thousand units, and Papua with 3.9 thousand units. The following is the data and distribution of MSMEs in Indonesia in 2022.
As of April 2023 the number of BCA MSME debtors was 49,185 units and those managed by the BCA Tebing Tinggi branch were 132 units. When compared to the total number of MSMEs in North Sumatra, it can be seen that there is still considerable potential to grow the number of debtors. One way to get financing potential is to take advantage of the data available at BCA. As of December 2022, the number of customers in the KCU of the High Cliffs is 40,562 customers. From this data, 1140 K1 customers were selected to be used as potential financing. K1 customers are customers who are included in the list of Group 1 (K1) customers determined by the Network Management & Regional Planning Work Unit and the Individual Customer Business Development Division which, Journal of Accounting Research, Utility Finance and Digital Assets 2 based on data analysis, bring the biggest business and profit to BCA. Withdrawal of customer data is done with the help of RPA (Robotic Process Automation) and then the data is processed so that it is useful. To process such a large amount of data certainly requires Data Science because if it is done conventionally with tabulations and analysis it certainly takes a long time and is less efficient. DataScience is an interdisciplinary field that uses scientific methods, processes, algorithms, and systems to extract knowledge and insight from structured and unstructured data (Dhar, 2013), and apply knowledge and actionable insights from data in various application domains. . Data science is related to data mining, machine learning, and big data. In simple terms, data science happens when we work with data to find answers to questions. The emphasis is more on the data itself and not on the science or knowledge. If we have data, then we have curiosity about the contents of useful data and then to answer this curiosity by studying the data, explore the data and do various things to analyze the data by utilizing certain science and technology to find answers. The ultimate goal of data science is to find insight from data. Data Science can be seen as a process of distilling or extracting or exploring insights from the data. The processed data can be very large. This insight can be likened to gold or diamonds, which, although only a few or small in size, are still valuable. This insight can be in the form of important information or models made from data that will be very useful in making a decision. Insights to be obtained from data need to start with a strong sense of curiosity from oneself or from an organization (in the form of a need because there is a problem to be solved by utilizing data). Furthermore, armed with this, later a data scientist can carry out various activities by utilizing appropriate knowledge and technology to get the desired insight (Sembiring & Hasibuan, 2021). Researchers through the BCA Tebing Tinggi Branch will explore this potential by utilizing Data Science to accelerate financing acquisitions. Data Science is a compilation of various techniques to process data to become valuable (Vijay Kotu, 2019). The technique that will be used by researchers is to look for certain patterns of customer account transactions that are likely to be potential MSME financing.

LITERATURE REVIEW 1. Bank Credit
Banking Law Number 10 of 1998 Article 1 concerning credit explains that credit is the provision of money or claims that can be equated with it, based on a loan agreement or agreement between banks and other parties that requires the borrower to repay the debt after a certain period of time with the provision of interest. . According to Hermansyah (2008) approval of a credit application is carried out by referring to the 5C Formula, namely Character, Capacity, Capital, Collateral, and Condition of Economy. The meaning of the word credit in the banking world is that creditors, both financial institutions, provide loans to debtors based on trust (Tjoekam, 1999). If associated with a business activity, credit means an activity that provides economic value to the debtor based on trust, and the same economic value will be returned by the creditor within a period determined by both parties.

MSME Financing
Growth in access to credit by MSMEs can increase economic growth. Apart from being beneficial for the state in order to stabilize the economy, credit is also beneficial for MSMEs as a secure and sustainable source of funding. In Kakuru's (2008) study, almost all commercial banks include MSMEs in their credit schemes to develop access to formal credit. According to Lusimbo & Muturi (2012), access to formal credit is defined as the absence of obstacles related to administrative or procedural costs at formal credit provider institutions that are felt by MSMEs when applying for credit. MSMEs are inseparable from financing problems. According to Agus (2016), the problems faced by MSMEs are not only about financial management and human resources in it, but also related to financing. In order to increase ease of financing for MSMEs, the government is expanding MSME financing facilities in Indonesia with the People's Business Credit (KUR) program. KUR aims to facilitate capital for MSME actors, but it has not been fully realized. Until May 2016, KUR disbursement was only IDR 39.2 trillion of the total target of IDR 120 trillion this year (Noordiansyah, 2016). This has caused not all MSMEs to feel the KUR distribution.

Definition and concept of data science
Data Science is a concept to unify statistics, data analysis, informatics, and related methods to understand and analyze actual phenomena with data (Hayashi, 1998

Journal of Accounting Research, Utility Finance and Digital Assets
3 data science is different from computer science and information science. Turing Award winner Jim Gray envisions data science as the "fourth paradigm" of science (empirical, theoretical, computational and now data driven) and asserts that everything about science changes because of the impact of information technology and data deluge (Tony Hey, at al. 2009).
Source: Palmer, Shelly Figure 2.1 Interdisciplinary Discipline of Data Science The availability of very large data with inherent properties in big data affects the process of conducting data analysis. Data analysis includes the process of collecting, processing, and evaluating data to understand and draw conclusions from the information found in the data. The goal is to identify patterns, trends, and relationships between variables in data to solve problems or make informed decisions. Data analysis can involve various techniques such as statistics, mathematics, and computer engineering (Kudyba, S., 2014).

Data Mining
According to Davies (2004), simply data mining is mining or new information by looking for certain patterns or rules from a very large amount of data. Data mining is often also referred to as knowledge discovery in databases (KDD), which is an activity that includes collecting, using historical data to find regularities, patterns or relationships in large data sets (Santosa, 2007). Meanwhile, according to Turban (2005), data mining is a process that uses statistical techniques, mathematics, artificial intelligence, and machine learning to extract and identify useful information and related knowledge from various large databases.

Machine Learning
Dataminingis a more effective statistical learning approach and uses powerful features for data analysis. However, there are some problems with the complexity of the pre-processed data and the knowledge of experts. Recent advances in machine learning technology are helping to provide a new effective paradigm for deriving endto-end machine learning models from complex data (Mastoli et al., 2019). Machine learning classification methods have long been used in the field of data mining and many other fields of computer science as a classification algorithm involving methods from statistics, artificial intelligence and database management, can also be categorized as a key element in data interpretation and data visualization (Berry & Linoff, 2004). Machine learning requires data to learn so it is also termed learn from data (Alpaydin, 2010).

RESEARCH METHODS 1. Types of Research
This type of research is a mixed research between descriptive and quantitative. The first method used in this study is descriptive analysis, where this method aims to provide an overview of the data to be studied. Furthermore, the second method is machine learning with the random forest algorithm where this method is used to classify credit usage on existing debtor data of BCA KCU Tebing Tinggi. Next, make predictions about the possibility of using credit for data on K1 customers who are not yet debtors.

Research Variables
There are five feature variables that will be trained in this study, namely average balance, credit transfer, credit and balance mutation ratio, age, and line of business. Then one target variable, namely the classification of usage which is a description of the average loan usage. The following is a description of the variables above

Data Collection Techniques
In this study, the data used were secondary data obtained from the BCA KCU Tebing Tinggi. The data is retrieved using the help of Robotic Process Automation (RPA) through data available on branch computers through a CRM application called Specta.

Tools and Ways of Organizing Data
The data processing tool used is the Ms. program. Excel and Orange Data Mining. In this stage, an assessment of the data that has been obtained is based on the theory, namely descriptive data analysis and using the random forest method, then implementing the following steps: 1. Perform a descriptive analysis of the variables used with the scatter plot 2. Determine the training data (training) that will be used to perform the random forest method. 3. Choose a value of n which represents the number of trees. 4. Make a plot to see the training data training process in order to get more optimal results. 5. Choose the sampling technique that will be used in the random forest method. 6. Classifying data using the random forest method using predetermined training data. 7. Conduct accuracy testing using test data (testing) to see how much accuracy the random forest method has for research data.

Data mining
The research began with data collection from the SPECTA CRM BCA application which was carried out automatically with the help of Robotic Process Automation (RPA) technology using the Power Automate tool from Microsoft. It can be seen that there is a relationship between the use of credit with the average balance and credit mutations. Collections of red dots or debtors with low usage predominate in the area of large average balances and small credit mutations. Meanwhile, a collection of blue dots or debtors with high usage dominates the area of small average balances and large credit mutations. From this descriptive analysis it can be concluded that the use of debtors' credit can be affected by the average balance and credit mutations. Then the results of the scatter plot also show that there are some data that are too far away from the data group. Some of this data needs to be excluded so that the machine learning algorithm can work optimally.

Random Forest Algorithm
With the help of the outlier function from Orange Software can help reduce data that is out of range. The results of the outliers obtained 10 amounts of data that are not valid for use. After data reductionthen it is ready to be used in machine learning algorithms. Here the researcher uses the random forest algorithm to train data obtained from existing debtors.

Figure 9
Decision Tree The random forest algorithm is a collection of a group of decision trees that are used simultaneously using the Bagging or Bootstrap Aggregating method. The visualization results of the Pythagorean forest with a total of 10 trees show how the random forest algorithm works in determining predictions. Confusion Matrix By looking at the confusion matrix table, it can be concluded from the prediction results of the random forest algorithm that comes from a total of 92 data. For the use of High credit, the total prediction is entirely correct, namely from 56 data. Meanwhile, the use of Low credit has a total of 18 correct predictions out of a total of 19 data. And for the use of moderate credit, there are 16 correct predictions out of a total of 17 data.

Figure 13
Learning results and predictions of the random forest algorithm with train data The final stage of the research is the random forest algorithm which will be used as a prediction tool with the random forest algorithm which has learned from 92 train data of existing debtors. The algorithm is then used to predict K1 data and provide an overview of credit usage if the customer becomes an MSME debtor. Of the 1141 K1 data that were cleaned and transformed, there were 685 data that could be continued to the Prediction stage.  Figure 14 Random Forest Algorithm Final Flow and Orange Prediction In summary, flow machine learning starts from the input data train, which has previously gone through a cleaning and transformation process from existing debtors, then the data is reduced using outliers for data outside the range. The data train then studied the random forest algorithm. Furthermore, the K1 data input that has gone through the cleaning and transformation process is as much as 685 data. The data is predicted with the random forest algorithm and produces the resulting data.

Machine Learning Prediction Results
Datathe prediction results of 685 K1 customers concluded that if credit is given to these customers, it is likely that there will be usage according to the pattern of the existing debtor.

Figure 15 results of predictions and recommendations for new MSME debtors
Prediction results obtained by 136 K1 customers when given credit then the usage is High, 17 K1 customers use Moderate and 532 K1 customers use Low. These results can provide Managerial Impact to determine strategies in prospecting prospective borrowers. Of course, if you offer credit to 136 customers with high usage, the probability of success is greater than if you offer credit to 532 customers with low usage. This strategy is expected to help the growth of MSME debtors at the BCA Tebing Tinggi branch.

CONCLUSION
Based on the analysis of research results that have been carried out with data on existing debtors and K1, it can be concluded: 1. The results of the descriptive analysis using scatter plots show that there is a relationship between variables such as average balance, account mutation, ratio, age and line of business with the use of credit from existing debtors. This can be the basis for making these variables a factor influencing the use of credit. 2. The random forest algorithm was applied to study the tendency to use credit that was trained using 92 data on existing debtors and the results have fairly good accuracy. Then the algorithm is used to predict 685 potential K1 customer data to become debtors. The prediction results show that there are 136 customers