Predicting financial distress of companies using textual information of Board of Director's Reports

Document Type : Research Paper

Authors

1 Assistant Professor, Department of Accounting, Allameh Tabataba’i University, Tehran, Iran;

2 Accounting, Management and Accounting, Allameh Tabataba'i University, Tehran, Iran

3 Associate Professor, Department of Industrial Management, Allameh Tabataba’i University, Tehran, Iran.

4 Ph.D. Candidate, Department of Accounting, Allameh Tabataba’i University, Tehran, Iran

10.22103/jak.2024.22992.4017

Abstract

Objective: In modern attitudes towards the company as a socio-economic unit, companies, in addition to their economic function, have colorful functions in non-economic fields, such as helping the labor market and employment capacity, the environment and preserving or destroying it, future generations, and transferring debt or capital. Therefore, according to the economic and non-economic consequences of the companies, the financial distress and bankruptcy of the companies or their health and efficiency have a very important effect on the socio-economic indicators both at the macro level and at the micro level. Considering these effects, predicting these situations can help in creating awareness to deal with the possible future situation (especially the economic dimension).

So far, most of the research conducted on the financial distress and bankruptcy of companies has been based on structured data and quantitative disclosures. This is even though nowadays a large part of information is published through qualitative disclosures. Although the quantitative information of financial reports of companies is of great importance; this information alone cannot provide a complete picture of the company's condition for users. To use the qualitative information of financial reports, investors must first decode them and then process them. for this reason, providing and presenting written information along with quantitative information in financial reports has attracted the attention of producers, users, and regulators of accounting information to better understand quantitative information and provide information that cannot be provided in the form of numbers. In such a way, today, textual information has taken up more volume of financial reports of companies than quantitative information. Therefore, the current research seeks to first convert the unstructured data of the board of directors' report into structured data and then, use machine learning algorithms to provide a model for predicting financial distress in python.

Method: In this research, the report of the board of directors of 100 companies in the period of 2012-2022 was collected and to convert the unstructured data of the board of directors' reports into structured data, Python programming language has been used for text mining. So that after performing the text pre-processing steps, feature generation, feature selection, etc, to prepare 



textual data, the TF-IDF matrix has been used to select features. Then by combining these matrices and financial distress variables, machine learning algorithms are implemented and finally, the presented model has been validated using valid methods.

Findings:After extracting the dependent variable (y matrix) and having explanatory variables (x matrix) , we entered the modeling phase with different machine learning methods (including logistic regression, random forest, decision tree, nearest neighbor, and methods of support vector machine (SVM) with linear, RBF, sigmoid and polynomial methods).

According to the resaults, it can be said that the two decision tree model and SVM method with radial kernels have the best performance compared to other methods based on the Precision index, which is 80 and 82 percent. According to the Recall criterion, which is the aim of identifying data that happened in financial distress, 78 and 86 percent of coverage of financial distress, respectively, based on the fscore criterion, which is a combination of two criteria of Precision and Recall, respectively 79 % and 84 % confirm the ability to make correct diagnoses. Other models compared to these two models have had a significant difference in diagnosing financial distress.

DISCUSSION AND CONCLUSION: So far, many models have been presented for predicting financial distress, but what has been less discussed is the effect of the information content of the board of directors' report on predicting financial distress. The report of the board of directors contains valuable clues that, like financial data, can detect the existence of financial distress in companies. Therefore, in this research, an attempt has been made to predict the financial distress of companies by using the information capacity of the board of directors' reports. After preprocessing and data mining operations and extraction of dependent variables, using different machine learning methods (including logistic regression, random forest, decision tree, nearest neighbor, and methods of vector machine with linear, RBF, sigmoid, and polynomial) methods, financial distress was measured. The prediction results indicate the high predictive power of two decision tree model (79%) and SVM with radial kernel (85%) compared to other models. The results of this research showed that instead of just paying attention to the numbers and the ratios derived from these numbers, the text mining technique can also be used for analysis and prediction, and by combining it with the results obtained from quantitative information, the financial distress of companies can be determined. Although the accuracy of predicting results from unstructured data compared to structured data with a predictability about 90% is less, but knowing and using the capacity of this type of information is very important so many academics and practitioners believe that quantitative disclosures alone are not effective for economic decision-making.

In several ways, this research has increased knowledge of the field of accounting and data mining: First, the present research has tried to distance itself from the structured data extracted from financial statements, market data, and economic data, and for a major part of the analysis relies on the board' reports (unstructured data). Second, the results of this research showed that instead of relying on "numbers and figures", was relied on the analysis of "the text of reports" and predicted the financial distress of companies very accurately.

Keywords

Main Subjects



Articles in Press, Accepted Manuscript
Available Online from 13 May 2024
  • Receive Date: 22 February 2024
  • Revise Date: 12 May 2024
  • Accept Date: 13 May 2024