词向量 | 使用MD&A2001-2022语料训练Word2Vec模型
...
...
借助chatGPT更高效地学习「Python实证指标构建与文本分析」学一门含有Python语法、代码技术、科研应用三类内容的课程,如【Python实证指标构建文本分析】,掌握并内化最少必要知识量。明白原理,会把需求转化成问题,向chatGPT提问。如果把社科数据分析需求比作城堡, 我们需要掌握拆解成多个小积木的能力,之后每个小积木让chatGPT帮我实现。我们要做的是
使用机器学习、文本分析方法,发表在《经济研究》的相关论文
近年来,文本信息逐渐成为国外会计实证研究的热点,许多学者开始致力于 运用文本分析方法来解决会计与财务问题,并取得了众多有价值的研究成果。与之相比,国 内的此类研究却相当缺乏。为了弥补国内研究的不足,本文对国外近十年来取得的研究成果 进行了系统的梳理和述评。首先,系统阐述了会计文本信息的定义、特征及其测量方法;其 次,从不同层面出发,总结并分析了会计文本信息的影响因素及其作用结果;再次,指出了 现今国外研究中存在的不足。在此基础上,本文提出了一个未来研究的框架,分别从基础、 引入、拓展三个方向来展望国内研究,具体包括如何构建适合中文会计语言的文本分析方 法、国外现有理论与问题在我国的本土化检验以及在中国情境下可以拓展的独创性研究。In recent years, text information has gradually become a hot spot in foreign accounting empirical research. Many scholars have begun to use text analysis methods to solve accounting and financial problems, and have achieved many valuable research results. In contrast, such research in China is quite lacking. In order to make up for the lack of domestic research, this paper systematically sorts out and reviews the research achievements abroad in the past ten years. Firstly, it systematically expounds the definition, characteristics and measurement methods of accounting textual information; secondly, it summarizes and analyzes the influencing factors and results of accounting textual information from different levels; thirdly, it points out the deficiencies in current foreign research . On this basis, this paper proposes a framework for future research, looking forward to domestic research from the three directions of foundation, introduction, and expansion, including how to construct a text analysis method suitable for Chinese accounting language, and the existing foreign theories and problems in my country. Indigenous testing and original research that can be extended in the Chinese context.
本文利用金融情感词典和文本分析技术,分析中国人民银行货币政策执行报告的**文本情绪、文本相似度和文本可读性**等多维文本信息,刻画央行货币政策执行报告的文本特征,探究货币政策报告的文本信息与宏观经济和股票市场的关系。**实证研究发现,货币政策报告的文本情绪的改善会引起显著为正的股票市场价格反应, 报告文本相似度的增加会引起股票市场波动性的显著降低, 报告可读性对公布后股票市场的波动性影响不显著**。货币政策报告文本情绪还与诸多宏观经济指标显著相关。进一步研究发现,引起股票市场显著反应的是报告文本情绪中反映货币政策指引的部分,而反映宏观经济历史状态的部分对股票市场的影响不显著。本文从文本大数据分析角度证明了我国央行沟通的有效性,对国内央行沟通相关研究形成了有益补充。This paper uses text analysis techniques to analyze 71 Monetary Policy Implementation Reports ( hereinafter referred to as“the reports”) of PBOC,calculates the text sentiment ( tone) ,the similarity and readability and other text indicators of the reports,and explores the relationship between these text indicators and the macro economy and the stock market. Based on the Chinese financial sentiment dictionary developed by Jiang et al. ( 2020) ,this paper uses the sentiment unit method to calculate the tone of the reports. In addition,this paper uses TF - IDF weighted cosine similarity to characterize the similarity of the reports,and uses average sentence length to characterize the readability of the reports. The paper then uses correlation analysis to examine the relationship between the tone of the reports and macroeconomic indicators such as economic growth,inflation, and interest rates. With reference to Ehrmann and Fratzscher ( 2009) ,Zhang and Hu ( 2014) ,this paper adds tone,similarity and readability to the EGARCH model to explore whether textual indicators of the reports affect stock market returns and the volatility on the trading day after the release. Furthermore,this paper decomposes the content of the reports into two parts: economic and financial fundamentals and central bank policy guidelines,calculates the tone of the two parts and examines their impacts on the stock market respectively....