文献汇总 | 词嵌入与社会科学中的偏见(态度)

词嵌入

前几天刚刚分享了，

人类在书信、网络论坛留下语言、文字的过程中，也留下了自己的偏见、态度等主观认知信息（偏见、态度）。

词嵌入做为一种词向量模型，可以从文本中计算出隐含的上下文情景信息，态度及偏见。通过词向量距离的测算，就可以间接测得不同群体对某概念(组织、群体、品牌、地域等)的态度偏见。感觉词嵌入技术用处很大，最近整理了下pnas、nature、science中的文献，对了，相当部分的pnas关于词嵌入的论文经常会提供原始数据及代码。

目前有些Python库可以使用词嵌入模型展示人类认知偏见，如:

相关文献

冉雅璇,李志强,刘佳妮,张逸石.大数据时代下社会科学研究方法的拓展——基于词嵌入技术的文本分析的应用[J/OL].南开管理评论:1-27[2022-04-08].http://kns.cnki.net/kcms/detail/12.1288.F.20210905.1337.002.html
Kozlowski, A.C., Taddy, M. and Evans, J.A., 2019. The geometry of culture: Analyzing the meanings of class through word embeddings. American Sociological Review, 84(5), pp.905-949.
Toubia, O., Berger, J. and Eliashberg, J., 2021. How quantifying the shape of stories predicts their success. Proceedings of the National Academy of Sciences, 118(26).
Caliskan A, Bryson JJ, Narayanan A. Semantics derived automatically from language corpora contain human-like biases. Science. 2017;356: 183–186.
Garg N, Schiebinger L, Jurafsky D, Zou J. Word embeddings quantify 100 years of gender and ethnic stereotypes . Proceedings of the National Academy of Sciences. 2018. pp. E3635–E3644. doi:10.1073/pnas.1720347115
Garg, N., Schiebinger, L., Jurafsky, D. and Zou, J., 2018. Word embeddings quantify 100 years of gender and ethnic stereotypes. Proceedings of the National Academy of Sciences, 115(16), pp.E3635-E3644.
Peng, H., Ke, Q., Budak, C., Romero, D.M. and Ahn, Y.Y., 2021. Neural embeddings of scholarly periodicals reveal complex disciplinary organizations. Science Advances, 7(17), p.eabb9004.
Waller, I. and Anderson, A., 2021. Quantifying social organization and political polarization in online platforms. Nature, 600(7888), pp.264-268.
Arseniev-Koehler, A., Cochran, S.D., Mays, V.M., Chang, K.W. and Foster, J.G., 2022. Integrating topic modeling and word embedding to characterize violent deaths. Proceedings of the National Academy of Sciences, 119(10), p.e2108801119.
Bollen, J., Ten Thij, M., Breithaupt, F., Barron, A.T., Rutter, L.A., Lorenzo-Luaces, L. and Scheffer, M., 2021. Historical language records reveal a surge of cognitive distortions in recent decades. Proceedings of the National Academy of Sciences, 118(30).
Kim, L., Smith, D.S., Hofstra, B. and McFarland, D.A., 2022. Gendered knowledge in fields and academic careers. Research Policy, 51(1), p.104411.
Lawson, M.A., Martin, A.E., Huda, I. and Matz, S.C., 2022. Hiring women into senior leadership positions is associated with a reduction in gender stereotypes in organizational language. Proceedings of the National Academy of Sciences, 119(9), p.e2026443119.
Brady, W.J., McLoughlin, K., Doan, T.N. and Crockett, M.J., 2021. How social learning amplifies moral outrage expression in online social networks. Science Advances, 7(33), p.eabe5641.
Bailey, A.H., Williams, A. and Cimpian, A., 2022. Based on billions of words on the internet, people= men. Science Advances, 8(13), p.eabm2463.
Lewis, M. and Lupyan, G., 2020. Gender stereotypes are reflected in the distributional structure of 25 languages. Nature human behaviour, 4(10), pp.1021-1028.
Schramowski, P., Turan, C., Andersen, N., Rothkopf, C.A. and Kersting, K., 2022. Large pre-trained language models contain human-like biases of what is right and wrong to do. Nature Machine Intelligence, 4(3), pp.258-268.
Costa-jussà, M.R., 2019. An analysis of gender bias studies in natural language processing. Nature Machine Intelligence, 1(11), pp.495-496.
Rodman, E., 2020. A timely intervention: Tracking the changing meanings of political concepts with word vectors. Political Analysis, 28(1), pp.87-111.
Bhatia, S., 2017. Associative judgment and vector space semantics. Psychological review, 124(1), p.1.
Kurdi, B., Mann, T.C., Charlesworth, T.E. and Banaji, M.R., 2019. The relationship between implicit intergroup attitudes and beliefs. Proceedings of the National Academy of Sciences, 116(13), pp.5862-5871.
Charlesworth, T.E., Yang, V., Mann, T.C., Kurdi, B. and Banaji, M.R., 2021. Gender stereotypes in natural language: Word embeddings show robust consistency across child and adult language corpora of more than 65 million words. Psychological Science, 32(2), pp.218-240.
Bhatia, S., 2019. Predicting risk perception: New insights from data science. Management Science, 65(8), pp.3800-3823.
Rheault, L. and Cochrane, C., 2020. Word embeddings for the analysis of ideological placement in parliamentary corpora. Political Analysis, 28(1), pp.112-133.
Yang, K., Lau, R.Y. and Abbasi, A., 2022. Getting Personal: A Deep Learning Artifact for Text-Based Measurement of Personality. Information Systems Research.
Rodman, E., 2020. A timely intervention: Tracking the changing meanings of political concepts with word vectors. Political Analysis, 28(1), pp.87-111.
Margulis, E.H., Wong, P.C., Turnbull, C., Kubit, B.M. and McAuley, J.D., 2022. Narratives imagined in response to instrumental music reveal culture-bounded intersubjectivity. Proceedings of the National Academy of Sciences, 119(4).
Thompson, B., Roberts, S.G. and Lupyan, G., 2020. Cultural influences on word meanings revealed through large-scale semantic alignment. Nature Human Behaviour, 4(10), pp.1029-1038.

词嵌入#

相关文献#

广而告之#

词嵌入

相关文献

广而告之