JCR2021 | 计算文本的语言具体性

语言具体性描述了一个词在多大程度上是指一个实际的、有形的或“真实的”实体,以一种更具体、更熟悉、更容易被眼睛或心灵感知的方式描述对象和行为(即,可想象或生动;Brysbaert, Warriner, and Kuperman 2014; Semin and Fiedler 1988). 我找了三篇论文,简单分享一下。...

2022-04-07 · 3 min · 大邓

转载|大数据时代下社会科学研究方法的拓展——基于词嵌入技术的文本分析的应用

在大数据时代的背景下,基于大数据的分析处理技术为以“数据驱动”的社会科学研究创造了新的发 展契机。其中,词嵌入(Word Embeddings)技术借势大数据浪潮,以其高效的词表征能力和强大的迁移学习 能力在文本分析领域受到越来越多的关注。不同于传统的文本分析路径,词嵌入技术不仅实现了对非结构 化文本数据的表征,还保留了丰富的语义信息,可以实现对跨时间、跨文化文本中深层次文化信息的挖掘, 极大丰富了传统的社会科学实证的研究方法。文章总结了词嵌入技术的基本原理及特点,系统地梳理了词 嵌入技术的六大应用主题:社会偏见、概念联想、语义演变、组织关系、文本情感和个体决策机制。随后, 文章归纳了词嵌入技术的基本应用流程。词嵌入技术还面临文本数据的选择、中文文本的分词处理、单词 语义信息的表征层次三种挑战,文章归纳了相应的应对思路与方法。最后,基于词嵌入技术的强大适应能 力,未来研究可以进一步关注该技术在管理领域的应用前景,包括政策效应评估、用户推荐系统、品牌管 理、企业关系管理、组织内部管理、中国传统智慧与管理问题六个方面。In the context of the era of big data, the analysis and processing technology based on big data has created new development opportunities for data-driven social science research. Among them, word embedding (Word Embeddings) technology takes advantage of the wave of big data, and has received more and more attention in the field of text analysis with its efficient word representation ability and powerful transfer learning ability. Different from the traditional text analysis path, the word embedding technology not only realizes the representation of unstructured text data, but also retains rich semantic information, which can realize the mining of deep cultural information in cross-time and cross-cultural texts, which greatly It enriches the traditional social science empirical research methods. This article summarizes the basic principles and characteristics of word embedding technology, and systematically sorts out six application themes of word embedding technology: social bias, concept association, semantic evolution, organizational relationship, text emotion, and individual decision-making mechanism. Subsequently, the article summarizes the basic application process of word embedding technology. Word embedding technology also faces three challenges: the selection of text data, the word segmentation processing of Chinese text, and the representation level of word semantic information. The article summarizes the corresponding countermeasures and methods. Finally, based on the strong adaptability of word embedding technology, future research can further focus on the application prospects of this technology in the field of management, including policy effect evaluation, user recommendation system, brand management, enterprise relationship management, organization internal management, traditional Chinese wisdom and management There are six aspects to the problem....

2022-04-07 · 5 min · 冉雅璇李志强刘佳妮张逸石

PyPlutchik库 | 可视化文本的情绪轮(情绪指纹)

越来越多的社交网络学者, 为测量情绪, 基于**心理学家 Robert Plutchik** 提出的模型(通常简称为“**Plutchik轮**”,人类的情绪一共有8大类)制作了大量的情绪可视化作品。在某种程度上,Plutchik轮可以看做情绪指纹,例如不同的电影题材在8类情绪的分布是不一样的。...

2022-04-03 · 2 min · 大邓

whatlies库 | 可视化词向量

词语之间可以比较亲疏远近...

2022-04-02 · 1 min · 大邓

cntext库 | Python文本分析包更新

扩展词典、情感分析、可阅读性,内置9种情感词典,涵盖中英文...

2022-04-01 · 5 min · 大邓