{ "cells": [ { "cell_type": "markdown", "id": "32eaa36a", "metadata": {}, "source": [ "\n", "\n", "## 一、高管数据集\n", "\n", "\n", "\n", "
\n", "\n", "\n", "### 1.1 介绍\n", "[数据集 | 90w条中国上市公司高管数据](https://textdata.cn/blog/2022-11-25-senior-manager-resume-dataset/)\n", "\n", "90w 条中国上市公司高管简历,数据源-新浪财经,统计的日期范围**1990-2021**年。\n", " \n", " \n", "### 1.2 字段\n", "数据集的字段含,大多是从「个人简历」中计算衍生出来的。\n", "\n", "```\n", "- ID\n", "- 姓名\n", "- 证券代码\n", "- 统计截止日期\n", "- 个人简历\n", "- 国籍\n", "- 籍贯\n", "- 籍贯所在地区代码\n", "- 出生地\n", "- 出生地所在地区代码\n", "- 性别\n", "- 年龄\n", "- 毕业院校\n", "- 学历 1=中专及中专以下; 2=大专; 3=本科; 4=硕士研究生; 5=博士研究生; 6=其他(以其他形式公布的学历,如荣誉博士、函授等); 7=MBA/EMBA\n", "- 专业\n", "- 职称\n", "- 是否领取薪酬\n", "- 报告期报酬总额\n", "- 年末持股数\n", "- 是否高管团队成员\n", "- 是否董事会成员\n", "- 是否独立董事\n", "- 是否兼任董事长和CEO\n", "- 是否监事\n", "- 具体职务\n", "```\n", "\n", "\n", "
\n", "\n", "### 1.3 应用价值\n", "\n", "这里粘贴部分应用高管数据论文\n", "\n", "- 何瑛,于文蕾,戴逸驰,王砚羽.高管职业经历与企业创新[J].管理世界,2019,35(11):174-192.\n", "- 杨林,和欣,顾红芳.高管团队经验、动态能力与企业战略突变:管理自主权的调节效应[J].管理世界,2020,36(06):168-188+201+252.\n", "- 周楷唐,麻志明,吴联生.高管学术经历与公司债务融资成本[J].经济研究,2017,52(07):169-183.\n", "- 陆瑶,张叶青,黎波,赵浩宇.高管个人特征与公司业绩——基于机器学习的经验证据[J].管理科学学报,2020,23(02):120-140.\n", "- 柳光强,孔高文.高管经管教育背景与企业内部薪酬差距[J].会计研究,2021,(03):110-121.\n", "- 郑建明,孙诗璐,李金甜.高管文化背景与企业债务成本——基于劳模文化的视角[J].会计研究,2021,(03):137-145.\n", " \n", " \n", "
\n", "\n", "## 二、代码案例\n", "\n", "用Python实现以下五个技术难题,主要对高管简介进行操作\n", "\n", "1. 读取xlsx文件(90w高管数据)\n", "2. 简介文本中是否含指定词语(例如找出有【清华大学】求学经历的高管)\n", "3. 大学高管数量排行榜\n", "4. 统计文本中指定词语出现次数(例如统计每位高管内【大学】出现次数)\n", "5. 找出每位高管的出生年份(用正则表达式)\n", "6. 统计每位高管经历的时间点个数\n", "...\n", "\n", "### 2.1 导入数据\n" ] }, { "cell_type": "code", "execution_count": 1, "id": "1bc443a4", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
ID姓名证券代码统计截止日期个人简历国籍籍贯籍贯所在地区代码出生地出生地所在地区代码...是否领取薪酬报告期报酬总额津贴年末持股数是否高管团队成员是否董事会成员是否独立董事是否兼任董事长和CEO是否监事具体职务
030276026赵勇6051692020-10-30赵勇先生,副总经理,中国国籍,无境外永久居留权,1969年6月出生,大专学历;曾任新疆生产建...中华人民共和国北京市110000北京市110000...Y269600.0NaNNaN10000副总经理
13056116林菁6890092020-10-29林菁先生,1965年3月出生,中国国籍,无境外永久居留权,对外经济贸易大学工商管理(MBA)...中华人民共和国北京市110000北京市110000...N166300.0166300.0NaN01100独立董事
230568910信意安29952020-08-05信意安,男,1972年出生,中国国籍,无境外永久居留权,专科学历,现任发行人董事长兼总经理,...中华人民共和国北京市110000北京市110000...Y950000.0NaN22509250.011010董事长,非独立董事,总经理
330101636赵炳弟6885612020-07-22赵炳弟先生,独立董事,生于1960年10月,中国籍,无境外永久居留权,研究生学历,毕业于中央...NaN北京市110000北京市110000...N105000.0105000.0NaN01100独立董事
430138872张金6883772020-07-08张金先生:中国国籍,1962年出生,无境外永久居留权,本科学历,高级工程师,2009年2月被...中华人民共和国北京市110000北京市110000...N60000.060000.0NaN01100独立董事
\n", "

5 rows × 26 columns

\n", "
" ], "text/plain": [ " ID 姓名 证券代码 统计截止日期 \\\n", "0 30276026 赵勇 605169 2020-10-30 \n", "1 3056116 林菁 689009 2020-10-29 \n", "2 30568910 信意安 2995 2020-08-05 \n", "3 30101636 赵炳弟 688561 2020-07-22 \n", "4 30138872 张金 688377 2020-07-08 \n", "\n", " 个人简历 国籍 籍贯 籍贯所在地区代码 \\\n", "0 赵勇先生,副总经理,中国国籍,无境外永久居留权,1969年6月出生,大专学历;曾任新疆生产建... 中华人民共和国 北京市 110000 \n", "1 林菁先生,1965年3月出生,中国国籍,无境外永久居留权,对外经济贸易大学工商管理(MBA)... 中华人民共和国 北京市 110000 \n", "2 信意安,男,1972年出生,中国国籍,无境外永久居留权,专科学历,现任发行人董事长兼总经理,... 中华人民共和国 北京市 110000 \n", "3 赵炳弟先生,独立董事,生于1960年10月,中国籍,无境外永久居留权,研究生学历,毕业于中央... NaN 北京市 110000 \n", "4 张金先生:中国国籍,1962年出生,无境外永久居留权,本科学历,高级工程师,2009年2月被... 中华人民共和国 北京市 110000 \n", "\n", " 出生地 出生地所在地区代码 ... 是否领取薪酬 报告期报酬总额 津贴 年末持股数 是否高管团队成员 是否董事会成员 \\\n", "0 北京市 110000 ... Y 269600.0 NaN NaN 1 0 \n", "1 北京市 110000 ... N 166300.0 166300.0 NaN 0 1 \n", "2 北京市 110000 ... Y 950000.0 NaN 22509250.0 1 1 \n", "3 北京市 110000 ... N 105000.0 105000.0 NaN 0 1 \n", "4 北京市 110000 ... N 60000.0 60000.0 NaN 0 1 \n", "\n", " 是否独立董事 是否兼任董事长和CEO 是否监事 具体职务 \n", "0 0 0 0 副总经理 \n", "1 1 0 0 独立董事 \n", "2 0 1 0 董事长,非独立董事,总经理 \n", "3 1 0 0 独立董事 \n", "4 1 0 0 独立董事 \n", "\n", "[5 rows x 26 columns]" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import pandas as pd\n", "\n", "df = pd.read_excel('高管数据.xlsx')\n", "#剔除「个人简历」字段中的缺失值\n", "df.dropna(subset=['个人简历'], inplace=True)\n", "df.head()" ] }, { "cell_type": "markdown", "id": "3fa652c5", "metadata": {}, "source": [ "
\n", "\n", "### 2.2 简介文本长度" ] }, { "cell_type": "code", "execution_count": 91, "id": "18f17838", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 161\n", "1 154\n", "2 395\n", "3 306\n", "4 335\n", " ... \n", "900882 40\n", "900883 54\n", "900884 71\n", "900885 41\n", "900886 62\n", "Name: 个人简历, Length: 736970, dtype: int64" ] }, "execution_count": 91, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['个人简历'].str.len()\n", "\n", "#新增一个字段length,将简介文本长度保存到length中\n", "#df['length'] = df['个人简历'].str.len()" ] }, { "cell_type": "markdown", "id": "9d06e0cd", "metadata": {}, "source": [ "
\n", "\n", "### 2.3 简介文本中是否含指定词语\n", "\n", "例如找出有【清华大学】求学经历的高管,这里直接使用**Series.str.contains()**方法来直接搜某字段(Series)是否含某个词\n", "\n", "- ``len(df[df['个人简历'].str.contains('清华大学')])`` 有「清华大学」学习经历的高管人数\n", "- ``len(df[df['个人简历'].str.contains('北京大学')])`` 有「北京大学」学习经历的高管人数\n", "- ``len(df[df['个人简历'].str.contains('清华大学|北京大学')])`` 有「清华大学」或「北京大学」学习经历的高管人数\n", "- ``len(df[df['个人简历'].str.contains('清华大学') & df['个人简历'].str.contains('北京大学')])`` 同时有「清华大学」和「北京大学」学习经历的高管人数\n", "\n", "第三个(北大清华)表达式的数量应该是最多的(前两者之和), 第四个表达式是最少。 注意, 逻辑【或|】【且&】可以有任意多个" ] }, { "cell_type": "code", "execution_count": 78, "id": "56648e52", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "10377" ] }, "execution_count": 78, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#统计有【清华大学】学习经历的高管人数\n", "len(df[df['个人简历'].str.contains('清华大学')])" ] }, { "cell_type": "code", "execution_count": 77, "id": "3dc0e124", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8709" ] }, "execution_count": 77, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(df[df['个人简历'].str.contains('北京大学')])" ] }, { "cell_type": "code", "execution_count": 76, "id": "abbfd513", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "18647" ] }, "execution_count": 76, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(df[df['个人简历'].str.contains('清华大学|北京大学')])" ] }, { "cell_type": "code", "execution_count": 72, "id": "8b043d0c", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "439" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(df[df['个人简历'].str.contains('清华大学') & df['个人简历'].str.contains('北京大学')])" ] }, { "cell_type": "markdown", "id": "5f8c8340", "metadata": {}, "source": [ "
\n", "\n", "### 2.4 大学高管数量排行榜" ] }, { "cell_type": "code", "execution_count": 69, "id": "d7f97dcb", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "大学高管人数排行\n" ] }, { "data": { "text/plain": [ "[('清华大学', 10377),\n", " ('北京大学', 8709),\n", " ('中国人民大学', 7012),\n", " ('浙江大学', 5816),\n", " ('中山大学', 4065),\n", " ('上海交通大学', 3844),\n", " ('武汉大学', 3578),\n", " ('南京大学', 3272),\n", " ('西安交通大学', 2972),\n", " ('南开大学', 2716),\n", " ('湖南大学', 2502),\n", " ('华中科技大学', 2356),\n", " ('同济大学', 2089),\n", " ('吉林大学', 2044),\n", " ('四川大学', 1934),\n", " ('山东大学', 1847),\n", " ('中南大学', 1615),\n", " ('天津大学', 1598),\n", " ('重庆大学', 1440),\n", " ('北京航空航天大学', 1334),\n", " ('东北大学', 1241),\n", " ('中国科学技术大学', 842),\n", " ('兰州大学', 745),\n", " ('中国地质大学', 437),\n", " ('大连理工大大学', 0)]" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#测试列表(凭记忆手动输入的大学,各位可以自己设计测试列表)\n", "test_universitys = ['清华大学', '北京大学', '中国人民大学', '浙江大学', \n", " '上海交通大学', '西安交通大学', '同济大学', '南开大学', '天津大学', \n", " '武汉大学', '华中科技大学', '中国科学技术大学', '南京大学',\n", " '中山大学', '中南大学', '四川大学', '重庆大学', '兰州大学', '湖南大学', \n", " '山东大学', '吉林大学', '大连理工大大学', '东北大学', '北京航空航天大学', '中国地质大学']\n", "\n", "\n", "print('大学高管人数排行')\n", "\n", "uni_infos = []\n", "for university in test_universitys:\n", " num = len(df[df['个人简历'].str.contains(university)])\n", " uni_infos.append((university, num))\n", " \n", "uni_infos = sorted(uni_infos, key=lambda k:k[1], reverse=True)\n", "uni_infos" ] }, { "cell_type": "markdown", "id": "a2d14420", "metadata": {}, "source": [ "
\n", "\n", "### 2.5 统计文本中指定词语出现次数\n", "例如统计每位高管内【大学】出现次数" ] }, { "cell_type": "code", "execution_count": 79, "id": "15a1520f", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 0\n", "1 2\n", "2 0\n", "3 0\n", "4 0\n", " ..\n", "900882 0\n", "900883 0\n", "900884 0\n", "900885 0\n", "900886 0\n", "Name: 个人简历, Length: 736970, dtype: int64" ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['个人简历'].str.count('大学')" ] }, { "cell_type": "code", "execution_count": 81, "id": "bc83ef8c", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "高管总人数: 736970\n", "无大学经历高管人数: 515172\n", "有大学经历高管人数: 221798\n" ] } ], "source": [ "print('高管总人数: ', len(df))\n", "#简历中无「大学」字眼\n", "print('无大学经历高管人数:' , len(df[df['个人简历'].str.count('大学')==0]))\n", "#简历中有「大学」字眼\n", "print('有大学经历高管人数:' , len(df[df['个人简历'].str.count('大学')>0]))" ] }, { "cell_type": "code", "execution_count": 84, "id": "3aacf916", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 84, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAiMAAAGhCAYAAACzurT/AAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8qNh9FAAAACXBIWXMAAA9hAAAPYQGoP6dpAAAnk0lEQVR4nO3dfVjVdZ7/8dcB5EAqmKJHScQzZUZSqx5aA8eySZmlm42pTZw2zUmbGKyG2GokZsdg5wp3pnV0d4NyRtfVGof2MttcMTu73ekwNSthNWWTkzeQHiKwBbPpoPD5/cHPc3XiRg5y/Ag9H9f1va4535vzfdMgPM/33OAwxhgBAABYEmF7AAAA8PVGjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMCqKNsD9EZ7e7uOHDmi4cOHy+Fw2B4HAAD0gjFGx44dU2JioiIiur/+MSBi5MiRI0pKSrI9BgAA6IO6ujqNHz++2+0DIkaGDx8uqeOLiYuLszwNAADojZaWFiUlJQV+j3dnQMTIqadm4uLiiBEAAAaY073EghewAgAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgVZ9ipKysTG63WzExMfJ4PNq5c2e3+y5atEgOh6PTMmXKlD4PDQAABo+QY6SiokL5+fkqKipSTU2NZs2apaysLNXW1na5/+rVq+Xz+QJLXV2dRo4cqVtvvfWMhwcAAAOfwxhjQjlgxowZmj59usrLywPrUlJSlJ2drdLS0tMe/9xzz+nmm2/WgQMHlJyc3KtztrS0KD4+Xs3NzfyhPAAABoje/v4O6cpIa2urqqurlZmZGbQ+MzNTVVVVvbqPtWvXas6cOT2GiN/vV0tLS9ACAAAGp5BipLGxUW1tbXK5XEHrXS6X6uvrT3u8z+fT9u3btWTJkh73Ky0tVXx8fGBJSkoKZUwAADCARPXlIIfDEXTbGNNpXVfWr1+vESNGKDs7u8f9CgsLVVBQELjd0tJy2iCZuGzbac9/OgdXXH/G9wEAAEITUowkJCQoMjKy01WQhoaGTldLvsoYo3Xr1mnBggWKjo7ucV+n0ymn0xnKaAAAYIAK6Wma6OhoeTweeb3eoPVer1cZGRk9Hvvqq6/qT3/6kxYvXhz6lAAAYNAK+WmagoICLViwQGlpaUpPT9eaNWtUW1ur3NxcSR1PsRw+fFgbNmwIOm7t2rWaMWOGUlNT+2dyAAAwKIQcIzk5OWpqalJJSYl8Pp9SU1NVWVkZeHeMz+fr9Jkjzc3N2rx5s1avXt0/UwMAgEEj5M8ZsaE371PmBawAAJxbwvI5IwAAAP2NGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACs6lOMlJWVye12KyYmRh6PRzt37uxxf7/fr6KiIiUnJ8vpdOrCCy/UunXr+jQwAAAYXKJCPaCiokL5+fkqKyvTzJkz9eSTTyorK0vvvfeeJkyY0OUx8+bN08cff6y1a9fqoosuUkNDg06ePHnGwwMAgIHPYYwxoRwwY8YMTZ8+XeXl5YF1KSkpys7OVmlpaaf9X3jhBc2fP1/79+/XyJEj+zRkS0uL4uPj1dzcrLi4uC73mbhsW5/u+8sOrrj+jO8DAAB06M3vbynEp2laW1tVXV2tzMzMoPWZmZmqqqrq8pjnn39eaWlp+tnPfqYLLrhAF198sR544AH9+c9/7vY8fr9fLS0tQQsAABicQnqaprGxUW1tbXK5XEHrXS6X6uvruzxm//792rVrl2JiYrRlyxY1NjYqLy9PR48e7fZ1I6WlpSouLg5lNAAAMED16QWsDocj6LYxptO6U9rb2+VwOPT000/rL//yL3Xddddp5cqVWr9+fbdXRwoLC9Xc3BxY6urq+jImAAAYAEK6MpKQkKDIyMhOV0EaGho6XS05Zdy4cbrgggsUHx8fWJeSkiJjjD766CNNmjSp0zFOp1NOpzOU0QAAwAAV0pWR6OhoeTweeb3eoPVer1cZGRldHjNz5kwdOXJEn332WWDdBx98oIiICI0fP74PIwMAgMEk5KdpCgoK9Ktf/Urr1q3T3r17df/996u2tla5ubmSOp5iWbhwYWD/2267TaNGjdL3vvc9vffee3rttdf04IMP6s4771RsbGz/fSUAAGBACvlzRnJyctTU1KSSkhL5fD6lpqaqsrJSycnJkiSfz6fa2trA/sOGDZPX69W9996rtLQ0jRo1SvPmzdNPf/rT/vsqAADAgBXy54zYwOeMAAAw8ITlc0YAAAD6GzECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWNWnGCkrK5Pb7VZMTIw8Ho927tzZ7b6vvPKKHA5Hp+X999/v89AAAGDwCDlGKioqlJ+fr6KiItXU1GjWrFnKyspSbW1tj8f98Y9/lM/nCyyTJk3q89AAAGDwCDlGVq5cqcWLF2vJkiVKSUnRqlWrlJSUpPLy8h6PGzNmjMaOHRtYIiMj+zw0AAAYPEKKkdbWVlVXVyszMzNofWZmpqqqqno8dtq0aRo3bpyuvfZavfzyy6FPCgAABqWoUHZubGxUW1ubXC5X0HqXy6X6+voujxk3bpzWrFkjj8cjv9+vjRs36tprr9Urr7yiq666qstj/H6//H5/4HZLS0soYwIAgAEkpBg5xeFwBN02xnRad8rkyZM1efLkwO309HTV1dXpscce6zZGSktLVVxc3JfRAADAABPS0zQJCQmKjIzsdBWkoaGh09WSnlx55ZXat29ft9sLCwvV3NwcWOrq6kIZEwAADCAhxUh0dLQ8Ho+8Xm/Qeq/Xq4yMjF7fT01NjcaNG9ftdqfTqbi4uKAFAAAMTiE/TVNQUKAFCxYoLS1N6enpWrNmjWpra5Wbmyup46rG4cOHtWHDBknSqlWrNHHiRE2ZMkWtra166qmntHnzZm3evLl/vxIAADAghRwjOTk5ampqUklJiXw+n1JTU1VZWank5GRJks/nC/rMkdbWVj3wwAM6fPiwYmNjNWXKFG3btk3XXXdd/30VAABgwHIYY4ztIU6npaVF8fHxam5u7vYpm4nLtp3xeQ6uuP6M7wMAAHToze9vib9NAwAALCNGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYFWfYqSsrExut1sxMTHyeDzauXNnr4777W9/q6ioKE2dOrUvpwUAAINQyDFSUVGh/Px8FRUVqaamRrNmzVJWVpZqa2t7PK65uVkLFy7Utdde2+dhAQDA4BNyjKxcuVKLFy/WkiVLlJKSolWrVikpKUnl5eU9Hnf33XfrtttuU3p6ep+HBQAAg09IMdLa2qrq6mplZmYGrc/MzFRVVVW3x/3bv/2bPvzwQy1fvrxX5/H7/WppaQlaAADA4BRSjDQ2NqqtrU0ulytovcvlUn19fZfH7Nu3T8uWLdPTTz+tqKioXp2ntLRU8fHxgSUpKSmUMQEAwADSpxewOhyOoNvGmE7rJKmtrU233XabiouLdfHFF/f6/gsLC9Xc3BxY6urq+jImAAAYAHp3qeL/S0hIUGRkZKerIA0NDZ2ulkjSsWPHtHv3btXU1Oiee+6RJLW3t8sYo6ioKL344ov61re+1ek4p9Mpp9MZymgAAGCACunKSHR0tDwej7xeb9B6r9erjIyMTvvHxcXpnXfe0Z49ewJLbm6uJk+erD179mjGjBlnNj0AABjwQroyIkkFBQVasGCB0tLSlJ6erjVr1qi2tla5ubmSOp5iOXz4sDZs2KCIiAilpqYGHT9mzBjFxMR0Wg8AAL6eQo6RnJwcNTU1qaSkRD6fT6mpqaqsrFRycrIkyefznfYzRwAAAE5xGGOM7SFOp6WlRfHx8WpublZcXFyX+0xctu2Mz3NwxfVnfB8AAKBDb35/S/xtGgAAYBkxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFjVpxgpKyuT2+1WTEyMPB6Pdu7c2e2+u3bt0syZMzVq1CjFxsbqkksu0S9+8Ys+DwwAAAaXqFAPqKioUH5+vsrKyjRz5kw9+eSTysrK0nvvvacJEyZ02n/o0KG65557dPnll2vo0KHatWuX7r77bg0dOlTf//73++WLAAAAA5fDGGNCOWDGjBmaPn26ysvLA+tSUlKUnZ2t0tLSXt3HzTffrKFDh2rjxo292r+lpUXx8fFqbm5WXFxcl/tMXLatV/fVk4Mrrj/j+wAAAB168/tbCvFpmtbWVlVXVyszMzNofWZmpqqqqnp1HzU1NaqqqtLVV1/d7T5+v18tLS1BCwAAGJxCipHGxka1tbXJ5XIFrXe5XKqvr+/x2PHjx8vpdCotLU1Lly7VkiVLut23tLRU8fHxgSUpKSmUMQEAwADSpxewOhyOoNvGmE7rvmrnzp3avXu3nnjiCa1atUqbNm3qdt/CwkI1NzcHlrq6ur6MCQAABoCQXsCakJCgyMjITldBGhoaOl0t+Sq32y1Juuyyy/Txxx/rkUce0Xe/+90u93U6nXI6naGMBgAABqiQroxER0fL4/HI6/UGrfd6vcrIyOj1/Rhj5Pf7Qzk1AAAYpEJ+a29BQYEWLFigtLQ0paena82aNaqtrVVubq6kjqdYDh8+rA0bNkiSHn/8cU2YMEGXXHKJpI7PHXnsscd077339uOXAQAABqqQYyQnJ0dNTU0qKSmRz+dTamqqKisrlZycLEny+Xyqra0N7N/e3q7CwkIdOHBAUVFRuvDCC7VixQrdfffd/fdVAACAASvkzxmxgc8ZAQBg4AnL54wAAAD0N2IEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsKpPMVJWVia3262YmBh5PB7t3Lmz232fffZZzZ07V6NHj1ZcXJzS09O1Y8eOPg8MAAAGl5BjpKKiQvn5+SoqKlJNTY1mzZqlrKws1dbWdrn/a6+9prlz56qyslLV1dW65pprdOONN6qmpuaMhwcAAAOfwxhjQjlgxowZmj59usrLywPrUlJSlJ2drdLS0l7dx5QpU5STk6Of/OQnvdq/paVF8fHxam5uVlxcXJf7TFy2rVf31ZODK64/4/sAAAAdevP7W5KiQrnT1tZWVVdXa9myZUHrMzMzVVVV1av7aG9v17FjxzRy5MhQTj0gEEQAAIQupBhpbGxUW1ubXC5X0HqXy6X6+vpe3cc//dM/6fjx45o3b163+/j9fvn9/sDtlpaWUMYEAAADSJ9ewOpwOIJuG2M6revKpk2b9Mgjj6iiokJjxozpdr/S0lLFx8cHlqSkpL6MCQAABoCQYiQhIUGRkZGdroI0NDR0ulryVRUVFVq8eLGeeeYZzZkzp8d9CwsL1dzcHFjq6upCGRMAAAwgIcVIdHS0PB6PvF5v0Hqv16uMjIxuj9u0aZMWLVqkX//617r++tO/JsLpdCouLi5oAQAAg1NIrxmRpIKCAi1YsEBpaWlKT0/XmjVrVFtbq9zcXEkdVzUOHz6sDRs2SOoIkYULF2r16tW68sorA1dVYmNjFR8f349fCgAAGIhCjpGcnBw1NTWppKREPp9PqampqqysVHJysiTJ5/MFfebIk08+qZMnT2rp0qVaunRpYP0dd9yh9evXn/lXAAAABrSQY0SS8vLylJeX1+W2rwbGK6+80pdTAACArwn+Ng0AALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVxAgAALCKGAEAAFYRIwAAwCpiBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVfYqRsrIyud1uxcTEyOPxaOfOnd3u6/P5dNttt2ny5MmKiIhQfn5+X2cFAACDUMgxUlFRofz8fBUVFammpkazZs1SVlaWamtru9zf7/dr9OjRKioq0l/8xV+c8cAAAGBwCTlGVq5cqcWLF2vJkiVKSUnRqlWrlJSUpPLy8i73nzhxolavXq2FCxcqPj7+jAcGAACDS0gx0traqurqamVmZgatz8zMVFVVVb8N5ff71dLSErQAAIDBKaQYaWxsVFtbm1wuV9B6l8ul+vr6fhuqtLRU8fHxgSUpKanf7hsAAJxb+vQCVofDEXTbGNNp3ZkoLCxUc3NzYKmrq+u3+wYAAOeWqFB2TkhIUGRkZKerIA0NDZ2ulpwJp9Mpp9PZb/cHAADOXSFdGYmOjpbH45HX6w1a7/V6lZGR0a+DAQCAr4eQroxIUkFBgRYsWKC0tDSlp6drzZo1qq2tVW5urqSOp1gOHz6sDRs2BI7Zs2ePJOmzzz7TJ598oj179ig6OlqXXnpp/3wVAABgwAo5RnJyctTU1KSSkhL5fD6lpqaqsrJSycnJkjo+5Oyrnzkybdq0wP+urq7Wr3/9ayUnJ+vgwYNnNj0AABjwQo4RScrLy1NeXl6X29avX99pnTGmL6cBAABfA/xtGgAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFhFjAAAAKuIEQAAYBUxAgAArCJGAACAVcQIAACwihgBAABWESMAAMAqYgQAAFgVZXsA9K+Jy7ad8X0cXHF9P0wCAEDvcGUEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFXECAAAsIoYAQAAVhEjAADAKmIEAABYFWV7AAw+E5dtO+P7OLji+n6YBAAwEHBlBAAAWEWMAAAAq4gRAABgFTECAACsIkYAAIBVfXo3TVlZmX7+85/L5/NpypQpWrVqlWbNmtXt/q+++qoKCgr07rvvKjExUQ899JByc3P7PDRwOryjBwAGjpCvjFRUVCg/P19FRUWqqanRrFmzlJWVpdra2i73P3DggK677jrNmjVLNTU1evjhh3Xfffdp8+bNZzw8AAAY+EK+MrJy5UotXrxYS5YskSStWrVKO3bsUHl5uUpLSzvt/8QTT2jChAlatWqVJCklJUW7d+/WY489pltuueXMpgfOYVydAYDeCSlGWltbVV1drWXLlgWtz8zMVFVVVZfH/O53v1NmZmbQum9/+9tau3atTpw4oSFDhoQ4MoBQnGkUEUQAwi2kGGlsbFRbW5tcLlfQepfLpfr6+i6Pqa+v73L/kydPqrGxUePGjet0jN/vl9/vD9xubm6WJLW0tHQ7W7v/815/Hd3p6f57gxmY4VyboT/m6I8ZUpfvOKPj/1D8bWZghn6dAWfHqZ8fxpge9+vTC1gdDkfQbWNMp3Wn27+r9aeUlpaquLi40/qkpKRQRw1J/Kqw3n2vMEMHZujADB2YoQMzdDgXZkBojh07pvj4+G63hxQjCQkJioyM7HQVpKGhodPVj1PGjh3b5f5RUVEaNWpUl8cUFhaqoKAgcLu9vV1Hjx7VqFGjeoye7rS0tCgpKUl1dXWKi4sL+fj+cC7McK7MwQzMwAzMwAxfjxmMMTp27JgSExN73C+kGImOjpbH45HX69V3vvOdwHqv16ubbrqpy2PS09O1devWoHUvvvii0tLSun29iNPplNPpDFo3YsSIUEbtUlxcnNUQOFdmOFfmYAZmYAZmYIbBP0NPV0ROCfmtvQUFBfrVr36ldevWae/evbr//vtVW1sb+NyQwsJCLVy4MLB/bm6uDh06pIKCAu3du1fr1q3T2rVr9cADD4R6agAAMAiF/JqRnJwcNTU1qaSkRD6fT6mpqaqsrFRycrIkyefzBX3miNvtVmVlpe6//349/vjjSkxM1D//8z/ztl4AACCpjy9gzcvLU15eXpfb1q9f32nd1VdfrTfffLMvp+oXTqdTy5cv7/TUz9dthnNlDmZgBmZgBmZghi9zmNO93wYAACCM+EN5AADAKmIEAABYRYwAAACriBF8LfFSKQA4d/Tp3TTnuo8++kjl5eWqqqpSfX29HA6HXC6XMjIylJubG/aPlce5z+l06q233lJKSortUfA15fP5VF5erl27dsnn8ykyMlJut1vZ2dlatGiRIiMjbY8InDWD7t00u3btUlZWlpKSkpSZmSmXyyVjjBoaGuT1elVXV6ft27dr5syZVuesq6vT8uXLtW7durCe589//rOqq6s1cuRIXXrppUHbvvjiCz3zzDNBH1IXDnv37tXrr7+u9PR0XXLJJXr//fe1evVq+f1+3X777frWt74VtnN/+c8KfNnq1at1++23B/4kwcqVK8M2Q1c+/fRT/fu//7v27duncePG6Y477gh7JNfU1GjEiBFyu92SpKeeekrl5eWqra1VcnKy7rnnHs2fPz+sM9x7772aN2+eZs2aFdbznM6//Mu/aPfu3br++us1b948bdy4UaWlpWpvb9fNN9+skpISRUWF77Ha7t27NWfOHLndbsXGxuqNN97Q3/7t36q1tVU7duxQSkqKduzYoeHDh4dtBuCcYgaZtLQ0k5+f3+32/Px8k5aWdhYn6tqePXtMREREWM/xxz/+0SQnJxuHw2EiIiLM1VdfbY4cORLYXl9fH/YZtm/fbqKjo83IkSNNTEyM2b59uxk9erSZM2eOufbaa01UVJT5n//5n7Cd3+FwmKlTp5rZs2cHLQ6Hw1xxxRVm9uzZ5pprrgnb+U8ZN26caWxsNMYYs3//fjN27FgzduxYM3fuXDN+/HgTHx9v9u7dG9YZpk2bZl566SVjjDG//OUvTWxsrLnvvvtMeXm5yc/PN8OGDTNr164N6wynvhcnTZpkVqxYYXw+X1jP15WSkhIzfPhwc8stt5ixY8eaFStWmFGjRpmf/vSn5tFHHzWjR482P/nJT8I6w8yZM80jjzwSuL1x40YzY8YMY4wxR48eNVOnTjX33XdfWGc45bPPPjNr1qwxixYtMn/1V39lsrKyzKJFi8wvf/lL89lnn52VGXpSX19viouLw36exsZG89JLL5mmpiZjjDGffPKJWbFihSkuLjbvvfde2M9/Sl1dnTl27Fin9a2trebVV189a3OcOueWLVvMz372M7Nx48awfj8MuhiJiYkx77//frfb9+7da2JiYsI+x3/+53/2uPziF78IewhkZ2ebG264wXzyySdm37595sYbbzRut9scOnTIGHN2YiQ9Pd0UFRUZY4zZtGmTOf/8883DDz8c2P7www+buXPnhu38jz76qHG73Z2CJyoqyrz77rthO+9XORwO8/HHHxtjjJk/f76ZPXu2OX78uDHGmC+++MLccMMN5m/+5m/COsN5550X+P9+2rRp5sknnwza/vTTT5tLL700rDM4HA7z3//93+aHP/yhSUhIMEOGDDF//dd/bbZu3Wra2trCeu5TvvGNb5jNmzcbYzoeFERGRpqnnnoqsP3ZZ581F110UVhniI2NNR9++GHgdltbmxkyZIipr683xhjz4osvmsTExLDOYIwx7777rklMTDQjRowwN910k/n+979v7rrrLnPTTTeZESNGmAsuuOCs/jvpytl44PbGG2+Y+Ph443A4zPnnn292795t3G63mTRpkrnoootMbGysqa6uDusMR44cMVdccYWJiIgwkZGRZuHChUFRcrZ+Xn/66afGGGMaGhrMZZddZqKjo82kSZNMTEyMmTBhgvnoo4/Ccu5BFyNut9usW7eu2+3r1q0zbrc77HOcegTocDi6XcL9jTVmzBjz9ttvB63Ly8szEyZMMB9++OFZ+eaOi4sz+/btM8Z0/MCNiooK+kf9zjvvGJfLFdYZfv/735uLL77Y/N3f/Z1pbW01xtiNka7i6PXXXzfjx48P6wyjRo0yu3fvNsZ0fG/s2bMnaPuf/vQnExsbG9YZvvzfobW11VRUVJhvf/vbJjIy0iQmJpqHH3448P0SLrGxsYEoM8aYIUOGmD/84Q+B2wcPHjTnnXdeWGdITk42u3btCtw+cuSIcTgc5vPPPzfGGHPgwIGz8qBp9uzZZv78+cbv93fa5vf7zXe/+10ze/bssM7w1ltv9bhUVFSE/efUnDlzzJIlS0xLS4v5+c9/bsaPH2+WLFkS2L548WKTnZ0d1hkWLlxorrzySvO///u/xuv1mrS0NOPxeMzRo0eNMR0x4nA4wjrDl/993nXXXWbq1KmBq5eNjY0mIyPD3HnnnWE596CLkccff9xER0ebpUuXmueee8787ne/M6+//rp57rnnzNKlS43T6TTl5eVhnyMxMdFs2bKl2+01NTVh/wc2fPjwLi8v3nPPPWb8+PHmtddeO6sxYowxw4YNC3pEePDgwbPyQ/fYsWNm4cKF5vLLLzdvv/22GTJkyFmPkYaGBmNMx/fGl3/5GdPxy8fpdIZ1httvv90sXrzYGGPMrbfean784x8HbX/00UfNZZddFtYZvvzD7ssOHTpkli9fbpKTk8P+Pel2u8327duNMcZ88MEHJiIiwjzzzDOB7du2bTMTJ04M6ww//OEPTWpqqtm+fbt56aWXzDXXXBP0S/+FF14wF154YVhnMKYjzHr6d/DOO++clUDt7oHbqfXh/p44//zzAz8rW1tbTUREhHnjjTcC2998801zwQUXhHWGxMTEoHN+8cUX5qabbjJTp041TU1NZ+XB45f/fV588cXmv/7rv4K2v/zyy2H7tzHoYsQYY37zm9+YGTNmmKioqMA3dVRUlJkxY4apqKg4KzPceOON5u///u+73b5nz56wV+4VV1xhNmzY0OW2pUuXmhEjRoT9m/vyyy8P/OA3puOH24kTJwK3d+7ceVauVJ2yadMm43K5TERExFmPkcsuu8xMmzbNDBs2zDz77LNB21999dWw/7A7fPiwmThxornqqqtMQUGBiY2NNd/85jfNXXfdZa666ioTHR1ttm3bFtYZuouRU9rb282LL74Y1hmKiorM6NGjzZIlS4zb7TaFhYVmwoQJpry83DzxxBMmKSnJ3H///WGd4dixY2bevHmBn1EZGRlm//79ge07duwICqRwSUxMNM8991y327ds2RL2p4sSEhLM2rVrzcGDB7tctm3bFvafU0OHDjUHDhwI3P7qg6ZDhw6F/UHT0KFDzQcffBC07sSJEyY7OzvwIOpsxMipB01jxozp9DPy4MGDYXvQNCjf2puTk6OcnBydOHFCjY2NkqSEhAQNGTLkrM3w4IMP6vjx491uv+iii/Tyyy+HdYbvfOc72rRpkxYsWNBp27/+67+qvb1dTzzxRFhn+MEPfqC2trbA7dTU1KDt27dvD+u7ab5q/vz5+uY3v6nq6urAX5o+G5YvXx50+7zzzgu6vXXr1rC/wyQxMVE1NTVasWKFtm7dKmOMfv/736uurk4zZ87Ub3/7W6WlpYV1huTk5B7fsupwODR37tywzlBcXKzY2Fi9/vrruvvuu/WjH/1Il19+uR566CF9/vnnuvHGG/UP//APYZ1h2LBhqqio0BdffKGTJ09q2LBhQdszMzPDev5T7rrrLt1xxx368Y9/rLlz58rlcsnhcKi+vl5er1ePPvqo8vPzwzqDx+PRkSNHuv33+H//939h/1ygpKQk7d+/XxMnTpQk/eY3v9G4ceMC230+nxISEsI6wze+8Q29/fbbmjRpUmBdVFSU/uM//kO33nqrbrjhhrCe/5RFixbJ6XTqxIkTOnToUNC7MH0+n0aMGBGW8w66t/YCAHrvH//xH7V69erAZzJJHR8KOHbsWOXn5+uhhx4K6/m3bNmi48eP6/bbb+9y+6effqrnn39ed9xxR9hmKC4u1uTJk7t9a3tRUZHef/99bd68OWwz/OhHP9KePXu0Y8eOTttOnjypW265RVu3blV7e3vYZvje974XdPu6667TrbfeGrj94IMP6p133tELL7zQ7+cmRgAAOnDggOrr6yVJY8eODXweDaTPP/9ckZGRcjqdYTvHyZMn9fnnnysuLq7L7W1tbfroo4/O6hXdrzp+/LgiIyMVExPT7/fNx8EDAOR2u5Wenq709PRAiNTV1enOO++0Ote5MENTU5N+8IMfhPUcUVFR3YaIJB05ckTFxcVhneF0jh49qry8vLDcN1dGAABdeuuttzR9+vSg130xAzOEY4ZB+QJWAMDpPf/88z1u379/PzMww1mZgSsjAPA1FRERIYfD0eO7VRwOR1gfjTMDM0i8ZgQAvrbGjRunzZs3q729vcvlzTffZAZmOCszECMA8DXl8Xh6/AVzukfJzMAM/YXXjADA19S58OGMzMAMEq8ZAQAAlvE0DQAAsIoYAQAAVhEjAADAKmIEAABYRYwAAACriBEAAGAVMQIAAKwiRgAAgFX/Dz1cLVpLe3Y/AAAAAElFTkSuQmCC\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "#有些企业单位名字中带有「大学」,但这类企业非常少。\n", "#「大学」词语出现次数可以近似看做学习经历次数\n", "#如此, 1可以看做本科学历,2看做研究生学历, 3看做博士学历\n", "df['个人简历'].str.count('大学').value_counts(normalize=True).plot(kind='bar')" ] }, { "cell_type": "markdown", "id": "6189eabb", "metadata": {}, "source": [ "
\n", "\n", "### 2.6 找出每位高管的出生年份(用正则表达式)" ] }, { "cell_type": "code", "execution_count": 86, "id": "faff7c2d", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 [1969]\n", "1 [1965, 1984, 1986, 1990, 1994, 1995]\n", "2 [1972, 1998, 1999, 2000, 2015, 2002, 2016, 200...\n", "3 [1960, 1982, 1989, 1990, 1991, 1991, 2002, 200...\n", "4 [1962, 2009, 1985, 1996, 1996, 2008, 1993, 200...\n", " ... \n", "900882 []\n", "900883 []\n", "900884 []\n", "900885 []\n", "900886 []\n", "Name: 个人简历, Length: 736970, dtype: object" ] }, "execution_count": 86, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df['个人简历'].str.findall('\\d{4}')" ] }, { "cell_type": "code", "execution_count": 93, "id": "c45af8ba", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1969\n", "1 1965\n", "2 1972\n", "3 1960\n", "4 1962\n", " ... \n", "900882 0\n", "900883 0\n", "900884 0\n", "900885 0\n", "900886 0\n", "Name: 个人简历, Length: 736970, dtype: object" ] }, "execution_count": 93, "metadata": {}, "output_type": "execute_result" } ], "source": [ "def birth_year(years):\n", " try:\n", " #返回出生年份\n", " return years[0]\n", " except:\n", " #没有年份的,返回0\n", " return 0\n", " \n", " \n", "#高管出生年份\n", "df['个人简历'].str.findall('\\d{4}').apply(birth_year)" ] }, { "cell_type": "code", "execution_count": 92, "id": "29a4171a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "0 1\n", "1 6\n", "2 10\n", "3 10\n", "4 8\n", " ..\n", "900882 0\n", "900883 0\n", "900884 0\n", "900885 0\n", "900886 0\n", "Name: 个人简历, Length: 736970, dtype: int64" ] }, "execution_count": 92, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#高管时间点个数(感觉可以看做经历的个数)\n", "df['个人简历'].str.findall('\\d{4}').apply(lambda ys: len(set(ys)))" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" } }, "nbformat": 4, "nbformat_minor": 5 }