研究文章
改善外来词在资源缺乏语言识别数据增强和多特征融合
|
| 捐赠 |
模型 |
外来词识别结果(%) |
|
|
|
R |
|
F1 |
F1 (+) |
|
| 俄罗斯 |
规则(+) |
72.04 |
72.89 |
69.31 |
70.18 |
70.65 |
71.28 |
| CRF (+) |
71.63 |
72.45 |
67.28 |
68.15 |
69.39 |
70.23 |
| BLSTM-CNN (+) |
71.45 |
72.26 |
70.50 |
71.31 |
70.97 |
71.78 |
| ClEmbedding (+) |
73.12 |
73.94 |
71.84 |
72.62 |
72.47 |
73.27 |
| 我们的(+) |
74.80 |
75.62 |
73.64 |
74.20 |
74.22 |
74.90 |
|
| 阿拉伯语 |
规则(+) |
69.05 |
69.84 |
68.17 |
69.02 |
68.61 |
69.43 |
| CRF (+) |
69.83 |
70.65 |
67.42 |
68.29 |
68.60 |
69.45 |
| BLSTM-CNN (+) |
68.70 |
69.52 |
69.85 |
70.67 |
69.27 |
70.09 |
| ClEmbedding (+) |
72.95 |
73.76 |
72.03 |
72.85 |
72.49 |
73.30 |
| 我们的(+) |
73.91 |
74.62 |
72.35 |
73.06 |
73.12 |
73.83 |
|
| 土耳其 |
规则(+) |
72.02 |
72.86 |
69.87 |
70.50 |
70.93 |
71.66 |
| CRF (+) |
71.46 |
72.29 |
69.02 |
69.95 |
70.22 |
71.10 |
| BLSTM-CNN (+) |
71.25 |
72.04 |
70.43 |
71.18 |
70.84 |
71.61 |
| ClEmbedding (+) |
72.96 |
73.64 |
73.08 |
73.85 |
73.02 |
73.74 |
| 我们的(+) |
75.24 |
76.09 |
74.36 |
75.14 |
74.80 |
75.61 |
|
| 中国 |
规则(+) |
70.32 |
71.13 |
69.77 |
70.58 |
70.04 |
70.85 |
| CRF (+) |
70.85 |
71.64 |
69.24 |
70.05 |
70.04 |
70.84 |
| BLSTM-CNN (+) |
70.58 |
71.34 |
69.98 |
70.79 |
70.28 |
71.06 |
| ClEmbedding (+) |
71.67 |
72.48 |
71.35 |
72.14 |
71.51 |
72.31 |
| 我们的(+) |
74.30 |
75.07 |
72.88 |
73.95 |
73.58 |
74.51 |
|
|