loading model from cache /tmp/jieba.cache loading model cost 2.44023799896 seconds. Trie has been built succesfully. [2016-09-29 17:05:37 +0000] [32528] [INFO] Booting worker with pid: 32528 Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt loading model from cache /tmp/jieba.cache loading model cost 2.28571200371 seconds. Trie has been built succesfully. [2016-09-29 17:06:06 +0000] [32556] [INFO] Booting worker with pid: 32556 Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt loading model from cache /tmp/jieba.cache loading model cost 2.27150511742 seconds. Trie has been built succesfully. [2016-09-29 17:06:10 +0000] [32560] [INFO] Booting worker with pid: 32560 Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt loading model from cache /tmp/jieba.cache
4条答案
按热度按时间yfwxisqw1#
今天又遇到了问题,我在我的flask web app 中使用了jieba的加载自定义字典功能,然后用下面的命令启动
gunicorn -w 4 -p gevent -b 0.0.0.0:9999 --reload run:app
发现jieba连续不断的吐出下面的提示,我觉得应该是gunicorn开启了多个线程导致了这个问题,我想请教下,该如何解决?
loading model from cache /tmp/jieba.cache
loading model cost 2.44023799896 seconds.
Trie has been built succesfully.
[2016-09-29 17:05:37 +0000] [32528] [INFO] Booting worker with pid: 32528
Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt
loading model from cache /tmp/jieba.cache
loading model cost 2.28571200371 seconds.
Trie has been built succesfully.
[2016-09-29 17:06:06 +0000] [32556] [INFO] Booting worker with pid: 32556
Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt
loading model from cache /tmp/jieba.cache
loading model cost 2.27150511742 seconds.
Trie has been built succesfully.
[2016-09-29 17:06:10 +0000] [32560] [INFO] Booting worker with pid: 32560
Building Trie..., from /root/py27/lib/python2.7/site-packages/jieba/dict.txt
loading model from cache /tmp/jieba.cache
w6lpcovy2#
gunicorn会fork多个进程,但是jieba是lazy加载词典的。你可以在import jieba后,调用一下jieba.initialize()。 这样就不会多次加载了。
u91tlkcl3#
同样也是jieba load_dict的问题,我发现我自己在词典中添加了一个词并设定了参数比如:
萌萌哒 50 a
,但是使用posseg分词的结果却是萌萌哒 x
,这是版本问题还是其他设定的问题?yuvru6vn4#
@fxsjy
具体的代码用到了这几个部分
import jieba
jieba.initialize()
import os
if os.path.exists('cbi360.txt'):
jieba.load_userdict('cbi360.txt')
import jieba.posseg as peg
其中 cbi360.txt是我的自己的字典,而且我还用刀了jieba.posseg 的方法,请问这个具体的顺序是怎么样的啊?