(对不起,在问题跟踪器上打扰你关于一个可能不是bug的问题,但我没有其他与项目进行功能性沟通的方式)
在为openSUSE打包NLTK时,我想进行测试。问题是我们的构建系统(以及所有发行版的构建系统)与互联网隔离,所以我需要在不接触网络的情况下使运行测试套件成为可能。因此,我已经下载了所有的ntlk_data,并相应地设置了NTLK_DATA变量。不幸的是,结果并不好:
[ 78s] + cd /home/abuild/rpmbuild/BUILD
[ 78s] + cd nltk-3.7
[ 78s] ++ readlink -f ./ntlk_data/
[ 78s] + export NLTK_DATA=/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data
[ 78s] + NLTK_DATA=/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data
[ 78s] ++ '[' -f _current_flavor ']'
[ 78s] ++ cat _current_flavor
[ 78s] + last_flavor=python38
[ 78s] + '[' -z python38 ']'
[ 78s] + '[' python38 '!=' python39 ']'
[ 78s] + '[' -d build ']'
[ 78s] + mv build _build.python38
[ 78s] + '[' -d _build.python39 ']'
[ 78s] + mv _build.python39 build
[ 78s] + echo python39
[ 78s] + python_flavor=python39
[ 78s] + PYTHONPATH=/home/abuild/rpmbuild/BUILDROOT/python-nltk-3.7-0.x86_64/usr/lib/python3.9/site-packages
[ 78s] + PYTHONDONTWRITEBYTECODE=1
[ 78s] + pytest-3.9 --ignore=_build.python39 --ignore=_build.python310 --ignore=_build.python38 -v
[ 79s] ============================= test session starts ==============================
[ 79s] platform linux -- Python 3.9.10, pytest-6.2.5, py-1.11.0, pluggy-1.0.0 -- /usr/bin/python3.9
[ 79s] cachedir: .pytest_cache
[ 79s] rootdir: /home/abuild/rpmbuild/BUILD/nltk-3.7
[ 79s] plugins: cov-3.0.0, mock-3.6.1
[ 95s] collecting ... collected 424 items / 3 errors / 421 selected
[ 95s]
[ 95s] ==================================== ERRORS ====================================
[ 95s] _______________ ERROR collecting nltk/test/unit/test_corpora.py ________________
[ 95s] nltk/corpus/util.py:84: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{zip_name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource ptb not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('ptb')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/ptb.zip/ptb/
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s]
[ 95s] During handling of the above exception, another exception occurred:
[ 95s] nltk/test/unit/test_corpora.py:186: in <module>
[ 95s] ???
[ 95s] nltk/corpus/util.py:121: in __getattr__
[ 95s] self.__load()
[ 95s] nltk/corpus/util.py:86: in __load
[ 95s] raise e
[ 95s] nltk/corpus/util.py:81: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{self.__name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource ptb not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('ptb')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/ptb
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s] _______________ ERROR collecting nltk/test/unit/test_nombank.py ________________
[ 95s] nltk/corpus/util.py:84: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{zip_name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource nombank.1.0 not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('nombank.1.0')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/nombank.1.0.zip/nombank.1.0/
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s]
[ 95s] During handling of the above exception, another exception occurred:
[ 95s] nltk/test/unit/test_nombank.py:10: in <module>
[ 95s] nombank.nouns()
[ 95s] nltk/corpus/util.py:121: in __getattr__
[ 95s] self.__load()
[ 95s] nltk/corpus/util.py:86: in __load
[ 95s] raise e
[ 95s] nltk/corpus/util.py:81: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{self.__name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource nombank.1.0 not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('nombank.1.0')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/nombank.1.0
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s] _______________ ERROR collecting nltk/test/unit/test_wordnet.py ________________
[ 95s] nltk/corpus/util.py:84: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{zip_name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource wordnet not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('wordnet')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/wordnet.zip/wordnet/
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s]
[ 95s] During handling of the above exception, another exception occurred:
[ 95s] nltk/test/unit/test_wordnet.py:10: in <module>
[ 95s] wn.ensure_loaded()
[ 95s] nltk/corpus/util.py:121: in __getattr__
[ 95s] self.__load()
[ 95s] nltk/corpus/util.py:86: in __load
[ 95s] raise e
[ 95s] nltk/corpus/util.py:81: in __load
[ 95s] root = nltk.data.find(f"{self.subdir}/{self.__name}")
[ 95s] nltk/data.py:583: in find
[ 95s] raise LookupError(resource_not_found)
[ 95s] E LookupError:
[ 95s] E **********************************************************************
[ 95s] E Resource wordnet not found.
[ 95s] E Please use the NLTK Downloader to obtain the resource:
[ 95s] E
[ 95s] E >>> import nltk
[ 95s] E >>> nltk.download('wordnet')
[ 95s] E
[ 95s] E For more information see: https://www.nltk.org/data.html
[ 95s] E
[ 95s] E Attempted to load corpora/wordnet
[ 95s] E
[ 95s] E Searched in:
[ 95s] E - '/home/abuild/rpmbuild/BUILD/nltk-3.7/ntlk_data'
[ 95s] E - '/home/abuild/nltk_data'
[ 95s] E - '/usr/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/share/nltk_data'
[ 95s] E - '/usr/local/share/nltk_data'
[ 95s] E - '/usr/lib/nltk_data'
[ 95s] E - '/usr/local/lib/nltk_data'
[ 95s] E **********************************************************************
[ 95s] =============================== warnings summary ===============================
[ 95s] nltk/test/unit/test_tokenize.py:22
[ 95s] /home/abuild/rpmbuild/BUILD/nltk-3.7/nltk/test/unit/test_tokenize.py:22: DeprecationWarning:
[ 95s] The StanfordTokenizer will be deprecated in version 3.2.5.
[ 95s] Please use nltk.parse.corenlp.CoreNLPTokenizer instead.'
[ 95s] seg = StanfordSegmenter()
[ 95s]
[ 95s] -- Docs: https://docs.pytest.org/en/stable/warnings.html
[ 95s] =========================== short test summary info ============================
[ 95s] ERROR nltk/test/unit/test_corpora.py - LookupError:
[ 95s] ERROR nltk/test/unit/test_nombank.py - LookupError:
[ 95s] ERROR nltk/test/unit/test_wordnet.py - LookupError:
[ 95s] !!!!!!!!!!!!!!!!!!! Interrupted: 3 errors during collection !!!!!!!!!!!!!!!!!!!!
[ 95s] ======================== 1 warning, 3 errors in 16.17s =========================
[ 95s] error: Bad exit status from /var/tmp/rpm-tmp.xNuAZW (%check)
Complete log
有什么想法吗?
感谢您的任何回复,
Matěj
https://matej.ceplovi.cz/blog/ , Jabber: mcepl@ceplovi.cz
GPG指纹:3C76 A027 CA45 AD70 98B5 BC1D 7920 5802 880B C9D8
清晰的想法和厚厚的巧克力。
(主意应该清晰且巧克力厚。)
--西班牙谚语
2条答案
按热度按时间8ljdwjyq1#
@mcepl 这个库叫做NLTK(不是"NTLK"):看起来你不小心颠倒了"L"和"T" 😊
请告诉我们这是否解决了问题。我感觉你可能会遇到其他一些问题(例如
test_downloader.py
测试),但请先尝试一下:)zwghvu4y2#
是的,这是一个错误,但要使测试在离线状态下工作需要更多的努力:
all
大得多(至少需要添加comtrans, conll2007, jeita, knbc, machado, masc_tagged, nombank.1.0, panlex_swadesh, perluniprops, propbank, reuters, semcor, universal_treebanks_v20
)rm tools/nltk_term_index.py tools/run_doctests.py nltk_data/corpora/semcor/semcor.py
,这似乎过时且不可移植(或者使用过时的库 ...epydoc
?真的吗?)a. 允许跳过网络测试:
%check
部分生效时,它显示为:(
%pytest
在这里展开为:),所以当我这样做时,我得到:
我现在暂时跳过doctest。
圣诞快乐,2023年万事如意!