首先,感谢您分享了这个很棒的代码。
在设置好一切之后,当我尝试启动演示时,遇到了以下错误。请帮我解决一下。
(kosmos-2) wendell@:~/unilm/kosmos-2$ bash run_gradio.sh
run_gradio.sh: line 2: $'\r': command not found
run_gradio.sh: line 4: $'\r': command not found
run_gradio.sh: line 6: $'\r': command not found
/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launch.py:181: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use-env is set by default in torchrun.
If your script expects `--local-rank` argument to be set, please
change it to read from `os.environ['LOCAL_RANK']` instead. See
https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torchvision/io/image.py:13: UserWarning: Failed to load image Python extension: libtorch_cuda_cu.so: cannot open shared object file: No such file or directory
warn(f"Failed to load image Python extension: {e}")
Please install pip install -r visual_requirement.txt for VL dataset
usage: gradio_app.py [-h] [--no-progress-bar] [--log-interval LOG_INTERVAL] [--log-format {json,none,simple,tqdm}] [--log-file LOG_FILE] [--tensorboard-logdir TENSORBOARD_LOGDIR] [--wandb-project WANDB_PROJECT]
[--azureml-logging] [--seed SEED] [--cpu] [--tpu] [--bf16] [--memory-efficient-bf16] [--fp16] [--memory-efficient-fp16] [--fp16-no-flatten-grads] [--fp16-init-scale FP16_INIT_SCALE]
[--fp16-scale-window FP16_SCALE_WINDOW] [--fp16-scale-tolerance FP16_SCALE_TOLERANCE] [--on-cpu-convert-precision] [--min-loss-scale MIN_LOSS_SCALE] [--threshold-loss-scale THRESHOLD_LOSS_SCALE]
[--amp] [--amp-batch-retries AMP_BATCH_RETRIES] [--amp-init-scale AMP_INIT_SCALE] [--amp-scale-window AMP_SCALE_WINDOW] [--user-dir USER_DIR] [--empty-cache-freq EMPTY_CACHE_FREQ]
[--all-gather-list-size ALL_GATHER_LIST_SIZE] [--model-parallel-size MODEL_PARALLEL_SIZE] [--quantization-config-path QUANTIZATION_CONFIG_PATH] [--profile] [--reset-logging] [--suppress-crashes]
[--use-plasma-view] [--plasma-path PLASMA_PATH] [--deepspeed] [--zero ZERO] [--exit-interval EXIT_INTERVAL]
[--criterion {adaptive_loss,composite_loss,cross_entropy,ctc,fastspeech2,hubert,label_smoothed_cross_entropy,latency_augmented_label_smoothed_cross_entropy,label_smoothed_cross_entropy_with_alignment,legacy_masked_lm_loss,masked_lm,model,nat_loss,sentence_prediction,sentence_ranking,tacotron2,speech_to_unit,speech_to_spectrogram,wav2vec,vocab_parallel_cross_entropy,unigpt}]
[--tokenizer {moses,nltk,space}] [--bpe {byte_bpe,bytes,characters,fastbpe,gpt2,bert,hf_byte_bpe,sentencepiece,subword_nmt}]
[--optimizer {adadelta,adafactor,adagrad,adam,adamax,composite,cpu_adam,lamb,nag,sgd}]
[--lr-scheduler {cosine,fixed,inverse_sqrt,manual,pass_through,polynomial_decay,reduce_lr_on_plateau,step,tri_stage,triangular}] [--scoring {sacrebleu,bleu,chrf,meteor,wer}] [--task TASK]
[--num-workers NUM_WORKERS] [--skip-invalid-size-inputs-valid-test] [--max-tokens MAX_TOKENS] [--batch-size BATCH_SIZE] [--required-batch-size-multiple REQUIRED_BATCH_SIZE_MULTIPLE]
[--required-seq-len-multiple REQUIRED_SEQ_LEN_MULTIPLE] [--dataset-impl {raw,lazy,cached,mmap,fasta,huffman}] [--data-buffer-size DATA_BUFFER_SIZE] [--train-subset TRAIN_SUBSET]
[--valid-subset VALID_SUBSET] [--combine-valid-subsets] [--ignore-unused-valid-subsets] [--validate-interval VALIDATE_INTERVAL] [--validate-interval-updates VALIDATE_INTERVAL_UPDATES]
[--validate-after-updates VALIDATE_AFTER_UPDATES] [--fixed-validation-seed FIXED_VALIDATION_SEED] [--disable-validation] [--max-tokens-valid MAX_TOKENS_VALID]
[--batch-size-valid BATCH_SIZE_VALID] [--max-valid-steps MAX_VALID_STEPS] [--curriculum CURRICULUM] [--gen-subset GEN_SUBSET] [--num-shards NUM_SHARDS] [--shard-id SHARD_ID]
[--grouped-shuffling] [--update-epoch-batch-itr UPDATE_EPOCH_BATCH_ITR] [--update-ordered-indices-seed] [--distributed-world-size DISTRIBUTED_WORLD_SIZE]
[--distributed-num-procs DISTRIBUTED_NUM_PROCS] [--distributed-rank DISTRIBUTED_RANK] [--distributed-backend DISTRIBUTED_BACKEND] [--distributed-init-method DISTRIBUTED_INIT_METHOD]
[--distributed-port DISTRIBUTED_PORT] [--device-id DEVICE_ID] [--distributed-no-spawn] [--ddp-backend {c10d,fully_sharded,legacy_ddp,no_c10d,pytorch_ddp,slowmo}] [--ddp-comm-hook {none,fp16}]
[--bucket-cap-mb BUCKET_CAP_MB] [--fix-batches-to-gpus] [--find-unused-parameters] [--gradient-as-bucket-view] [--fast-stat-sync] [--heartbeat-timeout HEARTBEAT_TIMEOUT] [--broadcast-buffers]
[--slowmo-momentum SLOWMO_MOMENTUM] [--slowmo-base-algorithm SLOWMO_BASE_ALGORITHM] [--localsgd-frequency LOCALSGD_FREQUENCY] [--nprocs-per-node NPROCS_PER_NODE] [--pipeline-model-parallel]
[--pipeline-balance PIPELINE_BALANCE] [--pipeline-devices PIPELINE_DEVICES] [--pipeline-chunks PIPELINE_CHUNKS] [--pipeline-encoder-balance PIPELINE_ENCODER_BALANCE]
[--pipeline-encoder-devices PIPELINE_ENCODER_DEVICES] [--pipeline-decoder-balance PIPELINE_DECODER_BALANCE] [--pipeline-decoder-devices PIPELINE_DECODER_DEVICES]
[--pipeline-checkpoint {always,never,except_last}] [--zero-sharding {none,os}] [--no-reshard-after-forward] [--fp32-reduce-scatter] [--cpu-offload] [--use-sharded-state]
[--not-fsdp-flatten-parameters] [--path PATH] [--post-process [POST_PROCESS]] [--quiet] [--model-overrides MODEL_OVERRIDES] [--results-path RESULTS_PATH] [--beam BEAM] [--nbest NBEST]
[--max-len-a MAX_LEN_A] [--max-len-b MAX_LEN_B] [--min-len MIN_LEN] [--match-source-len] [--unnormalized] [--no-early-stop] [--no-beamable-mm] [--lenpen LENPEN] [--unkpen UNKPEN]
[--replace-unk [REPLACE_UNK]] [--sacrebleu] [--score-reference] [--prefix-size PREFIX_SIZE] [--no-repeat-ngram-size NO_REPEAT_NGRAM_SIZE] [--sampling] [--sampling-topk SAMPLING_TOPK]
[--sampling-topp SAMPLING_TOPP] [--constraints [{ordered,unordered}]] [--temperature TEMPERATURE] [--diverse-beam-groups DIVERSE_BEAM_GROUPS] [--diverse-beam-strength DIVERSE_BEAM_STRENGTH]
[--diversity-rate DIVERSITY_RATE] [--print-alignment [{hard,soft}]] [--print-step] [--lm-path LM_PATH] [--lm-weight LM_WEIGHT] [--iter-decode-eos-penalty ITER_DECODE_EOS_PENALTY]
[--iter-decode-max-iter ITER_DECODE_MAX_ITER] [--iter-decode-force-max-iter] [--iter-decode-with-beam ITER_DECODE_WITH_BEAM] [--iter-decode-with-external-reranker] [--retain-iter-history]
[--retain-dropout] [--retain-dropout-modules RETAIN_DROPOUT_MODULES] [--decoding-format {unigram,ensemble,vote,dp,bs}] [--no-seed-provided] [--save-dir SAVE_DIR] [--restore-file RESTORE_FILE]
[--continue-once CONTINUE_ONCE] [--finetune-from-model FINETUNE_FROM_MODEL] [--reset-dataloader] [--reset-lr-scheduler] [--reset-meters] [--reset-optimizer]
[--optimizer-overrides OPTIMIZER_OVERRIDES] [--save-interval SAVE_INTERVAL] [--save-interval-updates SAVE_INTERVAL_UPDATES] [--keep-interval-updates KEEP_INTERVAL_UPDATES]
[--keep-interval-updates-pattern KEEP_INTERVAL_UPDATES_PATTERN] [--keep-last-epochs KEEP_LAST_EPOCHS] [--keep-best-checkpoints KEEP_BEST_CHECKPOINTS] [--no-save] [--no-epoch-checkpoints]
[--no-last-checkpoints] [--no-save-optimizer-state] [--best-checkpoint-metric BEST_CHECKPOINT_METRIC] [--maximize-best-checkpoint-metric] [--patience PATIENCE]
[--checkpoint-suffix CHECKPOINT_SUFFIX] [--checkpoint-shard-count CHECKPOINT_SHARD_COUNT] [--load-checkpoint-on-all-dp-ranks] [--write-checkpoints-asynchronously] [--buffer-size BUFFER_SIZE]
[--input INPUT] [--source-lang SOURCE_LANG] [--target-lang TARGET_LANG] [--load-alignments] [--left-pad-source] [--left-pad-target] [--max-source-positions MAX_SOURCE_POSITIONS]
[--max-target-positions MAX_TARGET_POSITIONS] [--upsample-primary UPSAMPLE_PRIMARY] [--truncate-source] [--num-batch-buckets NUM_BATCH_BUCKETS] [--eval-bleu] [--eval-bleu-args EVAL_BLEU_ARGS]
[--eval-bleu-detok EVAL_BLEU_DETOK] [--eval-bleu-detok-args EVAL_BLEU_DETOK_ARGS] [--eval-tokenized-bleu] [--eval-bleu-remove-bpe [EVAL_BLEU_REMOVE_BPE]] [--eval-bleu-print-samples]
[--force-anneal FORCE_ANNEAL] [--lr-shrink LR_SHRINK] [--warmup-updates WARMUP_UPDATES] [--pad PAD] [--eos EOS] [--unk UNK]
data
gradio_app.py: error: unrecognized arguments: --local-rank=0
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 2) local_rank: 0 (pid: 648457) of binary: /home/wendell/anaconda3/envs/kosmos-2/bin/python
Traceback (most recent call last):
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/runpy.py", line 197, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launch.py", line 196, in <module>
main()
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launch.py", line 192, in main
launch(args)
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launch.py", line 177, in launch
run(args)
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 134, in __call__
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/wendell/anaconda3/envs/kosmos-2/lib/python3.9/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
demo/gradio_app.py FAILED
------------------------------------------------------------
Failures:
<NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
time : 2023-10-14_19:11:23
host : DESKTOP-3Q0HFJ3.
rank : 0 (local_rank: 0)
exitcode : 2 (pid: 648457)
error_file: <N/A>
traceback : To enable traceback see: https://pytorch.org/docs/stable/elastic/errors.html
============================================================
run_gradio.sh: line 8: --task: command not found
run_gradio.sh: line 9: --path: command not found
run_gradio.sh: line 11: --model-overrides: command not found
run_gradio.sh: line 12: --dict-path: command not found
run_gradio.sh: line 13: --required-batch-size-multiple: command not found
run_gradio.sh: line 14: --remove-bpe=sentencepiece: command not found
run_gradio.sh: line 15: --max-len-b: command not found
run_gradio.sh: line 16: --add-bos-token: command not found
run_gradio.sh: line 17: --beam: command not found
run_gradio.sh: line 18: --buffer-size: command not found
run_gradio.sh: line 19: --image-feature-length: command not found
run_gradio.sh: line 20: --locate-special-token: command not found
run_gradio.sh: line 21: --batch-size: command not found
run_gradio.sh: line 22: --nbest: command not found
run_gradio.sh: line 23: --no-repeat-ngram-size: command not found
run_gradio.sh: line 24: --location-bin-size: command not found
运行 gradio.sh
#!/bin/bash
model_path=./path/kosmos2.pt
master_port=$((RANDOM%1000+20000))
CUDA_LAUNCH_BLOCKING=1 CUDA_VISIBLE_DEVICES=0 python -m torch.distributed.launch --master_port=$master_port --nproc_per_node=1 demo/gradio_app.py None \
--task generation_obj \
--path $model_path \
--model-overrides "{'visual_pretrained': '',
'dict_path':'data/dict.txt'}" \
--dict-path 'data/dict.txt' \
--required-batch-size-multiple 1 \
--remove-bpe=sentencepiece \
--max-len-b 500 \
--add-bos-token \
--beam 1 \
--buffer-size 1 \
--image-feature-length 64 \
--locate-special-token 1 \
--batch-size 1 \
--nbest 1 \
--no-repeat-ngram-size 3 \
--location-bin-size 32
软件包版本
------------------------- -------------------------
aiofiles 23.2.1
aiohttp 3.8.6
aiosignal 1.3.1
altair 5.1.2
annotated-types 0.6.0
antlr4-python3-runtime 4.8
anyio 3.7.1
apex 0.1
async-timeout 4.0.3
attrs 23.1.0
bitarray 2.8.2
blis 0.7.11
braceexpand 0.1.7
catalogue 2.0.10
certifi 2023.7.22
cffi 1.16.0
charset-normalizer 3.3.0
click 8.1.7
colorama 0.4.6
confection 0.1.3
contourpy 1.1.1
cycler 0.12.1
cymem 2.0.8
Cython 3.0.3
deepspeed 0.4.4+165739a5
exceptiongroup 1.1.3
fairscale 0.4.0
fairseq 1.0.0a0+b237f42
fastapi 0.103.2
ffmpy 0.3.1
filelock 3.12.4
fonttools 4.43.1
frozenlist 1.4.0
fsspec 2023.9.2
ftfy 6.1.1
gmpy2 2.1.2
gradio 3.37.0
gradio_client 0.6.0
h11 0.14.0
httpcore 0.17.3
httpx 0.24.1
huggingface-hub 0.18.0
hydra-core 1.0.7
idna 3.4
importlib-resources 6.1.0
infinibatch 0.1.0
Jinja2 3.1.2
jsonschema 4.19.1
jsonschema-specifications 2023.7.1
kiwisolver 1.4.5
langcodes 3.3.0
linkify-it-py 2.0.2
lxml 4.9.3
markdown-it-py 2.2.0
MarkupSafe 2.1.1
matplotlib 3.8.0
mdit-py-plugins 0.3.3
mdurl 0.1.2
mpmath 1.3.0
multidict 6.0.4
murmurhash 1.0.10
networkx 3.1
ninja 1.11.1.1
numpy 1.23.0
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
omegaconf 2.0.6
open-clip-torch 1.3.0
opencv-python-headless 4.8.0.74
orjson 3.9.9
packaging 23.2
pandas 2.1.1
pathy 0.10.2
Pillow 10.0.1
pip 23.2.1
portalocker 2.8.2
preshed 3.0.9
protobuf 3.20.3
psutil 5.9.5
pycparser 2.21
pydantic 1.10.11
pydantic_core 2.10.1
pydub 0.25.1
pyparsing 3.1.1
python-dateutil 2.8.2
python-multipart 0.0.6
pytz 2023.3.post1
PyYAML 6.0.1
referencing 0.30.2
regex 2023.10.3
requests 2.31.0
rpds-py 0.10.6
sacrebleu 2.3.1
scipy 1.8.0
semantic-version 2.10.0
sentencepiece 0.1.99
setuptools 68.0.0
six 1.16.0
smart-open 6.4.0
sniffio 1.3.0
spacy 3.6.0
spacy-legacy 3.0.12
spacy-loggers 1.0.5
srsly 2.4.8
starlette 0.27.0
sympy 1.11.1
tabulate 0.9.0
tensorboardX 1.8
thinc 8.1.10
tiktoken 0.5.1
timm 0.4.12
toolz 0.12.0
torch 1.13.0
torchscale 0.1.1
torchvision 0.14.0
tqdm 4.66.1
triton 2.0.0
typer 0.9.0
typing_extensions 4.7.1
tzdata 2023.3
uc-micro-py 1.0.2
urllib3 2.0.6
uvicorn 0.23.2
wasabi 1.1.2
wcwidth 0.2.8
webdataset 0.2.57
websockets 11.0.3
wheel 0.41.2
xformers 0.0.23.dev652+git.705810f
yarl 1.9.2
zipp 3.17.0
在设置环境方面遇到了许多困难,在确保一切都正确配置后,当我运行 run_gradio.sh 时仍然出现错误。
希望得到帮助。谢谢!
7条答案
按热度按时间o0lyfsai1#
lf5gs5x22#
首先,感谢您的帮助!我尝试使用您提供的方法,并收到以下信息。
随后,我运行run_gradio.sh并遇到了以下错误。
我正在使用WSL(Windows子系统Linux)Ubuntu 22.04.2。我不确定这是否会产生影响。
我认为Xformer警告可以忽略,但我不确定当前的错误是由于我在使用您提供的方法时犯了任何错误。我对Git不太熟悉,为此表示歉意。
请帮助我解决这个问题。
xtupzzrd3#
我明白了。这个错误可能是由于使用了WSL(Windows Subsystem for Linux)导致的。我不确定Gradio是否支持在WSL下运行。
gcuhipw94#
好的,我明白了。我会尝试改变环境。非常感谢您的帮助!
cetgtptt5#
我明白了。这个错误可能是由使用WSL引起的。我不确定WSL是否支持Gradio。
你好!@donglixp,感谢到目前为止你的所有帮助。我已经确认WSL支持Gradio。
当前的错误:
我有xformer,但它目前是1.0.1版本
如果你能帮忙,请告诉我。谢谢!
68bkxrlz6#
我调整了xformer的版本。当前错误:
ygya80vv7#
有人解决了上一个错误吗?