PaddleOCR 对PGNet算法提供的模型进行评估,效果没有达到公布的指标

ef1yzkbh  于 2022-10-27  发布在  其他
关注(0)|答案(3)|浏览(292)

环境信息:

公开的指标信息:

链接:https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.5/doc/doc_ch/algorithm_e2e_pgnet.md#%E6%A8%A1%E5%9E%8B%E8%AE%AD%E7%BB%83%E3%80%81%E8%AF%84%E4%BC%B0%E3%80%81%E6%8E%A8%E7%90%86

对上图下载链接中的模型en_server_pgnetA进行评估,打印的log信息如下:

[2022/09/14 14:39:37] ppocr INFO: Architecture :
[2022/09/14 14:39:37] ppocr INFO: Backbone :
[2022/09/14 14:39:37] ppocr INFO: layers : 50
[2022/09/14 14:39:37] ppocr INFO: name : ResNet
[2022/09/14 14:39:37] ppocr INFO: Head :
[2022/09/14 14:39:37] ppocr INFO: name : PGHead
[2022/09/14 14:39:37] ppocr INFO: Neck :
[2022/09/14 14:39:37] ppocr INFO: name : PGFPN
[2022/09/14 14:39:37] ppocr INFO: Transform : None
[2022/09/14 14:39:37] ppocr INFO: algorithm : PGNet
[2022/09/14 14:39:37] ppocr INFO: model_type : e2e
[2022/09/14 14:39:37] ppocr INFO: Eval :
[2022/09/14 14:39:37] ppocr INFO: dataset :
[2022/09/14 14:39:37] ppocr INFO: data_dir : ./train_data/total_text/test
[2022/09/14 14:39:37] ppocr INFO: label_file_list : ['./train_data/total_text/test/test.txt']
[2022/09/14 14:39:37] ppocr INFO: name : PGDataSet
[2022/09/14 14:39:37] ppocr INFO: transforms :
[2022/09/14 14:39:37] ppocr INFO: DecodeImage :
[2022/09/14 14:39:37] ppocr INFO: channel_first : False
[2022/09/14 14:39:37] ppocr INFO: img_mode : BGR
[2022/09/14 14:39:37] ppocr INFO: E2ELabelEncodeTest : None
[2022/09/14 14:39:37] ppocr INFO: E2EResizeForTest :
[2022/09/14 14:39:37] ppocr INFO: max_side_len : 768
[2022/09/14 14:39:37] ppocr INFO: NormalizeImage :
[2022/09/14 14:39:37] ppocr INFO: mean : [0.485, 0.456, 0.406]
[2022/09/14 14:39:37] ppocr INFO: order : hwc
[2022/09/14 14:39:37] ppocr INFO: scale : 1./255.
[2022/09/14 14:39:37] ppocr INFO: std : [0.229, 0.224, 0.225]
[2022/09/14 14:39:37] ppocr INFO: ToCHWImage : None
[2022/09/14 14:39:37] ppocr INFO: KeepKeys :
[2022/09/14 14:39:37] ppocr INFO: keep_keys : ['image', 'shape', 'polys', 'texts', 'ignore_tags', 'img_id']
[2022/09/14 14:39:37] ppocr INFO: loader :
[2022/09/14 14:39:37] ppocr INFO: batch_size_per_card : 1
[2022/09/14 14:39:37] ppocr INFO: drop_last : False
[2022/09/14 14:39:37] ppocr INFO: num_workers : 2
[2022/09/14 14:39:37] ppocr INFO: shuffle : False
[2022/09/14 14:39:37] ppocr INFO: Global :
[2022/09/14 14:39:37] ppocr INFO: cal_metric_during_train : False
[2022/09/14 14:39:37] ppocr INFO: character_dict_path : ppocr/utils/ic15_dict.txt
[2022/09/14 14:39:37] ppocr INFO: character_type : EN
[2022/09/14 14:39:37] ppocr INFO: checkpoints : pretrain_models/en_server_pgnetA/best_accuracy
[2022/09/14 14:39:37] ppocr INFO: distributed : False
[2022/09/14 14:39:37] ppocr INFO: epoch_num : 600
[2022/09/14 14:39:37] ppocr INFO: eval_batch_step : [1000, 1000]
[2022/09/14 14:39:37] ppocr INFO: infer_img : None
[2022/09/14 14:39:37] ppocr INFO: log_smooth_window : 20
[2022/09/14 14:39:37] ppocr INFO: max_text_length : 50
[2022/09/14 14:39:37] ppocr INFO: max_text_nums : 30
[2022/09/14 14:39:37] ppocr INFO: pretrained_model : None
[2022/09/14 14:39:37] ppocr INFO: print_batch_step : 10
[2022/09/14 14:39:37] ppocr INFO: save_epoch_step : 20
[2022/09/14 14:39:37] ppocr INFO: save_inference_dir : None
[2022/09/14 14:39:37] ppocr INFO: save_model_dir : ./output/pgnet_r50_vd_totaltext/
[2022/09/14 14:39:37] ppocr INFO: save_res_path : ./output/pgnet_r50_vd_totaltext/predicts_pgnet.txt
[2022/09/14 14:39:37] ppocr INFO: tcl_len : 64
[2022/09/14 14:39:37] ppocr INFO: use_gpu : False
[2022/09/14 14:39:37] ppocr INFO: use_visualdl : True
[2022/09/14 14:39:37] ppocr INFO: valid_set : totaltext
[2022/09/14 14:39:37] ppocr INFO: Loss :
[2022/09/14 14:39:37] ppocr INFO: max_text_length : 50
[2022/09/14 14:39:37] ppocr INFO: max_text_nums : 30
[2022/09/14 14:39:37] ppocr INFO: name : PGLoss
[2022/09/14 14:39:37] ppocr INFO: pad_num : 36
[2022/09/14 14:39:37] ppocr INFO: tcl_bs : 64
[2022/09/14 14:39:37] ppocr INFO: Metric :
[2022/09/14 14:39:37] ppocr INFO: character_dict_path : ppocr/utils/ic15_dict.txt
[2022/09/14 14:39:37] ppocr INFO: gt_mat_dir : ./train_data/total_text/gt
[2022/09/14 14:39:37] ppocr INFO: main_indicator : f_score_e2e
[2022/09/14 14:39:37] ppocr INFO: mode : A
[2022/09/14 14:39:37] ppocr INFO: name : E2EMetric
[2022/09/14 14:39:37] ppocr INFO: Optimizer :
[2022/09/14 14:39:37] ppocr INFO: beta1 : 0.9
[2022/09/14 14:39:37] ppocr INFO: beta2 : 0.999
[2022/09/14 14:39:37] ppocr INFO: lr :
[2022/09/14 14:39:37] ppocr INFO: learning_rate : 0.001
[2022/09/14 14:39:37] ppocr INFO: name : Adam
[2022/09/14 14:39:37] ppocr INFO: regularizer :
[2022/09/14 14:39:37] ppocr INFO: factor : 0
[2022/09/14 14:39:37] ppocr INFO: name : L2
[2022/09/14 14:39:37] ppocr INFO: PostProcess :
[2022/09/14 14:39:37] ppocr INFO: mode : slow
[2022/09/14 14:39:37] ppocr INFO: name : PGPostProcess
[2022/09/14 14:39:37] ppocr INFO: score_thresh : 0.5
[2022/09/14 14:39:37] ppocr INFO: Train :
[2022/09/14 14:39:37] ppocr INFO: dataset :
[2022/09/14 14:39:37] ppocr INFO: data_dir : ./train_data/total_text/train
[2022/09/14 14:39:37] ppocr INFO: label_file_list : ['./train_data/total_text/train/train.txt']
[2022/09/14 14:39:37] ppocr INFO: name : PGDataSet
[2022/09/14 14:39:37] ppocr INFO: ratio_list : [1.0]
[2022/09/14 14:39:37] ppocr INFO: transforms :
[2022/09/14 14:39:37] ppocr INFO: DecodeImage :
[2022/09/14 14:39:37] ppocr INFO: channel_first : False
[2022/09/14 14:39:37] ppocr INFO: img_mode : BGR
[2022/09/14 14:39:37] ppocr INFO: E2ELabelEncodeTrain : None
[2022/09/14 14:39:37] ppocr INFO: PGProcessTrain :
[2022/09/14 14:39:37] ppocr INFO: batch_size : 4
[2022/09/14 14:39:37] ppocr INFO: max_text_size : 512
[2022/09/14 14:39:37] ppocr INFO: min_crop_size : 24
[2022/09/14 14:39:37] ppocr INFO: min_text_size : 4
[2022/09/14 14:39:37] ppocr INFO: KeepKeys :
[2022/09/14 14:39:37] ppocr INFO: keep_keys : ['images', 'tcl_maps', 'tcl_label_maps', 'border_maps', 'direction_maps', 'training_masks', 'label_list', 'pos_list', 'pos_mask']
[2022/09/14 14:39:37] ppocr INFO: loader :
[2022/09/14 14:39:37] ppocr INFO: batch_size_per_card : 4
[2022/09/14 14:39:37] ppocr INFO: drop_last : True
[2022/09/14 14:39:37] ppocr INFO: num_workers : 2
[2022/09/14 14:39:37] ppocr INFO: shuffle : True
[2022/09/14 14:39:37] ppocr INFO: profiler_options : None
[2022/09/14 14:39:37] ppocr INFO: train with paddle 2.3.2 and device Place(cpu)
[2022/09/14 14:39:37] ppocr INFO: Initialize indexs of datasets:['./train_data/total_text/test/test.txt']
[2022/09/14 14:39:40] ppocr INFO: resume from pretrain_models/en_server_pgnetA/best_accuracy
[2022/09/14 14:39:40] ppocr INFO: metric in ckpt***************
[2022/09/14 14:39:40] ppocr INFO: f_score:0.7829733997188428
[2022/09/14 14:39:40] ppocr INFO: total_num_gt:2543
[2022/09/14 14:39:40] ppocr INFO: seqerr:0.3176906452608811
[2022/09/14 14:39:40] ppocr INFO: recall_e2e:0.521431380259536
[2022/09/14 14:39:40] ppocr INFO: f_score_e2e:0.5293413173652695
[2022/09/14 14:39:40] ppocr INFO: total_num_det:2467
[2022/09/14 14:39:40] ppocr INFO: precision:0.8026753141467351
[2022/09/14 14:39:40] ppocr INFO: recall:0.7642154935115983
[2022/09/14 14:39:40] ppocr INFO: global_accumulative_recall:1943.3999999999946
[2022/09/14 14:39:40] ppocr INFO: fps:10.138212364362793
[2022/09/14 14:39:40] ppocr INFO: precision_e2e:0.5374949331171464
[2022/09/14 14:39:40] ppocr INFO: best_epoch:448
[2022/09/14 14:39:40] ppocr INFO: hit_str_count:1326
[2022/09/14 14:39:40] ppocr INFO: start_epoch:451
[2022/09/14 14:39:40] ppocr INFO: is_float16:False
eval model:: 100%|████████████████████████████| 300/300 [30:45<00:00, 5.64s/it]
[2022/09/14 15:10:26] ppocr INFO: metric eval***************
[2022/09/14 15:10:26] ppocr INFO: total_num_gt:2543
[2022/09/14 15:10:26] ppocr INFO: total_num_det:2473
[2022/09/14 15:10:26] ppocr INFO: global_accumulative_recall:1960.5999999999951
[2022/09/14 15:10:26] ppocr INFO: hit_str_count:1342
[2022/09/14 15:10:26] ppocr INFO: recall:0.7709791584742411
[2022/09/14 15:10:26] ppocr INFO: precision:0.8086534573392629
[2022/09/14 15:10:26] ppocr INFO: f_score:0.7893670411656245
[2022/09/14 15:10:26] ppocr INFO: seqerr:0.31551565847189467
[2022/09/14 15:10:26] ppocr INFO: recall_e2e:0.5277231616201337
[2022/09/14 15:10:26] ppocr INFO: precision_e2e:0.542660735948241
[2022/09/14 15:10:26] ppocr INFO: f_score_e2e:0.5350877192982456
[2022/09/14 15:10:26] ppocr INFO: fps:0.1657851570315855

原因是什么呀?

sqxo8psd

sqxo8psd1#

readme中公开的论文指标需要用B模式的metric计算方式,A模式的计算方式的标签格式和PPOCR格式相同,但是效果差一些

采用B模式精度评估方式:

下载ground truth :
wget https://paddleocr.bj.bcebos.com/dataset/Groundtruth.tar
修改metric部分参数,
Metric:
  name: E2EMetric
  mode: B   # two ways for eval, A: label from txt,  B: label from gt_mat
  gt_mat_dir:  ./train_data/Groundtruth/  # the dir of gt_mat
  character_dict_path: ppocr/utils/ic15_dict.txt
  main_indicator: f_score_e2e

最后评估出来的指标:

[2022/09/15 02:35:36] ppocr INFO: load pretrain successful from ./en_server_pgnetA/best_accuracy
eval model:: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 300/300 [01:15<00:00,  3.97it/s]
[2022/09/15 02:36:52] ppocr INFO: metric eval***************
[2022/09/15 02:36:52] ppocr INFO: total_num_gt:2204
[2022/09/15 02:36:52] ppocr INFO: total_num_det:2070
[2022/09/15 02:36:52] ppocr INFO: global_accumulative_recall:1818.3999999999967
[2022/09/15 02:36:52] ppocr INFO: hit_str_count:1267
[2022/09/15 02:36:52] ppocr INFO: recall:0.8250453720508152
[2022/09/15 02:36:52] ppocr INFO: precision:0.8749758454106266
[2022/09/15 02:36:52] ppocr INFO: f_score:0.8492773672439888
[2022/09/15 02:36:52] ppocr INFO: seqerr:0.30323361196656273
[2022/09/15 02:36:52] ppocr INFO: recall_e2e:0.5748638838475499
[2022/09/15 02:36:52] ppocr INFO: precision_e2e:0.6120772946859904
[2022/09/15 02:36:52] ppocr INFO: f_score_e2e:0.5928872250818905
[2022/09/15 02:36:52] ppocr INFO: fps:20.85154714822483
uinbv5nw

uinbv5nw2#

好的,谢谢。
再请问一下PGNet使用的预训练数据集是什么呢?

oyxsuwqo

oyxsuwqo3#

第二阶段的训练数据是synthtexk150k_irregular,synthtexk150k_curved,ArTV2,Total-tex 数据配比分别是 [0.0023, 0.0070, 0.1653, 0.8254]

相关问题