Skip to content

PaddleOCR python版本的DBPostProcess 和 fd.vision.ocr.DBDetectorPostprocessor 为什么存在精度差异 #2607

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
huotong1212 opened this issue Apr 6, 2025 · 1 comment
Assignees

Comments

@huotong1212
Copy link


温馨提示:根据社区不完全统计,按照模板提问,可以加快回复和解决问题的速度


环境

  • 【FastDeploy版本】: fastdeploy-python 1.0.7

问题日志及出现问题的操作流程

其中 DBPostProcess create_operators 是PaddleOCR 2.5 中的代码

class DBPostTest(unittest.TestCase):
    def setUp(self):
        url = "localhost:8008"
        self.det_runner = SyncGRPCTritonRunner(url, "v4_det_runtime", "1")

        pre_process_list = [{'DetResizeForTest': {'limit_side_len': 960, 'limit_type': 'max'}}, {
            'NormalizeImage': {'std': [0.229, 0.224, 0.225], 'mean': [0.485, 0.456, 0.406], 'scale': '1./255.',
                               'order': 'hwc'}}, {'ToCHWImage': None}, {'KeepKeys': {'keep_keys': ['image', 'shape']}}]
        self.det_preprocess_op = create_operators(pre_process_list)
        self.det_post = DBPostProcess(thresh=0.3, box_thresh=0.6, unclip_ratio=1.5)

        self.det_pre_fd = fd.vision.ocr.DBDetectorPreprocessor()
        self.det_pre_fd.max_side_len = 960

        self.det_post_fd = fd.vision.ocr.DBDetectorPostprocessor()
        self.det_post_fd.det_db_thresh = 0.3
        self.det_post_fd.det_db_box_thresh = 0.6
        self.det_post_fd.det_db_unclip_ratio = 1.5

    def test_db_py(self):
        image = cv2.imread("images/d1.jpg")
        output = transform({'image': image}, self.det_preprocess_op)
        img, shape_list = output
        print(f"shape list:{shape_list.tolist()}")
        inputs = np.expand_dims(img, axis=0)
        shape_list = np.expand_dims(shape_list, axis=0)
        outputs = self.det_runner.Run([inputs])
        preds = {
            "maps": list(outputs.values())[0]
        }
        results = self.det_post(preds, shape_list)
        for row in results[0]["points"]:
            print(row.tolist())

    def test_db_c(self):
        image = cv2.imread("images/d1.jpg")
        inputs, shape_list = self.det_pre_fd.run(image[np.newaxis, :, :, :])
        print(f"shape list:{shape_list}")
        inputs = inputs[0].numpy()
        outputs = self.det_runner.Run([inputs])
        preds = list(outputs.values())[0].copy()
        results = self.det_post_fd.run([preds,], shape_list)
        for row in results[0]:
            print(row)

输出如下:

python PaddleOCR

shape_list [1439.0, 1528.0, 0.6226546212647672, 0.6282722513089005]

[[564, 1040], [726, 1023], [730, 1062], [568, 1079]]
[[708, 1019], [917, 1008], [919, 1048], [710, 1059]]
[[292, 1001], [537, 994], [538, 1052], [294, 1059]]
[[1122, 1002], [1316, 999], [1317, 1037], [1123, 1041]]
[[296, 940], [535, 933], [537, 989], [297, 996]]
[[959, 942], [1097, 934], [1099, 974], [961, 982]]
[[658, 929], [847, 921], [849, 971], [660, 979]]
[[1126, 919], [1264, 913], [1266, 958], [1128, 964]]
[[314, 891], [532, 874], [536, 926], [318, 942]]
[[561, 877], [744, 870], [745, 909], [563, 916]]
[[759, 858], [1058, 849], [1059, 888], [760, 897]]
[[294, 819], [535, 814], [537, 876], [295, 882]]
[[563, 792], [735, 792], [735, 830], [563, 830]]
[[278, 704], [925, 685], [927, 734], [279, 752]]
[[278, 544], [544, 527], [547, 577], [281, 593]]
[[856, 466], [1185, 457], [1186, 502], [857, 511]]
[[281, 465], [664, 452], [666, 502], [283, 514]]
[[341, 389], [1258, 395], [1257, 450], [340, 443]]
[[1031, 45], [1197, 45], [1197, 191], [1031, 191]]

fastdeploy
shape_list [[1528, 1439, 960,               896]]
              0.6282722513089005  0.6226546212647672

[563, 1040, 725, 1023, 730, 1061, 568, 1079]
[708, 1018, 916, 1008, 918, 1048, 709, 1058]
[292, 1000, 536, 994, 537, 1051, 294, 1058]
[1122, 1002, 1316, 998, 1316, 1037, 1122, 1040]
[296, 939, 534, 933, 536, 989, 297, 995]
[959, 941, 1096, 934, 1098, 974, 961, 981]
[658, 928, 846, 921, 848, 971, 660, 978]
[1126, 918, 1263, 913, 1265, 958, 1128, 963]
[313, 891, 531, 873, 536, 925, 318, 942]
[561, 876, 743, 870, 744, 909, 563, 915]
[736, 880, 751, 880, 751, 886, 736, 886]
[759, 857, 1058, 849, 1058, 888, 759, 896]
[294, 819, 534, 814, 536, 876, 296, 881]
[563, 791, 735, 791, 735, 830, 563, 830]
[278, 703, 924, 685, 926, 733, 280, 751]
[273, 610, 913, 594, 915, 660, 275, 676]
[276, 544, 544, 526, 547, 576, 280, 594]
[574, 538, 588, 538, 588, 552, 574, 552]
[856, 465, 1184, 457, 1185, 502, 857, 510]
[281, 464, 663, 452, 665, 502, 283, 513]
[340, 388, 1257, 395, 1257, 449, 340, 443]
[1031, 44, 1196, 44, 1196, 191, 1031, 191]

经过比对,模型推理后的结果一致,但是后处理的结果不一致,如上,564 和 563 ,这是为什么,是因为c++中的精度差异吗? 求教

@ChaoII
Copy link
Collaborator

ChaoII commented Apr 25, 2025

后处理中看有没有直接用int强转float的,这样会丢掉小数部分,有些后处理用了round方法,先四舍五入,再转成int。我觉得这俩做法都不会影响最终结果。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants