Research on Collaborative Optimization of OCR Semantic Correction under Adversarial Interference: An Empirical Study on Multimodal Enhancement Based on PaddleOCR and BERT
DOI:
https://doi.org/10.61173/8b950109Keywords:
BERT, OCR-Corrector, PaddleOCR, OCR semantic correctionAbstract
In complex-scene OCR recognition, semantic deviations caused by local interference seriously threaten system reliability. This study constructs a correction framework integrating PaddleOCR and BERT. Through controlled interference experiments (semi-transparent occlusion + stroke misdirection), 15 multi-scene samples are generated to verify the collaborative optimization efficiency. The experiment shows that the character-level accuracy of the original OCR drops to 26.67% (4/15), and increases to 60.00% (9/15) after semantic correction, with an absolute gain of 33.33%, and there is no negative correction. Typical successful cases include shape-similar character correction (e.g., "生减" → "生成") and semantic completion (e.g., "如暑" → "如果"). However, technical limitations are still exposed in connected characters ("自自" not corrected) and domain terms ("¥5,000.00" not standardized). This study proposes a multimodal enhancement strategy and open-sources a toolchain to support interference parameterization adjustment, providing a reproducible robustness benchmark for industrial scenarios.