DAHSF: An Algorithm for Sequence Parsing for Specific Scenarios and Lightweight Deployment


Full Paperhttps://alphaxiv.org/pdf/2412.14054

Project Linkhttps://blog.csdn.net/m0_62984100/article/details/140054725

Githubhttps://github.com/Magic-Abracadabra/DAHSF/blob/main/DAHSF.pdf

Digestion Algorithm in Hierarchical Symbolic Forests: A Fast Text Normalization Algorithm and Semantic Parsing Framework for Specific Scenarios and Lightweight Deployment

Abstract

Text Normalization and Semantic Parsing have numerous applications in natural language processing, such as natural language programming, paraphrasing, data augmentation, constructing expert systems, text matching, and more. Despite the prominent achievements of deep learning in Large Language Models (LLMs), the interpretability of neural network architectures is still poor, which affects their credibility and hence limits the deployments of risk-sensitive scenarios. In certain scenario-specific domains with scarce data, rapidly obtaining a large number of supervised learning labels is challenging, and the workload of manually labeling data would be enormous. Catastrophic forgetting in neural networks further leads to low data utilization rates. In situations where swift responses are vital, the density of the model makes local deployment difficult and the response time long, which is not conducive to local applications of these fields. Inspired by the multiplication rule, a principle of combinatorial mathematics, and human thinking patterns, a multilayer framework along with its algorithm, the Digestion Algorithm in Hierarchical Symbolic Forests (DAHSF), is proposed to address these above issues, combining text normalization and semantic parsing workflows. The Chinese Scripting Language "Fire Bunny Intelligent Development Platform V2.0" is an important test and application of the technology discussed in this paper. DAHSF can run locally in scenario-specific domains on little datasets, with model size and memory usage optimized by at least two orders of magnitude, thus improving the execution speed, and possessing a promising optimization outlook.

Keywords: Text Normalization; Semantic Parsing; Lightweight Deployment; Fast Algorithm; Scenario-Specific

Important Note: This framework can be generalized to any sequence including text.

你可能感兴趣的:(新程序员,魔法,魔法传奇,人工智能,自然语言处理,算法,数据结构,人机交互)