一段代码:解决一个翻译的复杂问题(未完待续)

本文由 大侠(AhcaoZhu)原创,转载请声明。
链接: https://blog.csdn.net/Ahcao2008

Alt

一段代码:解决一个翻译的复杂问题(未完待续)

  • 摘要

摘要

  • 本文介绍python的一段代码,它将实现的功能是,对python的函数、模块、类等的__doc__实现解析、并调用AI完整翻译。目的是想得到完美的输出,从内容到格式。
  • 翻译的功能早就实现了。可是,英文文章的解析确实很复杂。
  • 所以借助AI来实现这个比较复杂的解析。(未完待续)
  • 【原创:AhcaoZhu大侠】

今天由于准备不充分,或者说这几天都在憋大招,实在没有可写的,可是我又不想中断贡献墙,所以就贴一段代码,还没有完成。
是和AI一起写的。
居然调试通过了。
不过过程那是相当地艰辛:效率一点儿也不高。
暂时凑合吧。
垃圾文章,实在惭愧啊。辜负粉丝了,请飘过。

# !/usr/bin/env python
# coding:utf-8

import re

_doc = '''
Executes a function after this request.  This is useful to modify
    response objects.  The function is passed the response object and has
    to return the same or a new one.
Executes a function after this request.  This is useful to modify
    response objects.  The function is passed the response object and has
    to return the same or a new one.    
Executes a function after this request.  This is useful to modify
    response objects.  The function is passed the response object and has
    to return the same or a new one.
Executes a function after this request.  This is useful to modify
    response objects.  The function is passed the response object and has
    to return the same or a new one.  
    
    Example::

        @app.route('/')
        def index():
            @after_this_request
            def add_header(response):
                response.headers['X-Foo'] = 'Parachute'
                return response
            return 'Hello World!'

    This is more useful if a function other than the view function wants to
    modify a response.  For instance think of a decorator that wants to add
    some headers without converting the return value into a response object.

    .. versionadded:: 0.9
    
'''


def parsedoc(str1: str) -> list:
    doc_strings = []
    current_doc = ""
    for line in str1.split("\n"):
        line = line.strip()
        if line == "":
            if current_doc != "":
                doc_strings.append(current_doc)
                current_doc = ""
        else:
            current_doc += line + "\n"
    if current_doc != "":
        doc_strings.append(current_doc)
    return doc_strings


def split_paragraph(text):
    # 将换行符替换为特殊字符◆
    text = text.replace('\n', '◆')
    # 分段处理,每次处理500个字符
    while len(text) > 500:
        # 找到第500个字符的位置
        index = 500
        while text[index] not in ['.', '!', '?', ':', ':', ','] and index > 1:
            index -= 1
        # 查找第一个英文标点符号的位置
        if text[index-1] == '.':
            break
        index += 1
        # 输出前半部分,并将◆替换回为换行符
        yield text[:index].replace('◆', '\n') + '\n'
        # 更新文本为后半部分
        text = text[index:]
    # 处理最后不足500个字符的部分
    if len(text) > 0:
        yield text.replace('◆', '\n')


def parsestr(str1: str) -> list:
    # 判断字符串类型
    if str1.startswith('```'):      # 第2种情况
        return ['Code']
    elif str1.startswith('['):      # 第3种情况
        return ['Special Paragraph']
    else:                           # 第1种情况
        paragraphs = split_paragraph(str1)
        sentences = []
        for paragraph in paragraphs:
            sentences.append(paragraph)
        return sentences


def main():
    li = parsedoc(_doc)
    li_out1 = []
    for doc in li:
        li1 = parsestr(doc)
        for str1 in li1:
            li_out1.append(str1)

    for doc in li_out1:
        print(doc)
        print('--------')


if __name__ == '__main__':
    main()

你可能感兴趣的:(菜鸟学python,python,开发语言,AI,文本解析,机器翻译)