Windows下利用win32clipboard实现Python的剪切板(Clipboard)操作

最近翻译论文的时候发现复制过来的文字经常带有很多的换行符,为了方便的去除这些换行符,写了一个python小方法。
代码如下(适用于Python3):

import win32clipboard as wc
import win32con
import sys

def stripClipboard():
    # 开始剪切板操作
    wc.OpenClipboard()
    # 尝试将剪切板内容读取为Unicode文本
    # 如果复制的内容是文件而非文本,读取为Unicode文本会报错,所以需要错误处理。
    try:
        txt = wc.GetClipboardData(win32con.CF_UNICODETEXT)
    except Exception as e:
        print("剪切板内容非文本,无法去除换行符。")
        wc.EmptyClipboard()
        sys.exit("已清空剪切板并退出。")
    txt = wc.GetClipboardData(win32con.CF_UNICODETEXT)
    txt = str(txt).strip()
    # 字符串按行分割
    txt = txt.splitlines()
    n = len(txt)
    # 用空格拼接每行
    txt = ' '.join(txt)
    # 将所有长度大于1的空白符转为1个空格
    txt = ' '.join(txt.split())
    # 清空剪切板
    wc.EmptyClipboard()
    # 尝试将处理完的字符放入剪切板,注意这里用的是win32con.CF_UNICODETEXT,
    # 如果使用win32con.CF_TEXT则需要对txt进行编码,否则会出现乱码。
    wc.SetClipboardData(win32con.CF_UNICODETEXT, txt)
    # 关闭剪切板
    wc.CloseClipboard()
    print('删除了{}个换行符\n'.format(n-1))
    print(txt+'\n')

if __name__ == '__main__':
    stripClipboard()

保存并命名为“stripClipboard.py”。每当从论文里复制了文字后,只需运行python stripClipboard.py接着再ctrl+v就能粘贴已去除所有换行符的文字内容。

例如,从论文中ctrl+v了如下文字:

In this paper, we propose a novel neural
network model called RNN Encoder–
Decoder that consists of two recurrent
neural networks (RNN). One RNN encodes
a sequence of symbols into a fixedlength
vector representation, and the other
decodes the representation into another sequence
of symbols. The encoder and decoder
of the proposed model are jointly
trained to maximize the conditional probability
of a target sequence given a source
sequence. The performance of a statistical
machine translation system is empirically
found to improve by using the conditional
probabilities of phrase pairs computed
by the RNN Encoder–Decoder as an
additional feature in the existing log-linear
model. Qualitatively, we show that the
proposed model learns a semantically and
syntactically meaningful representation of
linguistic phrases.

运行python stripClipboard.py,则剪切板的内容变为:

In this paper, we propose a novel neural network model called RNN Encoder– Decoder that consists of two recurrent neural networks (RNN). One RNN encodes a sequence of symbols into a fixedlength vector representation, and the other decodes the representation into another sequence of symbols. The encoder and decoder of the proposed model are jointly trained to maximize the conditional probability of a target sequence given a source sequence. The performance of a statistical machine translation system is empirically found to improve by using the conditional probabilities of phrase pairs computed by the RNN Encoder–Decoder as an additional feature in the existing log-linear model. Qualitatively, we show that the proposed model learns a semantically and syntactically meaningful representation of linguistic phrases.

注1: 要使用win32clipboard需安装pywin32,可用pip install -U pywin32来进行安装
注2: Mac下Python的剪切板操作将在另一篇文章里介绍
注3: 处理中文字符时会乱码
注3: 如果有乱码现象,可能是因为中文Windows系统编码默认为GBK,请尝试将编码设置为UTF-8

你可能感兴趣的:(python,python,windows)