<div class="col-sm-1 col-lg-offset-1"> </div>
<div class="switch switch-large">
<p>规则</p>
<div class="col-sm-1 col-lg-offset-1"></div>
<input type="checkbox" checked />
</div>
尝试:
para.style.cssText = "border:1px; border-style:solid; border-color:black";//不可行
var p=document.createElement(""
);//不可行
p.setAttribute("style","font-size:15px; color:blueviolet; font-weight:bold");//可行
with open('total_info.json','r',encoding='utf8')as fp:
json_data = json.loads(fp)#把json对象转换成python字典
print(json_data)
不应该使用json.loads()而应该使用json.load();
参考:
https://blog.csdn.net/NOT_GUY/article/details/80954328
https://blog.csdn.net/m0_37052320/article/details/102558444
报错json.decoder.JSONDecodeError: Invalid \escape
解决:
是因为json文件中包含反斜线, 去掉之后再用json.loads(str)就可以了
实现代码:(其他四个类似)
import json
#读取json文件
tmp = []
for line in open('total_info.json','r',encoding='utf8'):
tmp.append(json.loads(line))
with open('./total_info.txt','a',encoding='utf=8') as f:
for i in range(len(tmp)):
f.write(tmp[i]['content'])
1.将生成的对应的子空间的句子上传到云端硬盘
2.2
挂载云端硬盘:
from google.colab import drive
drive.mount('/content/gdrive')
更改运行路径:
import os
os.chdir("/content/gdrive/My Drive/BERT-Keyword-Extractor-master/BERT-Keyword-Extractor-master")
下载相关需要的包:
import nltk
nltk.download('punkt')
! pip install seqeval
!pip install pytorch_pretrained_bert
进行训练:
报错:
错误原因:
运行代码时出现此问题,参数类型的错误,传的参数应该是orch.longtensor类型,使用a=torch.LongTensor()定义即可:
修改代码BERT-Keyword Extractor.ipynb:
def keywordextract(sentence):
text = sentence
tkns = tokenizer.tokenize(text)
indexed_tokens = tokenizer.convert_tokens_to_ids(tkns)
segments_ids = [0] * len(tkns)
tokens_tensor = torch.LongTensor([indexed_tokens]).to(device)
segments_tensors = torch.LongTensor([segments_ids]).to(device)
model.eval()
prediction = []
logit = model(tokens_tensor, token_type_ids=None,
attention_mask=segments_tensors)
logit = logit.detach().cpu().numpy()
ans = []
prediction.extend([list(p) for p in np.argmax(logit, axis=2)])
for k, j in enumerate(prediction[0]):
if j==1 or j==0:
ans.append(tokenizer.convert_ids_to_tokens(tokens_tensor[0].to('cpu').numpy())[k])
return ans
tr_inputs = torch.LongTensor(tr_inputs)
val_inputs = torch.LongTensor(val_inputs)
tr_tags = torch.LongTensor(tr_tags)
val_tags = torch.LongTensor(val_tags)
tr_masks = torch.tensor(tr_masks)
val_masks = torch.tensor(val_masks)
将五个字空间句子输入到此模型中进行预测:
注意需要将空字符串的情况排除:
with open('sentence_TextCNN.txt','r',encoding='utf-8') as f:
with open('sentence_TextCNN_ans.txt','a',encoding='utf-8') as ff:
lines=f.readlines()
for i in range(len(lines)):
# print(lines[i].strip())
if(lines[i].strip()!=""):
ans=keywordextract(lines[i].strip())
for j in range(len(ans)):
ff.write(ans[j])
ff.write(" ")
ff.write("\n")