Apollo2Mars

Projects Review - UKPLab/emnlp2017-bilstm-cnn-crf

pklUKPLab/emnlp2017-bilstm-cnn-crf
https://github.com/UKPLab/emnlp2017-bilstm-cnn-crf

Pretrained-models
Sogaard and Goldberg

In the following repository you can find an BiLSTM-CRF implementation used for Sequence Tagging, e.g. POS-tagging, Chunking, or Named Entity Recognition. The implementation is based on Keras 1.x and can be run with Theano (0.9.0) or Tensorflow (0.12.1) as backend.

The architecture is described in our papers Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging and Optimal Hyperparameters for Deep LSTM-Networks for Sequence Labeling Tasks.

The hyperparameters of this network can be easily configured, so that you can re-run the proposed system by Huang et al., Bidirectional LSTM-CRF Models for Sequence Tagging, Ma and Hovy,* End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF and Lample et al, Neural Architectures for Named Entity Recognition*.

The implementation was optimized for performance using a smart shuffeling of the trainings data to group sentences with same length together. This increases the training speed by multiple factors in comparison to the implementations by Ma or Lample.

The training of the network is simple and can easily be extended to new datasets and languages. For example, see Train_POS.py.

Pretrained models can be stored and loaded for inference.
Simply execute follow code :
python RunModel.py models/modelname.h5 input.txt
Pretrained-models for some sequence tagging task using this LSTM-CRF implementations are provided in Pretrained Models.

This implementation can be used for Multi-Task Learning, i.e. learning simultanously several task with non-overlapping datasets. The file Train_MultiTask.py depicts an example, how the LSTM-CRF network can be used to learn POS-tagging and Chunking simultaneously. The number of tasks is not limited. Tasks can be supervised at the same level or at different output level, for example, to re-implement the approach by Sogaard and Goldberg, Deep multi-task learning with low level tasks supervised at lower layers.

Citation
If you find the implementation useful, please cite the following paper: Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging

@InProceedings{Reimers:2017:EMNLP,
author = {Reimers, Nils, and Gurevych, Iryna},
title = {{Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging}},
booktitle = {Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
month = {09},
year = {2017},
address = {Copenhagen, Denmark},
pages = {(to-appear)},
url = {https://arxiv.org/abs/1707.09861}
}

Abstract:

In this paper we show that reporting a single performance score is insufficient to compare non-deterministic approaches. We demonstrate this for common sequence tagging tasks that the seed value for the random number generator can result in statistically significant (p < 10^{-4}) differences for state-of-the-art systems. For two recent systems for NER, we observe an absolute difference of one percentage point F1-score depending on the selected seed value, making these systems perceived either as state-of-the-art or mediocre. Instead of publishing and reporting single performance scores, we propose to compare score distributions based on multiple executions.(::multiple parameters) Based on the evaluation of 50.000 LSTM-networks for five sequence tagging tasks, we present network architectures that perform superior as well as produce results with higher stability on unseen data.

Contact person: Nils Reimers, [email protected]

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

Don’t hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn’t be) or if you have further questions.

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

Setup
In order to run the code, you need either Python 2.7 or Python 3.6, Keras 1.2.x and as Backend either Theano 0.9.0 or TensorFlow 0.12.1. Note, at the moment the code cannot be run with Keras 2.x or Tensorflow 1.x!

If you want to use the character based word representations (charEmbeddings), you have to use the Theano backend(::why?). You can change this for Keras in the home folder in the file: .keras/keras.json by setting the option backend to theano. Another option is to set an environment variable:

export KERAS_BACKEND=theano
Setup with virtual environment
Setup a Python virtual environment (optional):

virtualenv .env
source .env/bin/activate

Install the requirements:

.env/bin/pip install -r requirements.txt

If everything works well, you can run python Train_POS.py to train a deep POS-tagger for the POS-tagset from universal dependencies.

Setup with docker
See the docker-folder for more information how to run these scripts in a docker container.

Training
Training new models is simple. Look at Train_POS.py and Train_Chunking.py for examples.

Place new datasets in the folder data. The system expects three files train.txt, dev.txt and test.txt in a CoNLL format. I.e. each token is in a new line, different columns are seperated by a white space (either a space or a tab). Sentences are seperated by an empty line.

For an example look at data/conll2000_chunking/train.txt. Files with multiple columns, like data/unidep_pos/train.txt are no problem, as we will specify later which columns should be used for training.

To train a LSTM-network, you must specify the following lines of code (Train_POS.py):

datasetName = 'unidep_pos'
dataColumns = {1:'tokens', 3:'POS'} #Tab separated columns, column 1 contains the token, 3 the universal POS tag
labelKey = 'POS'

embeddingsPath = ‘levy_deps.words’ #Word embeddings by Levy et al: https://levyomer.wordpress.com/2014/04/25/dependency-based-word-embeddings/
datasetName defines the name of the dataset, here it will use the data in the folder data/unidep_pos. dataColumns specifies the columns that should be read from the CoNLL file, in this case the first column and the third column should be read. The counting starts at 0. The first column contains the tokens, and the third column the POS-tag. Note, that we must always specify a ‘tokens’ column. The other columns can be named arbitrarily.

labelKey will specify which column should serve as label, in this case we want to perform POS-tagging. The name must match with the name specified in the dictionary dataColumns.

embeddingsPath contains the path to pre-trained word embeddings. The format for this must be text-based, i.e. each line contains the embedding for a word and the first column in that line is the word, followed by the dense vector. Our script will automatically download the embeddings by Levy et al. if they are not present.

If you want to perform chunking instead of POS-tagging, simple change the first lines (Train_Chunking.py):

datasetName = 'conll2000_chunking'
dataColumns = {0:'tokens', 1:'POS', 2:'chunk_BIO'} #Tab separated columns, column 0 contains the token, 1 the POS, 2 the chunk information using a BIO encoding
labelKey = 'chunk_BIO'

Note: By appending a _BIO to a column name, we indicate that this column is BIO encoded. The system will then compute the F1-score instead of the accuracy.

Running a stored model
If enabled during the trainings process, models are stored to the ‘models’ folder. Those models can be loaded and be used to tag new data. An example is implemented in RunModel.py:

python RunModel.py models/modelname.h5 input.txt

This script will read the model models/modelname.h5 as well as the text file input.txt. The text will be splitted into sentences and tokenized using NLTK. The tagged output will be written in a CoNLL format to standard out.

Multi-Task-Learning
The class neuralnets/MultiTaskLSTM.py implements a Multi-Task-Learning setup using LSTM. The code and parameters are similar to the Single-Task setup.

The file Train_MultiTask.py contains an example how to run the code. There, we define which datasets should be used:

posName = 'unidep_pos'
posColumns = {1:'tokens', 3:'POS'}
chunkingName = 'conll2000_chunking'
chunkingColumns = {0:'tokens', 1:'POS', 2:'chunk_BIO'}


datasetFiles = [
        (posName, posColumns),
        (chunkingName, chunkingColumns)
    ]
datasetTuples = {
    'POS': (posData, 'POS', True),
    'Chunking': (chunkingData, 'chunk_BIO', True)
    }

As before, we define the dataset names with the column names and store these information in the datasetFiles array. The dictionary datasetTuples contains the preprocessed datasets (posData and chunkingData), the column we like to use as label (POS and chunk_BIO). The boolean parameter defines whether this dataset should be evaluated. If it is set to False, no performance scores will be printed for this dataset(::why?).

LSTM-Hyperparameters
The parameters in the LSTM-CRF network can be configured by passing a parameter-dictionary to the BiLSTM-constructor: BiLSTM(params).

The following parameters exists:

miniBatchSize: Size (Nr. of sentences) for mini-batch training. Default value: 32
dropout: Set to 0, for no dropout. For naive dropout, set it to a real value between 0 and 1. For variational dropout, set it to a two-dimensional tuple or list, with the first entry corresponding to output dropout and the second entry to the recurrent dropout. Default value: [0.25, 0.25]
classifier: Set to Softmax to use a softmax classifier or to CRF to use a CRF-classifier as the last layer of the network. Default value: Softmax
LSTM-Size: List of integers with the number of recurrent units for the stacked LSTM-network. T**he list [100,75,50] would create 3 stacked BiLSTM-layers with 100, 75, and 50 recurrent units**. Default value: [100]
optimizer: Available optimizers: SGD, AdaGrad, AdaDelta, RMSProp, Adam, Nadam. Default value: nadam
earlyStopping: Early stoppig after certain number of epochs, if no improvement on the development set was achieved. Default value: 5
addFeatureDimensions: Dimension for additional features, that are passed to the network. Default value: 10
charEmbeddings: Available options: [None, ‘CNN’, ‘LSTM’]. If set to None, no character-based representations will be used. With CNN, the approach by Ma & Hovy using a CNN will be used. With LSTM, an LSTM network will be used to derive the character-based representation (Lample et al.). Default value: None. Note, charEmbeddings does only work with Theano as backend.
charEmbeddingsSize: The dimension for characters, if the character-based representation is enabled. Default value: 30
charFilterSize: If the CNN approach is used, this parameters defined the filter size, i.e. the output dimension of the convolution. Default: 30
charFilterLength: If the CNN approach is used, this parameters defines the filter length. Default: 3
charLSTMSize: If the LSTM approach is used, this parameters defines the size of the recurrent units. Default: 25
clipvalue: If non-zero, the gradient will be clipped to this value. Default: 0
clipnorm: If non-zero, the norm of the gradient will be normalized to this value. Default: 1
For the MutliTaskLSTM.py-network, the following additional parameter exists:

customClassifier: A dictionary, that maps each dataset an individual classifier. For example, the POS tag could use a Softmax-classifier, while the Chunking dataset is trained with a CRF-classifier.
Acknowledgments

This code uses the CRF-Implementation of Philipp Gross from the Keras

Pull Request #4621. Thank you for contributing this to the community.

Train_Pos.py
corpus

1   [   [   PUNCT   -LRB-   _   10  punct   _   SpaceAfter=No
2   This    this    **DET** DT  Number=Sing|PronType=Dem    3   det _   _
3   killing killing **NOUN**    NN  Number=Sing 10  nsubj   _   _
4   of  of  ADP IN  _   7   case    _   _
5   a   a   DET DT  Definite=Ind|PronType=Art   7   det _   _
6   respected   respected   ADJ JJ  Degree=Pos  7   amod    _   _
7   cleric  cleric  NOUN    NN  Number=Sing 3   nmod    _   _
8   will    will    AUX MD  VerbForm=Fin    10  aux _   _
9   be  be  AUX VB  VerbForm=Inf    10  aux _   _
10  causing cause   VERB    VBG VerbForm=Ger    0   root    _   _
11  us  we  PRON    PRP Case=Acc|Number=Plur|Person=1|PronType=Prs  10  iobj    _   _
12  trouble trouble NOUN    NN  Number=Sing 10  dobj    _   _
13  for for ADP IN  _   14  case    _   _
14  years   year    NOUN    NNS Number=Plur 10  nmod    _   _
15  to  to  PART    TO  _   16  mark    _   _
16  come    come    VERB    VB  VerbForm=Inf    14  acl _   SpaceAfter=No
17  .   .   PUNCT   .   _   10  punct   _   SpaceAfter=No
18  ]   ]   PUNCT   -RRB-   _   10  punct   _   _

1   DPA DPA PROPN   NNP Number=Sing 0   root    _   SpaceAfter=No
2   :   :   PUNCT   :   _   1   punct   _   _
3   Iraqi   iraqi   ADJ JJ  Degree=Pos  4   amod    _   _
4   authorities authority   NOUN    NNS Number=Plur 5   nsubj   _   _
5   announced   announce    VERB    VBD Mood=Ind|Tense=Past|VerbForm=Fin    1   parataxis   _   _
6   that    that    SCONJ   IN  _   9   mark    _   _
7   they    they    PRON    PRP Case=Nom|Number=Plur|Person=3|PronType=Prs  9   nsubj   _   _
8   had have    AUX VBD Mood=Ind|Tense=Past|VerbForm=Fin    9   aux _   _
9   busted  bust    VERB    VBN Tense=Past|VerbForm=Part    5   ccomp   _   _
10  up  up  ADP RP  _   9   compound:prt    _   _
11  3   3   NUM CD  NumType=Card    13  nummod  _   _
12  terrorist   terrorist   ADJ JJ  Degree=Pos  13  amod    _   _
13  cells   cell    NOUN    NNS Number=Plur 9   dobj    _   _
14  operating   operate VERB    VBG VerbForm=Ger    13  acl _   _
15  in  in  ADP IN  _   16  case    _   _
16  Baghdad Baghdad PROPN   NNP Number=Sing 14  nmod    _   SpaceAfter=No
17  .   .   PUNCT   .   _   1   punct   _   _

pkuseg.test()函数使用的困惑 MilkLeong 自然语言处理人工智能
用pkuseg模块来进行分词，在使用pkuseg.test()函数时，遇到了一些问题1.当我像这样写代码时，程序一直在运行状态，跑不出结果importpkusegpkuseg.test('input.txt','output.txt',postag=True,nthread=20)这里的’input.txt’是直接从网上贴的，其具体内容如下：我们都有一个家名字叫中国，兄弟姐妹都很多，景色也不错。男
维特比算法以及解码时的beamSearch yiqingyang2012 Machine learning tensorflow viterbi beamsearch HMM
维特比算法输入序列为词，输出序列为POS，采用HMM为例介绍维特比算法HMM假设当前的隐含状态（POSTAG）只和上一时刻的隐含状态相关，当前的observation（词）是由隐含状态生成的，之和当前的隐含状态相关。假设S(k,i,j)是一个集合，集合中每个元素是一个长度为k的序列，且每个序列的最后两个元素为(i,j).定义$\pi(t,i,j)$代表集合中概率最大的那个序列的概率值，其满足下面的
用通俗易懂的方式讲解：总结NLTK使用方法 2201_75499313 机器学习 python 人工智能开发语言
文章目录1.NLTK安装与功能描述2.NLTK词频统计（Frequency）技术提升3.NLTK去除停用词（stopwords）4.NLTK分句和分词（tokenize）5.NLTK词干提取（Stemming）6.NLTK词形还原（Lemmatization）7.NLTK词性标注（POSTag）8.NLTK中的wordnetNLTK（naturallanguagetoolkit）是一套基于pyth
基于Hmm模型和Viterbi算法的中文分词和词性标注 xuzf-cs Python 自然语言处理算法 nlp
使用python实现基于Hmm模型和Viterbi算法的中文分词及词性标注；使用最大概率算法进行优化。最终效果：人民日报语料：分词(F1:96.189%)；词性标注(F1:97.934%)完整代码和数据，参见本实验的github地址：https://github.com/xuzf-git/WordSegment-and-PosTag1、基于统计的分词方法（隐马尔可夫模型）（1）算法设计采用隐马尔科
transformer使用示例 ox180x transformer 深度学习自然语言处理人工智能 pytorch
关于transformer的一些基础知识，之前在看李宏毅视频的时候总结了一些，可以看here，到写此文章时，也基本忘的差不多了，故也不深究，讲两个关于transformer的基本应用，来方便理解与应用。序列标注参考文件transformer_postag.py.1.加载数据12#加载数据train_data,test_data,vocab,pos_vocab=load_treebank()其中lo
lstm使用示例 ox180x lstm 深度学习自然语言处理机器学习 rnn
注意，本文代码来自于plm-nlp-code。学习任何模型都需要一个简单可行的例子进行说明，我会基于plm-nlp-code的代码进行说明lstm在序列标注和句子极性二分类两个例子的应用。序列标注参考文件lstm_postag.py.1.加载数据12#加载数据train_data,test_data,vocab,pos_vocab=load_treebank()其中load_treebank代码：
[Errno 2] No such file or directory: ‘C:\\Users\\booze/.pkuseg\\postag\\featureIndex.txt_0‘ booze-J 常见问题 python
文章目录一、报错信息二、解决方案一、报错信息[Errno2]Nosuchfileordirectory:'C:\\Users\\booze/.pkuseg\\postag\\featureIndex.txt_0'二、解决方案下载postag.zip，解压到指定的位置C:\\Users\\booze/.pkuseg\\。postag.zip下载：百度网盘链接：https://pan.baidu.co
Python结合spaCy 进行简易自然语言处理
目录简介1.spaCy简介及安装方法1.1简介1.2安装2.spaCy的管道（Pipeline）与属性（Properties）2.1Tokenization2.2词性标注(POSTag)2.3实体识别2.4依存句法分析2.5名词短语（NP）3.集成词向量4.使用spaCy对文本进行机器学习5.和其它库的对比支持功能表速度：主要功能（Tokenizer、Tagging、Parsing）速度准确性：实
nlp常用任务以及各类任务常用模型 AndyViky 知识梳理 AI nlp deep-learning
nlp常用任务以及各类任务常用模型本文主要简单描述目前nlp方向的应用类型以及该类型下的常用方法（以及目前通用的数据集）详细信息参考自https://github.com/sebastianruder/NLP-progressnlp四大任务类型序列标注：分词/POSTag词性标注/NER命名实体识别分词/POSTag词性标注目前分词和词性标注技术已经非常成熟，常用的库有Jieba，哈工大pyltp
基于开源文本摘要模块sumy的文本摘要生成实践 Together_CZ python实践
自然语言处理领域中有很多的子任务，大类上一共分为四个板块，如下：1.序列标注：分词/POSTag/NER/语义标注2.分类任务：文本分类/情感计算3.句子关系判断：Entailment/QA/自然语言推理4.生成式任务：机器翻译/文本摘要在我接触NLP相关的工作以来，任务1和任务2是比较常见的，后面两种则几乎没有什么接触，今天发现了一个比较有意思的自动文本摘要生成模块sumy，这个属于最后一个任务
NLTK使用方法总结 Asia-Lee NLP
目录1.NLTK安装与功能描述2.NLTK词频统计（Frequency）3.NLTK去除停用词（stopwords）4.NLTK分句和分词（tokenize）5.NLTK词干提取（Stemming）6.NLTK词形还原（Lemmatization）7.NLTK词性标注（POSTag）8.NLTK中的wordnetNLTK（naturallanguagetoolkit）是一套基于python的自然语
NLP课程：Word2vec到FastText 张楚岚课程笔记
以下是我的学习笔记，以及总结，如有错误之处请不吝赐教。之前的文章主要介绍了Word2vec的原理及应用，本文主要介绍从word2vec到FastText的发展。NLP四大问题：主要用到的模型有：分类任务：文本分类/情感计算(常用模型CNN、朴素贝叶斯（伯努利贝叶斯、多项式贝叶斯、高斯分布贝叶斯参考）、svm).序列标注：分词/POSTag/NER/语义标注；(常用模型：RNN、LSTM、GRU)关
基于Stanford Parser 及OpenNLP Shallow Parser构建句子语法解析树 LarryNLPIR NLP/IR
最近做一个项目需要对给定的文本中的句子做Parse，根据POStag及句子成分信息找出词语/短语之间的dependency，然后根据dependency构建句子的parsetree.需要用到StanfordParser和OpenNLP中的ShallowParser，这两个Parser都用JAVA实现，提供API方式调用，可以根据句子输出语法解析树。下面总结两类Parser的作用及JAVA程序调用方
自然语言处理综述 NLP_victor NLP
自然语言处理技术分类工业界NLP四大任务：①序列标注：分词、POSTag词性标注、NER、语义标注②分类任务：文本分类、情感计算③句子关系判断：Entailment、QA、自然语言推理④生成式任务：机器翻译、文本摘要拓：自然语言推理是NLP高级别的任务之一，不过自然语言推理包含的内容比较多，机器阅读，问答系统和对话等本质上都属于自然语言推理。文本蕴含任务(textentailment)，它的任务形
自然语言处理基础知识 Jasonhaven
1.分词（WordCut）英文：单词组成句子，单词之间由空格隔开中文：字、词、句、段、篇词：有意义的字组合分词：将不同的词分隔开，将句子分解为词和标点符号英文分词：根据空格中文分词：三类算法中文分词难点：歧义识别、未登录词中文分词的好坏：歧义词识别和未登录词的识别准确率分词工具：Jieba，SnowNLP，NlPIR，LTP，NLTK2.词性标注（POSTag）词性也称为词类或词汇类别。用于特定任
文本关键词提取方法综述蕾姆233 数据挖掘 NLP
一、提取过程总共分两步，第一步对文章分词、去停用词、postag之后，得到候选关键词列表L；第二步，使用关键词提取算法提取关键词。最后得到的关键词应满足以下三个条件：1·、Understandable.Thekeyphrasesareunderstandabletopeople.Thisindicatestheextractedkeyphrasesshouldbegrammatical.Forex
中文自然语言处理向量合集(字向量,拼音向量,词向量,词性向量,依存关系向量) liuhuanyong_iscas 自然语言处理人工智能语言资源语言信息处理
ChineseEmbeddingChineseEmbeddingcollectioninclingtoken,postag,pinyin,dependency,wordembedding.中文自然语言处理向量合集,包括字向量,拼音向量,词向量,词性向量,依存关系向量.共5种类型的向量.项目地址：https://github.com/liuhuanyong项目简介目前不同于one-hot表示的稠密向
词性标注 yichudu NLP
词性标签PartofSpeechTag,PosTag,wikipedia用于给句子的不同词语加标注,有多种标签规则.PennTreebank项目用到的词性标签示意,点这里Number,Tag,Description1.CCCoordinatingconjunction2.CDCardinalnumber3.DTDeterminer4.EXExistentialthere5.FWForeignwor
Pyhon 自然语言处理（二）文本预处理流程慕白 Python NLP
Python自然语言处理（二）文本预处理流程完整的文本预处理的过程如下：原始文本语料—>分词Tokenize—>词性标注POSTag—>词干化Lemma/Stemming—>去除停用词—>处理后的文本语料1.Tokenizeimportnltksent="hello,Python"tokens=nltk.word_tokenize(sent)printtokens['hello',',','Pyt
Pyhon 自然语言处理（二）文本预处理流程慕白 Python NLP
Python自然语言处理（二）文本预处理流程完整的文本预处理的过程如下：原始文本语料—>分词Tokenize—>词性标注POSTag—>词干化Lemma/Stemming—>去除停用词—>处理后的文本语料1.Tokenizeimportnltksent="hello,Python"tokens=nltk.word_tokenize(sent)printtokens['hello',',','Pyt
基于Stanford Parser 及OpenNLP Shallow Parser构建句子语法解析树 yangliuy String null Integer input dependencies tags
最近做一个项目需要对给定的文本中的句子做Parse，根据POStag及句子成分信息找出词语/短语之间的dependency，然后根据dependency构建句子的parsetree.需要用到StanfordParser和OpenNLP中的ShallowParser，这两个Parser都用JAVA实现，提供API方式调用，可以根据句子输出语法解析树。下面总结两类Parser的作用及JAVA程序调用方
JVM StackMapTable 属性的作用及理解 lijingyao8206 jvm 字节码 Class文件 StackMapTable
在Java 6版本之后JVM引入了栈图(Stack Map Table)概念。为了提高验证过程的效率，在字节码规范中添加了Stack Map Table属性，以下简称栈图，其方法的code属性中存储了局部变量和操作数的类型验证以及字节码的偏移量。也就是一个method需要且仅对应一个Stack Map Table。在Java 7版
回调函数调用方法百合不是茶 java
最近在看大神写的代码时,.发现其中使用了很多的回调 ,以前只是在学习的时候经常用到 ,现在写个笔记记录一下代码很简单: MainDemo :调用方法得到方法的返回结果
[时间机器]制造时间机器需要一些材料 comsci 制造
根据我的计算和推测,要完全实现制造一台时间机器,需要某些我们这个世界不存在的物质和材料... 甚至可以这样说,这种材料和物质,我们在反应堆中也无法获得......
开口埋怨不如闭口做事邓集海邓集海做人做事工作
“开口埋怨，不如闭口做事。”不是名人名言，而是一个普通父亲对儿子的训导。但是，因为这句训导，这位普通父亲却造就了一个名人儿子。这位普通父亲造就的名人儿子，叫张明正。　　　　张明正出身贫寒，读书时成绩差，常挨老师批评。高中毕业，张明正连普通大学的分数线都没上。高考成绩出来后，平时开口怨这怨那的张明正，不从自身找原因，而是不停地埋怨自己家庭条件不好、埋怨父母没有给他创造良好的学习环境。　　　　
jQuery插件开发全解析，类级别与对象级别开发 IT独行者 jquery 开发插件　函数
jQuery插件的开发包括两种：一种是类级别的插件开发，即给 jQuery添加新的全局函数，相当于给 jQuery类本身添加方法。 jQuery的全局函数就是属于 jQuery命名空间的函数，另一种是对象级别的插件开发，即给 jQuery对象添加方法。下面就两种函数的开发做详细的说明。 1 、类级别的插件开发类级别的插件开发最直接的理解就是给jQuer
Rome解析Rss 413277409 Rome解析Rss
import java.net.URL; import java.util.List; import org.junit.Test; import com.sun.syndication.feed.synd.SyndCategory; import com.sun.syndication.feed.synd.S
RSA加密解密无量加密解密 rsa
RSA加密解密代码代码有待整理 package com.tongbanjie.commons.util; import java.security.Key; import java.security.KeyFactory; import java.security.KeyPair; import java.security.KeyPairGenerat
linux 软件安装遇到的问题 aichenglong linux 遇到的问题 ftp
1 ftp配置中遇到的问题 500 OOPS: cannot change directory 出现该问题的原因:是SELinux安装机制的问题.只要disable SELinux就可以了修改方法:1 修改/etc/selinux/config 中SELINUX=disabled 2 source /etc
面试心得 alafqq 面试
最近面试了好几家公司。记录下；支付宝，面试我的人胖胖的，看着人挺好的；博彦外包的职位，面试失败；阿里金融，面试官人也挺和善，只不过我让他吐血了。。。由于印象比较深，记录下； 1，自我介绍 2，说下八种基本类型；（算上string。楼主才答了3种，哈哈，string其实不是基本类型，是引用类型） 3，什么是包装类，包装类的优点； 4，平时看过什么书？NND，什么书都没看过。。照样
java的多态性探讨百合不是茶 java
java的多态性是指main方法在调用属性的时候类可以对这一属性做出反应的情况 //package 1; class A{ public void test(){ System.out.println("A"); } } class D extends A{ public void test(){ S
网络编程基础篇之JavaScript-学习笔记 bijian1013 JavaScript
1.documentWrite <html> <head> <script language="JavaScript"> document.write("这是电脑网络学校"); document.close(); </script> </h
探索JUnit4扩展：深入Rule bijian1013 JUnit Rule 单元测试
本文将进一步探究Rule的应用，展示如何使用Rule来替代@BeforeClass，@AfterClass，@Before和@After的功能。在上一篇中提到，可以使用Rule替代现有的大部分Runner扩展，而且也不提倡对Runner中的withBefores()，withAfte
[CSS]CSS浮动十五条规则 bit1129 css
这些浮动规则，主要是参考CSS权威指南关于浮动规则的总结，然后添加一些简单的例子以验证和理解这些规则。 1. 所有的页面元素都可以浮动 2. 一个元素浮动后，会成为块级元素，比如<span>,a, strong等都会变成块级元素 3.一个元素左浮动，会向最近的块级父元素的左上角移动，直到浮动元素的左外边界碰到块级父元素的左内边界；如果这个块级父元素已经有浮动元素停靠了
【Kafka六】Kafka Producer和Consumer多Broker、多Partition场景 bit1129 partition
0.Kafka服务器配置 3个broker 1个topic，6个partition，副本因子是2 2个consumer，每个consumer三个线程并发读取 1. Producer package kafka.examples.multibrokers.producers; import java.util.Properties; import java.util.
zabbix_agentd.conf配置文件详解 ronin47 zabbix 配置文件
Aliaskey的别名，例如 Alias=ttlsa.userid:vfs.file.regexp[/etc/passwd,^ttlsa:.:([0-9]+),,,,\1]，或者ttlsa的用户ID。你可以使用key：vfs.file.regexp[/etc/passwd,^ttlsa:.: ([0-9]+),,,,\1]，也可以使用ttlsa.userid。备注: 别名不能重复，但是可以有多个
java--19.用矩阵求Fibonacci数列的第N项 bylijinnan fibonacci
参考了网上的思路，写了个Java版的： public class Fibonacci { final static int[] A={1,1,1,0}; public static void main(String[] args) { int n=7; for(int i=0;i<=n;i++){ int f=fibonac
Netty源码学习-LengthFieldBasedFrameDecoder bylijinnan java netty
先看看LengthFieldBasedFrameDecoder的官方API http://docs.jboss.org/netty/3.1/api/org/jboss/netty/handler/codec/frame/LengthFieldBasedFrameDecoder.html API举例说明了LengthFieldBasedFrameDecoder的解析机制，如下：实
AES加密解密 chicony 加密解密
AES加解密算法，使用Base64做转码以及辅助加密： package com.wintv.common; import javax.crypto.Cipher; import javax.crypto.spec.IvParameterSpec; import javax.crypto.spec.SecretKeySpec; import sun.misc.BASE64Decod
文件编码格式转换 ctrain 编码格式
package com.test; import java.io.File; import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream;
mysql 在linux客户端插入数据中文乱码 daizj mysql 中文乱码
1、查看系统客户端，数据库，连接层的编码查看方法： http://daizj.iteye.com/blog/2174993 进入mysql，通过如下命令查看数据库编码方式： mysql> show variables like 'character_set_%'; +--------------------------+------
好代码是廉价的代码 dcj3sjt126com 程序员读书
长久以来我一直主张：好代码是廉价的代码。当我跟做开发的同事说出这话时，他们的第一反应是一种惊愕，然后是将近一个星期的嘲笑，把它当作一个笑话来讲。当他们走近看我的表情、知道我是认真的时，才收敛一点。当最初的惊愕消退后，他们会用一些这样的话来反驳： “好代码不廉价，好代码是采用经过数十年计算机科学研究和积累得出的最佳实践设计模式和方法论建立起来的精心制作的程序代码。” 我只
Android网络请求库——android-async-http dcj3sjt126com android
在iOS开发中有大名鼎鼎的ASIHttpRequest库，用来处理网络请求操作，今天要介绍的是一个在Android上同样强大的网络请求库android-async-http，目前非常火的应用Instagram和Pinterest的Android版就是用的这个网络请求库。这个网络请求库是基于Apache HttpClient库之上的一个异步网络请求处理库，网络处理均基于Android的非UI线程，通
ORACLE 复习笔记之SQL语句的优化 eksliang SQL优化 Oracle sql语句优化 SQL语句的优化
转载请出自出处：http://eksliang.iteye.com/blog/2097999 SQL语句的优化总结如下 sql语句的优化可以按照如下六个步骤进行：合理使用索引避免或者简化排序消除对大表的扫描避免复杂的通配符匹配调整子查询的性能 EXISTS和IN运算符下面我就按照上面这六个步骤分别进行总结：
浅析：Android 嵌套滑动机制（NestedScrolling） gg163 android 移动开发滑动机制嵌套
谷歌在发布安卓 Lollipop版本之后，为了更好的用户体验，Google为Android的滑动机制提供了NestedScrolling特性 NestedScrolling的特性可以体现在哪里呢？ 比如你使用了Toolbar，下面一个ScrollView，向上滚
使用hovertree菜单作为后台导航 hvt JavaScript jquery .net hovertree asp.net
hovertree是一个jquery菜单插件，官方网址：http://keleyi.com/jq/hovertree/ ，可以登录该网址体验效果。 0.1.3版本：http://keleyi.com/jq/hovertree/demo/demo.0.1.3.htm hovertree插件包含文件： http://keleyi.com/jq/hovertree/css
SVG 教程（二）矩形天梯梦 svg
SVG <rect> SVG Shapes SVG有一些预定义的形状元素，可被开发者使用和操作：矩形 <rect> 圆形 <circle> 椭圆 <ellipse> 线 <line> 折线 <polyline> 多边形 <polygon> 路径 <path>
一个简单的队列 luyulong java 数据结构队列
public class MyQueue { private long[] arr; private int front; private int end; // 有效数据的大小 private int elements; public MyQueue() { arr = new long[10]; elements = 0; front
基础数据结构和算法九：Binary Search Tree sunwinner Algorithm
A binary search tree (BST) is a binary tree where each node has a Comparable key (and an associated value) and satisfies the restriction that the key in any node is larger than the keys in all
项目出现的一些问题和体会 Steven-Walker DAO Web servlet
第一篇博客不知道要写点什么，就先来点近阶段的感悟吧。这几天学了servlet和数据库等知识，就参照老方的视频写了一个简单的增删改查的，完成了最简单的一些功能，使用了三层架构。 dao层完成的是对数据库具体的功能实现，service层调用了dao层的实现方法，具体对servlet提供支持。 &
高手问答：Java老A带你全面提升Java单兵作战能力！ ITeye管理员 java
本期特邀《Java特种兵》作者：谢宇，CSDN论坛ID: xieyuooo 针对JAVA问题给予大家解答，欢迎网友积极提问，与专家一起讨论! 作者简介：淘宝网资深Java工程师，CSDN超人气博主，人称“胖哥”。 CSDN博客地址： http://blog.csdn.net/xieyuooo 作者在进入大学前是一个不折不扣的计算机白痴，曾经被人笑话过不懂鼠标是什么，

Projects Review - UKPLab/emnlp2017-bilstm-cnn-crf

Abstract:

你可能感兴趣的:(postag)