Gensim 官方API

可以通过点击 官方链接 查看详细信息

官方提供的API列表如下:

interfaces– Core gensim interfaces

utils– Various utility functions

matutils– Math utils

corpora.bleicorpus– Corpus in Blei’s LDA-C format

corpora.csvcorpus– Corpus in CSV format

corpora.dictionary– Construct word<->id mappings

corpora.hashdictionary– Construct word<->id mappings

corpora.indexedcorpus– Random access to corpus documents

corpora.lowcorpus– Corpus in List-of-Words format

corpora.malletcorpus– Corpus in Mallet format of List-Of-Words.

corpora.mmcorpus– Corpus in Matrix Market format

corpora.sharded_corpus– Corpus stored in separate files

corpora.svmlightcorpus– Corpus in SVMlight format

corpora.textcorpus– Building corpora with dictionaries

corpora.ucicorpus– Corpus in UCI bag-of-words format

corpora.wikicorpus– Corpus from a Wikipedia dump

models.ldamodel– Latent Dirichlet Allocation

models.ldamulticore– parallelized Latent Dirichlet Allocation

models.lsimodel– Latent Semantic Indexing

models.ldaseqmodel– Dynamic Topic Modeling in Python

models.tfidfmodel– TF-IDF model

models.rpmodel– Random Projections

models.hdpmodel– Hierarchical Dirichlet Process

models.logentropy_model– LogEntropy model

models.normmodel– Normalization model

models.translation_matrix– Translation Matrix model

models.lsi_dispatcher– Dispatcher for distributed LSI

models.lsi_worker– Worker for distributed LSI

models.lda_dispatcher– Dispatcher for distributed LDA

models.lda_worker– Worker for distributed LDA

models.atmodel– Author-topic models

models.word2vec– Deep learning with word2vec

models.keyedvectors– Store and query word vectors

models.doc2vec– Deep learning with paragraph2vec

models.fasttext– FastText model

models.phrases– Phrase (collocation) detection

models.coherencemodel– Topic coherence pipeline

models.basemodel– Core TM interface

models.callbacks– Callbacks for track and viz LDA train process

models.wrappers.ldamallet– Latent Dirichlet Allocation via Mallet

models.wrappers.dtmmodel– Dynamic Topic Models (DTM) and Dynamic Influence Models (DIM)

models.wrappers.ldavowpalwabbit– Latent Dirichlet Allocation via Vowpal Wabbit

models.wrappers.wordrank– Word Embeddings from WordRank

models.wrappers.varembed– VarEmbed Word Embeddings

models.wrappers.fasttext– FastText Word Embeddings

similarities.docsim– Document similarity queries

How It Works

similarities.index– Fast Approximate Nearest Neighbor Similarity with Annoy package

sklearn_api.atmodel– Scikit learn wrapper for Author-topic model

sklearn_api.d2vmodel– Scikit learn wrapper for paragraph2vec model

sklearn_api.hdp– Scikit learn wrapper for Hierarchical Dirichlet Process model

sklearn_api.ldamodel– Scikit learn wrapper for Latent Dirichlet Allocation

sklearn_api.ldaseqmodel– Scikit learn wrapper for LdaSeq model

sklearn_api.lsimodel– Scikit learn wrapper for Latent Semantic Indexing

sklearn_api.phrases– Scikit learn wrapper for phrase (collocation) detection

sklearn_api.rpmodel– Scikit learn wrapper for Random Projection model

sklearn_api.text2bow– Scikit learn wrapper word<->id mapping

sklearn_api.tfidf– Scikit learn wrapper for TF-IDF model

sklearn_api.w2vmodel– Scikit learn wrapper for word2vec model

topic_coherence.aggregation– Aggregation module

topic_coherence.direct_confirmation_measure– Direct confirmation measure module

topic_coherence.indirect_confirmation_measure– Indirect confirmation measure module

topic_coherence.probability_estimation– Probability estimation module

topic_coherence.segmentation– Segmentation module

topic_coherence.text_analysis– Analyzing the texts of a corpus to accumulate statistical information about word occurrences

scripts.glove2word2vec– Convert glove format to word2vec

scripts.make_wikicorpus– Convert articles from a Wikipedia dump to vectors.

scripts.word2vec_standalone– Train word2vec on text file CORPUS

scripts.make_wiki_online– Convert articles from a Wikipedia dump

scripts.make_wiki_online_lemma– Convert articles from a Wikipedia dump

scripts.make_wiki_online_nodebug– Convert articles from a Wikipedia dump

scripts.word2vec2tensor– Convert the word2vec format to Tensorflow 2D tensor

scripts.segment_wiki– Convert wikipedia dump to json-line format

parsing.porter– Porter Stemming Algorithm

parsing.preprocessing– Functions to preprocess raw text

summarization.bm25– BM25 ranking function

summarization.commons– Common graph functions

summarization.graph– TextRank graph

summarization.keywords– Keywords for TextRank summarization algorithm

summarization.pagerank_weighted– Weighted PageRank algorithm

summarization.summarizer– TextRank Summariser

summarization.syntactic_unit– Syntactic Unit class

summarization.textcleaner– Summarization pre-processing

你可能感兴趣的:(Gensim 官方API)