[vectoreStore]--内存向量存储组件开发

vectoreStore:该组件通常用来做内存向量存储的,同时利用该存储区获取他的retrieval检索

内存向量存储使用的参数为:文档、embeddings、输出
由于他的输出分为vectoreStore向量库存储、retrieval向量检索器,因此需要拿到他的输出分别做处理

步骤:
第一步:获取值
第二步:定义类型并实例化[采用mmr做他的检索处理]
第三步:对他的输出结果分别做处理

学习链接:

https://www.langchain.com.cn/modules/prompts/example_selectors/examples/mmr

from typing import Any, Dict, Optional,List,Union
from langchain.schema import Document, BaseRetriever
from langchain.vectorstores import Chroma
from langchain.vectorstores.base import VectorStore
from langchain.embeddings.base import Embeddings
import chromadb,os

class MemoryVectorStore():

    def __init__(self,param_dict: Optional[Dict[str, Any]] = None):
        documents  = param_dict.get("document")
        embeddings :str  = param_dict.get("embeddings")
        if documents is None or len(documents) <= 0:
            raise Exception()
        if embeddings is None or not isinstance(embeddings, Embeddings):
            raise Exception()
        texts = []
        for doc in documents:
            if len(doc)>=1:
                for doc_copy in doc:
                    doc_page = doc_copy.page_content.replace("\n","")
                    texts.append(doc_page)
        self.__vectorstore = Chroma.from_documents(
            client=chromadb_client,
            documents=texts, 
            embedding=embeddings
            )
        outputs: list = param_dict.get("outputs")
        self.__output = outputs['output'] if outputs is not None and 0 < len(outputs) else "retriever"
        
    
    def source(self) -> Optional[Union[BaseRetriever, VectorStore]]:
        if self.__output.lower() == "retriever".lower():
            retriver = self.__vectorstore.as_retriever()
            retriver.search_type = 'mmr'
            return retriver
        elif self.__output.lower() == "vectorStore".lower():
            return self.__vectorstore
        else:
            return None

你可能感兴趣的:(基于langchain的开发,python,开发语言,aigc,langchain)