聊聊langchain4j的Web Search Engine

本文主要研究一下langchain4j的Web Search Engine

步骤

pom.xml


    dev.langchain4j
    langchain4j-web-search-engine-searchapi
    1.0.0-beta2

example

    @Test
    public void testSearchEngine() {
        SearchApiWebSearchEngine searchEngine = SearchApiWebSearchEngine.builder()
                .apiKey(System.getenv("SEARCH_API_KEY"))
                .engine("baidu")
                .build();
        WebSearchTool webTool = WebSearchTool.from(searchEngine);

        Assistant assistant = AiServices.builder(Assistant.class)
                .chatLanguageModel(model)
                .tools(webTool)
                .chatMemory(MessageWindowChatMemory.withMaxMessages(10))
                .build();
        String result = assistant.search("今天日期是哪一天");
        System.out.println(result);
    }
输出如下:
根据搜索结果,今天的日期为2025年3月25日(星期二),在农历中则是2025年二月廿六。

更多详细信息您可以访问以下网站:
- [易灵算命网](http://old.d02.cn/suanming/nongli.php)
- [今天几号网](https://www.jintianjihao.com/)
- [快乐日历 - 今天什么日子,今天星期几,农历几月几日@快乐家园](http://www.joyurl.cn/)

源码

WebSearchEngine

dev/langchain4j/web/search/WebSearchEngine.java

public interface WebSearchEngine {

    /**
     * Performs a search query on the web search engine and returns the search results.
     *
     * @param query the search query
     * @return the search results
     */
    default WebSearchResults search(String query) {
        return search(WebSearchRequest.from(query));
    }

    /**
     * Performs a search request on the web search engine and returns the search results.
     *
     * @param webSearchRequest the search request
     * @return the web search results
     */
    WebSearchResults search(WebSearchRequest webSearchRequest);
}
langchain4j定义了WebSearchEngine接口,它定义了search方法,根据WebSearchRequest返回WebSearchResults

WebSearchRequest

dev/langchain4j/web/search/WebSearchRequest.java

public class WebSearchRequest {

    private final String searchTerms;
    private final Integer maxResults;
    private final String language;
    private final String geoLocation;
    private final Integer startPage;
    private final Integer startIndex;
    private final Boolean safeSearch;
    private final Map additionalParams;

    private WebSearchRequest(Builder builder){
        this.searchTerms = ensureNotBlank(builder.searchTerms,"searchTerms");
        this.maxResults = builder.maxResults;
        this.language = builder.language;
        this.geoLocation = builder.geoLocation;
        this.startPage = getOrDefault(builder.startPage,1);
        this.startIndex = builder.startIndex;
        this.safeSearch = getOrDefault(builder.safeSearch,true);
        this.additionalParams = getOrDefault(builder.additionalParams, () -> new HashMap<>());
    }

    //......
}    
WebSearchRequest定义了searchTerms、maxResults、language、geoLocation、startPage、startIndex、safeSearch、additionalParams属性

WebSearchResults

dev/langchain4j/web/search/WebSearchResults.java

public class WebSearchResults {

    private final Map searchMetadata;
    private final WebSearchInformationResult searchInformation;
    private final List results;

    /**
     * Constructs a new instance of WebSearchResults.
     *
     * @param searchInformation The information about the web search.
     * @param results           The list of organic search results.
     */
    public WebSearchResults(WebSearchInformationResult searchInformation, List results) {
        this(null, searchInformation, results);
    }

    /**
     * Constructs a new instance of WebSearchResults.
     *
     * @param searchMetadata    The metadata associated with the web search.
     * @param searchInformation The information about the web search.
     * @param results           The list of organic search results.
     */
    public WebSearchResults(Map searchMetadata, WebSearchInformationResult searchInformation, List results) {
        this.searchMetadata = searchMetadata;
        this.searchInformation = ensureNotNull(searchInformation, "searchInformation");
        this.results = results;
    }

    //......
}    
WebSearchResults定义了searchMetadata、searchInformation、results属性

SearchApiWebSearchEngine

dev/langchain4j/web/search/searchapi/SearchApiWebSearchEngine.java

public class SearchApiWebSearchEngine implements WebSearchEngine {

    private static final String DEFAULT_BASE_URL = "https://www.searchapi.io";
    private static final String DEFAULT_ENGINE = "google";

    private final String apiKey;
    private final String engine;
    private final SearchApiClient client;
    private final Map optionalParameters;

    /**
     * @param apiKey             Required - the Search API key for accessing their API
     * @param baseUrl            overrides the default SearchApi base url
     * @param timeout            the timeout duration for API requests
     *                           

* Default value is 30 seconds. * @param engine the engine used by Search API to execute the search *

* Default engine is Google Search. * @param optionalParameters parameters to be passed on every request of this the engine, they can be overridden by the WebSearchRequest additional parameters for matching keys *

* Check Search API for more information on available parameters for each engine */ @Builder public SearchApiWebSearchEngine(String apiKey, String baseUrl, Duration timeout, String engine, Map optionalParameters) { this.apiKey = ensureNotBlank(apiKey, "apiKey"); this.engine = getOrDefault(engine, DEFAULT_ENGINE); this.optionalParameters = getOrDefault(copyIfNotNull(optionalParameters), new HashMap<>()); this.client = SearchApiClient.builder() .timeout(getOrDefault(timeout, ofSeconds(30))) .baseUrl(getOrDefault(baseUrl, DEFAULT_BASE_URL)) .build(); } /** * @param webSearchRequest Check Search API for more information on available additional * parameters for each engine that can be inside the request */ @Override public WebSearchResults search(WebSearchRequest webSearchRequest) { SearchApiWebSearchRequest request = SearchApiWebSearchRequest.builder() .apiKey(apiKey) .engine(engine) .query(webSearchRequest.searchTerms()) .optionalParameters(optionalParameters) .additionalRequestParameters(webSearchRequest.additionalParams()) .build(); SearchApiWebSearchResponse response = client.search(request); return toWebSearchResults(response); } private WebSearchResults toWebSearchResults(SearchApiWebSearchResponse response) { List organicResults = response.getOrganicResults(); Long totalResults = getTotalResults(response.getSearchInformation()); WebSearchInformationResult searchInformation = WebSearchInformationResult.from( totalResults, getCurrentPage(response.getPagination()), null ); Map searchMetadata = getOrDefault(response.getSearchParameters(), new HashMap<>()); addToMetadata(searchMetadata, response.getSearchMetadata()); return WebSearchResults.from( searchMetadata, searchInformation, toWebSearchOrganicResults(organicResults)); } private static long getTotalResults(Map searchInformation) { if (searchInformation != null && searchInformation.containsKey("total_results")) { Object totalResults = searchInformation.get("total_results"); return totalResults instanceof Integer ? ((Integer) totalResults).longValue() : (Long) totalResults; // changes depending on the amount of total_results } return 0; } private Integer getCurrentPage(Map pagination) { if (pagination != null && pagination.containsKey("current")) { return (Integer) pagination.get("current"); } return null; } private void addToMetadata(Map metadata, Map dataToAdd) { if (dataToAdd != null) { metadata.putAll(dataToAdd); } } private List toWebSearchOrganicResults(List organicResults) { return organicResults.stream() .map(result -> { Map metadata = new HashMap<>(2); metadata.put("position", result.getPosition()); return WebSearchOrganicResult.from( result.getTitle(), URI.create(result.getLink()), getOrDefault(result.getSnippet(), ""), null, // by default google custom search api does not return content metadata); }) .collect(Collectors.toList()); } public static WebSearchEngine withApiKey(String apiKey) { return builder().apiKey(apiKey).build(); } }

SearchApiWebSearchEngine实现了WebSearchEngine,它通过SearchApiClient去请求Search API

WebSearchTool

dev/langchain4j/web/search/WebSearchTool.java

public class WebSearchTool {

    private final WebSearchEngine searchEngine;

    public WebSearchTool(WebSearchEngine searchEngine) {
        this.searchEngine = ensureNotNull(searchEngine, "searchEngine");
    }

    /**
     * Runs a search query on the web search engine and returns a pretty-string representation of the search results.
     *
     * @param query the search user query
     * @return a pretty-string representation of the search results
     */
    @Tool("This tool can be used to perform web searches using search engines such as Google, particularly when seeking information about recent events.")
    public String searchWeb(@P("Web search query") String query) {
        WebSearchResults results = searchEngine.search(query);
        return format(results);
    }

    private String format(WebSearchResults results) {
        if (isNullOrEmpty(results.results()))
            return "No results found.";

        return results.results()
                .stream()
                .map(organicResult -> "Title: " + organicResult.title() + "\n"
                        + "Source: " + organicResult.url().toString() + "\n"
                        + (organicResult.content() != null ? "Content:" + "\n" + organicResult.content() : "Snippet:" + "\n" + organicResult.snippet()))
                .collect(Collectors.joining("\n\n"));
    }

    /**
     * Creates a new WebSearchTool with the specified web search engine.
     *
     * @param searchEngine the web search engine to use for searching the web
     * @return a new WebSearchTool
     */
    public static WebSearchTool from(WebSearchEngine searchEngine) {
        return new WebSearchTool(searchEngine);
    }
}
WebSearchTool定义了WebSearchEngine属性,它提供了searchWeb方法,并标注了@Tool注解,该方法执行searchEngine.search(query),并对结果进行format。

小结

langchain4j定义了WebSearchEngine接口,它定义了search方法,根据WebSearchRequest返回WebSearchResults;它提供了WebSearchTool可以将WebSearchEngine转为tool去跟model集成进行调用。langchain4j-web-search-engine-google-custom提供了google自定义搜索,langchain4j-web-search-engine-searchapi支持search api搜索,langchain4j-community-web-search-engine-searxng支持SearXNG搜索,langchain4j-web-search-engine-tavily支持tavily搜索。

doc

你可能感兴趣的:(langchain4j)