weixin_30678821

全文检索的配置

全文检索不同于特定字段的模糊查询，使用全文检索的效率再高，并且能够对于中文进行分词处理。

haystack:全文检索框架，支持whoosh、solr、Xaplan、Elasticsearc四种全文检索引擎
whoosh:纯python编写的全文搜索引擎，虽然性能比不上sphinx、xapian、elasticsearc等，但是无二进制包，程序不会莫名其妙的崩溃，对于小型的站点，whoosh已经足够使用，
jieba:中文分词包

安装包：

pip install django-haystack
pip install whoosh
pip install jieba

修改settings文件

INSTALLWS_APPS = [
    'haystack',  # 全文检索框架
]


 全文检索框架的配置
HAYSTACK_CONNECTIONS = {
    
    'default': {
    
        # 使用whoosh搜索引擎
        'ENGINE': 'goods.whoosh_cn_backend.WhooshEngine',
        # 索引文件的路径
        'PATH': os.path.join(BASE_DIR, 'whoosh_index'),
    },
}

# 当添加、修改、删除数据时，自动生成索引
HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.RealtimeSignalProcessor'

# 设置每页显示的数目，默认为20，可以自己修改  
HAYSTACK_SEARCH_RESULTS_PER_PAGE  =  8

在要检索的app下面，创建search_indexes.py文件

# 定义索引类
from haystack import indexes
from .models import Goods


# 索引类名格式：模型类名+Index
class GoodsIndex(indexes.SearchIndex, indexes.Indexable):
    """

    """
    # 索引字段：use_template 指定根据表中的哪些字段 建立索引文件
    # 把说明放在一个文件中
    text = indexes.CharField(document=True, use_template=True)
    
    # 建立检索字段，model_attr模型属性，如果需要多字段的话，在这里添加需要检索的字段
    goods_name = indexes.NgramField(model_attr="goods_name")  

    def get_model(self):
        return Goods  # 返回的模型类

    def index_queryset(self, using=None):
        return self.get_model().objects.all()

在templates下面创建如下文件夹 search/indexes/goods,在这下面创建goods_text.txt（goods是需要检索的模型类的小写）

注：名称是固定的，不可随意更改

在goods_test.txt里面写入需要检索的字段

# 指定根据表中的哪些字段建立索引数据
{
    {
    object.goods_name}}  # 根据商品的名称建立索引

进入到项目所在的目录，创建索引文件：python manage.py rebuild_index

html下搜索框固定设置

<form action="./search" method="get"><--form的method必须为get-->   
    {% csrf_token %}
    <input type="text" placeholder="搜索品牌 店铺..." name="q">  <--inpu的name必须为q-->
<input type="submit"  value="搜索" >
form>

配置项目下的urls

from django.urls import path, include

urlpatterns = [
    path('search/', include('haystack.urls'))  # 全文检过框架
]

搜索出来的结果，haystack会把搜索结果传递给templates/search目录下的search.html，所以需要在templates的search文件夹下创建search.html文件。传递的上下文包括：

query: 搜索的关键字

page:当前页的page对象

serchResult类的实例对象，对象的属性是object

paginator：分页paginator对象

# 设置每页显示的数目，默认为20，可以自己修改  
HAYSTACK_SEARCH_RESULTS_PER_PAGE  =  8

配置中文分词器，这里用到的模块为jieba,文件名为：tokenizer.py,把本文件放在与search_indexes.py同目录下，我这里放在了goods文件夹下

from jieba import cut_for_search
from whoosh.analysis import Tokenizer, Token


class ChineseTokenizer(Tokenizer):
    def __call__(self, value, positions=False, chars=False,
                 keeporiginal=False, removestops=True,
                 start_pos=0, start_char=0, mode='', **kwargs):

        t = Token(positions, chars, removestops=removestops, mode=mode,
                  **kwargs)
        # seglist = cut(value, cut_all=False)  # (精确模式)使用结巴分词库进行分词
        seglist = cut_for_search(value)  # (搜索引擎模式) 使用结巴分词库进行分词
        for w in seglist:
            t.original = t.text = w
            t.boost = 1.0
            if positions:
                t.pos = start_pos + value.find(w)
            if chars:
                t.startchar = start_char + value.find(w)
                t.endchar = start_char + value.find(w) + len(w)
            yield t  # 通过生成器返回每个分词的结果token


def ChineseAnalyzer():
    return ChineseTokenizer()

中文搜索引擎配置,文件名：whoosh_cn_backend.py，把本文件放在与search_indexes.py同目录下，我这里放在了goods文件夹下,修改172行的app名

# encoding: utf-8

from __future__ import absolute_import, division, print_function, unicode_literals

import json
import os
import re
import shutil
import threading
import warnings

from django.conf import settings
from django.core.exceptions import ImproperlyConfigured
from django.utils import six
from django.utils.datetime_safe import datetime
from django.utils.encoding import force_text

from haystack.backends import BaseEngine, BaseSearchBackend, BaseSearchQuery, EmptyResults, log_query
from haystack.constants import DJANGO_CT, DJANGO_ID, ID
from haystack.exceptions import MissingDependency, SearchBackendError, SkipDocument
from haystack.inputs import Clean, Exact, PythonData, Raw
from haystack.models import SearchResult
from haystack.utils import log as logging
from haystack.utils import get_identifier, get_model_ct
from haystack.utils.app_loading import haystack_get_model

try:
    import whoosh
except ImportError:
    raise MissingDependency(
        "The 'whoosh' backend requires the installation of 'Whoosh'. Please refer to the documentation.")

# Handle minimum requirement.
if not hasattr(whoosh, '__version__') or whoosh.__version__ < (2, 5, 0):
    raise MissingDependency("The 'whoosh' backend requires version 2.5.0 or greater.")

# Bubble up the correct error.
from whoosh import index
from whoosh.analysis import StemmingAnalyzer
from whoosh.fields import ID as WHOOSH_ID
from whoosh.fields import BOOLEAN, DATETIME, IDLIST, KEYWORD, NGRAM, NGRAMWORDS, NUMERIC, Schema, TEXT
from whoosh.filedb.filestore import FileStorage, RamStorage
from whoosh.highlight import highlight as whoosh_highlight
from whoosh.highlight import ContextFragmenter, HtmlFormatter
from whoosh.qparser import QueryParser
from whoosh.searching import ResultsPage
from whoosh.writing import AsyncWriter

DATETIME_REGEX = re.compile(
    '^(?P\d{4})-(?P\d{2})-(?P\d{2})T(?P\d{2}):(?P\d{2}):(?P\d{2})(\.\d{3,6}Z?)?$')
LOCALS = threading.local()
LOCALS.RAM_STORE = None


class WhooshHtmlFormatter(HtmlFormatter):
    """
    This is a HtmlFormatter simpler than the whoosh.HtmlFormatter.
    We use it to have consistent results across backends. Specifically,
    Solr, Xapian and Elasticsearch are using this formatting.
    """
    template = '<%(tag)s>%(t)s'


class WhooshSearchBackend(BaseSearchBackend):
    # Word reserved by Whoosh for special use.
    RESERVED_WORDS = (
        'AND',
        'NOT',
        'OR',
        'TO',
    )

    # Characters reserved by Whoosh for special use.
    # The '\\' must come first, so as not to overwrite the other slash replacements.
    RESERVED_CHARACTERS = (
        '\\', '+', '-', '&&', '||', '!', '(', ')', '{', '}',
        '[', ']', '^', '"', '~', '*', '?', ':', '.',
    )

    def __init__(self, connection_alias, **connection_options):
        super(WhooshSearchBackend, self).__init__(connection_alias, **connection_options)
        self.setup_complete = False
        self.use_file_storage = True
        self.post_limit = getattr(connection_options, 'POST_LIMIT', 128 * 1024 * 1024)
        self.path = connection_options.get('PATH')

        if connection_options.get('STORAGE', 'file') != 'file':
            self.use_file_storage = False

        if self.use_file_storage and not self.path:
            raise ImproperlyConfigured(
                "You must specify a 'PATH' in your settings for connection '%s'." % connection_alias)

        self.log = logging.getLogger('haystack')

    def setup(self):
        """
        Defers loading until needed.
        """
        from haystack import connections
        new_index = False

        # Make sure the index is there.
        if self.use_file_storage and not os.path.exists(self.path):
            os.makedirs(self.path)
            new_index = True

        if self.use_file_storage and not os.access(self.path, os.W_OK):
            raise IOError("The path to your Whoosh index '%s' is not writable for the current user/group." % self.path)

        if self.use_file_storage:
            self.storage = FileStorage(self.path)
        else:
            global LOCALS

            if getattr(LOCALS, 'RAM_STORE', None) is None:
                LOCALS.RAM_STORE = RamStorage()

            self.storage = LOCALS.RAM_STORE

        self.content_field_name, self.schema = self.build_schema(
            connections[self.connection_alias].get_unified_index().all_searchfields())
        self.parser = QueryParser(self.content_field_name, schema=self.schema)

        if new_index is True:
            self.index = self.storage.create_index(self.schema)
        else:
            try:
                self.index = self.storage.open_index(schema=self.schema)
            except index.EmptyIndexError:
                self.index = self.storage.create_index(self.schema)

        self.setup_complete = True

    def build_schema(self, fields):
        schema_fields = {
    
            ID: WHOOSH_ID(stored=True, unique=True),
            DJANGO_CT: WHOOSH_ID(stored=True),
            DJANGO_ID: WHOOSH_ID(stored=True),
        }
        # Grab the number of keys that are hard-coded into Haystack.
        # We'll use this to (possibly) fail slightly more gracefully later.
        initial_key_count = len(schema_fields)
        content_field_name = ''

        for field_name, field_class in fields.items():
            if field_class.is_multivalued:
                if field_class.indexed is False:
                    schema_fields[field_class.index_fieldname] = IDLIST(stored=True, field_boost=field_class.boost)
                else:
                    schema_fields[field_class.index_fieldname] = KEYWORD(stored=True, commas=True, scorable=True,
                                                                         field_boost=field_class.boost)
            elif field_class.field_type in ['date', 'datetime']:
                schema_fields[field_class.index_fieldname] = DATETIME(stored=field_class.stored, sortable=True)
            elif field_class.field_type == 'integer':
                schema_fields[field_class.index_fieldname] = NUMERIC(stored=field_class.stored, numtype=int,
                                                                     field_boost=field_class.boost)
            elif field_class.field_type == 'float':
                schema_fields[field_class.index_fieldname] = NUMERIC(stored=field_class.stored, numtype=float,
                                                                     field_boost=field_class.boost)
            elif field_class.field_type == 'boolean':
                # Field boost isn't supported on BOOLEAN as of 1.8.2.
                schema_fields[field_class.index_fieldname] = BOOLEAN(stored=field_class.stored)
            elif field_class.field_type == 'ngram':
                schema_fields[field_class.index_fieldname] = NGRAM(minsize=3, maxsize=15, stored=field_class.stored,
                                                                   field_boost=field_class.boost)
            elif field_class.field_type == 'edge_ngram':
                schema_fields[field_class.index_fieldname] = NGRAMWORDS(minsize=2, maxsize=15, at='start',
                                                                        stored=field_class.stored,
                                                                        field_boost=field_class.boost)
            else:
                from goods.tokenizer import ChineseAnalyzer
                schema_fields[field_class.index_fieldname] = TEXT(stored=True, analyzer=ChineseAnalyzer(),
                                                                  field_boost=field_class.boost, sortable=True)
            if field_class.document is True:
                content_field_name = field_class.index_fieldname
                schema_fields[field_class.index_fieldname].spelling = True

        # Fail more gracefully than relying on the backend to die if no fields
        # are found.
        if len(schema_fields) <= initial_key_count:
            raise SearchBackendError(
                "No fields were found in any search_indexes. Please correct this before attempting to search.")

        return (content_field_name, Schema(**schema_fields))

    def update(self, index, iterable, commit=True):
        if not self.setup_complete:
            self.setup()

        self.index = self.index.refresh()
        writer = AsyncWriter(self.index)

        for obj in iterable:
            try:
                doc = index.full_prepare(obj)
            except SkipDocument:
                self.log.debug(u"Indexing for object `%s` skipped", obj)
            else:
                # Really make sure it's unicode, because Whoosh won't have it any
                # other way.
                for key in doc:
                    doc[key] = self._from_python(doc[key])

                # Document boosts aren't supported in Whoosh 2.5.0+.
                if 'boost' in doc:
                    del doc['boost']

                try:
                    writer.update_document(**doc)
                except Exception as e:
                    if not self.silently_fail:
                        raise

                    # We'll log the object identifier but won't include the actual object
                    # to avoid the possibility of that generating encoding errors while
                    # processing the log message:
                    self.log.error(u"%s while preparing object for update" % e.__class__.__name__,
                                   exc_info=True, extra={
    "data": {
    "index": index,
                                                                  "object": get_identifier(obj)}})

        if len(iterable) > 0:
            # For now, commit no matter what, as we run into locking issues otherwise.
            writer.commit()

    def remove(self, obj_or_string, commit=True):
        if not self.setup_complete:
            self.setup()

        self.index = self.index.refresh()
        whoosh_id = get_identifier(obj_or_string)

        try:
            self.index.delete_by_query(q=self.parser.parse(u'%s:"%s"' % (ID, whoosh_id)))
        except Exception as e:
            if not self.silently_fail:
                raise

            self.log.error("Failed to remove document '%s' from Whoosh: %s", whoosh_id, e, exc_info=True)

    def clear(self, models=None, commit=True):
        if not self.setup_complete:
            self.setup()

        self.index = self.index.refresh()

        if models is not None:
            assert isinstance(models, (list, tuple))

        try:
            if models is None:
                self.delete_index()
            else:
                models_to_delete = []

                for model in models:
                    models_to_delete.append(u"%s:%s" % (DJANGO_CT, get_model_ct(model)))

                self.index.delete_by_query(q=self.parser.parse(u" OR ".join(models_to_delete)))
        except Exception as e:
            if not self.silently_fail:
                raise

            if models is not None:
                self.log.error("Failed to clear Whoosh index of models '%s': %s", ','.join(models_to_delete),
                               e, exc_info=True)
            else:
                self.log.error("Failed to clear Whoosh index: %s", e, exc_info=True)

    def delete_index(self):
        # Per the Whoosh mailing list, if wiping out everything from the index,
        # it's much more efficient to simply delete the index files.
        if self.use_file_storage and os.path.exists(self.path):
            shutil.rmtree(self.path)
        elif not self.use_file_storage:
            self.storage.clean()

        # Recreate everything.
        self.setup()

    def optimize(self):
        if not self.setup_complete:
            self.setup()

        self.index = self.index.refresh()
        self.index.optimize()

    def calculate_page(self, start_offset=0, end_offset=None):
        # Prevent against Whoosh throwing an error. Requires an end_offset
        # greater than 0.
        if end_offset is not None and end_offset <= 0:
            end_offset = 1

        # Determine the page.
        page_num = 0

        if end_offset is None:
            end_offset = 1000000

        if start_offset is None:
            start_offset = 0

        page_length = end_offset - start_offset

        if page_length and page_length > 0:
            page_num = int(start_offset / page_length)

        # Increment because Whoosh uses 1-based page numbers.
        page_num += 1
        return page_num, page_length

    @log_query
    def search(self, query_string, sort_by=None, start_offset=0, end_offset=None,
               fields='', highlight=False, facets=None, date_facets=None, query_facets=None,
               narrow_queries=None, spelling_query=None, within=None,
               dwithin=None, distance_point=None, models=None,
               limit_to_registered_models=None, result_class=None, **kwargs):
        if not self.setup_complete:
            self.setup()

        # A zero length query should return no results.
        if len(query_string) == 0:
            return {
    
                'results': [],
                'hits': 0,
            }

        query_string = force_text(query_string)

        # A one-character query (non-wildcard) gets nabbed by a stopwords
        # filter and should yield zero results.
        if len(query_string) <= 1 and query_string != u'*':
            return {
    
                'results': [],
                'hits': 0,
            }

        reverse = False

        if sort_by is not None:
            # Determine if we need to reverse the results and if Whoosh can
            # handle what it's being asked to sort by. Reversing is an
            # all-or-nothing action, unfortunately.
            sort_by_list = []
            reverse_counter = 0

            for order_by in sort_by:
                if order_by.startswith('-'):
                    reverse_counter += 1

            if reverse_counter and reverse_counter != len(sort_by):
                raise SearchBackendError("Whoosh requires all order_by fields"
                                         " to use the same sort direction")

            for order_by in sort_by:
                if order_by.startswith('-'):
                    sort_by_list.append(order_by[1:])

                    if len(sort_by_list) == 1:
                        reverse = True
                else:
                    sort_by_list.append(order_by)

                    if len(sort_by_list) == 1:
                        reverse = False

            sort_by = sort_by_list

        if facets is not None:
            warnings.warn("Whoosh does not handle faceting.", Warning, stacklevel=2)

        if date_facets is not None:
            warnings.warn("Whoosh does not handle date faceting.", Warning, stacklevel=2)

        if query_facets is not None:
            warnings.warn("Whoosh does not handle query faceting.", Warning, stacklevel=2)

        narrowed_results = None
        self.index = self.index.refresh()

        if limit_to_registered_models is None:
            limit_to_registered_models = getattr(settings, 'HAYSTACK_LIMIT_TO_REGISTERED_MODELS', True)

        if models and len(models):
            model_choices = sorted(get_model_ct(model) for model in models)
        elif limit_to_registered_models:
            # Using narrow queries, limit the results to only models handled
            # with the current routers.
            model_choices = self.build_models_list()
        else:
            model_choices = []

        if len(model_choices) > 0:
            if narrow_queries is None:
                narrow_queries = set()

            narrow_queries.add(' OR '.join(['%s:%s' % (DJANGO_CT, rm) for rm in model_choices]))

        narrow_searcher = None

        if narrow_queries is not None:
            # Potentially expensive? I don't see another way to do it in Whoosh...
            narrow_searcher = self.index.searcher()

            for nq in narrow_queries:
                recent_narrowed_results = narrow_searcher.search(self.parser.parse(force_text(nq)),
                                                                 limit=None)

                if len(recent_narrowed_results) <= 0:
                    return {
    
                        'results': [],
                        'hits': 0,
                    }

                if narrowed_results:
                    narrowed_results.filter(recent_narrowed_results)
                else:
                    narrowed_results = recent_narrowed_results

        self.index = self.index.refresh()

        if self.index.doc_count():
            searcher = self.index.searcher()
            parsed_query = self.parser.parse(query_string)

            # In the event of an invalid/stopworded query, recover gracefully.
            if parsed_query is None:
                return {
    
                    'results': [],
                    'hits': 0,
                }

            page_num, page_length = self.calculate_page(start_offset, end_offset)

            search_kwargs = {
    
                'pagelen': page_length,
                'sortedby': sort_by,
                'reverse': reverse,
            }

            # Handle the case where the results have been narrowed.
            if narrowed_results is not None:
                search_kwargs['filter'] = narrowed_results

            try:
                raw_page = searcher.search_page(
                    parsed_query,
                    page_num,
                    **search_kwargs
                )
            except ValueError:
                if not self.silently_fail:
                    raise

                return {
    
                    'results': [],
                    'hits': 0,
                    'spelling_suggestion': None,
                }

            # Because as of Whoosh 2.5.1, it will return the wrong page of
            # results if you request something too high. :(
            if raw_page.pagenum < page_num:
                return {
    
                    'results': [],
                    'hits': 0,
                    'spelling_suggestion': None,
                }

            results = self._process_results(raw_page, highlight=highlight, query_string=query_string,
                                            spelling_query=spelling_query, result_class=result_class)
            searcher.close()

            if hasattr(narrow_searcher, 'close'):
                narrow_searcher.close()

            return results
        else:
            if self.include_spelling:
                if spelling_query:
                    spelling_suggestion = self.create_spelling_suggestion(spelling_query)
                else:
                    spelling_suggestion = self.create_spelling_suggestion(query_string)
            else:
                spelling_suggestion = None

            return {
    
                'results': [],
                'hits': 0,
                'spelling_suggestion': spelling_suggestion,
            }

    def more_like_this(self, model_instance, additional_query_string=None,
                       start_offset=0, end_offset=None, models=None,
                       limit_to_registered_models=None, result_class=None, **kwargs):
        if not self.setup_complete:
            self.setup()

        field_name = self.content_field_name
        narrow_queries = set()
        narrowed_results = None
        self.index = self.index.refresh()

        if limit_to_registered_models is None:
            limit_to_registered_models = getattr(settings, 'HAYSTACK_LIMIT_TO_REGISTERED_MODELS', True)

        if models and len(models):
            model_choices = sorted(get_model_ct(model) for model in models)
        elif limit_to_registered_models:
            # Using narrow queries, limit the results to only models handled
            # with the current routers.
            model_choices = self.build_models_list()
        else:
            model_choices = []

        if len(model_choices) > 0:
            if narrow_queries is None:
                narrow_queries = set()

            narrow_queries.add(' OR '.join(['%s:%s' % (DJANGO_CT, rm) for rm in model_choices]))

        if additional_query_string and additional_query_string != '*':
            narrow_queries.add(additional_query_string)

        narrow_searcher = None

        if narrow_queries is not None:
            # Potentially expensive? I don't see another way to do it in Whoosh...
            narrow_searcher = self.index.searcher()

            for nq in narrow_queries:
                recent_narrowed_results = narrow_searcher.search(self.parser.parse(force_text(nq)),
                                                                 limit=None)

                if len(recent_narrowed_results) <= 0:
                    return {
    
                        'results': [],
                        'hits': 0,
                    }

                if narrowed_results:
                    narrowed_results.filter(recent_narrowed_results)
                else:
                    narrowed_results = recent_narrowed_results

        page_num, page_length = self.calculate_page(start_offset, end_offset)

        self.index = self.index.refresh()
        raw_results = EmptyResults()

        searcher = None
        if self.index.doc_count():
            query = "%s:%s" % (ID, get_identifier(model_instance))
            searcher = self.index.searcher()
            parsed_query = self.parser.parse(query)
            results = searcher.search(parsed_query)

            if len(results):
                raw_results = results[0].more_like_this(field_name, top=end_offset)

            # Handle the case where the results have been narrowed.
            if narrowed_results is not None and hasattr(raw_results, 'filter'):
                raw_results.filter(narrowed_results)

        try:
            raw_page = ResultsPage(raw_results, page_num, page_length)
        except ValueError:
            if not self.silently_fail:
                raise

            return {
    
                'results': [],
                'hits': 0,
                'spelling_suggestion': None,
            }

        # Because as of Whoosh 2.5.1, it will return the wrong page of
        # results if you request something too high. :(
        if raw_page.pagenum < page_num:
            return {
    
                'results': [],
                'hits': 0,
                'spelling_suggestion': None,
            }

        results = self._process_results(raw_page, result_class=result_class)

        if searcher:
            searcher.close()

        if hasattr(narrow_searcher, 'close'):
            narrow_searcher.close()

        return results

    def _process_results(self, raw_page, highlight=False, query_string='', spelling_query=None, result_class=None):
        from haystack import connections
        results = []

        # It's important to grab the hits first before slicing. Otherwise, this
        # can cause pagination failures.
        hits = len(raw_page)

        if result_class is None:
            result_class = SearchResult

        facets = {}
        spelling_suggestion = None
        unified_index = connections[self.connection_alias].get_unified_index()
        indexed_models = unified_index.get_indexed_models()

        for doc_offset, raw_result in enumerate(raw_page):
            score = raw_page.score(doc_offset) or 0
            app_label, model_name = raw_result[DJANGO_CT].split('.')
            additional_fields = {}
            model = haystack_get_model(app_label, model_name)

            if model and model in indexed_models:
                for key, value in raw_result.items():
                    index = unified_index.get_index(model)
                    string_key = str(key)

                    if string_key in index.fields and hasattr(index.fields[string_key], 'convert'):
                        # Special-cased due to the nature of KEYWORD fields.
                        if index.fields[string_key].is_multivalued:
                            if value is None or len(value) is 0:
                                additional_fields[string_key] = []
                            else:
                                additional_fields[string_key] = value.split(',')
                        else:
                            additional_fields[string_key] = index.fields[string_key].convert(value)
                    else:
                        additional_fields[string_key] = self._to_python(value)

                del (additional_fields[DJANGO_CT])
                del (additional_fields[DJANGO_ID])

                if highlight:
                    sa = StemmingAnalyzer()
                    formatter = WhooshHtmlFormatter('em')
                    terms = [token.text for token in sa(query_string)]

                    whoosh_result = whoosh_highlight(
                        additional_fields.get(self.content_field_name),
                        terms,
                        sa,
                        ContextFragmenter(),
                        formatter
                    )
                    additional_fields['highlighted'] = {
    
                        self.content_field_name: [whoosh_result],
                    }

                result = result_class(app_label, model_name, raw_result[DJANGO_ID], score, **additional_fields)
                results.append(result)
            else:
                hits -= 1

        if self.include_spelling:
            if spelling_query:
                spelling_suggestion = self.create_spelling_suggestion(spelling_query)
            else:
                spelling_suggestion = self.create_spelling_suggestion(query_string)

        return {
    
            'results': results,
            'hits': hits,
            'facets': facets,
            'spelling_suggestion': spelling_suggestion,
        }

    def create_spelling_suggestion(self, query_string):
        spelling_suggestion = None
        reader = self.index.reader()
        corrector = reader.corrector(self.content_field_name)
        cleaned_query = force_text(query_string)

        if not query_string:
            return spelling_suggestion

        # Clean the string.
        for rev_word in self.RESERVED_WORDS:
            cleaned_query = cleaned_query.replace(rev_word, '')

        for rev_char in self.RESERVED_CHARACTERS:
            cleaned_query = cleaned_query.replace(rev_char, '')

        # Break it down.
        query_words = cleaned_query.split()
        suggested_words = []

        for word in query_words:
            suggestions = corrector.suggest(word, limit=1)

            if len(suggestions) > 0:
                suggested_words.append(suggestions[0])

        spelling_suggestion = ' '.join(suggested_words)
        return spelling_suggestion

    def _from_python(self, value):
        """
        Converts Python values to a string for Whoosh.

        Code courtesy of pysolr.
        """
        if hasattr(value, 'strftime'):
            if not hasattr(value, 'hour'):
                value = datetime(value.year, value.month, value.day, 0, 0, 0)
        elif isinstance(value, bool):
            if value:
                value = 'true'
            else:
                value = 'false'
        elif isinstance(value, (list, tuple)):
            value = u','.join([force_text(v) for v in value])
        elif isinstance(value, (six.integer_types, float)):
            # Leave it alone.
            pass
        else:
            value = force_text(value)
        return value

    def _to_python(self, value):
        """
        Converts values from Whoosh to native Python values.

        A port of the same method in pysolr, as they deal with data the same way.
        """
        if value == 'true':
            return True
        elif value == 'false':
            return False

        if value and isinstance(value, six.string_types):
            possible_datetime = DATETIME_REGEX.search(value)

            if possible_datetime:
                date_values = possible_datetime.groupdict()

                for dk, dv in date_values.items():
                    date_values[dk] = int(dv)

                return datetime(date_values['year'], date_values['month'], date_values['day'], date_values['hour'],
                                date_values['minute'], date_values['second'])

        try:
            # Attempt to use json to load the values.
            converted_value = json.loads(value)

            # Try to handle most built-in types.
            if isinstance(converted_value, (list, tuple, set, dict, six.integer_types, float, complex)):
                return converted_value
        except:
            # If it fails (SyntaxError or its ilk) or we don't trust it,
            # continue on.
            pass

        return value


class WhooshSearchQuery(BaseSearchQuery):
    def _convert_datetime(self, date):
        if hasattr(date, 'hour'):
            return force_text(date.strftime('%Y%m%d%H%M%S'))
        else:
            return force_text(date.strftime('%Y%m%d000000'))

    def clean(self, query_fragment):
        """
        Provides a mechanism for sanitizing user input before presenting the
        value to the backend.

        Whoosh 1.X differs here in that you can no longer use a backslash
        to escape reserved characters. Instead, the whole word should be
        quoted.
        """
        words = query_fragment.split()
        cleaned_words = []

        for word in words:
            if word in self.backend.RESERVED_WORDS:
                word = word.replace(word, word.lower())

            for char in self.backend.RESERVED_CHARACTERS:
                if char in word:
                    word = "'%s'" % word
                    break

            cleaned_words.append(word)

        return ' '.join(cleaned_words)

    def build_query_fragment(self, field, filter_type, value):
        from haystack import connections
        query_frag = ''
        is_datetime = False

        if not hasattr(value, 'input_type_name'):
            # Handle when we've got a ``ValuesListQuerySet``...
            if hasattr(value, 'values_list'):
                value = list(value)

            if hasattr(value, 'strftime'):
                is_datetime = True

            if isinstance(value, six.string_types) and value != ' ':
                # It's not an ``InputType``. Assume ``Clean``.
                value = Clean(value)
            else:
                value = PythonData(value)

        # Prepare the query using the InputType.
        prepared_value = value.prepare(self)

        if not isinstance(prepared_value, (set, list, tuple)):
            # Then convert whatever we get back to what pysolr wants if needed.
            prepared_value = self.backend._from_python(prepared_value)

        # 'content' is a special reserved word, much like 'pk' in
        # Django's ORM layer. It indicates 'no special field'.
        if field == 'content':
            index_fieldname = ''
        else:
            index_fieldname = u'%s:' % connections[self._using].get_unified_index().get_index_fieldname(field)

        filter_types = {
    
            'content': '%s',
            'contains': '*%s*',
            'endswith': "*%s",
            'startswith': "%s*",
            'exact': '%s',
            'gt': "{%s to}",
            'gte': "[%s to]",
            'lt': "{to %s}",
            'lte': "[to %s]",
            'fuzzy': u'%s~',
        }

        if value.post_process is False:
            query_frag = prepared_value
        else:
            if filter_type in ['content', 'contains', 'startswith', 'endswith', 'fuzzy']:
                if value.input_type_name == 'exact':
                    query_frag = prepared_value
                else:
                    # Iterate over terms & incorportate the converted form of each into the query.
                    terms = []

                    if isinstance(prepared_value, six.string_types):
                        possible_values = prepared_value.split(' ')
                    else:
                        if is_datetime is True:
                            prepared_value = self._convert_datetime(prepared_value)

                        possible_values = [prepared_value]

                    for possible_value in possible_values:
                        terms.append(filter_types[filter_type] % self.backend._from_python(possible_value))

                    if len(terms) == 1:
                        query_frag = terms[0]
                    else:
                        query_frag = u"(%s)" % " AND ".join(terms)
            elif filter_type == 'in':
                in_options = []

                for possible_value in prepared_value:
                    is_datetime = False

                    if hasattr(possible_value, 'strftime'):
                        is_datetime = True

                    pv = self.backend._from_python(possible_value)

                    if is_datetime is True:
                        pv = self._convert_datetime(pv)

                    if isinstance(pv, six.string_types) and not is_datetime:
                        in_options.append('"%s"' % pv)
                    else:
                        in_options.append('%s' % pv)

                query_frag = "(%s)" % " OR ".join(in_options)
            elif filter_type == 'range':
                start = self.backend._from_python(prepared_value[0])
                end = self.backend._from_python(prepared_value[1])

                if hasattr(prepared_value[0], 'strftime'):
                    start = self._convert_datetime(start)

                if hasattr(prepared_value[1], 'strftime'):
                    end = self._convert_datetime(end)

                query_frag = u"[%s to %s]" % (start, end)
            elif filter_type == 'exact':
                if value.input_type_name == 'exact':
                    query_frag = prepared_value
                else:
                    prepared_value = Exact(prepared_value).prepare(self)
                    query_frag = filter_types[filter_type] % prepared_value
            else:
                if is_datetime is True:
                    prepared_value = self._convert_datetime(prepared_value)

                query_frag = filter_types[filter_type] % prepared_value

        if len(query_frag) and not isinstance(value, Raw):
            if not query_frag.startswith('(') and not query_frag.endswith(')'):
                query_frag = "(%s)" % query_frag

        return u"%s%s" % (index_fieldname, query_frag)


class WhooshEngine(BaseEngine):
    backend = WhooshSearchBackend
    query = WhooshSearchQuery

Linux入门Linux服务器搭建工作需要掌握的核心点
虚拟机的使用Linux安装（注意事项）服务器搭建（重点）网络配置（本地虚拟机）SSH连接远程服务器（putty、xshell6）FTP文件传输（FlashFXP、winscp）安装python（Linux自带python2.7.5）虚拟环境管理（virtualenv）django安装web服务器（Nginx + uwsgi） django项目发布数据库mysqlDNS解析（域名）Nginx多项目配置虚拟机安装虚拟机安装[重要]：https://blog.csdn.net/qq_39038465/article/details/81478847Linux目录结构bin:存放二进制可执行文件boot: 存放用于系统引导时使用的各种文件dev: 用于存放设备文件etc: 存放系统配置文件home: 存放所有用户文件的根目录lib 存放跟文件系统中的程序运行所需要的共享库及内核模块mnt 系统管理员安装临时文件系统的安装点opt 额外安装的可选应用程序包所放置的位置proc 虚拟文件系统，存放当前内存的映射root 超级用户目录sbin 存放二进制可执行文件，只有root才可以访问tmp 用于存放临时文件usr 用于存放系统应用程序var 用于存放运行时需要改变数据的文件Linux命令IP地址和主机名相关的命令查看IP：ifconfig重启网卡：service network restart查看网卡状态：service network status修改IP地址：vim /etc/sysconfig/network-scripts/ifcfg-ens33YPE="Ether net" # 网络类型，以太网BOOTPROTO="static"# 改为静态IPIPADDR="192.168.8.88"# IP地址NETMASK="255.255.255.0"# 子网掩码GATEWAY="192.168.8.1"# 网关DNS1="192.168.8.1"# 首选DNSONBOOT="yes"# 是否可以上网（默认为ON）1234567查看主机名：hostname修改主机名：vim /etc/hostnameCentOS7更换清华yum镜像清华：https://mirror.tuna.tsinghua.edu.cn/help/centos/
VIM常用命令 dd 删除光标所在的那一行u 撤销上一步操作ndd 删除光标所在位置起的多行 n为数字yy 复制光标当前所在的那一行nyy 复制多行 n为数字p 将已复制的内容粘贴到光标所在的位置的下一行大P 将已复制的内容粘贴到光标所在的位置的上一行np 粘贴多行到光标的下一行 n为数字ctrl+r 重复上一次操作$ 跳到一行的尾部0 跳到一行的头部gg 移动到这个文件的第一行G 跳到这个文件的最后一行nG 跳到n行set nu 显示行号H 光标移动到屏幕的最上方那一行的第一个字符M 光标移动到屏幕的中央那一行的第一个字符L 光标移动到屏幕的最下面那一行的第一个字符123456789101112131415161718CentOS7下安装python3# 1、安装pyhton3.7 的依赖包
yum -y groupinstall "Development tools"
yum -y install zlib-devel bzip2-devel openssl-devel ncurses-devel sqlite-devel readline-devel tk-devel gdbm-devel db4-devel libpcap-devel xz-devel libffi-devel
# 2、下载python3.7的“源码”： wget https://www.python.org/ftp/python/3.7.0/Python-3.7.0.tar.xz
# 3、解压并编译安装： tar -xJvf Python-3.7.0.tar.xz   # 4、用cd命令进入解压出来的Python文件夹 cd Python-3.7.0
# 5、用./方法执行configure,并指定安装到usr目录下 ./configure --prefix=/usr/local/python3 --enable-shared
# 6、开始编译安装 make && make install
# 7、配置环境变量，创建软链接 ln -s /usr/local/python3/bin/python3 /usr/bin/python3 # 创建python3的软链接     ln -s /usr/local/python3/bin/pip3 /usr/bin/pip3 # 创建pip的软链接# 8、将编译目录下的libpython3.7m.so.1.0文件复制到cp /root/Python-3.7.0/libpython3.7m.so.1.0 /usr/lib64/libpython3.7m.so.1.012345678910111213141516171819202122232425262728CentOS7下安装MySQL# 1、下载mysql的repo源 wget http://repo.mysql.com/mysql-community-release-el7-5.noarch.rpm
# 2、安装mysql-community-release-el7-5.noarch.rpm包 rpm -ivh mysql-community-release-el7-5.noarch.rpm
# 3、安装mysql yum install mysql-server
# 4、授权用户可以使用mysql chown -R root:root /var/lib/mysql
# 5、重启服务 service mysqld restart
# 6、接下来登录重置密码：     mysql -u root # 进入mysql     # 下面为mysql命令     use mysql;     update user set password=password('root') where user='root';     grant all privileges on *.* to 'root'@'%' identified by 'root'; #设置远程登陆密码 flush privileges; #刷新当前配置注：如果不管用，重启虚拟机ctrl+c,退出myql# 7、开放3306端口： # 设置 iptables serviceyum -y install iptables-services
# 如果要修改防火墙配置，如增加防火墙端口3306vi /etc/sysconfig/iptables
# 增加规则-A INPUT -p tcp -m state --state NEW -m tcp --dport 3306 -j ACCEPT #保存退出后
# 8、配置防火墙: systemctl restart iptables.service # 重启防火墙使配置生效 systemctl enable iptables.service # 设置防火墙开机启动12345678910111213141516171819202122232425262728293031323334353637连接MySQL数据库，并新建一个库以备的django使用使用navicat连接数据库

打开navicat以后，点击右上角的连接，弹出以下窗口，按下图中的内容填写

新建数据库，以便的django连接

数据库已经建好，下面就是django配置数据库django配置MySQL数据库# 1、在settings文件中，把数据库配置地方更改为以下内容：DATABASES = {'default': { 'ENGINE': 'django.db.backends.mysql', # 默认数据库为MySQL 'NAME': 'library', # 数据库名为library 'USER': 'root', # 连接数据库的用户 "root" 'PASSWORD': '123456', # 用户密码 "123456" 'HOST': 'www.XXXXXX.cn', # 主机的IP或者域名都可以 'PORT': 3306, # 数据库端口，默认为3306}}1234567891011django项目目录下的__init__.py文件中，导入pymysqlimport pymysqlpymysql.install_as_MySQLdb()12CentOS7安装虚拟环境# 安装虚拟环境pip3 install virtualenv
# 创建软链接ln -s /usr/local/python3/bin/virtualenv /usr/bin/virtualenv12345创建目录# 创建报错虚拟环境目录名字是任意的mkdir -p /data/env # 个人网站发布文件夹 .名字都是任意的!mkdir -p /data/wwwroot1234创建、进入虚拟环境# 进入env目录cd /data/env# 创建虚拟环境virtualenv --python=/usr/bin/python3 py3_django2# 激活虚拟环境cd /data/env/py3_django2/binsource activate # 退出: deactivate# 安装django、uwsgi等.pip install djangopip install uwsgi # django项目发布相关# 退出虚拟环境cd /data/env/py3_django2/bindeactivate 12345678910111213为uwsgi创建软链接# 给uwsgi建立软链接，方便使用ln -s /usr/local/python3/bin/uwsgi /usr/bin/uwsgi12创建xml文件，保存名字与项目名同名，后缀为.xml     127.0.0.1:8000    /data/wwwroot/library/     library.wsgi    4    uwsgi.log 12345678本地项目上传到CentOS7服务器上# 1、在windows系统下，用cmd进入项目的目录，生成项目包依赖列表（如果依赖包少的话，这一步忽略）pip freeze > requirements.txt# 2、settings文件设置ALLOWED_HOSTS = ['*'] # 允许所有IP访问1234CentOS7下安装相关依赖包# CentOS7下安装相关依赖包（如果依赖包少的话，这一步忽略）pip install -r requirements.txt12迁移静态文件# 一、指定收集静态文件的目录，修改settings文件中静态文件路径 STATIC_ROOT = '/data/wwwroot/library/static'
# 二、收集所有静态文件到STATIC_ROOT指定的目录python3 manage.py collectstatic12345安装Nginx# 1、用wget下载Nginxwget http://nginx.org/download/nginx-1.13.7.tar.gz   # 2、下载完成后，解压tar -zxvf nginx-1.13.7.tar.gz
# 3、进入到nginx-1.13.7目录下，并执行以下命令./configuremake && make install
# 4、nginx一般默认安装好的路径为/usr/local/nginx 在/usr/local/nginx/conf/中先备份一下nginx.conf文件，以防意外。cd /usr/local/nginx/conf/cp nginx.conf nginx.conf.bak
# 5、然后打开nginx.conf，把原来的内容删除，直接加入以下内容：events { worker_connections 1024;}http { include mime.types; default_type application/octet-stream; sendfile on; server { listen 80; server_name www.xxxxxx.cn; #改为自己的域名，没域名修改为127.0.0.1:80 charset utf-8; location / { include uwsgi_params; uwsgi_pass 127.0.0.1:8000; #端口要和uwsgi里配置的一样 uwsgi_param UWSGI_SCRIPT library.wsgi; #wsgi.py所在的目录名+.wsgi uwsgi_param UWSGI_CHDIR /data/wwwroot/library/; #项目路径    } location /static/ { alias /data/wwwroot/library/static/; #静态资源路径 } }}
""" 6、要留意备注的地方，要和UWSGI配置文件mysite.xml，还有项目路径对应上。进入/usr/local/nginx/sbin/目录执行./nginx -t命令先检查配置文件是否有错，没有错就执行以下命令："""cd /usr/local/nginx/sbin/./nginx
# 没有提示，证明成功# 测试127.0.0.1：80
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152启动项目# 进入djnago项目cd /data/wwwroot/library/
# uwsgi 解析项目中的配置文件uwsgi -x library.xml
#以上步骤都没有出错的话。cd /usr/local/nginx/sbin/
# 重启nginx./nginx -s reload
# 服务器内部测试是否发布成功curl 1270.0.0.0:80 #就可以看到网站!   # 关闭防火墙,否则远程不能访问!systemctl stop firewalld.service1234567891011121314151617Linux下防火墙管理# CentOS 7.0默认使用的是firewall作为防火墙，使用iptables必须重新设置一下
# 1、直接关闭防火墙systemctl stop firewalld.service # 停止firewallsystemctl disable firewalld.service # 禁止firewall开机启动
# 2、设置 iptables service （防火墙）yum -y install iptables-services # 安装防火墙管理# 如果要修改防火墙配置，如增加防火墙端口3306vi /etc/sysconfig/iptables # 用vi编辑器修改防火墙配置# 增加规则-A INPUT -p tcp -m state --state NEW -m tcp --dport 3306 -j ACCEPT# 保存退出后esc :wq   systemctl restart iptables.service # 重启防火墙使配置生效systemctl enable iptables.service # 设置防火墙开机启动
123456789101112131415161718常用指令# 查看python版本python -V
# 查看python命令如何解析which python3 # 找到命令位置 /usr/bin/python
# cd 命令进入到 /usr/binls -al python3*
# 查看虚拟环境目录和项目发布目录--------------------- 作者：董海明来源：CSDN 原文：https://blog.csdn.net/haiming0415/article/details/89946928 版权声明：本文为博主原创文章，转载请附上博文链接！

转载于:https://www.cnblogs.com/guofeng-1016/p/11290685.html

你可能感兴趣的:(python,数据库,操作系统)

rpg_trajectory_evaluation工具评估SLAM/VIO系统
rpg_trajectory_evaluation工具评估SLAM/VIO系统1、安装系统环境：ubuntu18.04+ROSmelodic代码：https://github.com/uzh-rpg/rpg_trajectory_evaluationtutorial:http://rpg.ifi.uzh.ch/docs/IROS18_Zhang.pdf1.1首先安装依赖的python库pipins
做人脸识别遇到的问题 princesshu python pycharm
最开始安装的时候直接用pipinstalldlib却一直显示错误提示“Failedbuildingwheelfordlib”之后去网上搜来了各种下载链接依然错误我发现问题是！！python版本问题，我下载所有的包都与我的python版本不匹配于是我先安装了cmakeboost之后最后直接在终端安好了dlib～
【Hugging Face全面拥抱LangChain：全新官方合作包】
文末有福利！❝最近HuggingFace官宣发布langchain_huggingface，这是一个由HuggingFace和LangChain共同维护的LangChain合作伙伴包。这个新的Python包旨在将HuggingFace最新功能引入LangChain并保持同步。通过HuggingFace官方包的加持，开发小伙伴们通过简单的api调用就能在langchain中轻松使用HuggingFa
【技术工具】python人员照片简介批量对照（千人级） Allen_Lyb 医疗高效编程研发 python 开发语言自然语言处理健康医疗语言模型
要实现根据照片上的工号批量添加人员姓名和工号到照片上，可以按照以下步骤操作（使用Python+PIL/Pillow+OpenCV+pytesseract）：解决方案步骤准备数据创建人员信息表（CSV格式）：姓名,工号确保所有照片文件名包含工号（如工号.jpg），或照片中有清晰可见的工号文本安装依赖库pipinstallpillowopencv-pythonpandaspytesseract#额外安
大数据技术是解决什么问题的？ @佳瑞大数据
基础知识1TB（太字节）=1024GB1PB（拍字节）=1024TB大数据核心框架HadoopHadoop作为大数据技术生态的核心框架，主要解决了海量数据（TB/PB级）的存储、处理和分析难题，尤其是在传统数据库（如MySQL）和单机计算无法应对的场景下，提供了低成本、高可靠、可扩展的解决方案。其核心解决的问题可归纳为以下几点：海量数据的存储问题传统痛点：单机存储容量有限（如单服务器硬盘通常在TB
Linux机器上Selenium+Python3+Chrome使用driver.get()只能获取到标签而没有内容的解决方法
代码：#!/usr/bin/python3#coding=utf8fromseleniumimportwebdriverfromselenium.webdriver.chrome.optionsimportOptionschrome_options=Options()chrome_options.add_argument('--headless')chrome_options.add_argume
解决 python 中的 huggingface_hub code_welike python 前端数据库
解决python中的huggingface_hub.utils._validators.HFValidationErrorRepoidmustbeintheformrepo_nameorname问题在使用python的huggingface_hub库时，有时候会遇到类似于“huggingface_hub.utils._validators.HFValidationErrorRepoidmustbe
使用Python调用Hugging Face Question Answering (问答)模型墨如夜色 python easyui 开发语言 Python
使用Python调用HuggingFaceQuestionAnswering(问答)模型在自然语言处理领域，问答系统是一种能够回答用户提出的问题的智能系统。HuggingFace是一个知名的开源软件库，提供了许多强大的自然语言处理工具和模型。其中，HuggingFace的QuestionAnswering模型可以帮助我们构建问答系统，使得我们能够从给定的文本中提取答案。本文将介绍如何使用Pytho
深入解析与实战应用：利用Python和Amazon Product Advertising API实战分析不进则退i python 开发语言
在电商平台的运营中，关键词搜索接口是不可或缺的一部分，特别是在亚马逊这样的全球电商平台。通过关键词搜索接口，商家可以高效地获取商品信息，优化选品策略，提升销售业绩。本文将详细介绍如何接入亚马逊的关键字搜索接口，并提供一个Python代码示例。点击获取key和secret1.注册开发者账号并获取API权限首先，你需要访问亚马逊开发者中心，注册一个开发者账号，并获取相应的API权限。在注册过程中，你将
Python爬虫【四十七章】异步爬虫与K8S弹性伸缩：构建百万级并发数据采集引擎程序员_CLUB Python入门到进阶 kubernetes python 爬虫
目录一、背景与行业痛点二、核心技术架构解析2.1异步爬虫引擎设计2.2K8S弹性伸缩架构三、生产环境实践数据3.1性能基准测试3.2成本优化效果四、高级优化技巧4.1协程级熔断降级4.2预测式扩容五、总结Python爬虫相关文章（推荐）一、背景与行业痛点在数字经济时代，企业每天需要处理TB级结构化数据。某头部金融风控平台曾面临以下挑战：数据时效性：需实时采集10万+新闻源，传统爬虫系统延迟超12小
Python爬虫【四十五章】爬虫攻防战：异步并发+AI反爬识别的技术解密程序员_CLUB Python入门到进阶 python 爬虫人工智能
目录引言：当爬虫工程师遇上AI反爬官一、异步并发基础设施层1.1混合调度框架设计1.2智能连接池管理二、机器学习反爬识别层2.1特征工程体系2.2轻量级在线推理三、智能决策系统3.1动态策略引擎3.2实时对抗案例四、性能优化实战4.1全链路压测数据4.2典型故障处理案例五、总结：构建智能化的爬虫生态系统Python爬虫相关文章（推荐）引言：当爬虫工程师遇上AI反爬官在大数据采集领域，我们正经历着技
Python处理MySQL大数据量：分页查询与性能优化 AI天才研究院 AI人工智能与大数据 python mysql 性能优化 ai
Python处理MySQL大数据量：分页查询与性能优化关键词：Python分页查询、MySQL性能优化、大数据量处理、LIMITOFFSET、索引优化摘要：当数据库表数据量达到百万级时，传统的LIMITOFFSET分页查询会出现明显性能瓶颈。本文从实际场景出发，用“图书馆找书”的通俗比喻拆解分页原理，结合Python代码示例和MySQL执行计划分析，详细讲解传统分页的痛点、优化思路（索引分页/覆盖
MacOS上安装Homebrew的详细教程
MacOS上安装Homebrew的详细教程一、引言Homebrew（通常简称为brew）是一款专门为MacOS操作系统设计的开源包管理器，它提供了一种简单、高效的方式来安装、管理和升级命令行工具、编程语言环境以及各种应用程序。其核心概念和作用如下：简化安装流程：在MacOS中，用户无需手动下载软件源码并配置编译环境，只需通过Homebrew提供的命令即可一键安装软件。Homebrew会自动处理软件
Mac上安装Claude Code的步骤
以下是基于现有信息的简明安装指南，适用于macOS系统。请按照以下步骤操作：前提条件操作系统：macOS10.15或更高版本。Node.js和npm：ClaudeCode基于Node.js，需安装Node.js18+和npm。请检查是否已安装：打开终端，运行node--version和npm--version。如果未安装，访问Node.js官方网站下载并安装最新LTS版本，或使用Homebrew：
第5章：数据访问层 liangxh2010 微服务后端架构
5.1SpringDataJPA使用文字讲解SpringDataJPA是SpringData项目的一部分，旨在极大地简化JPA（JavaPersistenceAPI）的使用。它通过提供基于Repository接口的编程模型，让我们无需编写任何实现代码就能完成大多数数据访问操作。核心概念：Entity：一个使用@Entity注解的普通Java对象（POJO），它映射到数据库中的一张表。Reposit
在Mac上安装Homebrew XbpObjectivec macos 操作系统
Homebrew是一个流行的软件包管理器，可以在Mac操作系统上轻松安装、更新和管理各种开源软件。下面是在Mac上安装Homebrew的详细步骤。步骤一：打开终端在Mac上，你可以通过启动应用程序文件夹中的"终端"应用程序来打开终端。步骤二：安装Homebrew在终端中，执行以下命令来安装Homebrew：/bin/bash-c"$(curl-fsSLhttps://raw.githubuserc
大学专业科普 | 计算智能、信息学与大数据鸭鸭鸭进京赶烤大数据
一、专业背景随着信息技术的飞速发展，数据的产生速度呈爆炸式增长，传统数据处理技术已经无法满足如此庞大的数据量和复杂的数据类型，大数据专业应运而生，旨在培养能够应对大数据挑战的专业人才。二、主要课程内容数学基础课程高等数学、概率论与数理统计、线性代数是大数据分析的核心数学基础，为数据处理、算法优化和模型构建提供必要的理论支持。计算机基础课程数据结构与算法、计算机网络、操作系统是大数据技术的重要支撑，
大学专业科普 | 人工智能、物联网和云计算技术鸭鸭鸭进京赶烤人工智能物联网云计算 5G 信号处理信息与通信网络
一、专业概述人工智能专业是一门融合计算机科学、数学、信息学等多学科知识的交叉学科。它旨在培养学生掌握人工智能领域的基本理论、方法和技能，以应对人工智能在各个领域的应用需求和发展挑战。二、主要课程基础课程：包括高等数学、线性代数、概率论与数理统计、离散数学等数学基础课程，为人工智能算法提供理论支撑；以及数据结构、算法设计与分析、计算机组成原理、操作系统、计算机网络等计算机科学基础课程，帮助学生理解人
Reids 子柒s redis 数据库
标题目录Redis概述Redis数据库特点Redis应用场景Redis安装RockyLinux操作系统Windows操作系统Mac操作系统Redis服务启动失败解决方案配置文件详解常见数据类型全局命令String类型字符串数值应用场景列表List基本命令应用场景Hash散列特性基本命令应用场景Set类型基本命令应用场景SortedSet类型有序集合示例基本命令应用场景数据持久化RDB数据持久化SA
解读一个大学专业——信号与图像处理
专业定义与核心内容维度内容定义研究如何采集、处理、分析和理解一维信号（语音、雷达、脑电）和二维/三维图像（医学、遥感、工业视觉）。关键词数字信号处理（DSP）、图像处理、计算机视觉、模式识别、压缩感知、深度学习、GPU加速、嵌入式系统。技术栈MATLAB/Python+OpenCV/PyTorch+DSP/FPGA+GPU（CUDA）第五届先进算法与信号、图像处理国际学术会议（AASIP2025）
浅谈全球化部署(二)
接上文，讲到多机房中的方案，本文继续说明多机房中数据同步的几中方式。上图为，全球化部署环境下，多机房部署，使用到相关技术：1.智能DNS：负责就近机房解析；2.API网关：负责关键数据读写分离；3.数据同步：负责底层数据库的同步；4.其它：如消息中心等；多机房的数据同步数据同步的方式存在如下几种：一写多读如上图所示。1.主机房，实现完整的读写；2.副机房，通过网关将写转到主机房，读在本机房完成；
docker容器中连接宿主机mysql数据库
最近要在docker中使用mysql数据库，首先考虑在ubuntu的镜像中安装mysql，这样的脚本和数据库都在容器中，直接访问localhost：3306，脚本很简单，如下：importpymysql#建立数据库连接db=pymysql.connect(port=3306,host="localhost",user="root",password="password",database="my_
MySql 运维性能优化
内存相关配置innodb_buffer_pool_size：这是InnoDB存储引擎最重要的参数，用于缓存数据和索引。建议设置为服务器可用内存的50%-70%（对于专用数据库服务器）。innodb_buffer_pool_size=8G#根据服务器内存调整innodb_log_buffer_size：用于缓存InnoDB日志。对于写入频繁的系统，可适当调大（默认16M）：innodb_log_bu
【python】向AWS Dynamodb中插入数据
一、背景AWSDynamodb数据库在架构中起到的作用是配置数据库，s3上buckect_a-->bucket_b-->bucket_c对应着层与层之间的关系，总所周知，Dynamobd是非关系型数据库，数据插入的格式是键值对形式的二、代码importboto3importjsonimportpandasaspdAWS_ACCESS_KEY_ID=''AWS_SECRET_ACCESS_KEY='
在Python中对嵌套对象(DynamoDB和表)使用模拟潮易 python 开发语言
在Python中，我们可以使用boto3库来模拟AWSDynamoDB的行为。以下是一个简单的例子，说明如何使用boto3来模拟DynamoDB的表，然后插入和查询数据：首先，你需要安装boto3库。你可以使用pip来安装：```bashpipinstallboto3```然后，你可以创建一个模拟器，并添加一些模拟的数据：```pythonimportboto3frombotocore.stubi
进程线程，并发并行的基本概念以及线程的初步使用还得是乖乖 java 服务器 jvm
今天分享一个关于线程的编写以及制作一个简单的小动态。首先了解并区分一下进程/线程并发/并行这几个概念：进程：在操作系统中，进程是程序的一次动态执行过程。它不仅仅是一个静态的程序代码，还包括程序在执行时所涉及的数据和资源。更精确地说，进程是一个具有独立功能的程序在一个数据集合上运行的过程，它是系统进行资源分配和调度的独立单位。线程：线程是操作系统中程序执行的基本单位，是进程内部的独立执行路径。每个进
MySQL(150)如何进行数据库自动化运维？辞暮尔尔-烟火年年 MySQL 数据库运维 mysql
数据库自动化运维（DBAAutomation）是确保数据库高效、安全运行的关键步骤。自动化运维可以涵盖备份、恢复、监控、性能优化、数据迁移等多个方面。以下是一个详细的指南，展示如何使用Java进行数据库自动化运维，包括代码示例。一、环境准备确保安装有Java开发环境（JDK）、Maven（或Gradle）以及一个数据库（例如MySQL）。我们将使用JDBC来进行数据库操作，以及QuartzSche
直接内存溢出 p＆f° JVM jvm
一、什么是直接内存直接捏成是一块由操作系统直接管理的内存，也叫堆外内存可以使用Unsafe或ByteBuffer分配直接内存可用-XX:MaxDirectMemorySize控制，默认是0，表示不限制二、为什么使用直接内存直接内存vs堆内存io效率高推荐参考：Java直接内存与非直接内存性能测试-阿里云开发者社区三、什么场景使用直接内存1有很大的数据需要存储，它的生命周期又很长2适合频繁的IO操作
深度解析：Python生成器中yield与return的混合使用机制
核心结论：这是有意设计，不是缺陷！在生成器函数中，return语句确实是通过抛出StopIteration异常来实现的，这是Python生成器协议的有意设计而非缺陷。这种机制实现了四个关键目标：✅保持与迭代协议的兼容性✅清晰区分中间值（yield）和最终结果（return）✅支持yieldfrom的高级用法✅提供获取最终结果的标准化方式（通过异常值）生成器执行流程图是否是否是开始执行生成器函数遇到
Python 协程 & 异步编程(asyncio) GeekAGI python 开发语言
文章目录协程&异步编程(asyncio)1.协程的实现1.1greenlet1.2yield1.3asyncio1.4async&awit1.5小结2.协程的意义2.1爬虫案例2.2小结3.异步编程3.1事件循环3.2协程和异步编程3.2.1基本应用3.2.2await3.2.3Task对象3.2.4asyncio.Future对象3.2.5futures.Future对象3.2.6异步迭代器3.
多线程编程之存钱与取钱周凡杨 java thread 多线程存钱取钱
生活费问题是这样的：学生每月都需要生活费，家长一次预存一段时间的生活费，家长和学生使用统一的一个帐号，在学生每次取帐号中一部分钱，直到帐号中没钱时通知家长存钱，而家长看到帐户还有钱则不存钱，直到帐户没钱时才存钱。问题分析：首先问题中有三个实体，学生、家长、银行账户，所以设计程序时就要设计三个类。其中银行账户只有一个，学生和家长操作的是同一个银行账户，学生的行为是
java中数组与List相互转换的方法征客丶 JavaScript java jsonp
1.List转换成为数组。（这里的List是实体是ArrayList) 　　调用ArrayList的toArray方法。　　toArray 　　public T[] toArray(T[] a)返回一个按照正确的顺序包含此列表中所有元素的数组；返回数组的运行时类型就是指定数组的运行时类型。如果列表能放入指定的数组，则返回放入此列表元素的数组。否则，将根据指定数组的运行时类型和此列表的大小分
Shell 流程控制 daizj 流程控制 if else while case shell
Shell 流程控制和Java、PHP等语言不一样，sh的流程控制不可为空，如(以下为PHP流程控制写法)： <?php if(isset($_GET["q"])){ search(q);}else{// 不做任何事情} 在sh/bash里可不能这么写，如果else分支没有语句执行，就不要写这个else，就像这样 if else if if 语句语
Linux服务器新手操作之二周凡杨 Linux 简单操作
1.利用关键字搜寻Man Pages man -k keyword 其中-k 是选项，keyword是要搜寻的关键字如果现在想使用whoami命令，但是只记住了前3个字符who，就可以使用 man -k who来搜寻关键字who的man命令 [haself@HA5-DZ26 ~]$ man -k
socket聊天室之服务器搭建朱辉辉33 socket
因为我们做的是聊天室，所以会有多个客户端，每个客户端我们用一个线程去实现，通过搭建一个服务器来实现从每个客户端来读取信息和发送信息。我们先写客户端的线程。 public class ChatSocket extends Thread{ Socket socket; public ChatSocket(Socket socket){ this.sock
利用finereport建设保险公司决策分析系统的思路和方法老A不折腾 finereport 金融保险分析系统报表系统项目开发
决策分析系统呈现的是数据页面，也就是俗称的报表，报表与报表间、数据与数据间都按照一定的逻辑设定，是业务人员查看、分析数据的平台，更是辅助领导们运营决策的平台。底层数据决定上层分析，所以建设决策分析系统一般包括数据层处理（数据仓库建设）。项目背景介绍通常，保险公司信息化程度很高，基本上都有业务处理系统（像集团业务处理系统、老业务处理系统、个人代理人系统等）、数据服务系统（通过
始终要页面在ifream的最顶层林鹤霄
index.jsp中有ifream，但是session消失后要让login.jsp始终显示到ifream的最顶层。。。始终没搞定，后来反复琢磨之后，得到了解决办法，在这儿给大家分享下。。 index.jsp--->主要是加了颜色的那一句 <html> <iframe name="top" ></iframe> <ifram
MySQL binlog恢复数据 aigo mysql
1，先确保my.ini已经配置了binlog： # binlog log_bin = D:/mysql-5.6.21-winx64/log/binlog/mysql-bin.log log_bin_index = D:/mysql-5.6.21-winx64/log/binlog/mysql-bin.index log_error = D:/mysql-5.6.21-win
OCX打成CBA包并实现自动安装与自动升级 alxw4616 ocx cab
近来手上有个项目,需要使用ocx控件 (ocx是什么? http://baike.baidu.com/view/393671.htm) 在生产过程中我遇到了如下问题. 1. 如何让 ocx 自动安装? a) 如何签名? b) 如何打包? c) 如何安装到指定目录? 2.
Hashmap队列和PriorityQueue队列的应用百合不是茶 Hashmap队列 PriorityQueue队列
HashMap队列已经是学过了的,但是最近在用的时候不是很熟悉,刚刚重新看以一次, HashMap是K,v键 ,值 put()添加元素 //下面试HashMap去掉重复的 package com.hashMapandPriorityQueue; import java.util.H
JDK1.5 returnvalue实例 bijian1013 java thread java多线程 returnvalue
Callable接口：返回结果并且可能抛出异常的任务。实现者定义了一个不带任何参数的叫做 call 的方法。 Callable 接口类似于 Runnable，两者都是为那些其实例可能被另一个线程执行的类设计的。但是 Runnable 不会返回结果，并且无法抛出经过检查的异常。 ExecutorService接口方
angularjs指令中动态编译的方法(适用于有异步请求的情况) 内嵌指令无效 bijian1013 JavaScript AngularJS
在directive的link中有一个$http请求，当请求完成后根据返回的值动态做element.append('......');这个操作，能显示没问题，可问题是我动态组的HTML里面有ng-click，发现显示出来的内容根本不执行ng-click绑定的方法！
【Java范型二】Java范型详解之extend限定范型参数的类型 bit1129 extend
在第一篇中，定义范型类时，使用如下的方式： public class Generics<M, S, N> { //M,S,N是范型参数 } 这种方式定义的范型类有两个基本的问题： 1. 范型参数定义的实例字段，如private M m = null;由于M的类型在运行时才能确定，那么我们在类的方法中，无法使用m，这跟定义pri
【HBase十三】HBase知识点总结 bit1129 hbase
1. 数据从MemStore flush到磁盘的触发条件有哪些？ a.显式调用flush，比如flush 'mytable' b.MemStore中的数据容量超过flush的指定容量，hbase.hregion.memstore.flush.size,默认值是64M 2. Region的构成是怎么样？ 1个Region由若干个Store组成
服务器被DDOS攻击防御的SHELL脚本 ronin47
mkdir /root/bin vi /root/bin/dropip.sh #!/bin/bash/bin/netstat -na|grep ESTABLISHED|awk ‘{print $5}’|awk -F:‘{print $1}’|sort|uniq -c|sort -rn|head -10|grep -v -E ’192.168|127.0′|awk ‘{if($2!=null&a
java程序员生存手册-craps 游戏-一个简单的游戏 bylijinnan java
import java.util.Random; public class CrapsGame { /** * *一个简单的赌*博游戏，游戏规则如下： *玩家掷两个骰子，点数为1到6，如果第一次点数和为7或11，则玩家胜， *如果点数和为2、3或12，则玩家输， *如果和为其它点数，则记录第一次的点数和，然后继续掷骰，直至点数和等于第一次掷出的点
TOMCAT启动提示NB: JAVA_HOME should point to a JDK not a JRE解决开窍的石头 JAVA_HOME
当tomcat是解压的时候，用eclipse启动正常，点击startup.bat的时候启动报错; 报错如下： The JAVA_HOME environment variable is not defined correctly This environment variable is needed to run this program NB: JAVA_HOME shou
[操作系统内核]操作系统与互联网 comsci 操作系统
我首先申明：我这里所说的问题并不是针对哪个厂商的，仅仅是描述我对操作系统技术的一些看法操作系统是一种与硬件层关系非常密切的系统软件，按理说，这种系统软件应该是由设计CPU和硬件板卡的厂商开发的，和软件公司没有直接的关系，也就是说，操作系统应该由做硬件的厂商来设计和开发
富文本框ckeditor_4.4.7 文本框的简单使用支持IE11 cuityang 富文本框
<html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <title>知识库内容编辑</tit
Property null not found darrenzhu datagrid Flex Advanced propery null
When you got error message like "Property null not found ***", try to fix it by the following way: 1)if you are using AdvancedDatagrid, make sure you only update the data in the data prov
MySQl数据库字符串替换函数使用 dcj3sjt126com mysql 函数替换
需求：需要将数据表中一个字段的值里面的所有的 . 替换成 _ 原来的数据是 site.title site.keywords .... 替换后要为 site_title site_keywords 使用的SQL语句如下： updat
mac上终端起动MySQL的方法 dcj3sjt126com mysql mac
首先去官网下载: http://www.mysql.com/downloads/ 我下载了5.6.11的dmg然后安装,安装完成之后..如果要用终端去玩SQL.那么一开始要输入很长的:/usr/local/mysql/bin/mysql 这不方便啊,好想像windows下的cmd里面一样输入mysql -uroot -p1这样...上网查了下..可以实现滴. 打开终端,输入: 1
Gson使用一（Gson） eksliang json gson
转载请出自出处：http://eksliang.iteye.com/blog/2175401 一.概述从结构上看Json，所有的数据（data）最终都可以分解成三种类型：第一种类型是标量（scalar），也就是一个单独的字符串（string）或数字（numbers），比如"ickes"这个字符串。第二种类型是序列（sequence），又叫做数组（array）
android点滴4 gundumw100 android
Android 47个小知识 http://www.open-open.com/lib/view/open1422676091314.html Android实用代码七段（一） http://www.cnblogs.com/over140/archive/2012/09/26/2611999.html http://www.cnblogs.com/over140/arch
JavaWeb之JSP基本语法 ihuning javaweb
目录 JSP模版元素 JSP表达式 JSP脚本片断 EL表达式 JSP注释特殊字符序列的转义处理如何查找JSP页面中的错误 JSP模版元素 JSP页面中的静态HTML内容称之为JSP模版元素，在静态的HTML内容之中可以嵌套JSP
App Extension编程指南（iOS8/OS X v10.10）中文版啸笑天 ext
当iOS 8.0和OS X v10.10发布后，一个全新的概念出现在我们眼前，那就是应用扩展。顾名思义，应用扩展允许开发者扩展应用的自定义功能和内容，能够让用户在使用其他app时使用该项功能。你可以开发一个应用扩展来执行某些特定的任务，用户使用该扩展后就可以在多个上下文环境中执行该任务。比如说，你提供了一个能让用户把内容分
SQLServer实现无限级树结构 macroli oracle sql SQL Server
表结构如下：数据库id path titlesort 排序 1 0 首页 0 2 0,1 新闻 1 3 0,2 JAVA 2 4 0,3 JSP 3 5 0,2,3 业界动态 2 6 0,2,3 国内新闻 1 创建一个存储过程来实现，如果要在页面上使用可以设置一个返回变量将至传过去 create procedure test as begin decla
Css居中div，Css居中img，Css居中文本，Css垂直居中div qiaolevip 众观千象学习永无止境每天进步一点点 css
/**********Css居中Div**********/ div.center { width: 100px; margin: 0 auto; } /**********Css居中img**********/ img.center { display: block; margin-left: auto; margin-right: auto; }
Oracle 常用操作(实用) 吃猫的鱼 oracle
SQL>select text from all_source where owner=user and name=upper('&plsql_name'); SQL>select * from user_ind_columns where index_name=upper('&index_name'); 将表记录恢复到指定时间段以前
iOS中使用RSA对数据进行加密解密 witcheryne ios rsa iPhone objective c
RSA算法是一种非对称加密算法,常被用于加密数据传输.如果配合上数字摘要算法, 也可以用于文件签名. 本文将讨论如何在iOS中使用RSA传输加密数据. 本文环境 mac os openssl-1.0.1j, openssl需要使用1.x版本, 推荐使用[homebrew](http://brew.sh/)安装. Java 8 RSA基本原理 RS