长命百岁️

【论文阅读】检索增强发展历程及相关文章总结

文章目录

前言
Knn-LM
- Insight
- Method
- Results
- - Domain Adaption
  - Tuning Nearest Neighbor Search
- Analysis
REALM
- Insights
- Method
- - Knowledge Retriever
  - Knowledge-Augmented Encoder
- Exp
- Result
- - Ablation Study
  - Case Study
DPR
- Insight
- Method
- Experiments
- Results
RAG
- Insight
- - RAG-Sequence Model
  - RAG-Token Model
  - Retriever: DPR
  - Generator: BART
  - Training
  - Decoding
FID
- Insights
- Method
- Results
COG
- Insight
- Method
- Training
- Results
- - Standard language modeling
  - Domain adaption
  - Enlarged phrase index
GenRead
- Insights
- Method
- Results
REPLUG
- 前言
- REPLUG
- REPLUG LSR: Training the Dense Retriever
- - Computing Retrieval Likelihood
  - Computing LM likelihood
- Training Setup
- - Model
  - Training data
- Results
- - Language Modeling
  - MMLU
  - Open Domain QA
- Analysis
When not to trust language models
- Insight
- Evaluation Setup
- Res
- - without retrieval
  - with retrieval
  - Adaptive retrieval
- Summary

前言

很久没有发博客了，今天翻到之前对检索增强的总结，觉得比较有意义
模型：Knn-LM->REALM->DPR->RAG->FID->COG->GenRead->REPLUG->Adaptive retrieval

Knn-LM

Insight

LMs typically solve two subproblems:
- mapping sentence prefixes to fixed-sized representations
- using these representations to predict the next word in the text
Hypothesis: the representation learning problem may be easier than the prediction problem(use representation to help predict next word)
Introduce kNN-LM, an approach that extends a pre-trained LM by linearly interpolating its next word distribution with a k-nearest neighbors (kNN) model.

Method

Datastore: ( $\mathcal{K,V}$ ), the set of all key-value pairs constructed from all the training examples in $D$

key-value pair $k_i, v_i)$ , where the key $k_i$ is the vector representation of the context $f (c_i)$ and the value $v_i$ is the target word $w_i$

Inference: Interpolate the nearest neighbor distribution $p_{kNN}$ with the model distribution $p_{LM}$ using a tuned parameter $\lambda$ to produce the final $k NN - L M$ distribution(input context $x$ )

$p_{LM}(y|x)$ : given the input context $x$ the model generates the output distribution over next words $p_{LM}(y|x)$
$p_{kNN}(y|x)$ : a distribution over k-nearest neighbors
- compute the probability of each target based on the softmax of the negative distance $d(q,k_i)$
- aggregating probability mass for each vocabulary item across all its occurrences in the retrieved targets

Results

Performance on WIKITEXT-03

performance on BOOKS

Can retrieving nearest neighbors from data be a substitute for training on it?

Training on WIKI-100M and retrieving from WIKI-100B is better that training on WIKI-3B
rather than training language models on ever larger datasets, we can use smaller datasets to learn representations and augment them with kNN-LM over a large corpus.

How the amount of data used for kNN retrieval affects performance?

Domain Adaption

training on WIKI-3B and preforming on BOOKS

Tuning Nearest Neighbor Search

Key function

Number of neighbors per query(Figure 4) and interpolation parameter(Figure 5)

Analysis

examples where kNN-LM is most helpful typically contain rare patterns
necessary to use neural representation rather than n-gram based method
can LMs remember the training dataset to replace using explicit memory?
- LMs have the ability to remember all the training data(Figure 8) but are not good at generalization

REALM

Insights

预训练语言模型的缺点

很难确定在网络中存储哪些知识以及在哪里
存储知识的空间受网络大小限制

之前工作的局限

prior works have demonstrated the benefit of adding a discrete retrieval step to neural networks, but did not apply the framework to language model pre-training and employed non-learned retrievers to handle large-scale document collections
inspired by the framework retrieve relevant documents and extract an answer from the docs and extends it to language model pre-training

本文提出REALM，一个retrieve-then-predict方法

以更可解释，更模块化的方式捕捉知识
key: train the retriever using a performance-based signal from unsupervised text

Methods compared with：

extremely large models that store knowledge implicitly（eg. T5）
approaches that also use a knowledge retriever to access external knowledge, but implement retrieval in a more heuristic fashion

Method

For both pre-training and fine-tuning, REALM takes some input x and learns a distribution p(y | x) over possible outputs y.

pre-training: masked language modeling
fine-tuning: Open-QA
two-stages:
- retrieve: sample from distribution $p (z ∣ x)$
- predict: $p (y ∣ z, x)$
- overall likelihood of generating $y$

Knowledge Retriever

implement the embedding functions using BERT-style Transformers
- where

Knowledge-Augmented Encoder

pretraining: use MLM loss
- 向量长度不固定，可以用内积吗？是不是都默认归一化了
Open-QA fine-tuning: assume that the answer $y$ can be found as a contiguous sequence of tokens in some document $z$
- $BERT_{START(s)}$ and $BERT_{END(s)}$ denote the Transformer output vectors corresponding to the start and end tokens of span s, respectively
- 正确的分数大，不需要保证错误的分数小吗？
- do not update $Embed_{doc}$ for simplicity

Exp

Pretraining: 8 candidate documents, two choices of corpus:(1) Wikipedia (2)CC-News

Finetuning: consider top-5 candidates

Result

Ablation Study

Exact Match: predicted answer is evaluated via exact match with any reference answer
Zero-shot Recall@5: how often the gold answer appears in the top-5 retrievals before applying any fine-tuning.

Case Study

DPR

Insight

Dense retrieval methods have thus never be shown to outperform TF-IDF/BM25 for open-domain QA before ORQA
two weaknesses of ORQA
- ICT pretraining is computationally intensive and it is not completely clear that regular sentences are good surrogates of questions in the objective function
- the context encoder is not fine-tuned using pairs of questions and answers, the corresponding representations could be suboptimal.

can we train a better dense embedding model using only pairs of questions and passages (or answers), without additional pretraining

focus on developing the right training scheme using a relatively small number of question and passage pairs(only finetuning)

Propose DPR, a two-stage framework:

a context retriever
a machine reader

Method

Encoders: two independent BERT

Training:

goal: create a vector space such that relevant pairs of questions and passages will have smaller distance
- In-batch negatives

Experiments

source documents: Wikipedia dump from Dec. 20, 2018(100 words as passages, title + passage)

QA datasets: Natural Question; TriviaQA; WebQuestion; CuratedTREC; SQuAD v1.1

large: NQ, TriviaQA, SQuAD
small: TREC, WQ

Results

Retrieval

**
**

End-to-end QA

Besides the retriever, our QA system consists of a neural reader extracts an answer span from the passages

using BERT to predict the start_token and the end_token

higher retriever accuracy typically leads to better final QA results

RAG

Insight

1.预训练模型存储知识的能力很强，但访问和精准操控知识的能力还受限，所以在knowledge-intensive任务上不如task-specific架构。

cannot easily expand or revise their memory
can’t straightforwardly provide insight into their predictions
may produce “hallucinations”

2.parametric memory with non-parametric (i.e., retrieval-based) memories结合可以解决一些问题

知识可以直接修改和扩展，可以检查和解释访问的知识

3.REALM 和 ORQA 利用了这种形式（基于masked language model），但是只探索了 open-domain extractive question answering

因此，本文将这种方式扩展到NLP的主力seq2seq models上

parametric memory: 预训练的seq2seq transformer
non-parametric memory: Wikipedia的dense vector index(通过预训练的检索器获取. i.e. DPR)
提出两种形式 RAG-Sequence 和 RAG-Token

RAG-Sequence Model

uses the same retrieved document to generate the complete sequence.

检索到的 top-k 中的文档，每个都对生成起一定的作用
每个文档都对整个sequence起作用

RAG-Token Model

use a different latent document for each target token.

一个输出(sequence)中的每个token可以利用不同的document $z$

Retriever: DPR

We use a pre-trained bi-encoder from DPR to initialize our retriever and to build the document index

We refer to the document index as the non-parametric memory

Generator: BART

use BART-large and simply concatenate the input $x$ and the retrieved content $z$

Training

jointly train the retriever and generator components without any direct supervision on what document should be retrieved.

Use a fine-tuning training corpus of input/output pairs $x_i, y_i)$
keep the document encoder(costly and not necessary) fixed, only fine-tuning the query encoder and the generator

Decoding

RAG-Token：按beam生成，每个token的概率都知道
RAG-Sequence: 对每个文档都生成一个输出 $y$ ，构成集合 $Y$ 。有些文档生成的 $y$ ，另一些文档未必能生成。我们对所有的文档都算一下这样的 $y$ 的概率，然后一个 $y$ 的概率就能写成 $\sum_{z\in top-k}p(z|x)p(y|x,z)$ 。这叫做Thorough Decoding
- 但是这样当生成sequence长了之后， $Y$ 会很大，要算很多遍。为了效率，将 $p(y|x,z_i)$ 置为0，如果通过 $x,z_i$ 没有生成 $y$ ，这个叫做 Fast Decoding

在四种knowledge-intensive任务上测试RAG。

所有实验都用 Wikipedia 作为检索的知识源
每个文档都被拆分成100个词的块
top-k，k是5或10

open-domain QA

Abstractive Question Answering（MSMARCO）
- RAG好于BART，接近最优模型
  - 最优模型利用了gold passages
Jeopardy QG(Jeopardy)
- why RAG-Token performs the best
  - combine content from several documents
- the non-parametric component helps to guide the generation, drawing out specific knowledge stored in the parametric memory.(after the first token of each book is generated, the document posterior flattens)
Fact Verification（FVR3, FVR2）
- 对FVR3（3分类），RAG差的不多，且SOTA方法需要很多设计，训练
- 对FVR2（2分类），RAG差的不多，SOTA方法会利用gold evidence

FID

Insights

之前方法的缺陷：

Retrieval based approaches were previously considered in the context of open domain question answering with extractive models（including DPR and REALM）
- Aggregating and combining evidence from multiple passages is not straightforward when using extractive models

Propose retrieval + generation.

Method

two steps:

retrieval:
- BM25/DPR
reading:
- each question+passage is processed independently from other passages by the encoder
- the decoder performs attention over the concatenation of the resulting representations of all the retrieved passages
  - processing passages independently in the encoder, but jointly in the decoder
- implement cross-attention over the concatenation of the resulting representations of all the retrieved passages(personal thinking).
  - 但是我看了代码，在生成的时候是将所有passage拼接起来输入到模型的，感觉很诧异
    - 更新：没错，就是通过cross-attention。作者更新了encoder的处理部分，将每个passage单独处理完后，组织成一个大序列，给decoder看。这种方式能够一定程度克服输入长度限制，可以借鉴，但是个人认为只适合encoder-decoder架构，且cross-attention计算量会线性增大（没有self-attention上的增加）
model: T5

Results

generative models seem to perform well when evidence from multiple passages need to be aggregated, compared to extractive approaches

training with different numbers of passages, while testing with 100 passages.

COG

Insight

Reformulate text generation by copying text segments from existing text collections

the next-token predictions in traditional neural language models are replaced by a series of copy-and-paste operations.

改进：动态学习phrase table，对里面的内容进行增删改查，或者将fixed phrase转成dynamic phrase

Method

At each time step, a suitable phrase is selected and appended to the current prefix accordingly

For a document $D^i$ , a phrase $D^i_{s:e}$ of length e − s + 1 can be extracted, where $s$ and $e$ mark the start and end positions of the phrase in the document, respectively.
denote all the phrases in the source text collection as $\mathcal{P}$ –> $\{(k,p_k)|k \in \mathcal{P}\}$
- $p_k = PhraseEncoder(s, e, D^i)$
- fitness score:
  - $q_i$ is the representation of the prefix $x_{x<i$
to support the scenarios where no suitable phrases are available, we also add the context-independent token embeddings ${(w, v_w)|w ∈ V }$ in standard LMs to the phrase table

The model consists of three major components:

a prefix encoder that maps prefixes to fixed-sized representations
- use the standard Transformer architecture with causal attention(GPT-2)
- use the hidden state of the last token as the prefix representation $q_i$
a context-dependent phrase encoder that computes the vector representations of the phrases in the source text collection
- For a document $D = D_1, . . . , D_m$ of length m:
  - first apply a deep bidirectional Transformer(BERT-base-cased) to obtain contextualized token representations $D^{m \times d_t}$
  - apply two MLPs models, $MLP_{start}$ and $MLP_{end}$ , to convert $D$ into start and end token representations respectively:
  - for each phrase $D_{s:e}$ , use the concatenation of the corresponding start and end vectors as the phrase representation
a set of context-independent token embeddings similar to the one used in standard neural language models
- to retain the generalization capability to compose output with standalone tokens
- add the traditional context-independent token embeddings $R^{|V| \times d}$ to our phrase table.
- useful when there is no suitable phrase in the source text collection

为什么用GPT-2生成的表示，与BERT生成的表示算匹配，二者在一个表达空间内吗？

Training

a document D has been split into n phrases $D = p_1, . . . , p_n$

the training loss for next-phrase predictions(next-phrase prediction)
- $\mathcal{P_k}$ consists of all the phrases in the source document $D^k$
to retain the capability of token-level generation, we also train COG with the standard token-level autoregressive loss(next-token prediction)

The training loss is the sum of these two losses.

Results

Standard language modeling

Inference Speed

the encoding time cost is not included
achieves comparable inference efficiency with the standard Transformer baseline
- the inference latency of kNN-LM is much higher than Transformer, and COG

Case Study

Domain adaption

COG allows a single model to be specialized in different domains, by simply switching the source text collection

Enlarged phrase index

Idea

Levenshtein Transformer: 这个模型在生成时，可以对生成的结果进行增删改（NeurIPS 2019）

GenRead

Insights

ICLR 2023: 8 8 8 10

Three drawbacks of retrieve-then-read pipeline

candidate documents for retrieval are chunked (e.g., 100 words) and fixed, so the retrieved documents might contain noisy information that is irrelevant to the question
- 可以按语义截断，按语义分块
the representations of questions and documents are typically obtained independently in modern two-tower dense retrieval models, leading to only shallow interactions captured between them
- 可以深层交互，比如question编码完，在编码doc的时候，每一层都看到question的编码，最后算分
- 有必要深层交互吗？浅层与深层的影响是什么？
document retrieval over a large corpus requires the retriever model to first encode all candidate documents and store representations for each document
- 但是用大模型，不用检索，一样会受限于模型大小，因为知识量与参数量有关，且更难解释
- 生成式检索能否用来解决这个问题？

Propose to leverage LLMs to directly generate contextual documents for a given question，two advantages

generated contextual documents contain the correct answer more often than the top retrieved documents
- large language models generate contextual documents by performing deep token-level cross-attention between all the question and document contents
our approach significantly outperforms directly generating answers from large language models despite not incorporating any new external information
- mainly because the task of generating document-level contexts is close to the objective of causal language modeling pre-training, so the world knowledge stored in the model parameters can be better utilized
- 生成文档的真实性能保证吗？逻辑能保证吗？会加剧幻象吗？（会出现幻象）

Method

Two steps：

first prompts a LLM to generate contextual documents with respect to a given query
reads the generated documents to predict the final answer(a large model like InstructGPT for zero-shot or a smaller model like FID for finetuning)

Zero setting:

first prompt a large language model (InstructGPT) to generate documents based on the given question with greedy decoding strategy
use generated sentence along with the input question to produce the final answer from the large language model

Supervised setting:

Explore how the generated documents from large language models can benefit the supervised setting.

leverage a small reader model such as FiD to peruse the generated documents under the supervised setting(finetune the reader)
scaling the size of retrieved documents can lead to better performance(for retrieval model)
- But it is hard to generate diverse documents

Clustering-based prompts:

step1: get one initial document per question
- now have a question-document pair set ${q_i,d_i\}_{i=1}^{|Q|}$ ( $Q$ is the set of questions in the training split)
step2: encode each question-document pair, do k-means clustering
step3: sample and generate k documents
- sample n(hyperparameter = 5) question-document pairs from each cluster c, denoted as ${qc1, dc1; qc2, dc2; ...; qcn, dcn\}$
  - 一个cluster能代表一种 q 与 d 之间的关系吗？
- input: ${qc1\} \{dc1\} ... \{qcn\} \{dcn\} \{input question\}$
- output: a document
- K clusters -> K generated documents
- 这样好吗？使用的 pairs 都是question-independent，对一个question中的所有question来说都是相同的。对不同question来说，生成的document可能都是与question某个特定方面相关的，因为prompt里面的关系是相同的

Results

Zero-shot

Supervised setting

InstructGPT + FiD(FiD is fine-tuned on the training split of target datasets)

Other tasks

Fact checking: there is a smaller semantic gap between the given factual statement and contextual documents

Case Study

揭示了检索的问题，检索回来的doc与question并不是紧密联系的，可能因为其中一些词发挥作用导致similarity比较高
生成一般是顺着prompt说，联系会比较紧密一些

REPLUG

前言

本文提出REPLUG，一个将语言模型视为黑盒检索增强的语言模型架构。在REPLUG中，仅将检索得到的文档拼接到原有输入前面即可，不需要像以前一样更新语言模型参数。该架构中可以通过更新检索器进一步提升性能。

REPLUG

给一个输入上下文
REPLUG会首先从外部资源 $D=\{d_1,\dots,d_m\}$ 中检索出一些相关文档
- 使用基于双塔encoder（共享参数）的dense retrieval来检索文档，一个encoder用来编码输入 $x$ 和文档 $d$
- 文档与输入的embedding都是对其中每个token最后一个隐藏层表达的平均值
- 通过cos similarity计算 $x$ 与 $d$ 的相关性： $s (d, x) = cos (E (d), E (x))$
- 预先计算文档的embedding，并利用FAISS来快速找到top-k文档
之后我们将每个检索到的文档与输入上下文进行拼接，并行输入到大模型中
- 由于模型输入限制，无法将所有检索文档与输入 $x$ 进行拼接
- 采用聚合策略，拼接时，将每个top-k文档分别拼接在 $x$ 前面，并将拼接结果分别输入到语言模型中。
最后聚合每个并行输入得到的预测概率
- 对上面分别计算的结果进行聚合
  - 给定上下文输入 $x$ 和 top-k 相关文档集合 $D^{'}$ ，下一个token $y$ 的生成概率由加权平均决定
    - $p(y|x,D^{'}) = \sum_{d \in D^{'}}p(y|d \circ x) \cdot \lambda(d,x)$
      - 其中 $\lambda(d,x)$ 是 $d$ 与 $x$ 相似度 $s (d, x)$ 进行softmax的结果

REPLUG LSR: Training the Dense Retriever

REPLUG LSR 可以看做 REPLUG的一个增强版本。在REPLUG中，我们使用的检索器可能不够适配语言模型，因此这里利用语言模型本身反馈的监督信号，来调整REPLUG中的检索器。

这里的监督信号可以告诉我们，什么样的文档应该被检索回来

核心思想：our approach can be seen as adjusting the probabilities of the retrieved documents to match the probabilities of the output sequence perplexities of the language model

其实就是匹配检索文档的概率与语言模型输出序列的概率
- 输出序列的概率就是语言模型提供的监督信号
- 这样做的理由
  - 如果模型输出的ground truth序列的概率更大，那么我们认为模型的效果越好
  - 我们认为，如果一个文档对模型的输出更有帮助，那么我们就认为这个文档更应该被检索回来，其检索的概率也应该更大。
  - 所以说，一个文档被检索回来的概率应该与使用这个文档得到输出序列的概率是正相关的，因此我们想要匹配检索文档的概率与语言模型输出序列的概率

这部分介绍如何计算检索文档概率分布与输出序列概率分布

Computing Retrieval Likelihood

给定输入 $x$ ，我们检索回来概率最大的top-k个文档，为 $D^{'} \subset D$ ，文档 $d$ 的检索概率(likelihood)为

$P_R(d \mid x)=\frac{e^{s(d, x) / \gamma}}{\sum_{d \in \mathcal{D}^{\prime}} e^{s(d, x) / \gamma}}$

$\gamma$ 是用来控制 softmax 温度的超参
按理应该在整个 $D$ 上进行，但是那样计算量太大，因此在 $D^{'}$ 上近似计算

Computing LM likelihood

将语言模型用来评估每个文档对语言模型困惑度的提升程度，首先计算 $P_{LM}(y|d,x)$ ，这是给定 $x$ 和文档 $d$ 时，ground truth $y$ 的生成概率。如果这个概率越大，则说明当前文档对困惑度的提升程度越大。然后计算分布：

$\mid x, y)=\frac{e^{P_{L M}(y \mid d, x) / \beta}}{\sum_{d \in \mathcal{D}^{\prime}} e^{P_{L M}(y \mid d, x) / \beta}}$

$\beta$ 是超参

有了两个分布之后，用loss function 对二者进行匹配

在给定 $x$ 和 $y$ 时，计算检索概率分布和语言模型概率分布，我们利用KL divergence来匹配两个分布，并用来优化dense retriever

$\mathcal{L}=\frac{1}{|\mathcal{B}|} \sum_{x \in \mathcal{B}} K L\left(P_R(d \mid x) \| Q_{\mathrm{LM}}(d \mid x, y)\right)$

$B$ 是输入 $x$ 的集合
我们最小化损失函数来优化检索器，LM保持不动

因为检索器参数在训练过程中更新，参数更新后document embedding会变化，因此每隔 $T$ 步就重新算一次document embedding，并重复上述过程

Training Setup

Model

LM: GPT-3(for REPLUG LSR)
Retriever：Contriver(2022新模型)

Training data

所有训练数据都来自 Pile training data(包含不同领域文本的language model benchmark)
800K 个 256 token长的序列作为训练queries
- 每个query分成两部分，前128token作为 input context $x$ ，后一半作为需要续写的ground truth $y$
外部语料库 $D$ , 采样36M 128 token长的文档

Results

Language Modeling

randomly subsampled Pile training data (367M documents of 128 tokens) and use them as the retrieval corpus for all models

MMLU

Atlas trains both the retriever and the language model, which we consider a white-box retrieval LM setting.
对于检索增强的版本，我们将test question作为query，从Wikipedia中检索10个文档，与question拼接成10个输入，最后的结果是10个输出的聚合

Open Domain QA

dataset: Natural Question and TriviaQA
- For evaluation, we consider the few-shot(use a few training data) and full data(use all training data)
RETRO, R2-D2, Atlas are finetuned on the training data, either in a few-shot setting or with full training data

Analysis

性能的提升不止源于聚合不同的输出结果，聚合相关的文档是成功的关键
随着聚合文档数目的提升，REPLUG 和 REPLUG LSR 的性能单点提升，不过 a small number of documents(e.g., 10)就可以做的不错

REPLUG带来的性能增益与模型大小保持一致, 且能够应用到不同模型上

REPLUG is more helpful when texts contain rare entities

it is unclear when the model relies on retrieved knowledge or parametric knowledge

When not to trust language models

Insight

LMs have been shown to have limited memorization for less frequent entities, are prone to hallucinations, and suffer from temporal degradation
it is unclear whether it(incorporating non-parametric knowledge) is strictly superior or complementary to parametric knowledge

target: understand when we should and should not rely on LMs’ parametric knowledge, and how scaling and non-parametric memories can help

Evaluation Setup

focus: factual knowledge
task format: open-domain QA

Dimensions of Analysis:

Previous research often uses the term frequency of object entities in pretraining corpora to understand memorization
focus on the other two variables in a factual knowledge triple: the subject entity and the relationship type.
- subject entity: use the popularity of the entities measured by Wikipedia monthly page views
- relationship type:

Dataset:

PopQA: randomly sample knowledge triples of 16 relationship types from Wikidata

EntityQuestions: use Wikipedia hyperlink counts as a proxy of the frequency of entities and sample knowledge triples from WikiData, from the frequency distributions

Res

without retrieval

there is a positive correlation between subject entity popularity and models’ accuracy for almost all relationship types
factual knowledge of some relationship types are more easily memorized than others

Scaling may not help with tail knowledge

with retrieval

run an off-the-shelf retrieval system off-line to retrieve context from Wikipedia relevant to a question and concatenate the retrieved context(top one for simplicity) with the original question

use BM25 / Contriever

Retrieval largely improves performance

Non-parametric memories are effective for less popular facts

Non-parametric memories can mislead LMs

Adaptive retrieval

we use retrieval for questions whose popularity is lower than a threshold

determine the popularity threshold independently for each relationship type.(maximize the adaptive accuracy on a development set)

Summary

LMs’ memorization (RQ1) is often limited to the popular factual knowledge and even GPT-3 davinci-003 fails to answer the majority of the long-tail questions
- scaling up models does not significantly improve the performance for long-tail questions
Non-parametric memories largely improve performance on long-tail distributions across models.
- retrieval augmentation can hurt the performance of large LMs on questions about popular entities as the retrieved context can be misleading
Devise a simple-yet-effective retrieval-augmented LM method, Adaptive Retrieval, which adaptively combines parametric and non-parametric memories based on popularity

你可能感兴趣的:(自然语言处理,论文阅读,信息检索,论文阅读)

高效批量单词翻译工具的设计与应用
本文还有配套的精品资源，点击获取简介：在信息技术飞速发展的今天，批量单词翻译工具通过计算机的数据处理能力，大大提高了语言学习和文字处理的效率。用户通过简单输入单词列表到一个文本文件，并运行翻译程序，即可获得翻译结果并保存至指定文件。该工具集成了内置或外部翻译引擎，利用自然语言处理技术实现快速准确的翻译，并可能提供词性识别等附加功能。尽管机器翻译无法完全取代人工校对，但它为用户提供了一种高效的翻译解
深度学习模型表征提取全解析 ZhangJiQun&MXP 教学 2024大模型以及算力 2021 AI python 深度学习人工智能 python embedding 语言模型
模型内部进行表征提取的方法在自然语言处理（NLP）中，“表征（Representation）”指将文本（词、短语、句子、文档等）转化为计算机可理解的数值形式（如向量、矩阵），核心目标是捕捉语言的语义、语法、上下文依赖等信息。自然语言表征技术可按“静态/动态”“有无上下文”“是否融入知识”等维度划分一、传统静态表征（无上下文，词级为主）这类方法为每个词分配固定向量，不考虑其在具体语境中的含义（无法解
搜广推校招面经九十三 Y1nhl 搜广推面经机器学习人工智能 python 算法推荐算法 pytorch 搜索算法
字节懂车帝一面一、NDCG（NormalizedDiscountedCumulativeGain）的计算NDCG是信息检索和排序任务中常用的评价指标，用于衡量模型预测的排序质量与真实相关性排序的一致程度。1.1.DCG@k（DiscountedCumulativeGain）DCG@k=∑i=1krelilog⁡2(i+1)\text{DCG@k}=\sum_{i=1}^{k}\frac{rel_i
【AI大模型】LLM模型架构深度解析：BERT vs. GPT vs. T5 我爱一条柴ya 学习AI记录 ai 人工智能 AI编程 python
引言Transformer架构的诞生（Vaswanietal.,2017）彻底改变了自然语言处理（NLP）。在其基础上，BERT、GPT和T5分别代表了三种不同的模型范式，主导了预训练语言模型的演进。理解它们的差异是LLM开发和学习的基石。一、核心架构对比特性BERT(BidirectionalEncoder)GPT(GenerativePre-trainedTransformer)T5(Text
[论文阅读]Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smal 0x211 论文阅读语言模型人工智能自然语言处理
中文译名：逐步蒸馏！以较少的训练数据和较小的模型规模超越较大的语言模型发布链接：http://arxiv.org/abs/2305.02301AcceptedtoFindingsofACL2023阅读原因：近期任务需要用到蒸馏操作，了解相关知识核心思想：改变视角。原来的视角：把LLMs视为噪声标签的来源。现在的视角：把LLMs视为能够推理的代理。方法好在哪？需要的数据量少，得到的结果好。文章的方法
GPT实操——利用GPT创建一个应用狗木马深度学习 gpt-3 gpt
功能描述信息查询：用户可以询问各种问题，如天气、新闻、股票等，机器人会返回相关信息。任务执行：用户可以要求机器人执行一些简单的任务，如设置提醒、发送邮件等。情感支持：机器人可以与用户进行情感交流，提供安慰和支持。个性化设置：用户可以自定义机器人的回复风格和偏好。技术栈前端：React.js后端：Node.js+Express数据库：MongoDB自然语言处理：OpenAIGPT-3API其他工具：
Python爬虫实战：使用最新技术爬取新华网新闻数据 Python爬虫项目 2025年爬虫实战项目 python 爬虫开发语言 scrapy 音视频
一、前言在当今信息爆炸的时代，网络爬虫技术已经成为获取互联网数据的重要手段。作为国内权威新闻媒体，新华网每天发布大量高质量的新闻内容，这些数据对于舆情分析、市场研究、自然语言处理等领域具有重要价值。本文将详细介绍如何使用Python最新技术构建一个高效、稳定的新华网新闻爬虫系统。二、爬虫技术选型2.1技术栈选择在构建新华网爬虫时，我们选择了以下技术栈：请求库：httpx（支持HTTP/2，异步请求
NLP_知识图谱_大模型——个人学习记录 macken9999 自然语言处理知识图谱大模型自然语言处理知识图谱学习
1.自然语言处理、知识图谱、对话系统三大技术研究与应用https://github.com/lihanghang/NLP-Knowledge-Graph深度学习-自然语言处理(NLP)-知识图谱：知识图谱构建流程【本体构建、知识抽取（实体抽取、关系抽取、属性抽取）、知识表示、知识融合、知识存储】-元気森林-博客园https://www.cnblogs.com/-402/p/16529422.htm
从RNN循环神经网络到Transformer注意力机制：解析神经网络架构的华丽蜕变熊猫钓鱼>_> 神经网络 rnn transformer
1.引言在自然语言处理和序列建模领域，神经网络架构经历了显著的演变。从早期的循环神经网络（RNN）到现代的Transformer架构，这一演变代表了深度学习方法在处理序列数据方面的重大进步。本文将深入比较这两种架构，分析它们的工作原理、优缺点，并通过实验结果展示它们在实际应用中的性能差异。2.循环神经网络（RNN）2.1基本原理循环神经网络是专门为处理序列数据而设计的神经网络架构。RNN的核心思想
深度学习实战-使用TensorFlow与Keras构建智能模型程序员Gloria Python超入门 TensorFlow python
深度学习实战-使用TensorFlow与Keras构建智能模型深度学习已经成为现代人工智能的重要组成部分，而Python则是实现深度学习的主要编程语言之一。本文将探讨如何使用TensorFlow和Keras构建深度学习模型，包括必要的代码实例和详细的解析。1.深度学习简介深度学习是机器学习的一个分支，使用多层神经网络来学习和表示数据中的复杂模式。其广泛应用于图像识别、自然语言处理、推荐系统等领域。
MySQL 中如何优化 DISTINCT 查询：基于 Java 的实践与应用喵手数据库 mysql java 数据库
全文目录：开篇语前言摘要简介概述1.使用索引优化2.限制选择字段3.使用`GROUPBY`替代`DISTINCT`核心源码解读Java代码示例：优化`DISTINCT`查询代码说明案例分析案例一：数据去重优化应用场景演示场景一：日志数据去重场景二：用户信息检索优缺点分析优点缺点类代码方法介绍及演示MySQLDistinctOptimization类测试用例main函数测试用例测试结果预期测试代码分
技术类岗位面试中经典问题总结分享
1.谈淡你的最成功/失败的经历,你现在回去(时光倒流）怎么做2.你做过的一个项目/事例，说说过程（观是否谈结果)过程中，怎么进行信息检索的3.请你对我进行一个评价（观察是否谈到缺点)4.请用一句话介绍自己（总结十逻辑思维)5.你所学的课程中最喜欢/了解哪一个，请淡谈课程内容6.请描述一下用单片机点亮一个流水灯的全过程/请描述AD绘制PCB板的全过程/请详细描述用C语编辑环境输出一个Hellow,w
【论文阅读】【IEEE TCYB 2023】Edge-Guided Recurrent Positioning Network forSalient Object Detection in Opt
引言任务：光学遥感图像中显著目标检测论文地址：Edge-GuidedRecurrentPositioningNetworkforSalientObjectDetectioninOpticalRemoteSensingImages|IEEEJournals&Magazine|IEEEXplore代码地址：前置知识一、摘要目前由于光学rsi中目标类型多样、目标尺度多样、目标方向众多以及背景杂乱，现有S
【RAG实战指南 Day 13】嵌入模型选择与性能对比在未来等你 Java场景面试宝典 RAG 嵌入模型语义搜索信息检索向量数据库
【RAG实战指南Day13】嵌入模型选择与性能对比文章内容开篇欢迎来到"RAG实战指南"系列的第13天！今天我们聚焦RAG系统中的关键组件——嵌入模型。嵌入模型的质量直接影响检索效果，进而决定整个RAG系统的性能。在信息检索过程中，嵌入模型将文本转换为向量表示，其质量决定了语义搜索的准确性和召回率。本文将深入分析主流嵌入模型的技术特点、性能表现和适用场景，帮助您在项目中做出最优选择。通过本文，您将
大模型服务的推理优化探索半吊子全栈工匠
【引】有的事情别人不问时我们明白，一旦要我们解释它我们就不明白了，而这正是我们必须留心思索的东西。于是，开启了一次又一次的论文阅读之旅。开发并部署大模型应用肯定要考虑它们的服务成本。然而，钱并不是唯一的考虑因素，如果不能解决模型性能方面的问题，即使有很大的预算，大模型服务仍会受到影响。本文尝试讨论将LLM推理服务更改为高吞吐量引擎的挑战与应对方法。1.大模型服务面临的挑战大模型的能力令人惊叹，但其
ER综述论文阅读-Emotion recognition in EEG signals using deep learning methods: A review 今天早睡了情绪识别Emotion Recognition 论文阅读深度学习人工智能
EmotionrecognitioninEEGsignalsusingdeeplearningmethods:AreviewQ1期刊，2023论文链接：https://d1wqtxts1xzle7.cloudfront.net/105887899/emotionreview-libre.pdf?1695460941=&response-content-disposition=inline%3B+f
【论文阅读】AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting quintus0505 LLM 论文阅读语言模型
AdaCtrl:TowardsAdaptiveandControllableReasoningviaDifficulty-AwareBudgeting3Method3.1长度触发标签作为控制接口（Length-TriggerTagsasControllingInterface）3.2冷启动微调（Cold-startfine-tuning）3.3难度感知的强化学习框架（Difficulty-awar
【论文阅读笔记】TimesURL: Self-supervised Contrastive Learning for Universal Time Series 少写代码少看论文多多睡觉 #论文阅读笔记论文阅读笔记
TimesURL:Self-supervisedContrastiveLearningforUniversalTimeSeriesRepresentationLearning摘要学习适用于多种下游任务的通用时间序列表示，并指出这在实际应用中具有挑战性但也是有价值的。最近，研究人员尝试借鉴自监督对比学习（SSCL）在计算机视觉（CV）和自然语言处理（NLP）中的成功经验，以解决时间序列表示的问题。
【论文阅读】Decoupled Knowledge Distillation Bosenya12 论文阅读
摘要：最先进的蒸馏方法主要基于从中间层蒸馏出深层特征，而logit蒸馏的重要性则被大大忽视了。为了提供研究logit蒸馏的新观点，我们将经典的KD损失重新表述为两部分，即目标类知识蒸馏（TCKD）和非目标类知识蒸馏（NCKD）。我们实证调查并证明了两部分的效果：TCKD传递了有关训练样本“困难”的知识，而NCKD是logit蒸馏起作用的突出原因。更重要的是，我们揭示了经典的KD损失是一个耦合公式，
【论文阅读】Transfer Learning for Automatic Modulation Recognition Using a Few Modulated Signal Samples
摘要：这封信提出了一种用于自动调制识别（AMR）的迁移学习模型，该模型仅具有少量调制信号样本。传输模型以音频信号UrbanSound8K作为源域进行训练，然后以一些调制信号样本为目标域进行微调。为了提高分类性能，信噪比（SNR）被用作一个功能来促进信号的分类。仿真结果表明，迁移模型在分类精度方面具有显著优势。这篇文章的核心内容是提出了一种基于迁移学习（TransferLearning）的自动调制识
【论文阅读】Meta-SE: A Meta-Learning Framework for Few-Shot Speech Enhancement Bosenya12 论文阅读
这篇文章介绍了一个名为Meta-SE的元学习框架，专门用于少样本（few-shot）语音增强问题。文章的核心目标是解决在实际应用中，由于训练样本有限而导致传统深度神经网络（DNN）模型性能受限的问题。Meta-SE通过元学习的方法，利用先验的元知识快速适应新的任务和噪声类型，即使只有少量训练样本也能表现出色。背景知识与研究动机语音增强技术旨在从带噪语音信号中恢复目标语音，提升语音质量和可懂度。深度
【论文阅读】SASLN：小样本条件下机械故障诊断的信号增强自学习网络
SASLN:SignalsAugmentedSelf-TaughtLearningNetworksforMechanicalFaultDiagnosisUnderSmallSampleCondition本文介绍了一种名为SASLN（SignalsAugmentedSelf-TaughtLearningNetworks）的方法，专门用于在小样本条件下对风力发电机（WT）的发电机轴承故障进行诊断。该方
【论文阅读】SSCL-AMC：一种基于动态增强和集成学习的自监督自动调制分类方法
SSCL-AMC:ASelf-supervisedAutomaticModulationClassificationMethodviaDynamicAugmentationandEnsembleLearning摘要：与传统的手工自动调制分类（AMC）方法相比，深度学习已经显示出有希望的结果，AMC作为信号检测和调制之间的中间步骤发挥着关键作用。然而，获取大规模标记数据仍然具有挑战性，因为数据质量和
AIGC与自动驾驶：文心一言的车载交互设计 AI天才研究院 ChatGPT 实战计算 Agentic AI 实战 AIGC 自动驾驶文心一言 ai
AIGC与自动驾驶：文心一言的车载交互设计关键词：AIGC、自动驾驶、车载交互、文心一言、自然语言处理、多模态交互、用户体验摘要：本文深入探讨人工智能生成内容（AIGC）技术在自动驾驶领域的创新应用，特别是百度文心一言如何重构车载交互体验。通过解析文心一言的核心技术架构、多模态融合算法、场景化交互模型，结合具体代码实现和数学模型，揭示其在语音交互、情境理解、个性化服务等场景中的技术优势。同时通过项
[论文阅读] 人工智能 + 软件工程 | 当 LLM 写代码时，它的 “思考过程” 靠谱吗？—— 揭秘 CoT 质量的那些事儿张较瘦_ 前沿技术论文阅读人工智能软件工程
当LLM写代码时，它的“思考过程”靠谱吗？——揭秘CoT质量的那些事儿论文标题：AreTheyAllGood?EvaluatingtheQualityofCoTsinLLM-basedCodeGenerationarXiv:2507.06980[pdf,html,other]AreTheyAllGood?EvaluatingtheQualityofCoTsinLLM-basedCodeGenera
PyTorch 在 Python 自然语言处理中的运用 Python编程之道 Python编程之道 python pytorch 自然语言处理 ai
PyTorch在Python自然语言处理中的运用关键词：PyTorch，Python，自然语言处理，深度学习，文本分类，情感分析摘要：本文全面探讨了PyTorch在Python自然语言处理（NLP）领域的运用。首先介绍了相关背景知识，包括目的范围、预期读者等内容。接着详细阐述了核心概念，如词嵌入、循环神经网络等，并给出了相应的原理示意图和流程图。深入讲解了核心算法原理，结合Python代码进行详细
后端领域的自然语言处理技术应用大厂资深架构师 Spring Boot 开发实战自然语言处理 easyui 人工智能 ai
后端领域的自然语言处理技术应用关键词：后端领域、自然语言处理、技术应用、算法原理、实际案例摘要：本文聚焦于后端领域中自然语言处理技术的应用。首先介绍了相关背景，包括目的范围、预期读者等。接着阐述核心概念与联系，通过文本示意图和Mermaid流程图展示其原理和架构。详细讲解了核心算法原理并给出Python源代码示例，同时介绍了数学模型和公式。通过项目实战，展示代码实际案例并进行详细解释。分析了自然语
Char Studio 使用入门：高效构建企业级对话系统的实战指南 charles666666 人工智能产品经理语言模型自然语言处理架构
数字化浪潮推动下，企业与用户的交互模式正经历深刻变革，对话系统作为核心交互手段，其重要性日益凸显。然而，众多企业在构建对话系统时，却深陷诸多困境，难以自拔。一、开篇痛点场景：企业对话系统开发的典型困境企业在自行开发对话系统时，往往面临预算超支、周期漫长以及维护成本居高不下等问题。开发团队需要投入大量时间和精力进行底层技术架构的搭建，例如自然语言处理算法的研究、对话逻辑的设计等，这不仅消耗了大量的人
【AI大模型】深入解析预训练：大模型时代的核心引擎我爱一条柴ya 学习AI记录深度学习人工智能 ai python AI编程算法
预训练已成为现代人工智能，尤其是自然语言处理和计算机视觉领域的基石技术。它彻底改变了模型开发范式，催生了BERT、GPT等革命性模型。本文将系统阐述预训练的核心概念、原理、方法、应用及挑战。一、预训练的本质：为何需要它？核心问题：数据标注的瓶颈监督学习依赖海量高质量标注数据，获取成本极高（时间、金钱、专业知识）。对于复杂任务（如理解语义、生成文本），标注难度呈指数级上升。标注数据稀缺导致模型泛化能
开源人工神经网络库（OpenANN） deepdata_cn 人工智能神经网络
OpenANN（OpenANN，OpenArtificialNeuralNetworkLibrary）是一个开源的人工神经网络库，基于C++编写，依赖Eigen3库进行高效的矩阵运算，使用CMake进行项目构建，支持多种神经网络架构，包括前馈神经网络、卷积神经网络和循环神经网络等，适用于图像识别、自然语言处理、时间序列预测等多种场景。提供数据预处理、模型保存和加载、超参数优化等功能。支持GPU加速
SAX解析xml文件小猪猪08 xml
1.创建SAXParserFactory实例 2.通过SAXParserFactory对象获取SAXParser实例 3.创建一个类SAXParserHander继续DefaultHandler，并且实例化这个类 4.SAXParser实例的parse来获取文件 public static void main(String[] args) { //
为什么mysql里的ibdata1文件不断的增长？ brotherlamp linux linux运维 linux资料 linux视频 linux运维自学
我们在 Percona 支持栏目经常收到关于 MySQL 的 ibdata1 文件的这个问题。当监控服务器发送一个关于 MySQL 服务器存储的报警时，恐慌就开始了 —— 就是说磁盘快要满了。一番调查后你意识到大多数地盘空间被 InnoDB 的共享表空间 ibdata1 使用。而你已经启用了 innodbfileper_table，所以问题是： ibdata1存了什么？当你启用了 i
Quartz-quartz.properties配置 eksliang quartz
其实Quartz JAR文件的org.quartz包下就包含了一个quartz.properties属性配置文件并提供了默认设置。如果需要调整默认配置，可以在类路径下建立一个新的quartz.properties，它将自动被Quartz加载并覆盖默认的设置。下面是这些默认值的解释 #-----集群的配置 org.quartz.scheduler.instanceName =
informatica session的使用 18289753290 workflow session log Informatica
如果希望workflow存储最近20次的log，在session里的Config Object设置，log options做配置，save session log :sessions run ;savesessio log for these runs:20 session下面的source 里面有个tracing
Scrapy抓取网页时出现CRC check failed 0x471e6e9a != 0x7c07b839L的错误酷的飞上天空 scrapy
Scrapy版本0.14.4 出现问题现象： ERROR: Error downloading <GET http://xxxxx CRC check failed 解决方法 1.设置网络请求时的header中的属性'Accept-Encoding': '*;q=0' 明确表示不支持任何形式的压缩格式，避免程序的解压
java Swing小集锦永夜-极光 java swing
1.关闭窗体弹出确认对话框 1.1 this.setDefaultCloseOperation (JFrame.DO_NOTHING_ON_CLOSE); 1.2 this.addWindowListener ( new WindowAdapter () { public void windo
强制删除.svn文件夹随便小屋 java
在windows上，从别处复制的项目中可能带有.svn文件夹，手动删除太麻烦，并且每个文件夹下都有。所以写了个程序进行删除。因为.svn文件夹在windows上是只读的，所以用File中的delete()和deleteOnExist()方法都不能将其删除，所以只能采用windows命令方式进行删除
GET和POST有什么区别？及为什么网上的多数答案都是错的。 aijuans get post
如果有人问你，GET和POST，有什么区别？你会如何回答？我的经历前几天有人问我这个问题。我说GET是用于获取数据的，POST，一般用于将数据发给服务器之用。这个答案好像并不是他想要的。于是他继续追问有没有别的区别？我说这就是个名字而已，如果服务器支持，他完全可以把G
谈谈新浪微博背后的那些算法 aoyouzi 谈谈新浪微博背后的那些算法
本文对微博中常见的问题的对应算法进行了简单的介绍，在实际应用中的算法比介绍的要复杂的多。当然，本文覆盖的主题并不全，比如好友推荐、热点跟踪等就没有涉及到。但古人云“窥一斑而见全豹”，希望本文的介绍能帮助大家更好的理解微博这样的社交网络应用。微博是一个很多人都在用的社交应用。天天刷微博的人每天都会进行着这样几个操作：原创、转发、回复、阅读、关注、@等。其中，前四个是针对短博文，最后的关注和@则针
Connection reset 连接被重置的解决方法百合不是茶 java 字符流连接被重置
流是java的核心部分,,昨天在做android服务器连接服务器的时候出了问题,就将代码放到java中执行,结果还是一样连接被重置被重置的代码如下; 客户端代码; package 通信软件服务器; import java.io.BufferedWriter; import java.io.OutputStream; import java.io.O
web.xml配置详解之filter bijian1013 java web.xml filter
一.定义 <filter> <filter-name>encodingfilter</filter-name> <filter-class>com.my.app.EncodingFilter</filter-class> <init-param> <param-name>encoding<
Heritrix Bill_chen 多线程 xml 算法制造配置管理
作为纯Java语言开发的、功能强大的网络爬虫Heritrix，其功能极其强大，且扩展性良好，深受热爱搜索技术的盆友们的喜爱，但它配置较为复杂，且源码不好理解，最近又使劲看了下，结合自己的学习和理解，跟大家分享Heritrix的点点滴滴。 Heritrix的下载（http://sourceforge.net/projects/archive-crawler/）安装、配置，就不罗嗦了，可以自己找找资
【Zookeeper】FAQ bit1129 zookeeper
1.脱离IDE，运行简单的Java客户端程序 #ZkClient是简单的Zookeeper~$ java -cp "./:zookeeper-3.4.6.jar:./lib/*" ZKClient 1. Zookeeper是的Watcher回调是同步操作，需要添加异步处理的代码 2. 如果Zookeeper集群跨越多个机房，那么Leader/
The user specified as a definer ('aaa'@'localhost') does not exist 白糖_ localhost
今天遇到一个客户BUG，当前的jdbc连接用户是root，然后部分删除操作都会报下面这个错误：The user specified as a definer ('aaa'@'localhost') does not exist 最后找原因发现删除操作做了触发器，而触发器里面有这样一句 /*!50017 DEFINER = ''aaa@'localhost' */ 原来最初
javascript中showModelDialog刷新父页面 bozch JavaScript 刷新父页面 showModalDialog
在页面中使用showModalDialog打开模式子页面窗口的时候，如果想在子页面中操作父页面中的某个节点，可以通过如下的进行： window.showModalDialog('url',self,‘status...’); // 首先中间参数使用self 在子页面使用w
编程之美-买书折扣 bylijinnan 编程之美
import java.util.Arrays; public class BookDiscount { /**编程之美买书折扣书上的贪心算法的分析很有意思，我看了半天看不懂，结果作者说，贪心算法在这个问题上是不适用的。。下面用动态规划实现。哈利波特这本书一共有五卷，每卷都是8欧元，如果读者一次购买不同的两卷可扣除5%的折扣，三卷10%，四卷20%，五卷
关于struts2.3.4项目跨站执行脚本以及远程执行漏洞修复概要 chenbowen00 struts WEB安全
因为近期负责的几个银行系统软件，需要交付客户，因此客户专门请了安全公司对系统进行了安全评测，结果发现了诸如跨站执行脚本，远程执行漏洞以及弱口令等问题。下面记录下本次解决的过程以便后续 1、首先从最简单的开始处理，服务器的弱口令问题，首先根据安全工具提供的测试描述中发现应用服务器中存在一个匿名用户，默认是不需要密码的，经过分析发现服务器使用了FTP协议，而使用ftp协议默认会产生一个匿名用
[电力与暖气]煤炭燃烧与电力加温 comsci
在宇宙中,用贝塔射线观测地球某个部分,看上去,好像一个个马蜂窝,又像珊瑚礁一样,原来是某个国家的采煤区..... 不过,这个采煤区的煤炭看来是要用完了.....那么依赖将起燃烧并取暖的城市,在极度严寒的季节中...该怎么办呢? &nbs
oracle O7_DICTIONARY_ACCESSIBILITY参数 daizj oracle
O7_DICTIONARY_ACCESSIBILITY参数控制对数据字典的访问.设置为true,如果用户被授予了如select any table等any table权限,用户即使不是dba或sysdba用户也可以访问数据字典.在9i及以上版本默认为false,8i及以前版本默认为true.如果设置为true就可能会带来安全上的一些问题.这也就为什么O7_DICTIONARY_ACCESSIBIL
比较全面的MySQL优化参考 dengkane mysql
本文整理了一些MySQL的通用优化方法，做个简单的总结分享，旨在帮助那些没有专职MySQL DBA的企业做好基本的优化工作，至于具体的SQL优化，大部分通过加适当的索引即可达到效果，更复杂的就需要具体分析了，可以参考本站的一些优化案例或者联系我，下方有我的联系方式。这是上篇。 1、硬件层相关优化 1.1、CPU相关在服务器的BIOS设置中，可
C语言homework2，有一个逆序打印数字的小算法 dcj3sjt126com c
#h1# 0、完成课堂例子 1、将一个四位数逆序打印 1234 ==> 4321 实现方法一： # include <stdio.h> int main(void) { int i = 1234; int one = i%10; int two = i / 10 % 10; int three = i / 100 % 10;
apacheBench对网站进行压力测试 dcj3sjt126com apachebench
ab 的全称是 ApacheBench ，是 Apache 附带的一个小工具，专门用于 HTTP Server 的 benchmark testing ，可以同时模拟多个并发请求。前段时间看到公司的开发人员也在用它作一些测试，看起来也不错，很简单，也很容易使用，所以今天花一点时间看了一下。通过下面的一个简单的例子和注释，相信大家可以更容易理解这个工具的使用。
2种办法让HashMap线程安全 flyfoxs java jdk jni
多线程之--2种办法让HashMap线程安全多线程之--synchronized 和reentrantlock的优缺点多线程之--2种JAVA乐观锁的比较( NonfairSync VS. FairSync) HashMap不是线程安全的,往往在写程序时需要通过一些方法来回避.其实JDK原生的提供了2种方法让HashMap支持线程安全.
Spring Security（04）——认证简介 234390216 Spring Security 认证过程
认证简介目录 1.1 认证过程 1.2 Web应用的认证过程 1.2.1 ExceptionTranslationFilter 1.2.2 在request之间共享SecurityContext 1
Java 位运算 Javahuhui java 位运算
// 左移( << ) 低位补0 // 0000 0000 0000 0000 0000 0000 0000 0110 然后左移2位后，低位补0： // 0000 0000 0000 0000 0000 0000 0001 1000 System.out.println(6 << 2);// 运行结果是24 // 右移( >> ) 高位补"
mysql免安装版配置 ldzyz007 mysql
1、my-small.ini是为了小型数据库而设计的。不应该把这个模型用于含有一些常用项目的数据库。 2、my-medium.ini是为中等规模的数据库而设计的。如果你正在企业中使用RHEL,可能会比这个操作系统的最小RAM需求(256MB)明显多得多的物理内存。由此可见，如果有那么多RAM内存可以使用，自然可以在同一台机器上运行其它服务。 3、my-large.ini是为专用于一个SQL数据
MFC和ado数据库使用时遇到的问题你不认识的休道人 sql C++mfc
=================================================================== 第一个 =================================================================== try{ CString sql; sql.Format("select * from p
表单重复提交Double Submits rensanning double
可能发生的场景： *多次点击提交按钮 *刷新页面 *点击浏览器回退按钮 *直接访问收藏夹中的地址 *重复发送HTTP请求（Ajax）（1）点击按钮后disable该按钮一会儿，这样能避免急躁的用户频繁点击按钮。这种方法确实有些粗暴，友好一点的可以把按钮的文字变一下做个提示，比如Bootstrap的做法： http://getbootstrap.co
Java String 十大常见问题 tomcat_oracle java 正则表达式
　1.字符串比较，使用“==”还是equals()? 　　"=="判断两个引用的是不是同一个内存地址(同一个物理对象)。　　equals()判断两个字符串的值是否相等。　　除非你想判断两个string引用是否同一个对象，否则应该总是使用equals()方法。　　如果你了解字符串的驻留(String Interning)则会更好地理解这个问题。　　
SpringMVC 登陆拦截器实现登陆控制 xp9802 springMVC
思路，先登陆后，将登陆信息存储在session中，然后通过拦截器，对系统中的页面和资源进行访问拦截，同时对于登陆本身相关的页面和资源不拦截。实现方法： 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23