Lecture 19 Question Answering

目录

      • introduction
      • IR-based QA (dominant approach)
      • Knowledge-based QA
      • Hybrid QA
      • Conclusion

introduction

  • Definition: question answering (“QA”) is the task of automatically determining the answer for a natural language question
  • Mostly focus on “factoid” questions
  • factoid question(not ambiguious)
    • Factoid questions, have short precise answers:
      • What war involved the battle of Chapultepec?
      • What is the date of Boxing Day?
      • What are some fragrant white climbing roses?
      • What are tannins?
  • non-factoid question
    • General non-factoid questions require a longer answer, critical analysis, summary, calculation and more:
      • Why is the date of Australia Day contentious?
      • What is the angle 60 degrees in radians?
  • why focus on factoid questions
    • They are easier
    • They have an objective answer
    • Current NLP technologies cannot handle non-factoid answers(under developing)
  • 2 key approaches
    • Information retrieval-based QA
      • Given a query, search relevant documents
      • extract answers within these relevant documents
    • Knowledge-based QA
      • Builds semantic representation of the query
      • Query database of facts to find answers

IR-based QA (dominant approach)

  • IR-based factoid QA: TREC-QA Lecture 19 Question Answering_第1张图片

    1. Use question to make query for IR engine
      • query formulation: extract key terms to search for documents in the database
      • answer type detection: guess what type of answer is the question looking for
    2. Find document, and passage(段,章) within document(find most relevant passages)
    3. Extract short answer string(use relevant passages and answer type information)
  • question processing

    • Find key parts of question that will help retrieval
      • Discard non-content words/symbols (wh-word, ?, etc)
      • Formulate as tf-idf query, using unigrams or bigrams
      • Identify entities and prioritise match
    • May reformulate question using templates
      • E.g. “Where is Federation Square located?”
      • Query = “Federation Square located”
      • Query = “Federation Square is located [in/at]”
    • Predict expected answer type (here = LOCATION)
  • answer types

    • Knowing the type of answer can help in:
      • finding the right passage containing the answer
      • finding the answer string
    • Treat as classification (a closed set of answer types)
      • given question, predict answer type
      • key feature is question headword
      • What are the animals on the Australian coat of arms?
      • Generally not a difficult task Lecture 19 Question Answering_第2张图片Lecture 19 Question Answering_第3张图片
  • retrieval

    • Find top n documents matching query (standard IR)
    • Next find passages (paragraphs or sentences) in these documents (also driven by IR)
    • a good passage should contain:
      • many instances of the question keywords
      • several named entities of the answer type
      • close proximity(靠近) of these terms in the passage
      • high ranking by IR engine
    • Re-rank IR outputs to find best passage (e.g., using supervised learning)
  • answer extraction

    • Find a concise answer to the question, as a span in the passage

      • “Who is the federal MP for Melbourne?”
      • The Division of Melbourne is an Australian Electoral Division in Victoria, represented since the 2010 election by Adam Bandt, a member of the Greens.
      • “How many Australian PMs have there been since 2013?”
      • Australia has had five prime ministers in five years. No wonder Merkel needed a cheat sheet at the G-20.
    • how?

      • Use a neural network to extract answer
      • AKA reading comprehension task(assuming query and evidence passage are given, and find the span)
      • But deep learning models require lots of data
      • Do we have enough data to train comprehension models?
    • dataset

      • MCTest(a dataset)

        • Crowdworkers write fictional stories, questions and answers
        • 500 stories, 2000 questions
        • Multiple choice questions Lecture 19 Question Answering_第4张图片
      • SQuAD

        • Use Wikipedia passages(easier to create than MCTest)
        • First set of crowdworkers create questions (given passage)
        • Second set of crowdworkers label the answer
        • 150K questions (!)
        • Second version includes unanswerable questions(no answer in the passage) Lecture 19 Question Answering_第5张图片
    • reading comprehension

      • Given a question and context passage, predict where the answer span starts and end in passage?

      • Compute:

        • P s t a r t ( i ) P_{start}(i) Pstart(i): prob. of token i is the starting token
        • P e n d ( i ) P_{end}(i) Pend(i): prob. of token i is the ending token Lecture 19 Question Answering_第6张图片
      • LSTM-based model

        • Feed question tokens to a bidirectional LSTM

        • Aggregate LSTM outputs via weighted sum to produce q, the final question embedding Lecture 19 Question Answering_第7张图片

        • Process passage in a similar way, using another bidirectional LSTM

        • More than just word embeddings as input

          • A feature to denote whether the word matches a question word
          • POS feature
          • Weighted question embedding: produced by attending to each question words Lecture 19 Question Answering_第8张图片
        • { p 1 , . . . , p m p_1,...,p_m p1,...,pm}: one vector for each passage token from bidirectional LSTM

        • To compute start and end probability for each token

          • p s t a r t ( i ) ∝ e x p ( p i W s q ) p_{start}(i) \propto exp(p_iW_sq) pstart(i)exp(piWsq)
          • P e n d ( i ) ∝ e x p ( p i W e q ) P_{end}(i) \propto exp(p_iW_eq) Pend(i)exp(piWeq) Lecture 19 Question Answering_第9张图片
      • BERT-based model

        • Fine-tune BERT to predict answer span

          • p s t a r t ( i ) ∝ e x p ( S T T i ′ ) p_{start}(i) \propto exp(S^TT_i') pstart(i)exp(STTi)
          • p e n d ( i ) ∝ e x p ( E T T i ′ ) p_{end}(i) \propto exp(E^TT_i') pend(i)exp(ETTi) Lecture 19 Question Answering_第10张图片
        • why BERT works better than LSTM

          • It’s pre-trained and so already “knows” language before it’s adapted to the task
          • Self-attention architecture allows fine-grained analysis between words in question and context paragraph

Knowledge-based QA

  • QA over structured KB
    • Many large knowledge bases
      • Freebase, DBpedia, Yago, …
    • Can we support natural language queries?
      • E.g.
        • ‘When was Ada Lovalace born?’ → \to birth-year (Ada Lovelace, ?x)
        • ‘What is the capital of England’ → \to capital-city(?x, England)
      • Link “Ada Lovelace” with the correct entity in the KB to find triple (Ada Lovelace, birth-year, 1815)
    • but
      • Converting natural language sentence into triple is not trivial
        • ‘When was Ada Lovelace born’ → \to birth-year (Ada Lovelace, ?x)
      • Entity linking also an important component
        • Ambiguity: “When was Lovelace born?”
      • Can we simplify this two-step process?
    • semantic parsing
      • Convert questions into logical forms to query KB directly

        • Predicate calculus
        • Programming query (e.g. SQL) Lecture 19 Question Answering_第11张图片
      • how to build a semantic parser

        • Text-to-text problem:
          • Input = natural language sentence
          • Output = string in logical form
        • Encoder-decoder model(but do we have enough data?) Lecture 19 Question Answering_第12张图片

Hybrid QA

  • hybrid methods

    • Why not use both text-based and knowledgebased resources for QA?
    • IBM’s Watson which won the game show Jeopardy! uses a wide variety of resources to answer questions
      • (question)THEATRE: A new play based on this Sir Arthur Conan Doyle canine classic opened on the London stage in 2007.
      • (answer)The Hound Of The Baskervilles
  • core idea of Watson

    • Generate lots of candidate answers from textbased and knowledge-based sources
    • Use a rich variety of evidence to score them
    • Many components in the system, most trained separately Lecture 19 Question Answering_第13张图片Lecture 19 Question Answering_第14张图片Lecture 19 Question Answering_第15张图片Lecture 19 Question Answering_第16张图片
  • QA evaluation

    • IR: Mean Reciprocal Rank for systems returning matching passages or answer strings
      • E.g. system returns 4 passages for a query, first correct passage is the 3rd passage
      • MRR = 1/3
    • MCTest: Accuracy
    • SQuAD: Exact match of string against gold answer

Conclusion

  • IR-based QA: search textual resources to answer questions
    • Reading comprehension: assumes question+passage
  • Knowledge-based QA: search structured resources to answer questions
  • Hot area: many new approaches & evaluation datasets being created all the time (narratives, QA, commonsense reasoning, etc)

你可能感兴趣的:(自然语言处理,自然语言处理,问答)