DeepEdit: Knowledge Editing as Decoding with Constraints


University of California, Los Angeles University of California, Davis

We propose DeepEdit ✍, a new knowledge editing method that can be applied to all black-box large language models (LLMs) without retraining. DeepEdit conducts knowledge editing by applying the semantic constraints to the decoding of LLMs. Two examples of the LLMs' text generation with our DeepEdit and the baseline method MeLLo are shown below.
teaser

Abstract

We propose a new perspective of knowledge editing (KE) for large language models (LLMs) that treats it as a constrained decoding problem. We design decoding constraints to regulate LLMs, ensuring coherence between reasoning steps when incorporating new knowledge. To enforce these constraints, we utilize a depth-first search to adaptively substitute new knowledge for the LLMs' original reasoning steps, greedily seeking the optimal path of multi-hop reasoning with new knowledge. From this vantage, we propose DEEPEDIT: Depth-first Search-based Decoding for Knowledge Editing. DEEPEDIT improves the KE of LLMs by enhancing the conciseness, coherence, pertinence, and receptiveness of reasoning with new knowledge. DEEPEDIT is flexibly applicable to any black-box LLM without requiring access to model parameters or token-wise distributions. In addition to DEEPEDIT, we propose two new KE benchmarks: MQuAKE-2002 and MQuAKE-hard 🏁 , which are designed to provide more precise and challenging assessments of KE approaches. Qualitatively, DEEPEDIT enables LLMs to produce more succinct reasoning outputs in accordance with new knowledge. Quantitatively, it yields significant improvements on multiple KE benchmarks.

teaser

New Benchmarks for Knowledge Editing

We provide two new benchmarks for the evaluation of KE methods: MQuAKE-2002 and MQuAKE-hard🏁. Both are built based on the MQuAKE (Zhong et al., 2023), a recent KE dataset that is designed to evaluate the KE methods on helping LLMs to answer multi-hop questions given new knowledge. Every instance in MQuAKE includes a multi-hop question and one to four related edited facts, every of which can alter the ground-truth answer to the question. Zhong et al. (2023) suggests using a randomly sampled 3,000 instance subset of MQuAKE to do the evaluation, which reduces the experimental costs. This evaluation benchmark is termed as MQuAKE-3k.

Annotation Mistakes in One-Third Instances of MQuAKE-3k.
There are two issues of using MQuAKE-3k to evaluate KE methods. The first issue is that the new knowledge from different instances MQuAKE-3k can cause conflicts and mistakes to the ground-truth answers. In other words, the ground-truth answer from instance A can be altered by the new knowledge from another instance B. We show an example of knowledge conflicts in MQuAKE-3k in the below Figure. These knowledge conflicts will make the ground-truth answers incorrect given the new knowledge that are conflicted with the answers because the inference on every instance would retrieve the new knowledge from all instances. We observe that 998 instances' ground-truth answers are broken by the new knowledge from other instances.

teaser

New Benchmark MQuAKE-2002 for More Precise Evaluation.
To address the issue of annotation mistakes in MQuAKE-3k, we provide a new subset of MQuAKE-3k, which does not have any knowledge conflict across instances. This subset includes 2,002 instances, so we term it as MQuAKE-2002. We filter out the instances of which the ground-truth answers are broken by the new knowledge from other instances to produce MQuAKE-2002. Compared with MQuAKE-3k, our MQuAKE-2002 provides a more precise evaluation for knowledge editing methods, since it removes the annotation mistakes due to knowledge conflicts across instances. The data statistics of MQuAKE-2002 are provided in the below Table.

New Benchmark MQuAKE-hard for More Challenging Evaluation.
The second issue of MQuAKE-3k is that more than 66% instances in MQuAKE-3k only contain at most two edited facts that influence the answers, which are not challenging enough to evaluate the knowledge editing methods on handling multiple edited facts that can alter the ground-truth reasons. We construct a more challenging subset of MQuAKE by selecting the instances that contain the highest number of edited facts per instance. We term this challenging set as MQuAKE-hard, which includes 429 instances and every instance contains four edited facts. MQuAKE-hard has no overlap with MQuAKE-3k. We term this challenging set as MQuAKE-hard, which includes 429 instances and every instance contains four edited facts. MQuAKE-hard has no overlap with MQuAKE-3k. The data statistics of MQuAKE-hard are also provided in the below Table.

Data statistics of different benchmarks. # Conflicted Instances represent the number of instances of which the ground-truth labels are affected by the new knowledge from other instances. An example is shown in the above Figure.
Benchmark # Instances # Hops per Instance # Edited Facts per Instance # Conflicted Instances
MQuAKE-3k [Zhong et al., 2023] 3,000 3.0 2.0 998
MQuAKE-2002 (Ours) 2,002 2.7 2.2 0
MQuAKE-hard (Ours) 429 4.0 4.0 0
See more details in our paper

Designing Decoding Constraints for Knowledge Editing

The central idea is to view knowledge editing as a problem of decoding with constraints.
We propose to impose semantic constraints on the LLMs' outputs to represent the desired properties that benefit knowledge editing. Specifically, we propose the following constraints: Conciseness 📍 prevents the redundant loops of reasoning; Coherence 🐾 guarantees the coherence of adjacent steps; Receptiveness 💡 guarantees the LLMs' awareness of new knowledge; Pertinence 🙌 improves the relevance of reasoning to the input question. These constraints improve LLMs’ reasoning with new knowledge from different perspectives as a whole.

teaser
See more details in our paper

Decoding at The Reasoning Step Level

In this work, we change the granularity of decoding from the token level to the reasoning step level. Our decoding is designed at the step level: instead of taking the tokens predefined in a fixed vocabulary as the candidates, we take the reasoning steps as the candidates. At every iteration, we let LLMs to produce one reasoning step and retrieve a fixed number of new facts in parallel to the reasoning step as the step candidates. The new fact retrieval is based on the nearest neighbor search with the sentence-level embeddings given by BERT. The motivation of this design is that the new facts that are semantically close to the generated facts are more likely to satisfy the constraints.

Taking the single generated step and the new facts as the step candidates, we filter a valid candidate and take it as the new reasoning step. Note that there may exist more than one candidates that satisfy the constraints. We will introduce how we design the decoding constraints for KE in Section 3.2, how we verify the constraints in Section 3.3 and Section 3.4, and how we process the multiple valid step candidates to search for the optimal path of LLMs' multi-hop reasoning in Section 3.5.

Our decoding design bypasses the requirement of accessing the token-wise distributions, since all the decoding actions are defined over the step level. Our decoding strategy is free to be applied to black-box LLMs in order to produce satisfactory outputs in accordance with new knowledge.

Graph depicting the process of unlearning Harry Potter content

Local Constraint Verifier

We store the candidates that satisfy the local verifier ⚖ in a stack for further use. Instead of simply retaining only one candidate, we store all the candidates in the stack 📚, since we have the global verifier that has to verify the reasoning steps from the whole reasoning process. The correctness of a reasoning step has to be finally decided by the global verifier.

Graph depicting the process of unlearning Harry Potter content

Global Constraint Verifier

The global verifier ⚖ determines the feasibility of the whole reasoning process by checking whether the answer is found. If not feasible, we pop the status stack 📚 to use another reasoning subprocess that satisfies the local verifier.

Graph depicting the process of unlearning Harry Potter content

Performance of Knowledge Editing

Experimental results (accuracy; %) on the dataset MQuAKE-3k with the KE batch size of 100.
Method MQuAKE-3k
text-davinci-003 w/ MeLLo 32.2
text-davinci-003 w/ DEEPEDIT (Ours) 49.1
gpt-3.5-turbo-instruct w/ MeLLo 29.9
gpt-3.5-turbo-instruct w/ DEEPEDIT (Ours) 47.2

Graph depicting the process of unlearning Harry Potter content

Graph depicting the process of unlearning Harry Potter content

BibTeX


@misc{wang2024deepedit,
title={DeepEdit: Knowledge Editing as Decoding with Constraints},
author={Yiwei Wang and Muhao Chen and Nanyun Peng and Kai-Wei Chang},
year={2024},
eprint={2401.10471},
archivePrefix={arXiv},
primaryClass={cs.CL}
}