toplogo
Resources
Sign In

Measuring the Scope of Patent Claims Using Probabilities from Language Models


Core Concepts
The scope of a patent claim can be measured as the reciprocal of the self-information (surprisal) of the claim, where the self-information is calculated based on the probability of occurrence of the claim obtained from a language model.
Abstract
The paper proposes a novel approach to measuring the scope of patent claims based on probabilities obtained from language models. The key points are: The scope of a patent claim is defined as the reciprocal of the self-information (surprisal) of the claim, where self-information is the negative logarithm of the probability of occurrence of the claim. The probability of occurrence of a claim can be estimated using different language models, ranging from simple models based on word/character frequencies to advanced large language models (LLMs) that capture contextual information. Several practical implementations are explored, including models based on equal token probabilities, word/character frequencies, and conditional probabilities from LLMs like GPT-2 and davinci-002. The performance of the different language models is assessed by applying them to various series of patent claims designed to have gradually decreasing scope. The LLMs outperform the simpler frequency-based models, but even the character count alone proves to be a more reliable indicator of scope than the word count. The proposed approach provides a more principled and semantically-aware way of measuring patent claim scope compared to previous methods based solely on word or character counts.
Stats
The scope of a patent claim is inversely proportional to the self-information (surprisal) of the claim. Self-information is calculated as the negative logarithm of the probability of occurrence of the claim. Probability of occurrence can be estimated using various language models, from simple frequency-based to advanced LLMs.
Quotes
"Self-information (also called surprisal, information content, or Shannon information) is defined as the negative log-probability. That is, given a claim C, and the probability p(C) of occurrence of this claim, the associated self-information I(C) is defined as: I(C) = -log(p(C)) = log(1/p(C))." "According to this formulation, S→0 as p→0 and I→∞, while S→∞ as p→1 and I→0. In other words, a claim having a small value of self-information (meaning its definition requires little information) has a broad scope, whereas a claim requiring a lot of information to define it has a narrow scope."

Deeper Inquiries

What other types of language models beyond LLMs could be explored for estimating the probability of patent claim occurrence?

In addition to Large Language Models (LLMs) like GPT-2, there are other types of language models that could be explored for estimating the probability of patent claim occurrence. One option is to consider Transformer-based models like BERT (Bidirectional Encoder Representations from Transformers) or RoBERTa (Robustly optimized BERT approach). These models have shown effectiveness in various natural language processing tasks and could potentially provide accurate probabilities for patent claim occurrence. Additionally, models like ELMO (Embeddings from Language Models) or ULMFiT (Universal Language Model Fine-tuning) could also be considered for their ability to capture contextual information in text.

How could the proposed approach be extended to account for the hierarchical structure of patent claims (independent vs. dependent claims)?

To account for the hierarchical structure of patent claims, the proposed approach could be extended by incorporating a weighting mechanism based on the relationship between independent and dependent claims. Independent claims, which define the core invention, could be given higher weights in the probability calculation compared to dependent claims, which build upon the independent claims. This weighting could be determined based on the position of the claim in the hierarchy (e.g., root independent claim vs. dependent claim) or the number of dependencies a claim has on other claims. By adjusting the probabilities of occurrence based on this hierarchical structure, the scope measurement would more accurately reflect the overall protection and innovation level of the patent.

Can the scope measurement be further improved by incorporating additional factors beyond just the probability of claim occurrence, such as the technical significance or novelty of the claimed invention?

Yes, the scope measurement can be enhanced by incorporating additional factors beyond just the probability of claim occurrence. Factors such as the technical significance, novelty, and inventive step of the claimed invention could be integrated into the scope calculation. This could involve assigning weights to these factors based on their importance in determining the breadth and depth of the patent claim. For example, a highly novel and technically significant feature within a claim could be given a higher weight, leading to a narrower scope measurement. By considering these additional factors, the scope measurement would not only reflect the linguistic complexity of the claim but also the substantive innovation and uniqueness of the invention, providing a more comprehensive evaluation of the patent claim's scope.
0