Class Unifier

java.lang.Object
org.languagetool.rules.patterns.Unifier

public class Unifier extends Object
Implements unification of features over tokens.
  • Field Details

    • UNIFY_IGNORE

      private static final String UNIFY_IGNORE
      See Also:
    • tokSequence

      private final List<AnalyzedTokenReadings> tokSequence
    • tokSequenceEquivalences

      private final List<List<Map<String,Set<String>>>> tokSequenceEquivalences
      List of all equivalences matched per tokens in the sequence, kept exactly in sync with the list in tokSequence, so that a reading 2 of token 1 has its equivalence map addressable as tokSequenceEquivalences.get(1).get(2).
    • equivalenceTypes

      private final Map<EquivalenceTypeLocator,PatternToken> equivalenceTypes
      A Map for storing the equivalence types for features. Features are specified as Strings, and map into types defined as maps from Strings to Elements.
    • equivalenceFeatures

      private final Map<String,List<String>> equivalenceFeatures
      A Map that stores all possible equivalence types listed for features.
    • equivalencesMatched

      private final List<Map<String,Set<String>>> equivalencesMatched
      Map of sets of matched equivalences in the unified sequence.
    • allFeatsIn

      private boolean allFeatsIn
    • tokCnt

      private int tokCnt
    • readingsCounter

      private int readingsCounter
    • featuresFound

      private List<Boolean> featuresFound
    • tmpFeaturesFound

      private List<Boolean> tmpFeaturesFound
    • equivalencesToBeKept

      private final Map<String,Set<String>> equivalencesToBeKept
    • unificationFeats

      private Map<String,List<String>> unificationFeats
    • inUnification

      private boolean inUnification
    • uniMatched

      private boolean uniMatched
    • uniAllMatched

      private boolean uniAllMatched
  • Constructor Details

  • Method Details

    • isSatisfied

      protected final boolean isSatisfied(AnalyzedToken aToken, Map<String,List<String>> uFeatures)
      Tests if a token has shared features with other tokens.
      Parameters:
      aToken - token to be tested
      uFeatures - features to be tested
      Returns:
      true if the token shares this type of feature with other tokens
    • checkNext

      private boolean checkNext(AnalyzedToken aToken, Map<String,List<String>> uFeatures)
    • startNextToken

      public final void startNextToken()
      Call after every complete token (AnalyzedTokenReadings) checked.
    • startUnify

      public final void startUnify()
      Starts testing only those equivalences that were previously matched.
    • getFinalUnificationValue

      public final boolean getFinalUnificationValue(Map<String,List<String>> uFeatures)
      Make sure that we really matched all the required features of the unification.
      Parameters:
      uFeatures - Features to be checked
      Returns:
      True if the token sequence has been found.
      Since:
      2.5
    • reset

      public final void reset()
      Resets after use of unification. Required.
    • getUnifiedTokens

      @Nullable public final @Nullable AnalyzedTokenReadings[] getUnifiedTokens()
      Gets a full sequence of filtered tokens.
      Returns:
      Array of AnalyzedTokenReadings that match equivalence relation defined for features tested, or null
    • addTokenToSequence

      private void addTokenToSequence(List<AnalyzedTokenReadings> tokenSequence, AnalyzedToken token, int pos)
    • isUnified

      public final boolean isUnified(AnalyzedToken matchToken, Map<String,List<String>> uFeatures, boolean lastReading, boolean isMatched)
      Tests if the token sequence is unified.

      Usage note: to test if the sequence of tokens is unified (i.e., shares a group of features, such as the same gender, number, grammatical case etc.), you need to test all tokens but the last one in the following way: call isUnified() for every reading of a token, and set lastReading to true. For the last token, check the truth value returned by this method. In previous cases, it may actually be discarded before the final check. See AbstractPatternRule for an example.

      To make it work in XML rules, the Elements built based on <token>s inside the unify block have to be processed in a special way: namely the last Element has to be marked as the last one (by using PatternToken.setLastInUnification()).
      Parameters:
      matchToken - AnalyzedToken token to unify
      lastReading - true when the matchToken is the last reading in the AnalyzedTokenReadings
      isMatched - true if the reading matches the element in the pattern rule, otherwise the reading is not considered in the unification
      Returns:
      true if the tokens in the sequence are unified
    • isUnified

      public final boolean isUnified(AnalyzedToken matchToken, Map<String,List<String>> uFeatures, boolean lastReading)
    • addNeutralElement

      public final void addNeutralElement(AnalyzedTokenReadings analyzedTokenReadings)
      Used to add neutral elements (AnalyzedTokenReadings to the unified sequence. Useful if the sequence contains punctuation or connectives, for example.
      Parameters:
      analyzedTokenReadings - A neutral element to be added.
      Since:
      2.5
    • getFinalUnified

      @Nullable public final @Nullable AnalyzedTokenReadings[] getFinalUnified()
      Used for getting a unified sequence in case when simple test method isUnified(AnalyzedToken, Map, boolean)} was used.
      Returns:
      An array of AnalyzedTokenReadings or null when not in unification