Class TreeWalk

java.lang.Object
org.eclipse.jgit.treewalk.TreeWalk
All Implemented Interfaces:
AutoCloseable, AttributesProvider
Direct Known Subclasses:
NameConflictTreeWalk

public class TreeWalk extends Object implements AutoCloseable, AttributesProvider
Walks one or more AbstractTreeIterators in parallel.

This class can perform n-way differences across as many trees as necessary.

Each tree added must have the same root as existing trees in the walk.

A TreeWalk instance can only be used once to generate results. Running a second time requires creating a new TreeWalk instance, or invoking reset() and adding new trees before starting again. Resetting an existing instance may be faster for some applications as some internal buffers may be recycled.

TreeWalk instances are not thread-safe. Applications must either restrict usage of a TreeWalk instance to a single thread, or implement their own synchronization at a higher level.

Multiple simultaneous TreeWalk instances per Repository are permitted, even from concurrent threads.

  • Field Details

    • NO_TREES

      private static final AbstractTreeIterator[] NO_TREES
    • operationType

      private TreeWalk.OperationType operationType
      Type of operation you want to retrieve the git attributes for.
    • filterCommandsByNameDotType

      private Map<String,String> filterCommandsByNameDotType
      The filter command as defined in gitattributes. The keys are filterName+"."+filterCommandType. E.g. "lfs.clean"
    • reader

      private final ObjectReader reader
    • closeReader

      private final boolean closeReader
    • idBuffer

      private final MutableObjectId idBuffer
    • filter

      private TreeFilter filter
    • trees

    • recursive

      private boolean recursive
    • postOrderTraversal

      private boolean postOrderTraversal
    • depth

      int depth
    • advance

      private boolean advance
    • postChildren

      private boolean postChildren
    • attributesNodeProvider

      private AttributesNodeProvider attributesNodeProvider
    • currentHead

    • attrs

      private Attributes attrs
      Cached attribute for the current entry
    • attributesHandler

      private AttributesHandler attributesHandler
      Cached attributes handler
    • config

      private Config config
    • filterCommands

      private Set<String> filterCommands
  • Constructor Details

    • TreeWalk

      public TreeWalk(Repository repo)
      Create a new tree walker for a given repository.
      Parameters:
      repo - the repository the walker will obtain data from. An ObjectReader will be created by the walker, and will be closed when the walker is closed.
    • TreeWalk

      public TreeWalk(@Nullable Repository repo, ObjectReader or)
      Create a new tree walker for a given repository.
      Parameters:
      repo - the repository the walker will obtain data from. An ObjectReader will be created by the walker, and will be closed when the walker is closed.
      or - the reader the walker will obtain tree data from. The reader is not closed when the walker is closed.
      Since:
      4.3
    • TreeWalk

      public TreeWalk(ObjectReader or)
      Create a new tree walker for a given repository.
      Parameters:
      or - the reader the walker will obtain tree data from. The reader is not closed when the walker is closed.
    • TreeWalk

      private TreeWalk(@Nullable Repository repo, ObjectReader or, boolean closeReader)
  • Method Details

    • setOperationType

      public void setOperationType(TreeWalk.OperationType operationType)
      Set the operation type of this walk
      Parameters:
      operationType - a TreeWalk.OperationType object.
      Since:
      4.2
    • forPath

      Open a tree walk and filter to exactly one path.

      The returned tree walk is already positioned on the requested path, so the caller should not need to invoke next() unless they are looking for a possible directory/file name conflict.

      Parameters:
      reader - the reader the walker will obtain tree data from.
      path - single path to advance the tree walk instance into.
      trees - one or more trees to walk through, all with the same root.
      Returns:
      a new tree walk configured for exactly this one path; null if no path was found in any of the trees.
      Throws:
      IOException - reading a pack file or loose object failed.
      CorruptObjectException - an tree object could not be read as its data stream did not appear to be a tree, or could not be inflated.
      IncorrectObjectTypeException - an object we expected to be a tree was not a tree.
      MissingObjectException - a tree object was not found.
    • forPath

      Open a tree walk and filter to exactly one path.

      The returned tree walk is already positioned on the requested path, so the caller should not need to invoke next() unless they are looking for a possible directory/file name conflict.

      Parameters:
      repo - repository to read config data and AttributesNodeProvider from.
      reader - the reader the walker will obtain tree data from.
      path - single path to advance the tree walk instance into.
      trees - one or more trees to walk through, all with the same root.
      Returns:
      a new tree walk configured for exactly this one path; null if no path was found in any of the trees.
      Throws:
      IOException - reading a pack file or loose object failed.
      CorruptObjectException - an tree object could not be read as its data stream did not appear to be a tree, or could not be inflated.
      IncorrectObjectTypeException - an object we expected to be a tree was not a tree.
      MissingObjectException - a tree object was not found.
      Since:
      4.3
    • forPath

      Open a tree walk and filter to exactly one path.

      The returned tree walk is already positioned on the requested path, so the caller should not need to invoke next() unless they are looking for a possible directory/file name conflict.

      Parameters:
      db - repository to read tree object data from.
      path - single path to advance the tree walk instance into.
      trees - one or more trees to walk through, all with the same root.
      Returns:
      a new tree walk configured for exactly this one path; null if no path was found in any of the trees.
      Throws:
      IOException - reading a pack file or loose object failed.
      CorruptObjectException - an tree object could not be read as its data stream did not appear to be a tree, or could not be inflated.
      IncorrectObjectTypeException - an object we expected to be a tree was not a tree.
      MissingObjectException - a tree object was not found.
    • forPath

      Open a tree walk and filter to exactly one path.

      The returned tree walk is already positioned on the requested path, so the caller should not need to invoke next() unless they are looking for a possible directory/file name conflict.

      Parameters:
      db - repository to read tree object data from.
      path - single path to advance the tree walk instance into.
      tree - the single tree to walk through.
      Returns:
      a new tree walk configured for exactly this one path; null if no path was found in any of the trees.
      Throws:
      IOException - reading a pack file or loose object failed.
      CorruptObjectException - an tree object could not be read as its data stream did not appear to be a tree, or could not be inflated.
      IncorrectObjectTypeException - an object we expected to be a tree was not a tree.
      MissingObjectException - a tree object was not found.
    • getObjectReader

      public ObjectReader getObjectReader()
      Get the reader this walker is using to load objects.
      Returns:
      the reader this walker is using to load objects.
    • getOperationType

      public TreeWalk.OperationType getOperationType()
      Get the operation type
      Returns:
      the TreeWalk.OperationType
      Since:
      4.3
    • close

      public void close()

      Release any resources used by this walker's reader.

      A walker that has been released can be used again, but may need to be released after the subsequent usage.

      Specified by:
      close in interface AutoCloseable
      Since:
      4.0
    • getFilter

      public TreeFilter getFilter()
      Get the currently configured filter.
      Returns:
      the current filter. Never null as a filter is always needed.
    • setFilter

      public void setFilter(TreeFilter newFilter)
      Set the tree entry filter for this walker.

      Multiple filters may be combined by constructing an arbitrary tree of AndTreeFilter or OrTreeFilter instances to describe the boolean expression required by the application. Custom filter implementations may also be constructed by applications.

      Note that filters are not thread-safe and may not be shared by concurrent TreeWalk instances. Every TreeWalk must be supplied its own unique filter, unless the filter implementation specifically states it is (and always will be) thread-safe. Callers may use TreeFilter.clone() to create a unique filter tree for this TreeWalk instance.

      Parameters:
      newFilter - the new filter. If null the special TreeFilter.ALL filter will be used instead, as it matches every entry.
      See Also:
    • isRecursive

      public boolean isRecursive()
      Is this walker automatically entering into subtrees?

      If the walker is recursive then the caller will not see a subtree node and instead will only receive file nodes in all relevant subtrees.

      Returns:
      true if automatically entering subtrees is enabled.
    • setRecursive

      public void setRecursive(boolean b)
      Set the walker to enter (or not enter) subtrees automatically.

      If recursive mode is enabled the walker will hide subtree nodes from the calling application and will produce only file level nodes. If a tree (directory) is deleted then all of the file level nodes will appear to be deleted, recursively, through as many levels as necessary to account for all entries.

      Parameters:
      b - true to skip subtree nodes and only obtain files nodes.
    • isPostOrderTraversal

      public boolean isPostOrderTraversal()
      Does this walker return a tree entry after it exits the subtree?

      If post order traversal is enabled then the walker will return a subtree after it has returned the last entry within that subtree. This may cause a subtree to be seen by the application twice if isRecursive() is false, as the application will see it once, call enterSubtree(), and then see it again as it leaves the subtree.

      If an application does not enable isRecursive() and it does not call enterSubtree() then the tree is returned only once as none of the children were processed.

      Returns:
      true if subtrees are returned after entries within the subtree.
    • setPostOrderTraversal

      public void setPostOrderTraversal(boolean b)
      Set the walker to return trees after their children.
      Parameters:
      b - true to get trees after their children.
      See Also:
    • setAttributesNodeProvider

      public void setAttributesNodeProvider(AttributesNodeProvider provider)
      Sets the AttributesNodeProvider for this TreeWalk.

      This is a requirement for a correct computation of the git attributes. If this TreeWalk has been built using TreeWalk(Repository) constructor, the AttributesNodeProvider has already been set. Indeed,the Repository can provide an AttributesNodeProvider using Repository.createAttributesNodeProvider() method. Otherwise you should provide one.

      Parameters:
      provider - a AttributesNodeProvider object.
      Since:
      4.2
      See Also:
    • getAttributesNodeProvider

      public AttributesNodeProvider getAttributesNodeProvider()
      Get the attributes node provider
      Returns:
      the AttributesNodeProvider for this TreeWalk.
      Since:
      4.3
    • getAttributes

      public Attributes getAttributes()
      Get attributes

      Retrieve the git attributes for the current entry.

      Git attribute computation

      • Get the attributes matching the current path entry from the info file (see AttributesNodeProvider.getInfoAttributesNode()).
      • Completes the list of attributes using the .gitattributes files located on the current path (the further the directory that contains .gitattributes is from the path in question, the lower its precedence). For a checkin operation, it will look first on the working tree (if any). If there is no attributes file, it will fallback on the index. For a checkout operation, it will first use the index entry and then fallback on the working tree if none.
      • In the end, completes the list of matching attributes using the global attribute file define in the configuration (see AttributesNodeProvider.getGlobalAttributesNode())

      Iterator constraints

      In order to have a correct list of attributes for the current entry, this TreeWalk requires to have at least one AttributesNodeProvider and a DirCacheIterator set up. An AttributesNodeProvider is used to retrieve the attributes from the info attributes file and the global attributes file. The DirCacheIterator is used to retrieve the .gitattributes files stored in the index. A WorkingTreeIterator can also be provided to access the local version of the .gitattributes files. If none is provided it will fallback on the DirCacheIterator.

      Specified by:
      getAttributes in interface AttributesProvider
      Returns:
      the currently active attributes
      Since:
      4.2
    • getEolStreamType

      @Nullable public CoreConfig.EolStreamType getEolStreamType(TreeWalk.OperationType opType)
      Get the EOL stream type of the current entry using the config and getAttributes().
      Parameters:
      opType - the operationtype (checkin/checkout) which should be used
      Returns:
      the EOL stream type of the current entry using the config and getAttributes(). Note that this method may return null if the TreeWalk is not based on a working tree
      Since:
      4.10
    • reset

      public void reset()
      Reset this walker so new tree iterators can be added to it.
    • reset

      Reset this walker to run over a single existing tree.
      Parameters:
      id - the tree we need to parse. The walker will execute over this single tree if the reset is successful.
      Throws:
      MissingObjectException - the given tree object does not exist in this repository.
      IncorrectObjectTypeException - the given object id does not denote a tree, but instead names some other non-tree type of object. Note that commits are not trees, even if they are sometimes called a "tree-ish".
      CorruptObjectException - the object claimed to be a tree, but its contents did not appear to be a tree. The repository may have data corruption.
      IOException - a loose object or pack file could not be read.
    • reset

      Reset this walker to run over a set of existing trees.
      Parameters:
      ids - the trees we need to parse. The walker will execute over this many parallel trees if the reset is successful.
      Throws:
      MissingObjectException - the given tree object does not exist in this repository.
      IncorrectObjectTypeException - the given object id does not denote a tree, but instead names some other non-tree type of object. Note that commits are not trees, even if they are sometimes called a "tree-ish".
      CorruptObjectException - the object claimed to be a tree, but its contents did not appear to be a tree. The repository may have data corruption.
      IOException - a loose object or pack file could not be read.
    • addTree

      Add an already existing tree object for walking.

      The position of this tree is returned to the caller, in case the caller has lost track of the order they added the trees into the walker.

      The tree must have the same root as existing trees in the walk.

      Parameters:
      id - identity of the tree object the caller wants walked.
      Returns:
      position of this tree within the walker.
      Throws:
      MissingObjectException - the given tree object does not exist in this repository.
      IncorrectObjectTypeException - the given object id does not denote a tree, but instead names some other non-tree type of object. Note that commits are not trees, even if they are sometimes called a "tree-ish".
      CorruptObjectException - the object claimed to be a tree, but its contents did not appear to be a tree. The repository may have data corruption.
      IOException - a loose object or pack file could not be read.
    • addTree

      public int addTree(AbstractTreeIterator p)
      Add an already created tree iterator for walking.

      The position of this tree is returned to the caller, in case the caller has lost track of the order they added the trees into the walker.

      The tree which the iterator operates on must have the same root as existing trees in the walk.

      Parameters:
      p - an iterator to walk over. The iterator should be new, with no parent, and should still be positioned before the first entry. The tree which the iterator operates on must have the same root as other trees in the walk.
      Returns:
      position of this tree within the walker.
    • getTreeCount

      public int getTreeCount()
      Get the number of trees known to this walker.
      Returns:
      the total number of trees this walker is iterating over.
    • next

      Advance this walker to the next relevant entry.
      Returns:
      true if there is an entry available; false if all entries have been walked and the walk of this set of tree iterators is over.
      Throws:
      MissingObjectException - isRecursive() was enabled, a subtree was found, but the subtree object does not exist in this repository. The repository may be missing objects.
      IncorrectObjectTypeException - isRecursive() was enabled, a subtree was found, and the subtree id does not denote a tree, but instead names some other non-tree type of object. The repository may have data corruption.
      CorruptObjectException - the contents of a tree did not appear to be a tree. The repository may have data corruption.
      IOException - a loose object or pack file could not be read.
    • stopWalk

      void stopWalk() throws IOException
      Notify iterators the walk is aborting.

      Primarily to notify DirCacheBuildIterator the walk is aborting so that it can copy any remaining entries.

      Throws:
      IOException - if traversal of remaining entries throws an exception during object access. This should never occur as remaining trees should already be in memory, however the methods used to finish traversal are declared to throw IOException.
    • getTree

      public <T extends AbstractTreeIterator> T getTree(int nth, Class<T> clazz)
      Obtain the tree iterator for the current entry.

      Entering into (or exiting out of) a subtree causes the current tree iterator instance to be changed for the nth tree. This allows the tree iterators to manage only one list of items, with the diving handled by recursive trees.

      Parameters:
      nth - tree to obtain the current iterator of.
      clazz - type of the tree iterator expected by the caller.
      Returns:
      r the current iterator of the requested type; null if the tree has no entry to match the current path.
    • getRawMode

      public int getRawMode(int nth)
      Obtain the raw FileMode bits for the current entry.

      Every added tree supplies mode bits, even if the tree does not contain the current entry. In the latter case FileMode.MISSING's mode bits (0) are returned.

      Parameters:
      nth - tree to obtain the mode bits from.
      Returns:
      mode bits for the current entry of the nth tree.
      See Also:
    • getFileMode

      public FileMode getFileMode(int nth)
      Obtain the FileMode for the current entry.

      Every added tree supplies a mode, even if the tree does not contain the current entry. In the latter case FileMode.MISSING is returned.

      Parameters:
      nth - tree to obtain the mode from.
      Returns:
      mode for the current entry of the nth tree.
    • getFileMode

      public FileMode getFileMode()
      Obtain the FileMode for the current entry on the currentHead tree
      Returns:
      mode for the current entry of the currentHead tree.
      Since:
      4.3
    • getObjectId

      public ObjectId getObjectId(int nth)
      Obtain the ObjectId for the current entry.

      Using this method to compare ObjectId values between trees of this walker is very inefficient. Applications should try to use idEqual(int, int) or getObjectId(MutableObjectId, int) whenever possible.

      Every tree supplies an object id, even if the tree does not contain the current entry. In the latter case ObjectId.zeroId() is returned.

      Parameters:
      nth - tree to obtain the object identifier from.
      Returns:
      object identifier for the current tree entry.
      See Also:
    • getObjectId

      public void getObjectId(MutableObjectId out, int nth)
      Obtain the ObjectId for the current entry.

      Every tree supplies an object id, even if the tree does not contain the current entry. In the latter case ObjectId.zeroId() is supplied.

      Applications should try to use idEqual(int, int) when possible as it avoids conversion overheads.

      Parameters:
      out - buffer to copy the object id into.
      nth - tree to obtain the object identifier from.
      See Also:
    • idEqual

      public boolean idEqual(int nthA, int nthB)
      Compare two tree's current ObjectId values for equality.
      Parameters:
      nthA - first tree to compare the object id from.
      nthB - second tree to compare the object id from.
      Returns:
      result of getObjectId(nthA).equals(getObjectId(nthB)).
      See Also:
    • getNameString

      public String getNameString()
      Get the current entry's name within its parent tree.

      This method is not very efficient and is primarily meant for debugging and final output generation. Applications should try to avoid calling it, and if invoked do so only once per interesting entry, where the name is absolutely required for correct function.

      Returns:
      name of the current entry within the parent tree (or directory). The name never includes a '/'.
    • getPathString

      public String getPathString()
      Get the current entry's complete path.

      This method is not very efficient and is primarily meant for debugging and final output generation. Applications should try to avoid calling it, and if invoked do so only once per interesting entry, where the name is absolutely required for correct function.

      Returns:
      complete path of the current entry, from the root of the repository. If the current entry is in a subtree there will be at least one '/' in the returned string.
    • getRawPath

      public byte[] getRawPath()
      Get the current entry's complete path as a UTF-8 byte array.
      Returns:
      complete path of the current entry, from the root of the repository. If the current entry is in a subtree there will be at least one '/' in the returned string.
    • getPathLength

      public int getPathLength()
      Get the path length of the current entry.
      Returns:
      The path length of the current entry.
    • isPathMatch

      public int isPathMatch(byte[] p, int pLen)
      Test if the supplied path matches the current entry's path.

      This method detects if the supplied path is equal to, a subtree of, or not similar at all to the current entry. It is faster to use this method than to use getPathString() to first create a String object, then test startsWith or some other type of string match function.

      If the current entry is a subtree, then all paths within the subtree are considered to match it.

      Parameters:
      p - path buffer to test. Callers should ensure the path does not end with '/' prior to invocation.
      pLen - number of bytes from buf to test.
      Returns:
      -1 if the current path is a parent to p; 0 if p matches the current path; 1 if the current path is different and will never match again on this tree walk.
      Since:
      4.7
    • isPathPrefix

      public int isPathPrefix(byte[] p, int pLen)
      Test if the supplied path matches the current entry's path.

      This method tests that the supplied path is exactly equal to the current entry or is one of its parent directories. It is faster to use this method then to use getPathString() to first create a String object, then test startsWith or some other type of string match function.

      If the current entry is a subtree, then all paths within the subtree are considered to match it.

      Parameters:
      p - path buffer to test. Callers should ensure the path does not end with '/' prior to invocation.
      pLen - number of bytes from buf to test.
      Returns:
      < 0 if p is before the current path; 0 if p matches the current path; 1 if the current path is past p and p will never match again on this tree walk.
    • isPathSuffix

      public boolean isPathSuffix(byte[] p, int pLen)
      Test if the supplied path matches (being suffix of) the current entry's path.

      This method tests that the supplied path is exactly equal to the current entry, or is relative to one of entry's parent directories. It is faster to use this method then to use getPathString() to first create a String object, then test endsWith or some other type of string match function.

      Parameters:
      p - path buffer to test.
      pLen - number of bytes from buf to test.
      Returns:
      true if p is suffix of the current path; false if otherwise
    • getDepth

      public int getDepth()
      Get the current subtree depth of this walker.
      Returns:
      the current subtree depth of this walker.
    • isSubtree

      public boolean isSubtree()
      Is the current entry a subtree?

      This method is faster then testing the raw mode bits of all trees to see if any of them are a subtree. If at least one is a subtree then this method will return true.

      Returns:
      true if enterSubtree() will work on the current node.
    • isPostChildren

      public boolean isPostChildren()
      Is the current entry a subtree returned after its children?
      Returns:
      true if the current node is a tree that has been returned after its children were already processed.
      See Also:
    • enterSubtree

      Enter into the current subtree.

      If the current entry is a subtree this method arranges for its children to be returned before the next sibling following the subtree is returned.

      Throws:
      MissingObjectException - a subtree was found, but the subtree object does not exist in this repository. The repository may be missing objects.
      IncorrectObjectTypeException - a subtree was found, and the subtree id does not denote a tree, but instead names some other non-tree type of object. The repository may have data corruption.
      CorruptObjectException - the contents of a tree did not appear to be a tree. The repository may have data corruption.
      IOException - a loose object or pack file could not be read.
    • min

      Throws:
      CorruptObjectException
    • popEntriesEqual

      void popEntriesEqual() throws CorruptObjectException
      Throws:
      CorruptObjectException
    • skipEntriesEqual

      void skipEntriesEqual() throws CorruptObjectException
      Throws:
      CorruptObjectException
    • exitSubtree

      void exitSubtree()
    • parserFor

      Throws:
      IncorrectObjectTypeException
      IOException
    • pathOf

      static String pathOf(AbstractTreeIterator t)
    • pathOf

      static String pathOf(byte[] buf, int pos, int end)
    • getTree

      public <T extends AbstractTreeIterator> T getTree(Class<T> type)
      Get the tree of that type.
      Type Parameters:
      T - a tree type.
      Parameters:
      type - of the tree to be queried
      Returns:
      the tree of that type or null if none is present.
      Since:
      4.3
    • getFilterCommand

      public String getFilterCommand(String filterCommandType) throws IOException
      Inspect config and attributes to return a filtercommand applicable for the current path, but without expanding %f occurences
      Parameters:
      filterCommandType - which type of filterCommand should be executed. E.g. "clean", "smudge"
      Returns:
      a filter command
      Throws:
      IOException
      Since:
      4.2
    • getFilterCommandDefinition

      private String getFilterCommandDefinition(String filterDriverName, String filterCommandType)
      Get the filter command how it is defined in gitconfig. The returned string may contain "%f" which needs to be replaced by the current path before executing the filter command. These filter definitions are cached for better performance.
      Parameters:
      filterDriverName - The name of the filter driver as it is referenced in the gitattributes file. E.g. "lfs". For each filter driver there may be many commands defined in the .gitconfig
      filterCommandType - The type of the filter command for a specific filter driver. May be "clean" or "smudge".
      Returns:
      the definition of the command to be executed for this filter driver and filter command