d1_common.iter package

This package contains iterators that provide a convenient way to retrieve and iterate over Node contents.

Although this directory is not a package, this __init__.py file is required for pytest to be able to reach test directories below this directory.

Submodules

d1_common.iter.bytes module

Generator that returns a bytes object in chunks.

class d1_common.iter.bytes.BytesIterator(bytes_, chunk_size=1024)

Bases: object

Generator that returns a bytes object in chunks.

size

Returns:

int: The total number of bytes that will be returned by the iterator.

d1_common.iter.file module

Generator that returns the bytes of a file in chunks.

class d1_common.iter.file.FileIterator(path, chunk_size=1024)

Bases: object

Generator that returns the bytes of a file in chunks.

size

Returns:

int : The total number of bytes that will be returned by the iterator.

class d1_common.iter.file.FileLikeObjectIterator(file, chunk_size=1024)

Bases: object

Generator that returns the bytes of a file-like object in chunks.

size

Returns:

int : The total number of bytes that will be returned by the iterator.

d1_common.iter.path module

Generator that resolves a list of file and dir paths and returns file paths with optional filtering and client feedback.

d1_common.iter.path.path_generator(path_list, include_glob_list=None, exclude_glob_list=None, recursive=True, ignore_invalid=False, default_excludes=True, return_dir_paths=False)

# language=rst.

Parameters
  • path_list – list of str

    List of file- and dir paths. File paths are used directly and dirs are searched for files.

    path_list does not accept glob patterns, as it’s more convenient to let the shell expand glob patterns to directly specified files and dirs. E.g., to use a glob to select all .py files in a subdir, the command may be called with sub/dir/*.py, which the shell expands to a list of files, which are then passed to this function. The paths should be Unicode or utf-8 strings. Tilde (“~”) to home expansion is performed on the paths.

    The shell can also expand glob patterns to dir paths or a mix of file and dir paths.

  • include_glob_list – list of str

  • exclude_glob_list – list of str

    Patterns ending with “/” are matched only against dir names. All other patterns are matched only against file names.

    If the include list contains any file patterns, files must match one or more of the patterns in order to be returned.

    If the include list contains any dir patterns, dirs must match one or more of the patterns in order for the recursive search to descend into them.

    The exclude list works in the same way except that matching files and dirs are excluded instead of included. If both include and exclude lists are specified, files and dirs must both match the include and not match the exclude patterns in order to be returned or descended into.

  • recursive – bool

    • True (default): Search subdirectories

    • False: Do not search subdirectories

  • ignore_invalid – bool

    • True: Invalid paths in path_list are ignored.

    • False (default): EnvironmentError is raised if any of the paths in path_list do not reference an existing file or dir.

  • default_excludes – bool

    • True: A list of glob patterns for files and dirs that should typically be ignored is added to any exclude patterns passed to the function. These include dirs such as .git and backup files, such as files appended with “~”.

    • False: No files or dirs are excluded by default.

  • return_dir_paths – bool

    • False: Only file paths are returned.

    • True: Directory paths are also returned.

Returns

File path iterator

Notes

During iteration, the iterator can be prevented from descending into a directory by sending a “skip” flag when the iterator yields the directory path. This allows the client to determine if directories should be iterated by, for instance, which files are present in the directory. This can be used in conjunction with the include and exclude glob lists. Note that, in order to receive directory paths that can be skipped, return_dir_paths must be set to True.

The regular for...in syntax does not support sending the “skip” flag back to the iterator. Instead, use a pattern like:

itr = file_iterator.file_iter(..., return_dir_paths=True)
try:
  path = itr.next()
  while True:
  skip_dir = determine_if_dir_should_be_skipped(path)
  file_path = itr.send(skip_dir)
except KeyboardInterrupt:
  raise StopIteration
except StopIteration:
  pass

Glob patterns are matched only against file and directory names, not the full paths.

Paths passed directly in path_list are not filtered.

The same file can be returned multiple times if path_list contains duplicated file paths or dir paths, or dir paths that implicitly include the same subdirs.

include_glob_list and exclude_glob_list are handy for filtering the files found in dir searches.

Remember to escape the include and exclude glob patterns on the command line so that they’re not expanded by the shell.

d1_common.iter.string module

Generator that returns the Unicode characters of a str in chunks.

class d1_common.iter.string.StringIterator(string, chunk_size=1024)

Bases: object

Generator that returns the Unicode characters of a str in chunks.

size

Returns:

int : The total number of characters that will be returned by the iterator.