Packages

c

io.xskipper.search

DataSkippingFileFilter

class DataSkippingFileFilter extends Logging

A Custom FileFilter which enables to filter sequence of PartitionDirectory using a given MetadataBackend

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. DataSkippingFileFilter
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new DataSkippingFileFilter(tid: String, metadataStoreManager: MetadataStoreManager, sparkSession: SparkSession, params: Map[String, String] = Map.empty[String, String])

    tid

    the table identifier for which the DataSkippingFileFilter is built

    metadataStoreManager

    the MetadataStoreManager to be used in order to get io.xskipper.metadatastore.MetadataHandle for the identifier

    params

    a map of parameters to be set by the DataSkippingFileFilter on the io.xskipper.metadatastore.MetadataHandle

Value Members

  1. def handleStatistics(): Unit

    Update the IndexMeta associated with the table

  2. def init(dataFilters: Seq[Expression], partitionFilters: Seq[Expression], metadataFilterFactories: Seq[MetadataFilterFactory], clauseTranslators: Seq[ClauseTranslator]): Unit

    Filters the partition directory by removing unnecessary objects from each partition directory

    Filters the partition directory by removing unnecessary objects from each partition directory

    dataFilters

    query predicates for actual data columns (not partitions)

    partitionFilters

    the partition predicates from the query

    metadataFilterFactories

    a sequence of MetadataFilterFactory to generate filters according to the index on the dataset

    clauseTranslators

    a sequence of ClauseTranslators to be applied on the clauses

    returns

    a sequence of PartitionDirectory after filtering the unnecessary objects using the metadata

  3. def isRequired(fs: FileStatus): Boolean

    Returns true if the current file is required for the given query by checking if it is present in the required files or not indexed

    Returns true if the current file is required for the given query by checking if it is present in the required files or not indexed

    fs

    the file status to check

    returns

    true if the file is required, false otherwise

  4. def isSkipabble(): Boolean

    returns

    true if the current query is relevant for skipping. i.e - it has indexed files and a metadata query can be generated