Package

org.apache.spark.sql.execution.datasources

xskipper

Permalink

package xskipper

Visibility

Public
All

Type Members

class CatalogDataSkippingFileIndex extends CatalogFileIndex with Logging

Used to preserve the capabilities of CatalogFileIndex
class DataSkippingFileIndexRule extends Rule[LogicalPlan]

A Catalyst rule which replaces a logical relation plan's InMemoryFileIndex with an extended DataSkippingFileIndex which allows fine grained file filtering.
A Catalyst rule which replaces a logical relation plan's InMemoryFileIndex with an extended DataSkippingFileIndex which allows fine grained file filtering.
The following applies to the case where plan is an instance of LogicalRelation, with a HadoopFsRelation: the rule adds a dummy option to the options field of the new HadoopFsRelation. if the input plan's catalog is not an instance of IndexedCatalog - then the dummy option will be absent, thus it's addition will ensure that org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals will return false when comparing the input plan to the returned plan.
if the input plan's catalog is indeed an instance of IndexedCatalog - then the plan doesn't change, and indeed org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals will return true - preventing an unnecessary catalyst churn.
class InMemoryDataSkippingIndex extends InMemoryFileIndex
class PrunedInMemoryDataSkippingIndex extends InMemoryDataSkippingIndex

used to support PrunedInMemoryFileIndex which might result from running PruneFileSourcePartitions rule, since the PrunedInMemoryFileIndex specify its own partitionSpec we can not use an InMemoryFileIndex

Value Members

object DataSkippingUtils extends Logging

Ungrouped