A Catalyst rule which replaces a logical relation plan's InMemoryFileIndex
with an extended DataSkippingFileIndex which allows fine grained file filtering.
The following applies to the case where plan is an instance of LogicalRelation,
with a HadoopFsRelation:
the rule adds a dummy option to the options field of the new HadoopFsRelation.
if the input plan's catalog is not an instance of IndexedCatalog - then the
dummy option will be absent, thus it's addition will ensure that
org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals
will return false when comparing the input plan to the returned plan.
if the input plan's catalog is indeed an instance of IndexedCatalog -
then the plan doesn't change, and indeed
org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals will return true -
preventing an unnecessary catalyst churn.
A Catalyst rule which replaces a logical relation plan's InMemoryFileIndex with an extended DataSkippingFileIndex which allows fine grained file filtering.
The following applies to the case where plan is an instance of LogicalRelation, with a HadoopFsRelation: the rule adds a dummy option to the options field of the new HadoopFsRelation. if the input plan's catalog is not an instance of IndexedCatalog - then the dummy option will be absent, thus it's addition will ensure that org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals will return false when comparing the input plan to the returned plan.
if the input plan's catalog is indeed an instance of IndexedCatalog - then the plan doesn't change, and indeed org.apache.spark.sql.catalyst.trees.TreeNode.fastEquals will return true - preventing an unnecessary catalyst churn.