classRuleExtension extends (SparkSessionExtensions) ⇒ Unit with Logging
Injects the DataSkippingFileIndexRule into catalyst as part of
the operatorOptimization rules and enable it
using spark session extensions injectOptimizerRule.
Injects the DataSkippingFileIndexRule into catalyst as part of
the operatorOptimization rules and enable it
using spark session extensions injectOptimizerRule.
To use with regular spark session (either pyspark or scala):
val spark = SparkSession
.builder()
.appName("Xskipper")
.config("spark.master", "local[*]") // comment out to run in production
.config("spark.sql.extensions", "io.xskipper.RuleExtension")
//.enableHiveSupport()
.getOrCreate()
To use with thrift server:
1) get a Xskipper jar file
2) start the thrift server, with the extension:
start-thriftserver.sh --jars <XskipperJar>
--conf spark.sql.extensions=io.xskipper.RuleExtension
3) you can now connect via JDBC (e.g. - beeline/squirrel/ any other JDBC driver)
Injects the DataSkippingFileIndexRule into catalyst as part of the operatorOptimization rules and enable it using spark session extensions injectOptimizerRule.
To use with regular spark session (either pyspark or scala): val spark = SparkSession .builder() .appName("Xskipper") .config("spark.master", "local[*]") // comment out to run in production .config("spark.sql.extensions", "io.xskipper.RuleExtension") //.enableHiveSupport() .getOrCreate()
To use with thrift server: 1) get a Xskipper jar file 2) start the thrift server, with the extension: start-thriftserver.sh --jars <XskipperJar> --conf spark.sql.extensions=io.xskipper.RuleExtension 3) you can now connect via JDBC (e.g. - beeline/squirrel/ any other JDBC driver)