Xskipper

Companion class Xskipper

object Xskipper

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Xskipper
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clearStats(): Unit
Clears the stats for all active MetadataHandle instances in the active MetadataStoreManager Should be called before each query to make sure the aggregated stats are cleared
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def disable(sparkSession: SparkSession): Unit
Python API Wrapper for disabling Xskipper in the given SparkSession
Python API Wrapper for disabling Xskipper in the given SparkSession
sparkSession
SparkSession object
def enable(sparkSession: SparkSession): Unit
Python API Wrapper for enabling Xskipper in the given SparkSession
Python API Wrapper for enabling Xskipper in the given SparkSession
sparkSession
SparkSession object
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def get(key: String): String
Retrieves the value associated with the given key in the configuration
Retrieves the value associated with the given key in the configuration
key
the key to lookup
returns
the value associated with the key or null if the key doesn't exist (null is returned so this function can be used in the python module)
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getConf(): Map[String, String]
Returns a map of all configurations currently set
def getLatestQueryAggregatedStats(sparkSession: SparkSession): DataFrame
Gets the aggregated latest query skipping stats for all active MetadataHandle instances in the current default MetadataStoreManager.
Gets the aggregated latest query skipping stats for all active MetadataHandle instances in the current default MetadataStoreManager. In order to get reliable results it is assumed that either clearStats or clearActiveMetadataHandles was called before running the query.
This is needed since the way we aggregate the skipping stats is by going over all active MetadataHandles of the MetadataStoreManager and aggregating their stats. When running multiple queries there could be a scenario in which the first query used dataset a and the second query didn't use it, therefore, when calling aggregatedStats for the second query the MetadataHandle for dataset a will be present as an active MetadataHandle therefore we need its stats to be cleared.
In case the API was called on a query which didn't involve any index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. No dataset in the query has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or due to missing metadata filter) or due to failure to translate the abstract query. the returned DataFrame structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats
sparkSession
a spark session to construct the dataframe with the latest query stats
returns
a DataFrame object containing information about latest query stats
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def isEnabled(sparkSession: SparkSession): Boolean
Python API Wrapper for checking if Xskipper is enabled
Python API Wrapper for checking if Xskipper is enabled
sparkSession
SparkSession object
returns
true if the Xskipper is enabled for the current SparkSession
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def listIndexes(sparkSession: SparkSession): DataFrame
Returns information about the indexed datasets
Returns information about the indexed datasets
returns
a DataFrame object containing information about the indexed datasets under the configured base path
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def reset(sparkSession: SparkSession): Unit
Reset all xskipper settings by: 1.
Reset all xskipper settings by: 1. disables filtering 2. clear all MetadataHandle in the default MetadataStoreManager 3. reset the JVM wide configuration
sparkSession
the spark session to remove the rule from
def set(key: String, value: String): Unit
Sets a specific key in the JVM wide configuration
Sets a specific key in the JVM wide configuration
key
the key to set
value
the value associated with the key
def setConf(params: Map[String, String]): Unit
def setConf(params: Map[String, String]): Unit
Updates JVM wide xskipper parameters (Only given parameters will be updated)
Updates JVM wide xskipper parameters (Only given parameters will be updated)
params
a map of parameters to be set
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def unset(key: String): Unit
Removes a key from the configuration
Removes a key from the configuration
key
the key to remove
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Inherited from AnyRef

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Inherited from Any

Value Members

final def asInstanceOf[T0]: T0

Definition Classes
Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any

Ungrouped

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clearStats(): Unit
Clears the stats for all active MetadataHandle instances in the active MetadataStoreManager Should be called before each query to make sure the aggregated stats are cleared
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def disable(sparkSession: SparkSession): Unit
Python API Wrapper for disabling Xskipper in the given SparkSession
Python API Wrapper for disabling Xskipper in the given SparkSession
sparkSession
SparkSession object
def enable(sparkSession: SparkSession): Unit
Python API Wrapper for enabling Xskipper in the given SparkSession
Python API Wrapper for enabling Xskipper in the given SparkSession
sparkSession
SparkSession object
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def get(key: String): String
Retrieves the value associated with the given key in the configuration
Retrieves the value associated with the given key in the configuration
key
the key to lookup
returns
the value associated with the key or null if the key doesn't exist (null is returned so this function can be used in the python module)
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getConf(): Map[String, String]
Returns a map of all configurations currently set
def getLatestQueryAggregatedStats(sparkSession: SparkSession): DataFrame
Gets the aggregated latest query skipping stats for all active MetadataHandle instances in the current default MetadataStoreManager.
Gets the aggregated latest query skipping stats for all active MetadataHandle instances in the current default MetadataStoreManager. In order to get reliable results it is assumed that either clearStats or clearActiveMetadataHandles was called before running the query.
This is needed since the way we aggregate the skipping stats is by going over all active MetadataHandles of the MetadataStoreManager and aggregating their stats. When running multiple queries there could be a scenario in which the first query used dataset a and the second query didn't use it, therefore, when calling aggregatedStats for the second query the MetadataHandle for dataset a will be present as an active MetadataHandle therefore we need its stats to be cleared.
In case the API was called on a query which didn't involve any index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. No dataset in the query has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or due to missing metadata filter) or due to failure to translate the abstract query. the returned DataFrame structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats
sparkSession
a spark session to construct the dataframe with the latest query stats
returns
a DataFrame object containing information about latest query stats
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def isEnabled(sparkSession: SparkSession): Boolean
Python API Wrapper for checking if Xskipper is enabled
Python API Wrapper for checking if Xskipper is enabled
sparkSession
SparkSession object
returns
true if the Xskipper is enabled for the current SparkSession
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
def listIndexes(sparkSession: SparkSession): DataFrame
Returns information about the indexed datasets
Returns information about the indexed datasets
returns
a DataFrame object containing information about the indexed datasets under the configured base path
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def reset(sparkSession: SparkSession): Unit
Reset all xskipper settings by: 1.
Reset all xskipper settings by: 1. disables filtering 2. clear all MetadataHandle in the default MetadataStoreManager 3. reset the JVM wide configuration
sparkSession
the spark session to remove the rule from
def set(key: String, value: String): Unit
Sets a specific key in the JVM wide configuration
Sets a specific key in the JVM wide configuration
key
the key to set
value
the value associated with the key
def setConf(params: Map[String, String]): Unit
def setConf(params: Map[String, String]): Unit
Updates JVM wide xskipper parameters (Only given parameters will be updated)
Updates JVM wide xskipper parameters (Only given parameters will be updated)
params
a map of parameters to be set
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
def unset(key: String): Unit
Removes a key from the configuration
Removes a key from the configuration
key
the key to remove
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Packages

Xskipper 

Companion class Xskipper

object Xskipper

Value Members

Inherited from AnyRef

Value Members

Inherited from Any

Value Members

Ungrouped

Xskipper