Xskipper

Companion object Xskipper

class Xskipper extends AnyRef

Main class for programmatically interacting with Xskipper

Linear Supertypes

AnyRef, Any

Ordering

Alphabetic
By Inheritance

Inherited

Xskipper
AnyRef
Any

Hide All
Show All

Visibility

Public
All

Instance Constructors

new Xskipper(sparkSession: SparkSession, uri: String, metadataStoreManagerClassName: String)
Additional constructor for pySpark API
Additional constructor for pySpark API
sparkSession
sparkSession instance for processing
uri
the URI of the dataset / the identifier of the table on which the index is defined
metadataStoreManagerClassName
fully qualified name of MetadataStoreManager to be used

Exceptions thrown
XskipperException if the metadataStoreManagerClassName is invalid
new Xskipper(sparkSession: SparkSession, uri: String, metadataStoreManager: MetadataStoreManager = ParquetMetadataStoreManager)
sparkSession
sparkSession instance for processing
uri
the URI of the dataset / the identifier of the table on which the index is defined
metadataStoreManager
The MetadataStoreManager to use

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def describeIndex(): DataFrame
Describes the indexes on the URI (for table URI)
Describes the indexes on the URI (for table URI)
returns
DataFrame object containing information about the index

Exceptions thrown
XskipperException if the URI is not indexed
def describeIndex(reader: DataFrameReader): DataFrame
Describes the indexes on the URI (for non table URI)
Describes the indexes on the URI (for non table URI)
reader
a DataFrameReader instance to enable reading the URI as a DataFrame
returns
DataFrame object containing information about the index

Exceptions thrown
XskipperException if the URI is not indexed
def dropIndex(): Unit
Deletes the index
Deletes the index

Exceptions thrown
XskipperException if index cannot be removed
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def generateDescribeIndex(df: DataFrame): DataFrame
return meta index info like indexing scheme and skipping stats
return meta index info like indexing scheme and skipping stats

Attributes
protected
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getLatestQueryStats(): DataFrame
Return latest query skipping statistics for this Xskipper instance
Return latest query skipping statistics for this Xskipper instance
In case the API was called on a URI without an index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. Dataset has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or maybe due to missing metadata filter) or due to failure to translate the abstract query. the returned dataframe structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats
returns
DataFrame object containing information about latest query stats
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def indexBuilder(): IndexBuilder
Helper class for setting and building an index
def isIndexed(): Boolean
Checks if the URI is indexed
Checks if the URI is indexed
returns
true if the URI is indexed
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def refreshIndex(): DataFrame
Refresh index operation for table URI
Refresh index operation for table URI
returns
DataFrame object containing statistics about the refresh operation

Exceptions thrown
XskipperException if index cannot be refreshed
def refreshIndex(reader: DataFrameReader): DataFrame
Refresh index operation for non table URI
Refresh index operation for non table URI
reader
a DataFrameReader instance to enable reading the URI as a DataFrame Note: The reader is assumed to have all of the parameters configured. reader.load(Seq(<path>)) will be used by the indexing code to read each object separately
returns
DataFrame object containing statistics about the refresh operation

Exceptions thrown
XskipperException if index cannot be refreshed
def setParams(params: Map[String, String]): Unit
def setParams(params: Map[String, String]): Unit
Update instance specific MetadataHandle parameters
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
val tableIdentifier: String
def toString(): String

Definition Classes
AnyRef → Any
val uri: String
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Inherited from AnyRef

Value Members

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
def toString(): String

Definition Classes
AnyRef → Any
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Inherited from Any

Value Members

final def asInstanceOf[T0]: T0

Definition Classes
Any
final def isInstanceOf[T0]: Boolean

Definition Classes
Any

Ungrouped

final def !=(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def ##(): Int

Definition Classes
AnyRef → Any
final def ==(arg0: Any): Boolean

Definition Classes
AnyRef → Any
final def asInstanceOf[T0]: T0

Definition Classes
Any
def clone(): AnyRef

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@native() @throws( ... )
def describeIndex(): DataFrame
Describes the indexes on the URI (for table URI)
Describes the indexes on the URI (for table URI)
returns
DataFrame object containing information about the index

Exceptions thrown
XskipperException if the URI is not indexed
def describeIndex(reader: DataFrameReader): DataFrame
Describes the indexes on the URI (for non table URI)
Describes the indexes on the URI (for non table URI)
reader
a DataFrameReader instance to enable reading the URI as a DataFrame
returns
DataFrame object containing information about the index

Exceptions thrown
XskipperException if the URI is not indexed
def dropIndex(): Unit
Deletes the index
Deletes the index

Exceptions thrown
XskipperException if index cannot be removed
final def eq(arg0: AnyRef): Boolean

Definition Classes
AnyRef
def equals(arg0: Any): Boolean

Definition Classes
AnyRef → Any
def finalize(): Unit

Attributes
protected[java.lang]
Definition Classes
AnyRef
Annotations
@throws( classOf[java.lang.Throwable] )
def generateDescribeIndex(df: DataFrame): DataFrame
return meta index info like indexing scheme and skipping stats
return meta index info like indexing scheme and skipping stats

Attributes
protected
final def getClass(): Class[_]

Definition Classes
AnyRef → Any
Annotations
@native()
def getLatestQueryStats(): DataFrame
Return latest query skipping statistics for this Xskipper instance
Return latest query skipping statistics for this Xskipper instance
In case the API was called on a URI without an index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. Dataset has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or maybe due to missing metadata filter) or due to failure to translate the abstract query. the returned dataframe structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats
returns
DataFrame object containing information about latest query stats
def hashCode(): Int

Definition Classes
AnyRef → Any
Annotations
@native()
def indexBuilder(): IndexBuilder
Helper class for setting and building an index
def isIndexed(): Boolean
Checks if the URI is indexed
Checks if the URI is indexed
returns
true if the URI is indexed
final def isInstanceOf[T0]: Boolean

Definition Classes
Any
final def ne(arg0: AnyRef): Boolean

Definition Classes
AnyRef
final def notify(): Unit

Definition Classes
AnyRef
Annotations
@native()
final def notifyAll(): Unit

Definition Classes
AnyRef
Annotations
@native()
def refreshIndex(): DataFrame
Refresh index operation for table URI
Refresh index operation for table URI
returns
DataFrame object containing statistics about the refresh operation

Exceptions thrown
XskipperException if index cannot be refreshed
def refreshIndex(reader: DataFrameReader): DataFrame
Refresh index operation for non table URI
Refresh index operation for non table URI
reader
a DataFrameReader instance to enable reading the URI as a DataFrame Note: The reader is assumed to have all of the parameters configured. reader.load(Seq(<path>)) will be used by the indexing code to read each object separately
returns
DataFrame object containing statistics about the refresh operation

Exceptions thrown
XskipperException if index cannot be refreshed
def setParams(params: Map[String, String]): Unit
def setParams(params: Map[String, String]): Unit
Update instance specific MetadataHandle parameters
final def synchronized[T0](arg0: ⇒ T0): T0

Definition Classes
AnyRef
val tableIdentifier: String
def toString(): String

Definition Classes
AnyRef → Any
val uri: String
final def wait(): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long, arg1: Int): Unit

Definition Classes
AnyRef
Annotations
@throws( ... )
final def wait(arg0: Long): Unit

Definition Classes
AnyRef
Annotations
@native() @throws( ... )

Packages

Xskipper 

Companion object Xskipper

class Xskipper extends AnyRef

Instance Constructors

Value Members

Inherited from AnyRef

Value Members

Inherited from Any

Value Members

Ungrouped

Xskipper