Packages

class Xskipper extends AnyRef

Main class for programmatically interacting with Xskipper

Linear Supertypes
AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. Xskipper
  2. AnyRef
  3. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new Xskipper(sparkSession: SparkSession, uri: String, metadataStoreManagerClassName: String)

    Additional constructor for pySpark API

    Additional constructor for pySpark API

    sparkSession

    sparkSession instance for processing

    uri

    the URI of the dataset / the identifier of the table on which the index is defined

    metadataStoreManagerClassName

    fully qualified name of MetadataStoreManager to be used

    Exceptions thrown

    XskipperException if the metadataStoreManagerClassName is invalid

  2. new Xskipper(sparkSession: SparkSession, uri: String, metadataStoreManager: MetadataStoreManager = ParquetMetadataStoreManager)

    sparkSession

    sparkSession instance for processing

    uri

    the URI of the dataset / the identifier of the table on which the index is defined

    metadataStoreManager

    The MetadataStoreManager to use

Value Members

  1. def describeIndex(): DataFrame

    Describes the indexes on the URI (for table URI)

    Describes the indexes on the URI (for table URI)

    returns

    DataFrame object containing information about the index

    Exceptions thrown

    XskipperException if the URI is not indexed

  2. def describeIndex(reader: DataFrameReader): DataFrame

    Describes the indexes on the URI (for non table URI)

    Describes the indexes on the URI (for non table URI)

    reader

    a DataFrameReader instance to enable reading the URI as a DataFrame

    returns

    DataFrame object containing information about the index

    Exceptions thrown

    XskipperException if the URI is not indexed

  3. def dropIndex(): Unit

    Deletes the index

    Deletes the index

    Exceptions thrown

    XskipperException if index cannot be removed

  4. def getLatestQueryStats(): DataFrame

    Return latest query skipping statistics for this Xskipper instance

    Return latest query skipping statistics for this Xskipper instance

    In case the API was called on a URI without an index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. Dataset has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or maybe due to missing metadata filter) or due to failure to translate the abstract query. the returned dataframe structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats

    returns

    DataFrame object containing information about latest query stats

  5. def indexBuilder(): IndexBuilder

    Helper class for setting and building an index

  6. def isIndexed(): Boolean

    Checks if the URI is indexed

    Checks if the URI is indexed

    returns

    true if the URI is indexed

  7. def refreshIndex(): DataFrame

    Refresh index operation for table URI

    Refresh index operation for table URI

    returns

    DataFrame object containing statistics about the refresh operation

    Exceptions thrown

    XskipperException if index cannot be refreshed

  8. def refreshIndex(reader: DataFrameReader): DataFrame

    Refresh index operation for non table URI

    Refresh index operation for non table URI

    reader

    a DataFrameReader instance to enable reading the URI as a DataFrame Note: The reader is assumed to have all of the parameters configured. reader.load(Seq(<path>)) will be used by the indexing code to read each object separately

    returns

    DataFrame object containing statistics about the refresh operation

    Exceptions thrown

    XskipperException if index cannot be refreshed

  9. def setParams(params: Map[String, String]): Unit
  10. def setParams(params: Map[String, String]): Unit

    Update instance specific MetadataHandle parameters

  11. val tableIdentifier: String
  12. val uri: String