Additional constructor for pySpark API
Additional constructor for pySpark API
sparkSession instance for processing
the URI of the dataset / the identifier of the table on which the index is defined
fully qualified name of MetadataStoreManager to be used
XskipperException
if the metadataStoreManagerClassName is invalid
sparkSession instance for processing
the URI of the dataset / the identifier of the table on which the index is defined
The MetadataStoreManager to use
Describes the indexes on the URI (for table URI)
Describes the indexes on the URI (for table URI)
DataFrame object containing information about the index
XskipperException
if the URI is not indexed
Describes the indexes on the URI (for non table URI)
Describes the indexes on the URI (for non table URI)
a DataFrameReader instance to enable reading the URI as a DataFrame
DataFrame object containing information about the index
XskipperException
if the URI is not indexed
Deletes the index
Deletes the index
XskipperException
if index cannot be removed
return meta index info like indexing scheme and skipping stats
return meta index info like indexing scheme and skipping stats
Return latest query skipping statistics for this Xskipper instance
Return latest query skipping statistics for this Xskipper instance
In case the API was called on a URI without an index or the API was called without running a query the returned DataFrame structure is - status, reason with status=FAILED In case the query cannot be skipped because one of the following: 1. Dataset has no indexed files 2. No query to the metadata store can be generated - can be due to a predicate that can not be used in skipping (or maybe due to missing metadata filter) or due to failure to translate the abstract query. the returned dataframe structure is: status, isSkippable, skipped_Bytes, skipped_Objs, total_Bytes, total_Objs with status=SUCCESS, isSkippable=false and all other values are -1 Otherwise the DataFrame structure is the same as above with isSkippable=true and the relevant stats
DataFrame object containing information about latest query stats
Helper class for setting and building an index
Checks if the URI is indexed
Checks if the URI is indexed
true if the URI is indexed
Refresh index operation for table URI
Refresh index operation for table URI
DataFrame object containing statistics about the refresh operation
XskipperException
if index cannot be refreshed
Refresh index operation for non table URI
Refresh index operation for non table URI
a DataFrameReader instance to enable reading the URI as a
DataFrame
Note: The reader is assumed to have all of the parameters configured.
reader.load(Seq(<path>))
will be used by the indexing code to read each
object separately
DataFrame object containing statistics about the refresh operation
XskipperException
if index cannot be refreshed
Update instance specific MetadataHandle parameters
the URI of the dataset / the identifier of the table on which the index is defined
Main class for programmatically interacting with Xskipper