Packages

class IndexBuilder extends Logging

Helper class for building indexes

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. IndexBuilder
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new IndexBuilder(spark: SparkSession, uri: String, xskipper: Xskipper)

    spark

    org.apache.spark.sql.SparkSession object

    uri

    the URI of the dataset / the identifier of the table on which the index is defined

    xskipper

    xskipper the Xskipper instance associated with this IndexBuilder

Value Members

  1. final def !=(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  4. def addBloomFilterIndex(col: String, fpp: Double, keyMetadata: String): IndexBuilder

    Adds a BloomFilter index for the given column

    Adds a BloomFilter index for the given column

    col

    the column to add the index on

    fpp

    the false positive rate to use

    keyMetadata

    the key metadata to be used

  5. def addBloomFilterIndex(col: String, fpp: Double = ..., ndv: Long = ...): IndexBuilder

    Adds a BloomFilter index for the given column

    Adds a BloomFilter index for the given column

    col

    the column to add the index on

    fpp

    the false positive rate to use

    ndv

    the expected number of distinct values in the bloom filter

  6. def addBloomFilterIndex(col: String, keyMetadata: String): IndexBuilder

    Adds a BloomFilter index for the given column

    Adds a BloomFilter index for the given column

    col

    the column to add the index on

    keyMetadata

    the key metadata to be used

  7. def addBloomFilterIndex(col: String): IndexBuilder

    Adds a BloomFilter index for the given column

    Adds a BloomFilter index for the given column

    col

    the column to add the index on

  8. def addCustomIndex(index: String, cols: Array[String], params: Map[String, String]): IndexBuilder

    Adds a custom index (Overload for python)

    Adds a custom index (Overload for python)

    index

    the index name

    cols

    the sequence of columns

    params

    the index instance to add

  9. def addCustomIndex(index: String, cols: Array[String], params: Map[String, String], keyMetadata: String): IndexBuilder

    Adds a custom index (Overload for python)

    Adds a custom index (Overload for python)

    index

    the index name

    cols

    the sequence of columns

    params

    the index instance to add

    keyMetadata

    the key metadata to be used

  10. def addCustomIndex(index: Index): IndexBuilder

    Adds a custoom index

    Adds a custoom index

    index

    the index instance to add

  11. def addMinMaxIndex(col: String, keyMetadata: String): IndexBuilder

    Adds a MinMax index for the given column

    Adds a MinMax index for the given column

    col

    the column to add the index on

    keyMetadata

    the key metadata to be used

  12. def addMinMaxIndex(col: String): IndexBuilder

    Adds a MinMax index for the given column

    Adds a MinMax index for the given column

    col

    the column to add the index on

  13. def addValueListIndex(col: String, keyMetadata: String): IndexBuilder

    Adds a ValueList index for the given column

    Adds a ValueList index for the given column

    col

    the column to add the index on

    keyMetadata

    the key metadata to be used

  14. def addValueListIndex(col: String): IndexBuilder

    Adds a ValueList index for the given column

    Adds a ValueList index for the given column

    col

    the column to add the index on

  15. final def asInstanceOf[T0]: T0
    Definition Classes
    Any
  16. def build(): DataFrame

    Build index operation for table URI It is assumed that the URI that was used in Xskipper definition is the identifier of a table (<db>.

    )

    Build index operation for table URI It is assumed that the URI that was used in Xskipper definition is the identifier of a table (<db>.

    )

    returns

    a DataFrame indicating if the operation succeeded or not

  17. def build(reader: DataFrameReader): DataFrame

    Build index operation for non table URI

    Build index operation for non table URI

    reader

    a DataFrameReader instance to enable reading the URI as a DataFrame Note: The reader is assumed to have all of the parameters configured. reader.load(Seq(<path>)) will be used by the indexing code to read each object separately

    returns

    a DataFrame indicating if the operation succeeded or not

  18. def clone(): AnyRef
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  19. def createOrRefreshExistingIndex(df: DataFrame, indexes: Seq[Index], isRefresh: Boolean): DataFrame

    Creates or refresh an existing index by the DataFrame of the data to be indexed (assumed to be comprised of objects) This method first collects the objects that are already indexed and then indexes only the non indexed objects

    Creates or refresh an existing index by the DataFrame of the data to be indexed (assumed to be comprised of objects) This method first collects the objects that are already indexed and then indexes only the non indexed objects

    df

    the DataFrame to be indexed - can be either a dataset created by SparkSession.read on some hadoop file system path or a hive table on top of some hadoop file system

    indexes

    a sequence of Index that will be applied on the DataFrame

    isRefresh

    whehther or not this is a refresh operation. this is only required because in case of refresh we ignore index stats (instead of initializing them)

    returns

    a DataFrame of the format status, #indexedFiles, #removedFiles

  20. final def eq(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  21. def equals(arg0: Any): Boolean
    Definition Classes
    AnyRef → Any
  22. def finalize(): Unit
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  23. final def getClass(): Class[_]
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  24. def hashCode(): Int
    Definition Classes
    AnyRef → Any
    Annotations
    @native()
  25. val indexes: ArrayBuffer[Index]
  26. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  27. def initializeLogIfNecessary(isInterpreter: Boolean): Unit
    Attributes
    protected
    Definition Classes
    Logging
  28. final def isInstanceOf[T0]: Boolean
    Definition Classes
    Any
  29. def isTraceEnabled(): Boolean
    Attributes
    protected
    Definition Classes
    Logging
  30. def log: Logger
    Attributes
    protected
    Definition Classes
    Logging
  31. def logDebug(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  32. def logDebug(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  33. def logError(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  34. def logError(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  35. def logInfo(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  36. def logInfo(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  37. def logName: String
    Attributes
    protected
    Definition Classes
    Logging
  38. def logTrace(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  39. def logTrace(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  40. def logWarning(msg: ⇒ String, throwable: Throwable): Unit
    Attributes
    protected
    Definition Classes
    Logging
  41. def logWarning(msg: ⇒ String): Unit
    Attributes
    protected
    Definition Classes
    Logging
  42. val metadataProcessor: MetadataProcessor
  43. final def ne(arg0: AnyRef): Boolean
    Definition Classes
    AnyRef
  44. final def notify(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  45. final def notifyAll(): Unit
    Definition Classes
    AnyRef
    Annotations
    @native()
  46. final def synchronized[T0](arg0: ⇒ T0): T0
    Definition Classes
    AnyRef
  47. def toString(): String
    Definition Classes
    AnyRef → Any
  48. final def wait(): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  49. final def wait(arg0: Long, arg1: Int): Unit
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  50. final def wait(arg0: Long): Unit
    Definition Classes
    AnyRef
    Annotations
    @native() @throws( ... )
  51. Inherited from Logging

    Inherited from AnyRef

    Inherited from Any

    Ungrouped