Class/Object

io.xskipper.index

BloomFilterIndex

Related Docs: object BloomFilterIndex | package index

Permalink

case class BloomFilterIndex(col: String, fpp: Double = ..., ndv: Long = ..., keyMetadata: Option[String] = None) extends Index with Product with Serializable

Represents an abstract bloom filter index

col

the column on which the index is applied

fpp

the expected false positive probability of the bloom filter

ndv

the expected number of distinct values

keyMetadata

optional key metadata

Linear Supertypes
Product, Equals, Index, Logging, Serializable, Serializable, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. BloomFilterIndex
  2. Product
  3. Equals
  4. Index
  5. Logging
  6. Serializable
  7. Serializable
  8. AnyRef
  9. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Instance Constructors

  1. new BloomFilterIndex(col: String, fpp: Double = ..., ndv: Long = ..., keyMetadata: Option[String] = None)

    Permalink

    col

    the column on which the index is applied

    fpp

    the expected false positive probability of the bloom filter

    ndv

    the expected number of distinct values

    keyMetadata

    optional key metadata

Value Members

  1. final def !=(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  2. final def ##(): Int

    Permalink
    Definition Classes
    AnyRef → Any
  3. final def ==(arg0: Any): Boolean

    Permalink
    Definition Classes
    AnyRef → Any
  4. final def asInstanceOf[T0]: T0

    Permalink
    Definition Classes
    Any
  5. def clone(): AnyRef

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  6. val col: String

    Permalink

    the column on which the index is applied

  7. var colsMap: Map[String, IndexField]

    Permalink
    Definition Classes
    Index
  8. final def eq(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  9. def finalize(): Unit

    Permalink
    Attributes
    protected[java.lang]
    Definition Classes
    AnyRef
    Annotations
    @throws( classOf[java.lang.Throwable] )
  10. val fpp: Double

    Permalink

    the expected false positive probability of the bloom filter

  11. def generateBaseMetadata(): MetadataType

    Permalink

    returns

    "zero" value of the index - will be used for the first comparison to the object's rows data (by default this is null)

    Definition Classes
    Index
  12. def generateColsMap(schemaMap: Map[String, (String, StructField)]): Unit

    Permalink

    Generate the column map according to a given schema

    Generate the column map according to a given schema

    schemaMap

    a map containing column names (as appear in the object) and their data types the key is the column name in lower case

    Definition Classes
    Index
  13. final def getClass(): Class[_]

    Permalink
    Definition Classes
    AnyRef → Any
  14. def getCols: Seq[String]

    Permalink

    returns

    the index columns (in lower case)

    Definition Classes
    Index
  15. def getIndexCols: Iterable[IndexField]

    Permalink

    returns

    the columns which the indexed is defined on

    Definition Classes
    Index
  16. def getKeyMetadata(): Option[String]

    Permalink
    Definition Classes
    Index
  17. def getMetaDataTypeClassName(): String

    Permalink

    returns

    the (full) name of the MetaDataType class used by this index

    Definition Classes
    BloomFilterIndexIndex
  18. def getName: String

    Permalink

    returns

    the name of the index

    Definition Classes
    BloomFilterIndexIndex
  19. def getParams: Map[String, String]

    Permalink

    returns

    the index params map

    Definition Classes
    Index
  20. def getRowMetadata(row: Row): Any

    Permalink

    Gets a DataFrame row and extract the raw metadata needed by the index

    Gets a DataFrame row and extract the raw metadata needed by the index

    row

    Row a row to be indexed

    returns

    raw metadata needed by the index or null if the row contain null value

    Definition Classes
    BloomFilterIndexIndex
  21. val indexCols: Seq[String]

    Permalink
    Definition Classes
    Index
  22. def initializeLogIfNecessary(isInterpreter: Boolean, silent: Boolean): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  23. def initializeLogIfNecessary(isInterpreter: Boolean): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  24. def isEncrypted(): Boolean

    Permalink
    Definition Classes
    Index
  25. final def isInstanceOf[T0]: Boolean

    Permalink
    Definition Classes
    Any
  26. var isOptimized: Boolean

    Permalink
    Definition Classes
    Index
  27. def isTraceEnabled(): Boolean

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  28. def isValid(df: DataFrame, schemaMap: Map[String, (String, DataType)]): Unit

    Permalink

    Gets a DataFrame and checks whether it is valid for the index

    Gets a DataFrame and checks whether it is valid for the index

    df

    the DataFrame to be checked

    schemaMap

    a map containing column names (as appear in the object) and their data types the key is the column name in lower case

    Definition Classes
    BloomFilterIndexIndex
    Exceptions thrown

    [[XskipperException]] if invalid index

  29. val keyMetadata: Option[String]

    Permalink

    optional key metadata

  30. def log: Logger

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  31. def logDebug(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  32. def logDebug(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  33. def logError(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  34. def logError(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  35. def logInfo(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  36. def logInfo(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  37. def logName: String

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  38. def logTrace(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  39. def logTrace(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  40. def logWarning(msg: ⇒ String, throwable: Throwable): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  41. def logWarning(msg: ⇒ String): Unit

    Permalink
    Attributes
    protected
    Definition Classes
    Logging
  42. val ndv: Long

    Permalink

    the expected number of distinct values

  43. final def ne(arg0: AnyRef): Boolean

    Permalink
    Definition Classes
    AnyRef
  44. final def notify(): Unit

    Permalink
    Definition Classes
    AnyRef
  45. final def notifyAll(): Unit

    Permalink
    Definition Classes
    AnyRef
  46. def optCollectMetaData(filePath: String, df: DataFrame, format: String, options: Map[String, String]): MetadataType

    Permalink

    For some formats we might have an optimized way for collecting the metadata This function enables this by receiving the entire file DataFrame instead of processing it row by row (For example in Parquet we can read the min/max from the footer)

    For some formats we might have an optimized way for collecting the metadata This function enables this by receiving the entire file DataFrame instead of processing it row by row (For example in Parquet we can read the min/max from the footer)

    filePath

    the path of the file that is being processed

    df

    a DataFrame with the file data

    format

    the format of the file

    options

    the options that were used to read the file

    returns

    the collected MetadataType or null if no metadata was collected

    Definition Classes
    Index
  47. def reduce(accuMetadata: MetadataType, curr: Any): MetadataType

    Permalink

    Given an accumulated metadata and new value - process the new value and returns an updated accumulated metadata

    Given an accumulated metadata and new value - process the new value and returns an updated accumulated metadata

    accuMetadata

    accumulated metadata created by processing all values until curr

    curr

    new value to be processed

    returns

    updated metadata for the index

    Definition Classes
    BloomFilterIndexIndex
  48. def reduce(md1: MetadataType, md2: MetadataType): MetadataType

    Permalink

    Same as above reduce given two accumulated metadata

    Same as above reduce given two accumulated metadata

    returns

    updated metadata for the index

    Definition Classes
    BloomFilterIndexIndex
  49. final def synchronized[T0](arg0: ⇒ T0): T0

    Permalink
    Definition Classes
    AnyRef
  50. final def wait(): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  51. final def wait(arg0: Long, arg1: Int): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )
  52. final def wait(arg0: Long): Unit

    Permalink
    Definition Classes
    AnyRef
    Annotations
    @throws( ... )

Inherited from Product

Inherited from Equals

Inherited from Index

Inherited from Logging

Inherited from Serializable

Inherited from Serializable

Inherited from AnyRef

Inherited from Any

Ungrouped