object ParquetUtils extends Logging

Linear Supertypes
Logging, AnyRef, Any
Ordering
  1. Alphabetic
  2. By Inheritance
Inherited
  1. ParquetUtils
  2. Logging
  3. AnyRef
  4. Any
  1. Hide All
  2. Show All
Visibility
  1. Public
  2. All

Value Members

  1. def getColumnName(idx: Index, version: Long = ...): String

    returns the column name for the specified index and version

    returns the column name for the specified index and version

    idx

    the index for which the column name needs to created

    version

    version number, the metadata spec of which will determine the column name

  2. def getColumnNameForCols(cols: Seq[String], idxName: String, version: Long = ...): String
  3. def getIndexSchema(index: Index, translators: Seq[ParquetMetaDataTranslator]): Option[DataType]

    Given an index and schema translator tries searching for the first available translation.

    Given an index and schema translator tries searching for the first available translation. to a native DataFrame schema. if no translation is found return None

    index

    the index to translate

    translators

    the list of available translators

    returns

    the DataType associated with the translation

  4. def getMdVersionStatus(version: Long): MetadataVersionStatus.MetadataVersionStatus
  5. def getMdVersionStatusFromDf(df: DataFrame): MetadataVersionStatus.MetadataVersionStatus
  6. def getVersion(schema: StructType): Long

    retrieves the version number from a metadata DataFrame Schema, returns 0 if the version is not explicitly defined (files without version number are implicitly declared version 0).

    retrieves the version number from a metadata DataFrame Schema, returns 0 if the version is not explicitly defined (files without version number are implicitly declared version 0). the function assumes the obj_name column exists in the schema.

    schema

    - the schema of the metadata df

  7. def getVersion(df: DataFrame): Long

    retrieves the version number from a Metadata DataFrame, returns 0 if the version is not explicitly defined (files without version number are implicitly declared version 0).

    retrieves the version number from a Metadata DataFrame, returns 0 if the version is not explicitly defined (files without version number are implicitly declared version 0). the function assumes the obj_name column exists in the schema.

  8. def isPmeAvailable(): Boolean

    checks if Parquet Modular Encryption (PME) is available the check is performed by verifying that org.apache.parquet.crypto.AesCipher is available (will be available if and only if PME is loaded)

    checks if Parquet Modular Encryption (PME) is available the check is performed by verifying that org.apache.parquet.crypto.AesCipher is available (will be available if and only if PME is loaded)

    returns

    true if PME is loaded, else false

  9. def mdFileToDF(session: SparkSession, mdPath: String): DataFrame